1. 30 7月, 2019 2 次提交
  2. 23 7月, 2019 1 次提交
  3. 19 7月, 2019 1 次提交
    • C
      drm/i915/execlists: Cancel breadcrumb on preempting the virtual engine · 7d6b60db
      Chris Wilson 提交于
      As we unwind the requests for a preemption event, we return a virtual
      request back to its original virtual engine (so that it is available for
      execution on any of its siblings). In the process, this means that its
      breadcrumb should no longer be associated with the original physical
      engine, and so we are forced to decouple it. Previously, as the request
      could not complete without our awareness, we would move it to the next
      real engine without any danger. However, preempt-to-busy allowed for
      requests to continue on the HW and complete in the background as we
      unwound, which meant that we could end up retiring the request before
      fixing up the breadcrumb link.
      
      [51679.517943] INFO: trying to register non-static key.
      [51679.517956] the code is fine but needs lockdep annotation.
      [51679.517960] turning off the locking correctness validator.
      [51679.517966] CPU: 0 PID: 3270 Comm: kworker/u8:0 Tainted: G     U            5.2.0+ #717
      [51679.517971] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [51679.518012] Workqueue: i915 retire_work_handler [i915]
      [51679.518017] Call Trace:
      [51679.518026]  dump_stack+0x67/0x90
      [51679.518031]  register_lock_class+0x52c/0x540
      [51679.518038]  ? find_held_lock+0x2d/0x90
      [51679.518042]  __lock_acquire+0x68/0x1800
      [51679.518047]  ? find_held_lock+0x2d/0x90
      [51679.518073]  ? __i915_sw_fence_complete+0xff/0x1c0 [i915]
      [51679.518079]  lock_acquire+0x90/0x170
      [51679.518105]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
      [51679.518112]  _raw_spin_lock+0x27/0x40
      [51679.518138]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
      [51679.518165]  i915_request_cancel_breadcrumb+0x29/0x160 [i915]
      [51679.518199]  i915_request_retire+0x43f/0x530 [i915]
      [51679.518232]  retire_requests+0x4d/0x60 [i915]
      [51679.518263]  i915_retire_requests+0xdf/0x1f0 [i915]
      [51679.518294]  retire_work_handler+0x4c/0x60 [i915]
      [51679.518301]  process_one_work+0x22c/0x5c0
      [51679.518307]  worker_thread+0x37/0x390
      [51679.518311]  ? process_one_work+0x5c0/0x5c0
      [51679.518316]  kthread+0x116/0x130
      [51679.518320]  ? kthread_create_on_node+0x40/0x40
      [51679.518325]  ret_from_fork+0x24/0x30
      [51679.520177] ------------[ cut here ]------------
      [51679.520189] list_del corruption, ffff88883675e2f0->next is LIST_POISON1 (dead000000000100)
      
      Fixes: 22b7a426 ("drm/i915/execlists: Preempt-to-busy")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190716124931.5870-4-chris@chris-wilson.co.uk
      7d6b60db
  4. 18 7月, 2019 1 次提交
  5. 17 7月, 2019 2 次提交
  6. 16 7月, 2019 1 次提交
  7. 13 7月, 2019 1 次提交
  8. 10 7月, 2019 3 次提交
  9. 05 7月, 2019 1 次提交
  10. 04 7月, 2019 2 次提交
  11. 03 7月, 2019 1 次提交
  12. 02 7月, 2019 1 次提交
  13. 26 6月, 2019 2 次提交
  14. 24 6月, 2019 1 次提交
  15. 22 6月, 2019 2 次提交
  16. 21 6月, 2019 6 次提交
  17. 20 6月, 2019 3 次提交
    • C
      drm/i915/execlists: Minimalistic timeslicing · 8ee36e04
      Chris Wilson 提交于
      If we have multiple contexts of equal priority pending execution,
      activate a timer to demote the currently executing context in favour of
      the next in the queue when that timeslice expires. This enforces
      fairness between contexts (so long as they allow preemption -- forced
      preemption, in the future, will kick those who do not obey) and allows
      us to avoid userspace blocking forward progress with e.g. unbounded
      MI_SEMAPHORE_WAIT.
      
      For the starting point here, we use the jiffie as our timeslice so that
      we should be reasonably efficient wrt frequent CPU wakeups.
      
      Testcase: igt/gem_exec_scheduler/semaphore-resolve
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-2-chris@chris-wilson.co.uk
      8ee36e04
    • C
      drm/i915/execlists: Preempt-to-busy · 22b7a426
      Chris Wilson 提交于
      When using a global seqno, we required a precise stop-the-workd event to
      handle preemption and unwind the global seqno counter. To accomplish
      this, we would preempt to a special out-of-band context and wait for the
      machine to report that it was idle. Given an idle machine, we could very
      precisely see which requests had completed and which we needed to feed
      back into the run queue.
      
      However, now that we have scrapped the global seqno, we no longer need
      to precisely unwind the global counter and only track requests by their
      per-context seqno. This allows us to loosely unwind inflight requests
      while scheduling a preemption, with the enormous caveat that the
      requests we put back on the run queue are still _inflight_ (until the
      preemption request is complete). This makes request tracking much more
      messy, as at any point then we can see a completed request that we
      believe is not currently scheduled for execution. We also have to be
      careful not to rewind RING_TAIL past RING_HEAD on preempting to the
      running context, and for this we use a semaphore to prevent completion
      of the request before continuing.
      
      To accomplish this feat, we change how we track requests scheduled to
      the HW. Instead of appending our requests onto a single list as we
      submit, we track each submission to ELSP as its own block. Then upon
      receiving the CS preemption event, we promote the pending block to the
      inflight block (discarding what was previously being tracked). As normal
      CS completion events arrive, we then remove stale entries from the
      inflight tracker.
      
      v2: Be a tinge paranoid and ensure we flush the write into the HWS page
      for the GPU semaphore to pick in a timely fashion.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
      22b7a426
    • C
      drm/i915: Keep rings pinned while the context is active · 09c5ab38
      Chris Wilson 提交于
      Remember to keep the rings pinned as well as the context image until the
      GPU is no longer active.
      
      v2: Introduce a ring->pin_count primarily to hide the
      mock_ring that doesn't fit into the normal GGTT vma picture.
      
      v3: Order is important in teardown, ringbuffer submission needs to drop
      the pin count on the engine->kernel_context before it can gleefully free
      its ring.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110946
      Fixes: ce476c80 ("drm/i915: Keep contexts pinned until after the next kernel context switch")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190619170135.15281-1-chris@chris-wilson.co.uk
      09c5ab38
  18. 19 6月, 2019 2 次提交
    • C
      drm/i915/execlists: Detect cross-contamination with GuC · 73591341
      Chris Wilson 提交于
      The process_csb routine from execlists_submission is incompatible with
      the GuC backend. Add a warning to detect if we accidentally end up in
      the wrong spot.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190618110736.31155-1-chris@chris-wilson.co.uk
      73591341
    • C
      drm/i915: Make the semaphore saturation mask global · 44d89409
      Chris Wilson 提交于
      The idea behind keeping the saturation mask local to a context backfired
      spectacularly. The premise with the local mask was that we would be more
      proactive in attempting to use semaphores after each time the context
      idled, and that all new contexts would attempt to use semaphores
      ignoring the current state of the system. This turns out to be horribly
      optimistic. If the system state is still oversaturated and the existing
      workloads have all stopped using semaphores, the new workloads would
      attempt to use semaphores and be deprioritised behind real work. The
      new contexts would not switch off using semaphores until their initial
      batch of low priority work had completed. Given sufficient backload load
      of equal user priority, this would completely starve the new work of any
      GPU time.
      
      To compensate, remove the local tracking in favour of keeping it as
      global state on the engine -- once the system is saturated and
      semaphores are disabled, everyone stops attempting to use semaphores
      until the system is idle again. One of the reason for preferring local
      context tracking was that it worked with virtual engines, so for
      switching to global state we could either do a complete check of all the
      virtual siblings or simply disable semaphores for those requests. This
      takes the simpler approach of disabling semaphores on virtual engines.
      
      The downside is that the decision that the engine is saturated is a
      local measure -- we are only checking whether or not this context was
      scheduled in a timely fashion, it may be legitimately delayed due to user
      priorities. We still have the same dilemma though, that we do not want
      to employ the semaphore poll unless it will be used.
      
      v2: Explain why we need to assume the worst wrt virtual engines.
      
      Fixes: ca6e56f6 ("drm/i915: Disable semaphore busywaits on saturated systems")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-8-chris@chris-wilson.co.uk
      44d89409
  19. 15 6月, 2019 2 次提交
    • C
      drm/i915: Replace engine->timeline with a plain list · 422d7df4
      Chris Wilson 提交于
      To continue the onslaught of removing the assumption of a global
      execution ordering, another casualty is the engine->timeline. Without an
      actual timeline to track, it is overkill and we can replace it with a
      much less grand plain list. We still need a list of requests inflight,
      for the simple purpose of finding inflight requests (for retiring,
      resetting, preemption etc).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
      422d7df4
    • C
      drm/i915: Keep contexts pinned until after the next kernel context switch · ce476c80
      Chris Wilson 提交于
      We need to keep the context image pinned in memory until after the GPU
      has finished writing into it. Since it continues to write as we signal
      the final breadcrumb, we need to keep it pinned until the request after
      it is complete. Currently we know the order in which requests execute on
      each engine, and so to remove that presumption we need to identify a
      request/context-switch we know must occur after our completion. Any
      request queued after the signal must imply a context switch, for
      simplicity we use a fresh request from the kernel context.
      
      The sequence of operations for keeping the context pinned until saved is:
      
       - On context activation, we preallocate a node for each physical engine
         the context may operate on. This is to avoid allocations during
         unpinning, which may be from inside FS_RECLAIM context (aka the
         shrinker)
      
       - On context deactivation on retirement of the last active request (which
         is before we know the context has been saved), we add the
         preallocated node onto a barrier list on each engine
      
       - On engine idling, we emit a switch to kernel context. When this
         switch completes, we know that all previous contexts must have been
         saved, and so on retiring this request we can finally unpin all the
         contexts that were marked as deactivated prior to the switch.
      
      We can enhance this in future by flushing all the idle contexts on a
      regular heartbeat pulse of a switch to kernel context, which will also
      be used to check for hung engines.
      
      v2: intel_context_active_acquire/_release
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
      ce476c80
  20. 11 6月, 2019 2 次提交
  21. 07 6月, 2019 2 次提交
  22. 28 5月, 2019 1 次提交