1. 06 3月, 2019 1 次提交
  2. 05 3月, 2019 1 次提交
  3. 26 2月, 2019 2 次提交
  4. 19 2月, 2019 1 次提交
    • C
      drm/i915: Use time based guilty context banning · 7f4127c4
      Chris Wilson 提交于
      Currently, we accumulate each time a context hangs the GPU, offset
      against the number of requests it submits, and if that score exceeds a
      certain threshold, we ban that context from submitting any more requests
      (cancelling any work in flight). In contrast, we use a simple timer on
      the file, that if we see more than a 9 hangs faster than 60s apart in
      total across all of its contexts, we will ban the client from creating
      any more contexts. This leads to a confusing situation where the file
      may be banned before the context, so lets use a simple timer scheme for
      each.
      
      If the context submits 3 hanging requests within a 120s period, declare
      it forbidden to ever send more requests.
      
      This has the advantage of not being easy to repair by simply sending
      empty requests, but has the disadvantage that if the context is idle
      then it is forgiven. However, if the context is idle, it is not
      disrupting the system, but a hog can evade the request counting and
      cause much more severe disruption to the system.
      
      Updating ban_score from request retirement is dubious as the retirement
      is purposely not in sync with request submission (i.e. we try and batch
      retirement to reduce overhead and avoid latency on submission), which
      leads to surprising situations where we can forgive a hang immediately
      due to a backlog of requests from before the hang being retired
      afterwards.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-2-chris@chris-wilson.co.uk
      7f4127c4
  5. 06 2月, 2019 1 次提交
  6. 30 1月, 2019 2 次提交
    • C
      drm/i915: Drop fake breadcrumb irq · 789659f4
      Chris Wilson 提交于
      Missed breadcrumb detection is defunct due to the tight coupling with
      dma_fence signaling and the myriad ways we may signal fences from
      everywhere but from an interrupt, i.e. we frequently signal a fence
      before we even see its interrupt. This means that even if we miss an
      interrupt for a fence, it still is signaled before our breadcrumb
      hangcheck fires, so simplify the breadcrumb hangchecking by moving it
      into the GPU hangcheck and forgo fake interrupts.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-3-chris@chris-wilson.co.uk
      789659f4
    • C
      drm/i915: Replace global breadcrumbs with per-context interrupt tracking · 52c0fdb2
      Chris Wilson 提交于
      A few years ago, see commit 688e6c72 ("drm/i915: Slaughter the
      thundering i915_wait_request herd"), the issue of handling multiple
      clients waiting in parallel was brought to our attention. The
      requirement was that every client should be woken immediately upon its
      request being signaled, without incurring any cpu overhead.
      
      To handle certain fragility of our hw meant that we could not do a
      simple check inside the irq handler (some generations required almost
      unbounded delays before we could be sure of seqno coherency) and so
      request completion checking required delegation.
      
      Before commit 688e6c72, the solution was simple. Every client
      waiting on a request would be woken on every interrupt and each would do
      a heavyweight check to see if their request was complete. Commit
      688e6c72 introduced an rbtree so that only the earliest waiter on
      the global timeline would woken, and would wake the next and so on.
      (Along with various complications to handle requests being reordered
      along the global timeline, and also a requirement for kthread to provide
      a delegate for fence signaling that had no process context.)
      
      The global rbtree depends on knowing the execution timeline (and global
      seqno). Without knowing that order, we must instead check all contexts
      queued to the HW to see which may have advanced. We trim that list by
      only checking queued contexts that are being waited on, but still we
      keep a list of all active contexts and their active signalers that we
      inspect from inside the irq handler. By moving the waiters onto the fence
      signal list, we can combine the client wakeup with the dma_fence
      signaling (a dramatic reduction in complexity, but does require the HW
      being coherent, the seqno must be visible from the cpu before the
      interrupt is raised - we keep a timer backup just in case).
      
      Having previously fixed all the issues with irq-seqno serialisation (by
      inserting delays onto the GPU after each request instead of random delays
      on the CPU after each interrupt), we can rely on the seqno state to
      perfom direct wakeups from the interrupt handler. This allows us to
      preserve our single context switch behaviour of the current routine,
      with the only downside that we lose the RT priority sorting of wakeups.
      In general, direct wakeup latency of multiple clients is about the same
      (about 10% better in most cases) with a reduction in total CPU time spent
      in the waiter (about 20-50% depending on gen). Average herd behaviour is
      improved, but at the cost of not delegating wakeups on task_prio.
      
      v2: Capture fence signaling state for error state and add comments to
      warm even the most cold of hearts.
      v3: Check if the request is still active before busywaiting
      v4: Reduce the amount of pointer misdirection with list_for_each_safe
      and using a local i915_request variable inside the loops
      v5: Add a missing pluralisation to a purely informative selftest message.
      
      References: 688e6c72 ("drm/i915: Slaughter the thundering i915_wait_request herd")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
      52c0fdb2
  7. 29 1月, 2019 1 次提交
    • C
      drm/i915: Stop tracking MRU activity on VMA · 499197dc
      Chris Wilson 提交于
      Our goal is to remove struct_mutex and replace it with fine grained
      locking. One of the thorny issues is our eviction logic for reclaiming
      space for an execbuffer (or GTT mmaping, among a few other examples).
      While eviction itself is easy to move under a per-VM mutex, performing
      the activity tracking is less agreeable. One solution is not to do any
      MRU tracking and do a simple coarse evaluation during eviction of
      active/inactive, with a loose temporal ordering of last
      insertion/evaluation. That keeps all the locking constrained to when we
      are manipulating the VM itself, neatly avoiding the tricky handling of
      possible recursive locking during execbuf and elsewhere.
      
      Note that discarding the MRU (currently implemented as a pair of lists,
      to avoid scanning the active list for a NONBLOCKING search) is unlikely
      to impact upon our efficiency to reclaim VM space (where we think a LRU
      model is best) as our current strategy is to use random idle replacement
      first before doing a search, and over time the use of softpinned 48b
      per-ppGTT is growing (thereby eliminating any need to perform any eviction
      searches, in theory at least) with the remaining users being found on
      much older devices (gen2-gen6).
      
      v2: Changelog and commentary rewritten to elaborate on the duality of a
      single list being both an inactive and active list.
      v3: Consolidate bool parameters into a single set of flags; don't
      comment on the duality of a single variable being a multiplicity of
      bits.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
      499197dc
  8. 25 1月, 2019 1 次提交
  9. 17 1月, 2019 1 次提交
  10. 10 1月, 2019 2 次提交
  11. 03 1月, 2019 1 次提交
  12. 02 1月, 2019 1 次提交
  13. 31 12月, 2018 2 次提交
  14. 13 12月, 2018 2 次提交
  15. 12 12月, 2018 1 次提交
  16. 07 12月, 2018 1 次提交
  17. 04 12月, 2018 1 次提交
  18. 23 11月, 2018 1 次提交
    • C
      drm/i915: Cache the error string · 0e39037b
      Chris Wilson 提交于
      Currently, we convert the error state into a string every time we read
      from sysfs (and sysfs reads in page size (4KiB) chunks). We do try to
      window the string and only capture the portion that is being read, but
      that means that we must always convert up to the window to find the
      start. For a very large error state bordering on EXEC_OBJECT_CAPTURE
      abuse, this is noticeable as it degrades to O(N^2)!
      
      As we do not have a convenient hook for sysfs open(), and we would like
      to keep the lazy conversion into a string, do the conversion of the
      whole string on the first read and keep the string until the error state
      is freed.
      
      v2: Don't double advance simple_read_from_buffer
      v3: Due to extreme pain of lack of vrealloc, use a scatterlist
      v4: Keep the forward iterator loosely cached
      v5: Stylistic improvements to reduce patch size
      Reported-by: NJason Ekstrand <jason@jlekstrand.net>
      Testcase: igt/gem_exec_capture/many*
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Jason Ekstrand <jason@jlekstrand.net>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181123132325.26541-1-chris@chris-wilson.co.uk
      0e39037b
  19. 22 11月, 2018 1 次提交
  20. 20 11月, 2018 2 次提交
  21. 03 10月, 2018 3 次提交
  22. 27 9月, 2018 1 次提交
  23. 15 9月, 2018 1 次提交
  24. 30 7月, 2018 1 次提交
  25. 07 7月, 2018 1 次提交
  26. 08 6月, 2018 1 次提交
  27. 06 6月, 2018 2 次提交
  28. 18 5月, 2018 3 次提交
  29. 03 5月, 2018 1 次提交