1. 26 2月, 2019 1 次提交
    • C
      drm/i915: Replace global_seqno with a hangcheck heartbeat seqno · 89531e7d
      Chris Wilson 提交于
      To determine whether an engine has 'stuck', we simply check whether or
      not is still on the same seqno for several seconds. To keep this simple
      mechanism intact over the loss of a global seqno, we can simply add a
      new global heartbeat seqno instead. As we cannot know the sequence in
      which requests will then be completed, we use a primitive random number
      generator instead (with a cycle long enough to not matter over an
      interval of a few thousand requests between hangcheck samples).
      
      The alternative to using a dedicated seqno on every request is to issue
      a heartbeat request and query its progress through the system. Sadly
      this requires us to reduce struct_mutex so that we can issue requests
      without requiring that bkl.
      
      v2: And without the extra CS_STALL for the hangcheck seqno -- we don't
      need strict serialisation with what comes later, we just need to be sure
      we don't write the hangcheck seqno before our batch is flushed.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-1-chris@chris-wilson.co.uk
      89531e7d
  2. 21 2月, 2019 1 次提交
  3. 12 2月, 2019 1 次提交
  4. 06 2月, 2019 1 次提交
  5. 30 1月, 2019 3 次提交
    • C
      drm/i915: Replace global breadcrumbs with per-context interrupt tracking · 52c0fdb2
      Chris Wilson 提交于
      A few years ago, see commit 688e6c72 ("drm/i915: Slaughter the
      thundering i915_wait_request herd"), the issue of handling multiple
      clients waiting in parallel was brought to our attention. The
      requirement was that every client should be woken immediately upon its
      request being signaled, without incurring any cpu overhead.
      
      To handle certain fragility of our hw meant that we could not do a
      simple check inside the irq handler (some generations required almost
      unbounded delays before we could be sure of seqno coherency) and so
      request completion checking required delegation.
      
      Before commit 688e6c72, the solution was simple. Every client
      waiting on a request would be woken on every interrupt and each would do
      a heavyweight check to see if their request was complete. Commit
      688e6c72 introduced an rbtree so that only the earliest waiter on
      the global timeline would woken, and would wake the next and so on.
      (Along with various complications to handle requests being reordered
      along the global timeline, and also a requirement for kthread to provide
      a delegate for fence signaling that had no process context.)
      
      The global rbtree depends on knowing the execution timeline (and global
      seqno). Without knowing that order, we must instead check all contexts
      queued to the HW to see which may have advanced. We trim that list by
      only checking queued contexts that are being waited on, but still we
      keep a list of all active contexts and their active signalers that we
      inspect from inside the irq handler. By moving the waiters onto the fence
      signal list, we can combine the client wakeup with the dma_fence
      signaling (a dramatic reduction in complexity, but does require the HW
      being coherent, the seqno must be visible from the cpu before the
      interrupt is raised - we keep a timer backup just in case).
      
      Having previously fixed all the issues with irq-seqno serialisation (by
      inserting delays onto the GPU after each request instead of random delays
      on the CPU after each interrupt), we can rely on the seqno state to
      perfom direct wakeups from the interrupt handler. This allows us to
      preserve our single context switch behaviour of the current routine,
      with the only downside that we lose the RT priority sorting of wakeups.
      In general, direct wakeup latency of multiple clients is about the same
      (about 10% better in most cases) with a reduction in total CPU time spent
      in the waiter (about 20-50% depending on gen). Average herd behaviour is
      improved, but at the cost of not delegating wakeups on task_prio.
      
      v2: Capture fence signaling state for error state and add comments to
      warm even the most cold of hearts.
      v3: Check if the request is still active before busywaiting
      v4: Reduce the amount of pointer misdirection with list_for_each_safe
      and using a local i915_request variable inside the loops
      v5: Add a missing pluralisation to a purely informative selftest message.
      
      References: 688e6c72 ("drm/i915: Slaughter the thundering i915_wait_request herd")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
      52c0fdb2
    • C
      drm/i915: Rename execlists->queue_priority to queue_priority_hint · 4d97cbe0
      Chris Wilson 提交于
      After noticing that we trigger preemption events for currently executing
      requests, as well as requests that complete before the preemption and
      attempting to suppress those preemption events, it is wise to not
      consider the queue_priority to be authoritative. As we only track the
      maximum priority seen between dequeue passes, if the maximum priority
      request is no longer available for dequeuing (it completed or is even
      executing on another engine), we have no knowledge of the previous
      queue_priority as it would require us to keep a full history of enqueued
      requests -- but we already have that history in the priolists!
      
      Rename the queue_priority to queue_priority_hint so that we do not
      confuse it as being exactly the maximum priority in the queue, but merely
      an indication that we have seen a new maximum priority value and as such
      we should check whether it should preempt the currently running request.
      
      v2: s/preempt_priority_hint/queue_priority_hint/ as preempt implies it
      being only used for the singular task of preemption and not the wider
      question of waking up due to a change in the queue.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-3-chris@chris-wilson.co.uk
      4d97cbe0
    • C
      drm/i915: Identify active requests · 85474441
      Chris Wilson 提交于
      To allow requests to forgo a common execution timeline, one question we
      need to be able to answer is "is this request running?". To track
      whether a request has started on HW, we can emit a breadcrumb at the
      beginning of the request and check its timeline's HWSP to see if the
      breadcrumb has advanced past the start of this request. (This is in
      contrast to the global timeline where we need only ask if we are on the
      global timeline and if the timeline has advanced past the end of the
      previous request.)
      
      There is still confusion from a preempted request, which has already
      started but relinquished the HW to a high priority request. For the
      common case, this discrepancy should be negligible. However, for
      identification of hung requests, knowing which one was running at the
      time of the hang will be much more important.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-2-chris@chris-wilson.co.uk
      85474441
  6. 29 1月, 2019 3 次提交
  7. 25 1月, 2019 3 次提交
  8. 18 1月, 2019 1 次提交
  9. 17 1月, 2019 2 次提交
  10. 16 1月, 2019 1 次提交
  11. 15 1月, 2019 2 次提交
  12. 03 1月, 2019 1 次提交
  13. 02 1月, 2019 1 次提交
  14. 31 12月, 2018 1 次提交
  15. 28 12月, 2018 1 次提交
  16. 18 12月, 2018 1 次提交
  17. 13 12月, 2018 3 次提交
  18. 12 12月, 2018 1 次提交
  19. 07 12月, 2018 1 次提交
  20. 05 12月, 2018 1 次提交
    • T
      drm/i915: Introduce per-engine workarounds · 90098efa
      Tvrtko Ursulin 提交于
      We stopped re-applying the GT workarounds after engine reset since commit
      59b449d5 ("drm/i915: Split out functions for different kinds of
      workarounds").
      
      Issue with this is that some of the GT workarounds live in the MMIO space
      which gets lost during engine resets. So far the registers in 0x2xxx and
      0xbxxx address range have been identified to be affected.
      
      This losing of applied workarounds has obvious negative effects and can
      even lead to hard system hangs (see the linked Bugzilla).
      
      Rather than just restoring this re-application, because we have also
      observed that it is not safe to just re-write all GT workarounds after
      engine resets (GPU might be live and weird hardware states can happen),
      we introduce a new class of per-engine workarounds and move only the
      affected GT workarounds over.
      
      Using the framework introduced in the previous patch, we therefore after
      engine reset, re-apply only the workarounds living in the affected MMIO
      address ranges.
      
      v2:
       * Move Wa_1406609255:icl to engine workarounds as well.
       * Rename API. (Chris Wilson)
       * Drop redundant IS_KABYLAKE. (Chris Wilson)
       * Re-order engine wa/ init so latest platforms are first. (Rodrigo Vivi)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=107945
      Fixes: 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds")
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: intel-gfx@lists.freedesktop.org
      Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181203133341.10258-1-tvrtko.ursulin@linux.intel.com
      (cherry picked from commit 4a15c75c)
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      90098efa
  21. 04 12月, 2018 4 次提交
  22. 21 11月, 2018 1 次提交
  23. 16 11月, 2018 1 次提交
    • C
      drm/i915/selftests: Workaround an issue with unused lockdep subclass · f911e723
      Chris Wilson 提交于
      lockdep insists that if we give a lock a subclass, it must be used.
      Failure to do so triggers a self-consistency check when reading
      lockdep_stats:
      
      [   49.902002] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused)
      [   49.902009] WARNING: CPU: 3 PID: 383 at kernel/locking/lockdep_proc.c:249 lockdep_stats_show+0x984/0xa10
      [   49.902026] Modules linked in: nls_ascii nls_cp437 vfat fat crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_uncore intel_rapl_perf intel_gtt efivars prime_numbers ahci libahci i2c_i801 video button efivarfs [last unloaded: drm_kms_helper]
      [   49.902059] CPU: 3 PID: 383 Comm: cat Tainted: G     U            4.20.0-rc2+ #304
      [   49.902068] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [   49.902079] RIP: 0010:lockdep_stats_show+0x984/0xa10
      [   49.902086] Code: 00 85 c0 0f 84 aa f8 ff ff 8b 05 77 37 e2 00 85 c0 0f 85 9c f8 ff ff 48 c7 c6 e0 57 bc 81 48 c7 c7 28 30 bb 81 e8 6b 77 fa ff <0f> 0b e9 82 f8 ff ff 48 c7 44 24 50 00 00 00 00 45 31 e4 31 db 31
      [   49.902103] RSP: 0018:ffffc90000247d58 EFLAGS: 00010292
      [   49.902110] RAX: 0000000000000044 RBX: 00000000000002f0 RCX: 0000000000000000
      [   49.902118] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffffff810b3464
      [   49.902126] RBP: 0000000000000039 R08: 0000000000000002 R09: 0000000000000000
      [   49.902133] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000007ead
      [   49.902141] R13: 0000000000000001 R14: ffff88884c021000 R15: 0000000000000097
      [   49.902150] FS:  00007fb347e66540(0000) GS:ffff88885e600000(0000) knlGS:0000000000000000
      [   49.902159] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   49.902165] CR2: 00007fb347aeb000 CR3: 00000008544bd005 CR4: 00000000001606e0
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181115203851.25739-1-chris@chris-wilson.co.uk
      f911e723
  24. 30 10月, 2018 1 次提交
    • R
      drm/i915: Prefer IS_GEN<n> check with bitmask. · 9e783375
      Rodrigo Vivi 提交于
      Whenever possible we should stick with IS_GEN<n> checks.
      
      Bitmaks has been introduced on commit ae7617f0 ("drm/i915:
      Allow optimized platform checks") for efficiency.
      
      Let's stick with it whenever possible.
      
      This patch was generated with coccinelle:
      
      spatch -sp_file is_gen.cocci *{c,h} --in-place
      
      is_gen.cocci:
      @gen2@ expression e; @@
      -INTEL_GEN(e) == 2
      +IS_GEN2(e)
      @gen3@ expression e; @@
      -INTEL_GEN(e) == 3
      +IS_GEN3(e)
      @gen4@ expression e; @@
      -INTEL_GEN(e) == 4
      +IS_GEN4(e)
      @gen5@ expression e; @@
      -INTEL_GEN(e) == 5
      +IS_GEN5(e)
      @gen6@ expression e; @@
      -INTEL_GEN(e) == 6
      +IS_GEN6(e)
      @gen7@ expression e; @@
      -INTEL_GEN(e) == 7
      +IS_GEN7(e)
      @gen8@ expression e; @@
      -INTEL_GEN(e) == 8
      +IS_GEN8(e)
      @gen9@ expression e; @@
      -INTEL_GEN(e) == 9
      +IS_GEN9(e)
      @gen10@ expression e; @@
      -INTEL_GEN(e) == 10
      +IS_GEN10(e)
      @gen11@ expression e; @@
      -INTEL_GEN(e) == 11
      +IS_GEN11(e)
      
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181026195143.20353-1-rodrigo.vivi@intel.com
      9e783375
  25. 18 10月, 2018 1 次提交
    • T
      drm/i915: GEM_WARN_ON considered harmful · bbb8a9d7
      Tvrtko Ursulin 提交于
      GEM_WARN_ON currently has dangerous semantics where it is completely
      compiled out on !GEM_DEBUG builds. This can leave users who expect it to
      be more like a WARN_ON, just without a warning in non-debug builds, in
      complete ignorance.
      
      Another gotcha with it is that it cannot be used as a statement. Which is
      again different from a standard kernel WARN_ON.
      
      This patch fixes both problems by making it behave as one would expect.
      
      It can now be used both as an expression and as statement, and also the
      condition evaluates properly in all builds - code under the conditional
      will therefore not unexpectedly disappear.
      
      To satisfy call sites which really want the code under the conditional to
      completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the
      callers to it. This one can also be used as both expression and statement.
      
      >From the above it follows GEM_DEBUG_WARN_ON should be used in situations
      where we are certain the condition will be hit during development, but at
      a place in code where error can be handled to the benefit of not crashing
      the machine.
      
      GEM_WARN_ON on the other hand should be used where condition may happen in
      production and we just want to distinguish the level of debugging output
      emitted between the production and debug build.
      
      v2:
       * Dropped BUG_ON hunk.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Tomasz Lis <tomasz.lis@intel.com>
      Reviewed-by: NTomasz Lis <tomasz.lis@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com
      bbb8a9d7
  26. 17 10月, 2018 1 次提交
  27. 12 10月, 2018 1 次提交