1. 17 1月, 2019 1 次提交
  2. 16 1月, 2019 1 次提交
  3. 15 1月, 2019 2 次提交
  4. 03 1月, 2019 1 次提交
  5. 02 1月, 2019 1 次提交
  6. 31 12月, 2018 1 次提交
  7. 28 12月, 2018 1 次提交
  8. 18 12月, 2018 1 次提交
  9. 13 12月, 2018 3 次提交
  10. 12 12月, 2018 1 次提交
  11. 05 12月, 2018 1 次提交
    • T
      drm/i915: Introduce per-engine workarounds · 90098efa
      Tvrtko Ursulin 提交于
      We stopped re-applying the GT workarounds after engine reset since commit
      59b449d5 ("drm/i915: Split out functions for different kinds of
      workarounds").
      
      Issue with this is that some of the GT workarounds live in the MMIO space
      which gets lost during engine resets. So far the registers in 0x2xxx and
      0xbxxx address range have been identified to be affected.
      
      This losing of applied workarounds has obvious negative effects and can
      even lead to hard system hangs (see the linked Bugzilla).
      
      Rather than just restoring this re-application, because we have also
      observed that it is not safe to just re-write all GT workarounds after
      engine resets (GPU might be live and weird hardware states can happen),
      we introduce a new class of per-engine workarounds and move only the
      affected GT workarounds over.
      
      Using the framework introduced in the previous patch, we therefore after
      engine reset, re-apply only the workarounds living in the affected MMIO
      address ranges.
      
      v2:
       * Move Wa_1406609255:icl to engine workarounds as well.
       * Rename API. (Chris Wilson)
       * Drop redundant IS_KABYLAKE. (Chris Wilson)
       * Re-order engine wa/ init so latest platforms are first. (Rodrigo Vivi)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=107945
      Fixes: 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds")
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: intel-gfx@lists.freedesktop.org
      Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181203133341.10258-1-tvrtko.ursulin@linux.intel.com
      (cherry picked from commit 4a15c75c)
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      90098efa
  12. 04 12月, 2018 4 次提交
  13. 21 11月, 2018 1 次提交
  14. 16 11月, 2018 1 次提交
    • C
      drm/i915/selftests: Workaround an issue with unused lockdep subclass · f911e723
      Chris Wilson 提交于
      lockdep insists that if we give a lock a subclass, it must be used.
      Failure to do so triggers a self-consistency check when reading
      lockdep_stats:
      
      [   49.902002] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused)
      [   49.902009] WARNING: CPU: 3 PID: 383 at kernel/locking/lockdep_proc.c:249 lockdep_stats_show+0x984/0xa10
      [   49.902026] Modules linked in: nls_ascii nls_cp437 vfat fat crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_uncore intel_rapl_perf intel_gtt efivars prime_numbers ahci libahci i2c_i801 video button efivarfs [last unloaded: drm_kms_helper]
      [   49.902059] CPU: 3 PID: 383 Comm: cat Tainted: G     U            4.20.0-rc2+ #304
      [   49.902068] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [   49.902079] RIP: 0010:lockdep_stats_show+0x984/0xa10
      [   49.902086] Code: 00 85 c0 0f 84 aa f8 ff ff 8b 05 77 37 e2 00 85 c0 0f 85 9c f8 ff ff 48 c7 c6 e0 57 bc 81 48 c7 c7 28 30 bb 81 e8 6b 77 fa ff <0f> 0b e9 82 f8 ff ff 48 c7 44 24 50 00 00 00 00 45 31 e4 31 db 31
      [   49.902103] RSP: 0018:ffffc90000247d58 EFLAGS: 00010292
      [   49.902110] RAX: 0000000000000044 RBX: 00000000000002f0 RCX: 0000000000000000
      [   49.902118] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffffff810b3464
      [   49.902126] RBP: 0000000000000039 R08: 0000000000000002 R09: 0000000000000000
      [   49.902133] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000007ead
      [   49.902141] R13: 0000000000000001 R14: ffff88884c021000 R15: 0000000000000097
      [   49.902150] FS:  00007fb347e66540(0000) GS:ffff88885e600000(0000) knlGS:0000000000000000
      [   49.902159] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   49.902165] CR2: 00007fb347aeb000 CR3: 00000008544bd005 CR4: 00000000001606e0
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181115203851.25739-1-chris@chris-wilson.co.uk
      f911e723
  15. 30 10月, 2018 1 次提交
    • R
      drm/i915: Prefer IS_GEN<n> check with bitmask. · 9e783375
      Rodrigo Vivi 提交于
      Whenever possible we should stick with IS_GEN<n> checks.
      
      Bitmaks has been introduced on commit ae7617f0 ("drm/i915:
      Allow optimized platform checks") for efficiency.
      
      Let's stick with it whenever possible.
      
      This patch was generated with coccinelle:
      
      spatch -sp_file is_gen.cocci *{c,h} --in-place
      
      is_gen.cocci:
      @gen2@ expression e; @@
      -INTEL_GEN(e) == 2
      +IS_GEN2(e)
      @gen3@ expression e; @@
      -INTEL_GEN(e) == 3
      +IS_GEN3(e)
      @gen4@ expression e; @@
      -INTEL_GEN(e) == 4
      +IS_GEN4(e)
      @gen5@ expression e; @@
      -INTEL_GEN(e) == 5
      +IS_GEN5(e)
      @gen6@ expression e; @@
      -INTEL_GEN(e) == 6
      +IS_GEN6(e)
      @gen7@ expression e; @@
      -INTEL_GEN(e) == 7
      +IS_GEN7(e)
      @gen8@ expression e; @@
      -INTEL_GEN(e) == 8
      +IS_GEN8(e)
      @gen9@ expression e; @@
      -INTEL_GEN(e) == 9
      +IS_GEN9(e)
      @gen10@ expression e; @@
      -INTEL_GEN(e) == 10
      +IS_GEN10(e)
      @gen11@ expression e; @@
      -INTEL_GEN(e) == 11
      +IS_GEN11(e)
      
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181026195143.20353-1-rodrigo.vivi@intel.com
      9e783375
  16. 18 10月, 2018 1 次提交
    • T
      drm/i915: GEM_WARN_ON considered harmful · bbb8a9d7
      Tvrtko Ursulin 提交于
      GEM_WARN_ON currently has dangerous semantics where it is completely
      compiled out on !GEM_DEBUG builds. This can leave users who expect it to
      be more like a WARN_ON, just without a warning in non-debug builds, in
      complete ignorance.
      
      Another gotcha with it is that it cannot be used as a statement. Which is
      again different from a standard kernel WARN_ON.
      
      This patch fixes both problems by making it behave as one would expect.
      
      It can now be used both as an expression and as statement, and also the
      condition evaluates properly in all builds - code under the conditional
      will therefore not unexpectedly disappear.
      
      To satisfy call sites which really want the code under the conditional to
      completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the
      callers to it. This one can also be used as both expression and statement.
      
      >From the above it follows GEM_DEBUG_WARN_ON should be used in situations
      where we are certain the condition will be hit during development, but at
      a place in code where error can be handled to the benefit of not crashing
      the machine.
      
      GEM_WARN_ON on the other hand should be used where condition may happen in
      production and we just want to distinguish the level of debugging output
      emitted between the production and debug build.
      
      v2:
       * Dropped BUG_ON hunk.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Tomasz Lis <tomasz.lis@intel.com>
      Reviewed-by: NTomasz Lis <tomasz.lis@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com
      bbb8a9d7
  17. 17 10月, 2018 1 次提交
  18. 12 10月, 2018 1 次提交
  19. 01 10月, 2018 1 次提交
  20. 26 9月, 2018 1 次提交
  21. 14 9月, 2018 1 次提交
  22. 04 9月, 2018 2 次提交
  23. 21 8月, 2018 1 次提交
  24. 15 8月, 2018 1 次提交
  25. 07 8月, 2018 1 次提交
  26. 27 7月, 2018 1 次提交
  27. 24 7月, 2018 1 次提交
  28. 14 7月, 2018 1 次提交
  29. 11 7月, 2018 1 次提交
  30. 07 7月, 2018 1 次提交
  31. 06 7月, 2018 1 次提交
  32. 05 7月, 2018 1 次提交
  33. 29 6月, 2018 1 次提交
    • C
      drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd) · 9512f985
      Chris Wilson 提交于
      Back in commit 27af5eea ("drm/i915: Move execlists irq handler to a
      bottom half"), we came to the conclusion that running our CSB processing
      and ELSP submission from inside the irq handler was a bad idea. A really
      bad idea as we could impose nearly 1s latency on other users of the
      system, on average! Deferring our work to a tasklet allowed us to do the
      processing with irqs enabled, reducing the impact to an average of about
      50us.
      
      We have since eradicated the use of forcewaked mmio from inside the CSB
      processing and ELSP submission, bringing the impact down to around 5us
      (on Kabylake); an order of magnitude better than our measurements 2
      years ago on Broadwell and only about 2x worse on average than the
      gem_syslatency on an unladen system.
      
      In this iteration of the tasklet-vs-direct submission debate, we seek a
      compromise where by we submit new requests immediately to the HW but
      defer processing the CS interrupt onto a tasklet. We gain the advantage
      of low-latency and ksoftirqd avoidance when waking up the HW, while
      avoiding the system-wide starvation of our CS irq-storms.
      
      Comparing the impact on the maximum latency observed (that is the time
      stolen from an RT process) over a 120s interval, repeated several times
      (using gem_syslatency, similar to RT's cyclictest) while the system is
      fully laden with i915 nops, we see that direct submission an actually
      improve the worse case.
      
      Maximum latency in microseconds of a third party RT thread
      (gem_syslatency -t 120 -f 2)
        x Always using tasklets (a couple of >1000us outliers removed)
        + Only using tasklets from CS irq, direct submission of requests
      +------------------------------------------------------------------------+
      |          +                                                             |
      |          +                                                             |
      |          +                                                             |
      |          +       +                                                     |
      |          + +     +                                                     |
      |       +  + +     +  x     x     x                                      |
      |      +++ + +     +  x  x  x  x  x  x                                   |
      |      +++ + ++  + +  *x x  x  x  x  x                                   |
      |      +++ + ++  + *  *x x  *  x  x  x                                   |
      |    + +++ + ++  * * +*xxx  *  x  x  xx                                  |
      |    * +++ + ++++* *x+**xx+ *  x  x xxxx x                               |
      |   **x++++*++**+*x*x****x+ * +x xx xxxx x          x                    |
      |x* ******+***************++*+***xxxxxx* xx*x     xxx +                x+|
      |             |__________MA___________|                                  |
      |      |______M__A________|                                              |
      +------------------------------------------------------------------------+
          N           Min           Max        Median           Avg        Stddev
      x 118            91           186           124     125.28814     16.279137
      + 120            92           187           109     112.00833     13.458617
      Difference at 95.0% confidence
      	-13.2798 +/- 3.79219
      	-10.5994% +/- 3.02677%
      	(Student's t, pooled s = 14.9237)
      
      However the mean latency is adversely affected:
      
      Mean latency in microseconds of a third party RT thread
      (gem_syslatency -t 120 -f 1)
        x Always using tasklets
        + Only using tasklets from CS irq, direct submission of requests
      +------------------------------------------------------------------------+
      |           xxxxxx                                        +   ++         |
      |           xxxxxx                                        +   ++         |
      |           xxxxxx                                      + +++ ++         |
      |           xxxxxxx                                     +++++ ++         |
      |           xxxxxxx                                     +++++ ++         |
      |           xxxxxxx                                     +++++ +++        |
      |           xxxxxxx                                   + ++++++++++       |
      |           xxxxxxxx                                 ++ ++++++++++       |
      |           xxxxxxxx                                 ++ ++++++++++       |
      |          xxxxxxxxxx                                +++++++++++++++     |
      |         xxxxxxxxxxx    x                           +++++++++++++++     |
      |x       xxxxxxxxxxxxx   x           +            + ++++++++++++++++++  +|
      |           |__A__|                                                      |
      |                                                      |____A___|        |
      +------------------------------------------------------------------------+
          N           Min           Max        Median           Avg        Stddev
      x 120         3.506         3.727         3.631     3.6321417    0.02773109
      + 120         3.834         4.149         4.039     4.0375167   0.041221676
      Difference at 95.0% confidence
      	0.405375 +/- 0.00888913
      	11.1608% +/- 0.244735%
      	(Student's t, pooled s = 0.03513)
      
      However, since the mean latency corresponds to the amount of irqsoff
      processing we have to do for a CS interrupt, we only need to speed that
      up to benefit not just system latency but our own throughput.
      
      v2: Remember to defer submissions when under reset.
      v4: Only use direct submission for new requests
      v5: Be aware that with mixing direct tasklet evaluation and deferred
      tasklets, we may end up idling before running the deferred tasklet.
      v6: Remove the redudant likely() from tasklet_is_enabled(), restrict the
      annotation to reset_in_progress().
      v7: Take the full timeline.lock when enabling perf_pmu stats as the
      tasklet is no longer a valid guard. A consequence is that the stats are
      now only valid for engines also using the timeline.lock to process
      state.
      
      Testcase: igt/gem_exec_latency/*rthog*
      References: 27af5eea ("drm/i915: Move execlists irq handler to a bottom half")
      Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-9-chris@chris-wilson.co.uk
      9512f985