提交 · 5179749925933575a67f9d8f16d0cc204f98a29f · openeuler / Kernel

04 12月, 2018 4 次提交

drm/i915: Allocate a common scratch page · 51797499

由 Chris Wilson 提交于 12月 04, 2018

Currently we allocate a scratch page for each engine, but since we only
ever write into it for post-sync operations, it is not exposed to
userspace nor do we care for coherency. As we then do not care about its
contents, we can use one page for all, reducing our allocations and
avoid complications by not assuming per-engine isolation.

For later use, it simplifies engine initialisation (by removing the
allocation that required struct_mutex!) and means that we can always rely
on there being a scratch page.

v2: Check that we allocated a large enough scratch for I830 w/a

Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.18.20+

51797499

drm/i915: Fuse per-context workaround handling with the common framework · 452420d2

由 Tvrtko Ursulin 提交于 12月 03, 2018

Convert the per context workaround handling code to run against the newly
introduced common workaround framework and fuse the two to use the
existing smarter list add helper, the one which does the sorted insert and
merges registers where possible.

This completes migration of all four classes of workarounds onto the
common framework.

Existing macros are kept untouched for smaller code churn.

v2:
 * Rename to list name ctx_wa_list and move from dev_priv to engine.

v3:
 * API rename and parameters tweaking. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203133357.10341-1-tvrtko.ursulin@linux.intel.com

452420d2

drm/i915: Move register white-listing to the common workaround framework · 69bcdecf

由 Tvrtko Ursulin 提交于 12月 03, 2018

Instead of having a separate list of white-listed registers we can
trivially move this to the common workarounds framework.

This brings us one step closer to the goal of driving all workaround
classes using the same code.

v2:
 * Use GEM_DEBUG_WARN_ON for the sanity check. (Chris Wilson)

v3:
 * API rename. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203125014.3219-6-tvrtko.ursulin@linux.intel.com

69bcdecf

drm/i915: Introduce per-engine workarounds · 4a15c75c

由 Tvrtko Ursulin 提交于 12月 03, 2018

We stopped re-applying the GT workarounds after engine reset since commit
59b449d5 ("drm/i915: Split out functions for different kinds of
workarounds").

Issue with this is that some of the GT workarounds live in the MMIO space
which gets lost during engine resets. So far the registers in 0x2xxx and
0xbxxx address range have been identified to be affected.

This losing of applied workarounds has obvious negative effects and can
even lead to hard system hangs (see the linked Bugzilla).

Rather than just restoring this re-application, because we have also
observed that it is not safe to just re-write all GT workarounds after
engine resets (GPU might be live and weird hardware states can happen),
we introduce a new class of per-engine workarounds and move only the
affected GT workarounds over.

Using the framework introduced in the previous patch, we therefore after
engine reset, re-apply only the workarounds living in the affected MMIO
address ranges.

v2:
 * Move Wa_1406609255:icl to engine workarounds as well.
 * Rename API. (Chris Wilson)
 * Drop redundant IS_KABYLAKE. (Chris Wilson)
 * Re-order engine wa/ init so latest platforms are first. (Rodrigo Vivi)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=107945
Fixes: 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203133341.10258-1-tvrtko.ursulin@linux.intel.com

4a15c75c

21 11月, 2018 1 次提交

drm/i915: Show waiter's status on engine dump · aa6a65da

由 Chris Wilson 提交于 11月 21, 2018

When showing the list of waiters, include the task's status so that we
can tell if they have been woken up and are waiting for the CPU, or if
they are still waiting to be woken.

v2: task_state_to_char()
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181121151653.24595-1-chris@chris-wilson.co.uk

aa6a65da

16 11月, 2018 1 次提交

drm/i915/selftests: Workaround an issue with unused lockdep subclass · f911e723

由 Chris Wilson 提交于 11月 15, 2018

lockdep insists that if we give a lock a subclass, it must be used.
Failure to do so triggers a self-consistency check when reading
lockdep_stats:

[   49.902002] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused)
[   49.902009] WARNING: CPU: 3 PID: 383 at kernel/locking/lockdep_proc.c:249 lockdep_stats_show+0x984/0xa10
[   49.902026] Modules linked in: nls_ascii nls_cp437 vfat fat crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_uncore intel_rapl_perf intel_gtt efivars prime_numbers ahci libahci i2c_i801 video button efivarfs [last unloaded: drm_kms_helper]
[   49.902059] CPU: 3 PID: 383 Comm: cat Tainted: G     U            4.20.0-rc2+ #304
[   49.902068] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[   49.902079] RIP: 0010:lockdep_stats_show+0x984/0xa10
[   49.902086] Code: 00 85 c0 0f 84 aa f8 ff ff 8b 05 77 37 e2 00 85 c0 0f 85 9c f8 ff ff 48 c7 c6 e0 57 bc 81 48 c7 c7 28 30 bb 81 e8 6b 77 fa ff <0f> 0b e9 82 f8 ff ff 48 c7 44 24 50 00 00 00 00 45 31 e4 31 db 31
[   49.902103] RSP: 0018:ffffc90000247d58 EFLAGS: 00010292
[   49.902110] RAX: 0000000000000044 RBX: 00000000000002f0 RCX: 0000000000000000
[   49.902118] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffffff810b3464
[   49.902126] RBP: 0000000000000039 R08: 0000000000000002 R09: 0000000000000000
[   49.902133] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000007ead
[   49.902141] R13: 0000000000000001 R14: ffff88884c021000 R15: 0000000000000097
[   49.902150] FS:  00007fb347e66540(0000) GS:ffff88885e600000(0000) knlGS:0000000000000000
[   49.902159] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   49.902165] CR2: 00007fb347aeb000 CR3: 00000008544bd005 CR4: 00000000001606e0
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181115203851.25739-1-chris@chris-wilson.co.uk

f911e723

30 10月, 2018 1 次提交

drm/i915: Prefer IS_GEN<n> check with bitmask. · 9e783375

由 Rodrigo Vivi 提交于 10月 26, 2018

Whenever possible we should stick with IS_GEN<n> checks.

Bitmaks has been introduced on commit ae7617f0 ("drm/i915:
Allow optimized platform checks") for efficiency.

Let's stick with it whenever possible.

This patch was generated with coccinelle:

spatch -sp_file is_gen.cocci *{c,h} --in-place

is_gen.cocci:
@gen2@ expression e; @@
-INTEL_GEN(e) == 2
+IS_GEN2(e)
@gen3@ expression e; @@
-INTEL_GEN(e) == 3
+IS_GEN3(e)
@gen4@ expression e; @@
-INTEL_GEN(e) == 4
+IS_GEN4(e)
@gen5@ expression e; @@
-INTEL_GEN(e) == 5
+IS_GEN5(e)
@gen6@ expression e; @@
-INTEL_GEN(e) == 6
+IS_GEN6(e)
@gen7@ expression e; @@
-INTEL_GEN(e) == 7
+IS_GEN7(e)
@gen8@ expression e; @@
-INTEL_GEN(e) == 8
+IS_GEN8(e)
@gen9@ expression e; @@
-INTEL_GEN(e) == 9
+IS_GEN9(e)
@gen10@ expression e; @@
-INTEL_GEN(e) == 10
+IS_GEN10(e)
@gen11@ expression e; @@
-INTEL_GEN(e) == 11
+IS_GEN11(e)

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181026195143.20353-1-rodrigo.vivi@intel.com

9e783375

18 10月, 2018 1 次提交

drm/i915: GEM_WARN_ON considered harmful · bbb8a9d7

由 Tvrtko Ursulin 提交于 10月 12, 2018

GEM_WARN_ON currently has dangerous semantics where it is completely
compiled out on !GEM_DEBUG builds. This can leave users who expect it to
be more like a WARN_ON, just without a warning in non-debug builds, in
complete ignorance.

Another gotcha with it is that it cannot be used as a statement. Which is
again different from a standard kernel WARN_ON.

This patch fixes both problems by making it behave as one would expect.

It can now be used both as an expression and as statement, and also the
condition evaluates properly in all builds - code under the conditional
will therefore not unexpectedly disappear.

To satisfy call sites which really want the code under the conditional to
completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the
callers to it. This one can also be used as both expression and statement.

>From the above it follows GEM_DEBUG_WARN_ON should be used in situations
where we are certain the condition will be hit during development, but at
a place in code where error can be handled to the benefit of not crashing
the machine.

GEM_WARN_ON on the other hand should be used where condition may happen in
production and we just want to distinguish the level of debugging output
emitted between the production and debug build.

v2:
 * Dropped BUG_ON hunk.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: NTomasz Lis <tomasz.lis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com

bbb8a9d7

17 10月, 2018 1 次提交

drm/i915: Ensure intel_engine_init_execlist() builds with Clang · 410ed573

由 Jani Nikula 提交于 10月 16, 2018

Clang build with UBSAN enabled leads to the following build error:

drivers/gpu/drm/i915/intel_engine_cs.o: In function `intel_engine_init_execlist':
drivers/gpu/drm/i915/intel_engine_cs.c:411: undefined reference to `__compiletime_assert_411'

Again, for this to work the code would first need to be inlined and then
constant folded, which doesn't work for Clang because semantic analysis
happens before optimization/inlining.

Use GEM_BUG_ON() instead of BUILD_BUG_ON().

v2: Use is_power_of_2() from log2.h (Chris)

References: http://mid.mail-archive.com/20181015203410.155997-1-swboyd@chromium.orgReported-by: NStephen Boyd <swboyd@chromium.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: NNathan Chancellor <natechancellor@gmail.com>
Tested-by: NStephen Boyd <swboyd@chromium.org>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181016122938.18757-2-jani.nikula@intel.com

410ed573

12 10月, 2018 1 次提交

drm/i915: Inject load failure inside intel_engines_init_mmio · 645ff9e3

由 Michal Wajdeczko 提交于 10月 11, 2018

We need extra load failure point to better test error path in
i915_driver_init_mmio.
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181011130008.24640-2-michal.wajdeczko@intel.com

645ff9e3

01 10月, 2018 1 次提交

drm/i915: Combine multiple internal plists into the same i915_priolist bucket · 85f5e1f3

由 Chris Wilson 提交于 10月 01, 2018

As we are about to allow ourselves to slightly bump the user priority
into a few different sublevels, packthose internal priority lists
into the same i915_priolist to keep the rbtree compact and avoid having
to allocate the default user priority even after the internal bumping.
The downside to having an requests[] rather than a node per active list,
is that we then have to walk over the empty higher priority lists. To
compensate, we track the active buckets and use a small bitmap to skip
over any inactive ones.

v2: Use MASK of internal levels to simplify our usage.
v3: Prevent overflow when SHIFT is zero.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181001123204.23982-4-chris@chris-wilson.co.uk

85f5e1f3

26 9月, 2018 1 次提交

drm/i915: Convert to BITS_PER_TYPE · 74f6e183

由 Chris Wilson 提交于 9月 26, 2018

In commit 9144d75e ("include/linux/bitops.h: introduce BITS_PER_TYPE"),
we made BITS_PER_TYPE available to all and now we can use the macro to
replace some open-coded computation of sizeof(T) * BITS_PER_BYTE.
Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180926104707.17410-1-chris@chris-wilson.co.uk

74f6e183

14 9月, 2018 1 次提交

drm/i915: Flush the tasklet when checking for idle · 22495b68

由 Chris Wilson 提交于 9月 14, 2018

In order to reduce latency when checking for idle we kick the tasklet
directly. Sometimes this is not enough as it is queued on another cpu
and so to improve the accuracy of this idle-check (and so to reduce
latency overall by avoiding another pass, or worse declaring a timeout!)
wait for the tasklet to complete.

References: https://bugs.freedesktop.org/show_bug.cgi?id=107916Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180914080017.30308-2-chris@chris-wilson.co.uk

22495b68

04 9月, 2018 2 次提交

drm/i915: Use a cached mapping for the physical HWS · d6acae36

由 Chris Wilson 提交于 9月 03, 2018

Older gen use a physical address for the hardware status page, for which
we use cache-coherent writes. As the writes are into the cpu cache, we use
a normal WB mapped page to read the HWS, used for our seqno tracking.

Anecdotally, I observed lost breadcrumbs writes into the HWS on i965gm,
which so far have not reoccurred with this patch. How reliable that
evidence is remains to be seen.

v2: Explicitly pass the expected physical address to the hw
v3: Also remember the wild writes we once had for HWS above 4G.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NDaniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180903152304.31589-2-chris@chris-wilson.co.uk

d6acae36

drm/i915: Combine cleanup_status_page() · a0e731f4

由 Chris Wilson 提交于 9月 03, 2018

Pull the physical status page cleanup into a common
cleanup_status_page() for caller simplicity.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180903152304.31589-1-chris@chris-wilson.co.uk

a0e731f4

21 8月, 2018 1 次提交

drm/i915: Correct CSB probing for engine state dumper · df4f94e8

由 Chris Wilson 提交于 8月 21, 2018

Since we no longer maintain our read position in the CSB pointers
register, it always returns 0 and not where we last read up to. As a
result the CSB probing in the state dumper starts from 0, either missing
entries or showing stale one.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180821101138.15822-1-chris@chris-wilson.co.uk

df4f94e8

15 8月, 2018 1 次提交

drm/i915: Clear stop-engine for a pardoned reset · a99b32a6

由 Chris Wilson 提交于 8月 14, 2018

If we pardon a per-engine reset, we may leave the STOP_RING bit asserted
in RING_MI_MODE resulting in the engine hanging. Unconditionally clear
it on the per-engine exit path as we know that either we skipped the
reset and so need the cancellation, or the reset was successful and the
cancellation is a no-op, or there was an error and we will follow up
with a full-reset or wedging (both of which will stop the engines again
as required).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106560Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180814171857.24673-1-chris@chris-wilson.co.uk

a99b32a6

07 8月, 2018 1 次提交

drm/i915: Pull seqno started checks together · 97f06158

由 Chris Wilson 提交于 8月 06, 2018

We have a few instances of checking seqno-1 to see if the HW has started
the request. Pull those together under a helper.

v2: Pull the !seqno assertion higher, as given seqno==1 we may indeed
check to see if we have started using seqno==0.
Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180806112605.20725-1-chris@chris-wilson.co.uk

97f06158

27 7月, 2018 1 次提交

drm/i915: Eliminate use of PAGE_SIZE as a virtual alignment · 7a859c65

由 Chris Wilson 提交于 7月 27, 2018

Using PAGE_SIZE for virtual offset alignment is superfluous as it is
equal to the minimum gtt alignment and so equivalent to 0. It is also
the wrong value to use as we stopped using physical page constructs for
the virtual GTT, i.e. it would be preferrable to use I915_GTT_PAGE_SIZE
and in these cases merely imply I915_GTT_MIN_ALIGNMENT.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180727091855.1879-1-chris@chris-wilson.co.uk

7a859c65

24 7月, 2018 1 次提交

drm/i915: Pull unpin map into vma release · 6a2f59e4

由 Chris Wilson 提交于 7月 21, 2018

A reasonably common operation is to pin the map of the vma alongside the
vma itself for the lifetime of the vma, and so release both pins at the
same time as destroying the vma. It is common enough to pull into the
release function, making that central function more attractive to a
couple of other callsites.

The continual ulterior motive is to sweep over errors on module load
aborting...

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180721125037.20127-1-chris@chris-wilson.co.uk

6a2f59e4

14 7月, 2018 1 次提交

drm/i915: Do not short-circuit tasklets during reset · 9701975e

由 Chris Wilson 提交于 7月 13, 2018

Inside intel_engine_is_idle(), we flush the tasklet to ensure that is
being run in a timely fashion (ksoftirqd has taught us to expect the
worst). However, if we are in the middle of reset, the HW may not yet be
ready to execute the submission tasklet and so we must respect the
disable flag.

Fixes: dd0cf235 ("drm/i915: Speed up idle detection by kicking the tasklets")
Testcase: igt/drv_selftest/live_hangcheck
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-2-chris@chris-wilson.co.uk

9701975e

11 7月, 2018 1 次提交

drm/i915/execlists: Switch to rb_root_cached · 655250a8

由 Chris Wilson 提交于 6月 29, 2018

The kernel recently gained an augmented rbtree with the purpose of
cacheing the leftmost element of the rbtree, a frequent optimisation to
avoid calls to rb_first() which is also employed by the
execlists->queue. Switch from our open-coded cache to the library.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180629075348.27358-9-chris@chris-wilson.co.uk

655250a8

07 7月, 2018 1 次提交

drm/i915: Replace nested subclassing with explicit subclasses · 890fd185

由 Chris Wilson 提交于 7月 06, 2018

In the next patch, we will want a third distinct class of timeline that
may overlap with the current pair of client and engine timeline classes.
Rather than use the ad hoc markup of SINGLE_DEPTH_NESTING, initialise
the different timeline classes with an explicit subclass.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706210710.16251-1-chris@chris-wilson.co.uk

890fd185

06 7月, 2018 1 次提交

drm/i915: Record logical context support in driver caps · 481827b4

由 Chris Wilson 提交于 7月 06, 2018

Avoid looking at the magical engines[RCS] to decide if the HW and driver
supports logical contexts, and instead record that knowledge during
initialisation.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706101442.21279-1-chris@chris-wilson.co.uk

481827b4

05 7月, 2018 1 次提交

drm/i915: Mark expected switch fall-throughs · f0d759f0

由 Gustavo A. R. Silva 提交于 6月 28, 2018

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 141432
Addresses-Coverity-ID: 141433
Addresses-Coverity-ID: 141434
Addresses-Coverity-ID: 141435
Addresses-Coverity-ID: 141436
Addresses-Coverity-ID: 1357360
Addresses-Coverity-ID: 1357403
Addresses-Coverity-ID: 1357433
Addresses-Coverity-ID: 1392622
Addresses-Coverity-ID: 1415273
Addresses-Coverity-ID: 1435752
Addresses-Coverity-ID: 1441500
Addresses-Coverity-ID: 1454596
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628223541.GA17665@embeddedor.com

f0d759f0

29 6月, 2018 3 次提交

drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd) · 9512f985

由 Chris Wilson 提交于 6月 28, 2018

Back in commit 27af5eea ("drm/i915: Move execlists irq handler to a
bottom half"), we came to the conclusion that running our CSB processing
and ELSP submission from inside the irq handler was a bad idea. A really
bad idea as we could impose nearly 1s latency on other users of the
system, on average! Deferring our work to a tasklet allowed us to do the
processing with irqs enabled, reducing the impact to an average of about
50us.

We have since eradicated the use of forcewaked mmio from inside the CSB
processing and ELSP submission, bringing the impact down to around 5us
(on Kabylake); an order of magnitude better than our measurements 2
years ago on Broadwell and only about 2x worse on average than the
gem_syslatency on an unladen system.

In this iteration of the tasklet-vs-direct submission debate, we seek a
compromise where by we submit new requests immediately to the HW but
defer processing the CS interrupt onto a tasklet. We gain the advantage
of low-latency and ksoftirqd avoidance when waking up the HW, while
avoiding the system-wide starvation of our CS irq-storms.

Comparing the impact on the maximum latency observed (that is the time
stolen from an RT process) over a 120s interval, repeated several times
(using gem_syslatency, similar to RT's cyclictest) while the system is
fully laden with i915 nops, we see that direct submission an actually
improve the worse case.

Maximum latency in microseconds of a third party RT thread
(gem_syslatency -t 120 -f 2)
  x Always using tasklets (a couple of >1000us outliers removed)
  + Only using tasklets from CS irq, direct submission of requests
+------------------------------------------------------------------------+
|          +                                                             |
|          +                                                             |
|          +                                                             |
|          +       +                                                     |
|          + +     +                                                     |
|       +  + +     +  x     x     x                                      |
|      +++ + +     +  x  x  x  x  x  x                                   |
|      +++ + ++  + +  *x x  x  x  x  x                                   |
|      +++ + ++  + *  *x x  *  x  x  x                                   |
|    + +++ + ++  * * +*xxx  *  x  x  xx                                  |
|    * +++ + ++++* *x+**xx+ *  x  x xxxx x                               |
|   **x++++*++**+*x*x****x+ * +x xx xxxx x          x                    |
|x* ******+***************++*+***xxxxxx* xx*x     xxx +                x+|
|             |__________MA___________|                                  |
|      |______M__A________|                                              |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 118            91           186           124     125.28814     16.279137
+ 120            92           187           109     112.00833     13.458617
Difference at 95.0% confidence
	-13.2798 +/- 3.79219
	-10.5994% +/- 3.02677%
	(Student's t, pooled s = 14.9237)

However the mean latency is adversely affected:

Mean latency in microseconds of a third party RT thread
(gem_syslatency -t 120 -f 1)
  x Always using tasklets
  + Only using tasklets from CS irq, direct submission of requests
+------------------------------------------------------------------------+
|           xxxxxx                                        +   ++         |
|           xxxxxx                                        +   ++         |
|           xxxxxx                                      + +++ ++         |
|           xxxxxxx                                     +++++ ++         |
|           xxxxxxx                                     +++++ ++         |
|           xxxxxxx                                     +++++ +++        |
|           xxxxxxx                                   + ++++++++++       |
|           xxxxxxxx                                 ++ ++++++++++       |
|           xxxxxxxx                                 ++ ++++++++++       |
|          xxxxxxxxxx                                +++++++++++++++     |
|         xxxxxxxxxxx    x                           +++++++++++++++     |
|x       xxxxxxxxxxxxx   x           +            + ++++++++++++++++++  +|
|           |__A__|                                                      |
|                                                      |____A___|        |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120         3.506         3.727         3.631     3.6321417    0.02773109
+ 120         3.834         4.149         4.039     4.0375167   0.041221676
Difference at 95.0% confidence
	0.405375 +/- 0.00888913
	11.1608% +/- 0.244735%
	(Student's t, pooled s = 0.03513)

However, since the mean latency corresponds to the amount of irqsoff
processing we have to do for a CS interrupt, we only need to speed that
up to benefit not just system latency but our own throughput.

v2: Remember to defer submissions when under reset.
v4: Only use direct submission for new requests
v5: Be aware that with mixing direct tasklet evaluation and deferred
tasklets, we may end up idling before running the deferred tasklet.
v6: Remove the redudant likely() from tasklet_is_enabled(), restrict the
annotation to reset_in_progress().
v7: Take the full timeline.lock when enabling perf_pmu stats as the
tasklet is no longer a valid guard. A consequence is that the stats are
now only valid for engines also using the timeline.lock to process
state.

Testcase: igt/gem_exec_latency/*rthog*
References: 27af5eea ("drm/i915: Move execlists irq handler to a bottom half")
Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-9-chris@chris-wilson.co.uk

9512f985

drm/i915/execlists: Trust the CSB · fd8526e5

由 Chris Wilson 提交于 6月 28, 2018

Now that we use the CSB stored in the CPU friendly HWSP, we do not need
to track interrupts for when the mmio CSB registers are valid and can
just check where we read up to last from the cached HWSP. This means we
can forgo the atomic bit tracking from interrupt, and in the next patch
it means we can check the CSB at any time.

v2: Change the splitting inside reset_prepare, we only want to lose
testing the interrupt in this patch, the next patch requires the change
in locking
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-8-chris@chris-wilson.co.uk

fd8526e5

drm/i915/execlists: Unify CSB access pointers · bc4237ec

由 Chris Wilson 提交于 6月 28, 2018

Following the removal of the last workarounds, the only CSB mmio access
is for the old vGPU interface. The mmio registers presented by vGPU do
not require forcewake and can be treated as ordinary volatile memory,
i.e. they behave just like the HWSP access just at a different location.
We can reduce the CSB access to a set of read/write/buffer pointers and
treat the various paths identically and not worry about forcewake.
(Forcewake is nightmare for worstcase latency, and we want to process
this all with irqsoff -- no latency allowed!)

v2: Comments, comments, comments. Well, 2 bonus comments.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-5-chris@chris-wilson.co.uk

bc4237ec

21 6月, 2018 1 次提交

drm/i915: Disable bh around call to tasklet · 26eb4cd6

由 Chris Wilson 提交于 6月 20, 2018

The guc submission backends expects to only be run from (at least)
softirq context, but during our intel_engine_is_idle() check we would
call into the tasklet to make sure it was flushed. As this could occur
from process context, occasionally we would be caught out using a
wait_for_atomic() not from an atomic context:

[   59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0))
[   59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915]
[   59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers
[   59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G     U  W         4.18.0-rc1-g93475d62c730-drmtip_67+ #1
[   59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018
[   59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915]
[   59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f
[   59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286
[   59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001
[   59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff
[   59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000
[   59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700
[   59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8
[   59.939260] FS:  00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000
[   59.939261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0
[   59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.939265] Call Trace:
[   59.939288]  ? intel_engine_is_idle+0x64/0x160 [i915]
[   59.939323]  ? intel_engine_dump+0x638/0x890 [i915]
[   59.939327]  ? seq_printf+0x49/0x70
[   59.939353]  ? i915_engine_info+0xc8/0x100 [i915]
[   59.939356]  ? drm_get_color_range_name+0x20/0x20
[   59.939361]  ? seq_read+0xf1/0x470
[   59.939365]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[   59.939370]  ? full_proxy_read+0x51/0x80
[   59.939389]  ? __vfs_read+0x31/0x170
[   59.939395]  ? do_sys_open+0x13b/0x240
[   59.939398]  ? rcu_read_lock_sched_held+0x6f/0x80
[   59.939401]  ? vfs_read+0x9e/0x140
[   59.939404]  ? ksys_read+0x50/0xc0
[   59.939409]  ? do_syscall_64+0x55/0x190
[   59.939412]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   59.939420] irq event stamp: 552834
[   59.939422] hardirqs last  enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600
[   59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100
[   59.939451] softirqs last  enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915]
[   59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106977
Fixes: dd0cf235 ("drm/i915: Speed up idle detection by kicking the tasklets")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180620135929.23956-1-chris@chris-wilson.co.uk

26eb4cd6

18 6月, 2018 1 次提交

drm/i915: Fix fallout of fake reset along resume · 4fdd5b4e

由 Chris Wilson 提交于 6月 16, 2018

commit b2209e62 ("drm/i915/execlists: Reset the CSB head tracking on
reset/sanitization") and commit 1288786b ("drm/i915: Move GEM sanitize
from resume_early to resume") show the conflicting requirements on the
code. We must reset the GPU before trashing live state on a fast resume
(hibernation debug, or error paths), but we must only reset our state
tracking iff the GPU is reset (or power cycled). This is tricky if we
are disabling GPU reset to simulate broken hardware; we reset our state
tracking but the GPU is left intact and recovers from its stale state.

v2: Again without the assertion for forcewake, no longer required since
commit b3ee09a4 ("drm/i915/ringbuffer: Fix context restore upon reset")
as the contexts are reset from the CS ensuring everything is powered up.

Fixes: b2209e62 ("drm/i915/execlists: Reset the CSB head tracking on reset/sanitization")
Fixes: 1288786b ("drm/i915: Move GEM sanitize from resume_early to resume")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180616202534.18767-1-chris@chris-wilson.co.uk

4fdd5b4e

14 6月, 2018 3 次提交

drm/i915: Show CCID in engine dumps · e62230de

由 Chris Wilson 提交于 6月 14, 2018

For debugging context issues, knowing what context the GPU is
loading/using is helpful.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614094103.18025-4-chris@chris-wilson.co.uk

e62230de

drm/i915: Make the hexdump row offset visually distinct · 286e6153

由 Chris Wilson 提交于 6月 14, 2018

Currently we use %08x for the row offset, and %08x for the binary
contents of the buffer. This makes it very easily to confuse the two, so
switch to using [%04x] for the start-of-row offset.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614094103.18025-3-chris@chris-wilson.co.uk

286e6153

drm/i915: Dump the ringbuffer of the active request for debugging · 83c31783

由 Chris Wilson 提交于 6月 14, 2018

Sometimes we need to see what instructions we emitted for a request to
try and gather a glimmer of insight into what the GPU is doing when it
stops responding.

v2: Move ring dumping into its own routine
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614122150.17552-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

83c31783

12 6月, 2018 1 次提交

drm/i915/ringbuffer: Serialize load of PD_DIR · d9d117e4

由 Chris Wilson 提交于 6月 11, 2018

After triggering the mm switch with a load of PD_DIR, which may be
deferred unto the MI_SET_CONTEXT on rcs, serialise the next commands
with that load by posting a read of PD_DIR (or else those subsequent
commands may access the stale page tables).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611171825.13678-2-chris@chris-wilson.co.uk

d9d117e4

11 6月, 2018 1 次提交

drm/i915/ringbuffer: Fix context restore upon reset · b3ee09a4

由 Chris Wilson 提交于 6月 11, 2018

The discovery with trying to enable full-ppgtt was that we were
completely failing to the load both the mm and context following the
reset. Although we were performing mmio to set the PP_DIR (per-process
GTT) and CCID (context), these were taking no effect (the assumption was
that this would trigger reload of the context and restore the page
tables). It was not until we performed the LRI + MI_SET_CONTEXT in a
following context switch would anything occur.

Since we are then required to reset the context image and PP_DIR using
CS commands, we place those commands into every batch. The hardware
should recognise the no-ops and eliminate the expensive context loads,
but we still have to pay the cost of using cross-powerwell register
writes. In practice, this has no effect on actual context switch times,
and only adds a few hundred nanoseconds to no-op switches. We can improve
the latter by eliminating the w/a around known no-op switches, but there
is an ulterior motive to keeping them.

Always emitting the context switch at the beginning of the request (and
relying on HW to skip unneeded switches) does have one key advantage.
Should we implement request reordering on Haswell, we will not know in
advance what the previous executing context was on the GPU and so we
would not be able to elide the MI_SET_CONTEXT commands ourselves and
always have to emit them. Having our hand forced now actually prepares
us for later.

Now since that context and mm follow the request, we no longer (and not
for a long time since requests took over!) require a trace point to tell
when we write the switch into the ring, since it is always. (This is
even more important when you remember that simply writing into the ring
bears no relation to the current mm.)

v2: Sandybridge has to agree to use LRI as well.

Testcase: igt/drv_selftests/live_hangcheck
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611110845.31890-1-chris@chris-wilson.co.uk

b3ee09a4

06 6月, 2018 1 次提交

drm/i915/gtt: Rename i915_hw_ppgtt base member · 82ad6443

由 Chris Wilson 提交于 6月 05, 2018

In the near future, I want to subclass gen6_hw_ppgtt as it contains a
few specialised members and I wish to add more. To avoid the ugliness of
using ppgtt->base.base, rename the i915_hw_ppgtt base member
(i915_address_space) as vm, which is our common shorthand for an
i915_address_space local.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk

82ad6443

31 5月, 2018 1 次提交

drm/i915: Nul-terminate legacy debug string · dd166fbd

由 Chris Wilson 提交于 5月 17, 2018

Make sure that when we don't have any scheduler attributes for the
request, the string is terminated.

Fixes: 247870ac ("drm/i915: Build request info on stack before printk")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517152824.11619-1-chris@chris-wilson.co.uk
(cherry picked from commit 96d4f03c)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

dd166fbd

24 5月, 2018 2 次提交

drm/i915/icl: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads · d78fa508

由 Yunwei Zhang 提交于 5月 18, 2018

WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Icelake as
well.

References: HSD#1405586840, BSID#0575

v2:
 - GEN11 mask is different from its predecessors. (Oscar)
 - Better separate GEN10 and GEN11. (Oscar)

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: NYunwei Zhang <yunwei.zhang@intel.com>
Reviewed-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1526683232-24753-1-git-send-email-yunwei.zhang@intel.com

d78fa508

drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads · 1e40d4ae

由 Yunwei Zhang 提交于 5月 18, 2018

WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO
read into Slice/Subslice specific registers, MCR packet control
register(0xFDC) needs to be programmed to point to any enabled
slice/subslice pair. Otherwise, incorrect value will be returned.

However, that means each subsequent MMIO read will be forwarded to a
specific slice/subslice combination as read is unicast. This is OK since
slice/subslice specific register values are consistent in almost all cases
across slice/subslice. There are rare occasions such as INSTDONE that this
value will be dependent on slice/subslice combo, in such cases, we need to
program 0xFDC and recover this after. This is already covered by
read_subslice_reg.

Also, 0xFDC will lose its information after TDR/engine reset/power state
change.

References: HSD#1405586840, BSID#0575

v2:
 - use fls() instead of find_last_bit() (Chris)
 - added INTEL_SSEU to extract sseu from device info. (Chris)
v3:
 - rebase on latest tip
v5:
 - Added references (Mika)
 - Change the ordered of passing arguments and etc. (Ursulin)
v7:
 - Moved WA explanation Comments(Oscar)
 - Rebased.
v8:
 - Renamed sanitize_mcr to calculate_s_ss_select. (Oscar)
 - calculate s/ss selector instead of whole mcr. (Oscar)
v9:
 - Updated function name (Oscar)
 - Remove redundant variables (Oscar)
v10:
 - Separate pre-GEN10 and GEN11 mask. (Oscar)

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: NYunwei Zhang <yunwei.zhang@intel.com>
Reviewed-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1526683197-24656-1-git-send-email-yunwei.zhang@intel.com

1e40d4ae

19 5月, 2018 1 次提交

drm/i915/execlists: Handle copying default context state for atomic reset · fe0c4935

由 Chris Wilson 提交于 5月 18, 2018

We want to be able to reset the GPU from inside a timer callback
(hardirq context). One step requires us to copy the default context
state over to the guilty context, which means we need to plan in advance
to have that object accessible from within an atomic context. The atomic
context prevents us from pinning the object or from peeking into the
shmemfs backing store (all may sleep), so we choose to pin the
default_state into memory when the engine becomes active. This
compromise allows us to swap out the default state when idle, when
required.

References: 5692251c ("drm/i915/lrc: Scrub the GPU state of the guilty hanging request")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180518090212.5349-2-chris@chris-wilson.co.uk

fe0c4935

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功