提交 · 06039d98202f31dbf85af76d2659aaaa455ee0cb · openeuler / Kernel

29 1月, 2019 3 次提交

drm/i915: Track the context's seqno in its own timeline HWSP · 5013eb8c

由 Chris Wilson 提交于 1月 28, 2019

Now that we have allocated ourselves a cacheline to store a breadcrumb,
we can emit a write from the GPU into the timeline's HWSP of the
per-context seqno as we complete each request. This drops the mirroring
of the per-engine HWSP and allows each context to operate independently.
We do not need to unwind the per-context timeline, and so requests are
always consistent with the timeline breadcrumb, greatly simplifying the
completion checks as we no longer need to be concerned about the
global_seqno changing mid check.

One complication though is that we have to be wary that the request may
outlive the HWSP and so avoid touching the potentially danging pointer
after we have retired the fence. We also have to guard our access of the
HWSP with RCU, the release of the obj->mm.pages should already be RCU-safe.

At this point, we are emitting both per-context and global seqno and
still using the single per-engine execution timeline for resolving
interrupts.

v2: s/fake_complete/mark_complete/
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-5-chris@chris-wilson.co.uk

5013eb8c

drm/i915: Allocate a status page for each timeline · 52954edd

由 Chris Wilson 提交于 1月 28, 2019

Allocate a page for use as a status page by a group of timelines, as we
only need a dword of storage for each (rounded up to the cacheline for
safety) we can pack multiple timelines into the same page. Each timeline
will then be able to track its own HW seqno.

v2: Reuse the common per-engine HWSP for the solitary ringbuffer
timeline, so that we do not have to emit (using per-gen specialised
vfuncs) the breadcrumb into the distinct timeline HWSP and instead can
keep on using the common MI_STORE_DWORD_INDEX. However, to maintain the
sleight-of-hand for the global/per-context seqno switchover, we will
store both temporarily (and so use a custom offset for the shared timeline
HWSP until the switch over).

v3: Keep things simple and allocate a page for each timeline, page
sharing comes next.

v4: I was caught repeating the same MI_STORE_DWORD_IMM over and over
again in selftests.

v5: And caught red handed copying create timeline + check.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-3-chris@chris-wilson.co.uk

52954edd

drm/i915: Always allocate an object/vma for the HWSP · 0ca88ba0

由 Chris Wilson 提交于 1月 28, 2019

Currently we only allocate an object and vma if we are using a GGTT
virtual HWSP, and a plain struct page for a physical HWSP. For
convenience later on with global timelines, it will be useful to always
have the status page being tracked by a struct i915_vma. Make it so.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-4-chris@chris-wilson.co.uk

0ca88ba0

25 1月, 2019 3 次提交

drm/i915: Remove GPU reset dependence on struct_mutex · eb8d0f5a

由 Chris Wilson 提交于 1月 25, 2019

Now that the submission backends are controlled via their own spinlocks,
with a wave of a magic wand we can lift the struct_mutex requirement
around GPU reset. That is we allow the submission frontend (userspace)
to keep on submitting while we process the GPU reset as we can suspend
the backend independently.

The major change is around the backoff/handoff strategy for performing
the reset. With no mutex deadlock, we no longer have to coordinate with
any waiter, and just perform the reset immediately.

Testcase: igt/gem_mmap_gtt/hang # regresses
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk

eb8d0f5a

drm/i915: Remove manual breadcumb counting · 9fa4973e

由 Chris Wilson 提交于 1月 25, 2019

Now that we know we measure the size of the engine->emit_breadcrumb()
correctly, we can remove the previous manual counting.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125120005.25191-1-chris@chris-wilson.co.uk

9fa4973e

drm/i915: Measure the required reserved size for request emission · e1a73a54

由 Chris Wilson 提交于 1月 25, 2019

Instead of tediously and fragilely counting up the number of dwords
required to emit the breadcrumb to seal a request, fake a request and
measure it automatically once during engine setup.

The downside is that this requires a fair amount of mocking to create a
proper breadcrumb. Still, should be less error prone in future as the
breadcrumb size fluctuates!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125100520.20163-1-chris@chris-wilson.co.uk

e1a73a54

18 1月, 2019 1 次提交

drm/i915: Use b->irq_enable() as predicate for mock engine · 293f8c0f

由 Chris Wilson 提交于 1月 18, 2019

Since commit d4ccceb0 ("drm/i915/icl: Ringbuffer interrupt handling")
we have required a mechanism to avoid touching the interrupt hardware
for breadcrumbs, superseding our mock interface for selftests.

The residual problem (ideas welcome) is in probing the mock ring
registers for ring_is_idle. Hmm, maybe we should just install
mock handlers for i915->uncore.mmio__write and friends? Only problem
being is that we would to truly mock some expected reads. :(

References: d4ccceb0 ("drm/i915/icl: Ringbuffer interrupt handling")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190118112225.13780-1-chris@chris-wilson.co.uk

293f8c0f

17 1月, 2019 2 次提交

drm/i915: small isolated c99 types to kernel types switch · 739f3abd

由 Jani Nikula 提交于 1月 16, 2019

Mixed C99 and kernel types use is getting ugly. Prefer kernel types.

sed -i 's/\buint\(8\|16\|32\|64\)_t\b/u\1/g'

Minor checkpatch fixes sprinkled on top of the changed lines.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NJosé Roberto de Souza <jose.souza@intel.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/14ed72e7f04c9340a057855c5950b54811f8a477.1547629303.git.jani.nikula@intel.com

739f3abd

drm/i915: Pull all the reset functionality together into i915_reset.c · 9f58892e

由 Chris Wilson 提交于 1月 16, 2019

Currently the code to reset the GPU and our state is spread widely
across a few files. Pull the logic together into a common file.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk

9f58892e

16 1月, 2019 1 次提交

drm/i915: Move intel_execlists_show_requests() aside · 0212bdef

由 Chris Wilson 提交于 1月 15, 2019

Move the debug pretty printer into a standalone routine prior to
extending it in upcoming feature work.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190115212948.10423-1-chris@chris-wilson.co.uk

0212bdef

15 1月, 2019 2 次提交

drm/i915/gem: Track the rpm wakerefs · 538ef96b

由 Chris Wilson 提交于 1月 14, 2019

Keep track of the temporary rpm wakerefs used for user access to the
device, so that we can cancel them upon release and clearly identify any
leaks.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-10-chris@chris-wilson.co.uk

538ef96b

drm/i915: Markup paired operations on wakerefs · 16e4dd03

由 Chris Wilson 提交于 1月 14, 2019

The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,

v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk

16e4dd03

03 1月, 2019 1 次提交

drm/i915: Always try to reset the GPU on takeover · 55277e1f

由 Chris Wilson 提交于 1月 03, 2019

When we first introduced the reset to sanitize the GPU on taking over
from the BIOS and before returning control to third parties (the BIOS!),
we restricted it to only systems utilizing HW contexts as we were
uncertain of how stable our reset mechanism truly was. We now have
reasonable coverage across all machines that expose a GPU reset method,
and so we should be safe to sanitize the GPU state everywhere.

v2: We _have_ to skip the reset if it would clobber the display.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190103112104.19561-1-chris@chris-wilson.co.uk

55277e1f

02 1月, 2019 1 次提交

drm/i915: start moving runtime device info to a separate struct · 0258404f

由 Jani Nikula 提交于 12月 31, 2018

First move the low hanging fruit, the fields that are only initialized
runtime. Use RUNTIME_INFO() exclusively to access the fields.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c24fe7a4b0492a888690c46814c0ff21ce2f12b1.1546267488.git.jani.nikula@intel.com

0258404f

31 12月, 2018 1 次提交

drm/i915: Drop unused engine->irq_seqno_barrier w/a · 1216e3c3

由 Chris Wilson 提交于 12月 28, 2018

Now that we have eliminated the CPU-side irq_seqno_barrier by moving the
delays on the GPU before emitting the MI_USER_INTERRUPT, we can remove
the engine->irq_seqno_barrier infrastructure. Though intentionally
slowing down the GPU is nasty, so is the code we can now remove!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181228171641.16531-6-chris@chris-wilson.co.uk

1216e3c3

28 12月, 2018 1 次提交

drm/i915: Remove HW semaphores for gen7 inter-engine synchronisation · 6faf5916

由 Chris Wilson 提交于 12月 28, 2018

The writing is on the wall for the existence of a single execution queue
along each engine, and as a consequence we will not be able to track
dependencies along the HW queue itself, i.e. we will not be able to use
HW semaphores on gen7 as they use a global set of registers (and unlike
gen8+ we can not effectively target memory to keep per-context seqno and
dependencies).

On the positive side, when we implement request reordering for gen7 we
also can not presume a simple execution queue and would also require
removing the current semaphore generation code. So this bring us another
step closer to request reordering for ringbuffer submission!

The negative side is that using interrupts to drive inter-engine
synchronisation is much slower (4us -> 15us to do a nop on each of the 3
engines on ivb). This is much better than it was at the time of introducing
the HW semaphores and equally important userspace weaned itself off
intermixing dependent BLT/RENDER operations (the prime culprit was glyph
rendering in UXA). So while we regress the microbenchmarks, it should not
impact the user.

References: https://bugs.freedesktop.org/show_bug.cgi?id=108888Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181228140736.32606-2-chris@chris-wilson.co.uk

6faf5916

18 12月, 2018 1 次提交

drm/i915: Apply missed interrupt after reset w/a to all ringbuffer gen · 060f2322

由 Chris Wilson 提交于 12月 18, 2018

Having completed a test run of gem_eio across all machines in CI we also
observe the phenomenon (of lost interrupts after resetting the GPU) on
gen3 machines as well as the previously sighted gen6/gen7. Let's apply
the same HWSTAM workaround that was effective for gen6+ for all, as
although we haven't seen the same failure on gen4/5 it seems prudent to
keep the code the same.

As a consequence we can remove the extra setting of HWSTAM and apply the
register from a single site.

v2: Delazy and move the HWSTAM into its own function
v3: Mask off all HWSP writes on driver unload and engine cleanup.
v4: And what about the physical hwsp?
v5: No, engine->init_hw() is not called from driver_init_hw(), don't be
daft. Really scrub HWSTAM as early as we can in driver_init_mmio()
v6: Rename set_hwsp as it was setting the mask not the hwsp register.
v7: Ville pointed out that although vcs(bsd) was introduced for g4x/ilk,
per-engine HWSTAM was not introduced until gen6!

References: https://bugs.freedesktop.org/show_bug.cgi?id=108735Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181218102712.11058-1-chris@chris-wilson.co.uk

060f2322

13 12月, 2018 3 次提交

drm/i915: merge gen checks to use range · f3ce44a0

由 Lucas De Marchi 提交于 12月 12, 2018

Instead of using IS_GEN() for consecutive gen checks, let's pass the
range to IS_GEN_RANGE(). By code inspection these were the ranges deemed
necessary for spatch:

@@
expression e;
@@
(
- IS_GEN(e, 3) || IS_GEN(e, 2)
+ IS_GEN_RANGE(e, 2, 3)
|
- IS_GEN(e, 3) || IS_GEN(e, 4)
+ IS_GEN_RANGE(e, 3, 4)
|
- IS_GEN(e, 5) || IS_GEN(e, 6)
+ IS_GEN_RANGE(e, 5, 6)
|
- IS_GEN(e, 6) || IS_GEN(e, 7)
+ IS_GEN_RANGE(e, 6, 7)
|
- IS_GEN(e, 7) || IS_GEN(e, 8)
+ IS_GEN_RANGE(e, 7, 8)
|
- IS_GEN(e, 8) || IS_GEN(e, 9)
+ IS_GEN_RANGE(e, 8, 9)
|
- IS_GEN(e, 10) || IS_GEN(e, 9)
+ IS_GEN_RANGE(e, 9, 10)
|
- IS_GEN(e, 9) || IS_GEN(e, 10)
+ IS_GEN_RANGE(e, 9, 10)
)

After conversion, checking we don't have any missing IS_GEN_RANGE() ||
IS_GEN() was also done.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-3-lucas.demarchi@intel.com

f3ce44a0

drm/i915: replace IS_GEN<N> with IS_GEN(..., N) · cf819eff

由 Lucas De Marchi 提交于 12月 12, 2018

Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.

The following spatch was used to convert the users of these macros:

@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)

v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
    using the bitmask
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com

cf819eff

drm/i915: Rename IS_GEN to IS_GEN_RANGE · 00690008

由 Lucas De Marchi 提交于 12月 12, 2018

RANGE makes it longer, but clearer. We are also going to add a macro to
check an individual gen, so add the _RANGE prefix here.

Diff generated with:

sed 's/IS_GEN(/IS_GEN_RANGE(/g' drivers/gpu/drm/i915/{*/,}*.{c,h} -i

v2: use IS_GEN rather than GT_GEN
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-1-lucas.demarchi@intel.com

00690008

12 12月, 2018 1 次提交

drm/i915: Allocate a common scratch page · fe78742d

由 Chris Wilson 提交于 12月 04, 2018

Currently we allocate a scratch page for each engine, but since we only
ever write into it for post-sync operations, it is not exposed to
userspace nor do we care for coherency. As we then do not care about its
contents, we can use one page for all, reducing our allocations and
avoid complications by not assuming per-engine isolation.

For later use, it simplifies engine initialisation (by removing the
allocation that required struct_mutex!) and means that we can always rely
on there being a scratch page.

v2: Check that we allocated a large enough scratch for I830 w/a

Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.18.20+
(cherry picked from commit 51797499)
[Joonas: Use new function in gen9_init_indirectctx_bb too]
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

fe78742d

07 12月, 2018 1 次提交

dma-buf: make fence sequence numbers 64 bit v2 · b312d8ca

由 Christian König 提交于 11月 14, 2018

For a lot of use cases we need 64bit sequence numbers. Currently drivers
overload the dma_fence structure to store the additional bits.

Stop doing that and make the sequence number in the dma_fence always
64bit.

For compatibility with hardware which can do only 32bit sequences the
comparisons in __dma_fence_is_later only takes the lower 32bits as significant
when the upper 32bits are all zero.

v2: change the logic in __dma_fence_is_later
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Link: https://patchwork.freedesktop.org/patch/266927/

b312d8ca

05 12月, 2018 1 次提交

drm/i915: Introduce per-engine workarounds · 90098efa

由 Tvrtko Ursulin 提交于 12月 05, 2018

We stopped re-applying the GT workarounds after engine reset since commit
59b449d5 ("drm/i915: Split out functions for different kinds of
workarounds").

Issue with this is that some of the GT workarounds live in the MMIO space
which gets lost during engine resets. So far the registers in 0x2xxx and
0xbxxx address range have been identified to be affected.

This losing of applied workarounds has obvious negative effects and can
even lead to hard system hangs (see the linked Bugzilla).

Rather than just restoring this re-application, because we have also
observed that it is not safe to just re-write all GT workarounds after
engine resets (GPU might be live and weird hardware states can happen),
we introduce a new class of per-engine workarounds and move only the
affected GT workarounds over.

Using the framework introduced in the previous patch, we therefore after
engine reset, re-apply only the workarounds living in the affected MMIO
address ranges.

v2:
 * Move Wa_1406609255:icl to engine workarounds as well.
 * Rename API. (Chris Wilson)
 * Drop redundant IS_KABYLAKE. (Chris Wilson)
 * Re-order engine wa/ init so latest platforms are first. (Rodrigo Vivi)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=107945
Fixes: 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203133341.10258-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 4a15c75c)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

90098efa

04 12月, 2018 4 次提交

drm/i915: Allocate a common scratch page · 51797499

由 Chris Wilson 提交于 12月 04, 2018

Currently we allocate a scratch page for each engine, but since we only
ever write into it for post-sync operations, it is not exposed to
userspace nor do we care for coherency. As we then do not care about its
contents, we can use one page for all, reducing our allocations and
avoid complications by not assuming per-engine isolation.

For later use, it simplifies engine initialisation (by removing the
allocation that required struct_mutex!) and means that we can always rely
on there being a scratch page.

v2: Check that we allocated a large enough scratch for I830 w/a

Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.18.20+

51797499

drm/i915: Fuse per-context workaround handling with the common framework · 452420d2

由 Tvrtko Ursulin 提交于 12月 03, 2018

Convert the per context workaround handling code to run against the newly
introduced common workaround framework and fuse the two to use the
existing smarter list add helper, the one which does the sorted insert and
merges registers where possible.

This completes migration of all four classes of workarounds onto the
common framework.

Existing macros are kept untouched for smaller code churn.

v2:
 * Rename to list name ctx_wa_list and move from dev_priv to engine.

v3:
 * API rename and parameters tweaking. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203133357.10341-1-tvrtko.ursulin@linux.intel.com

452420d2

drm/i915: Move register white-listing to the common workaround framework · 69bcdecf

由 Tvrtko Ursulin 提交于 12月 03, 2018

Instead of having a separate list of white-listed registers we can
trivially move this to the common workarounds framework.

This brings us one step closer to the goal of driving all workaround
classes using the same code.

v2:
 * Use GEM_DEBUG_WARN_ON for the sanity check. (Chris Wilson)

v3:
 * API rename. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203125014.3219-6-tvrtko.ursulin@linux.intel.com

69bcdecf

drm/i915: Introduce per-engine workarounds · 4a15c75c

由 Tvrtko Ursulin 提交于 12月 03, 2018

We stopped re-applying the GT workarounds after engine reset since commit
59b449d5 ("drm/i915: Split out functions for different kinds of
workarounds").

Issue with this is that some of the GT workarounds live in the MMIO space
which gets lost during engine resets. So far the registers in 0x2xxx and
0xbxxx address range have been identified to be affected.

This losing of applied workarounds has obvious negative effects and can
even lead to hard system hangs (see the linked Bugzilla).

Rather than just restoring this re-application, because we have also
observed that it is not safe to just re-write all GT workarounds after
engine resets (GPU might be live and weird hardware states can happen),
we introduce a new class of per-engine workarounds and move only the
affected GT workarounds over.

Using the framework introduced in the previous patch, we therefore after
engine reset, re-apply only the workarounds living in the affected MMIO
address ranges.

v2:
 * Move Wa_1406609255:icl to engine workarounds as well.
 * Rename API. (Chris Wilson)
 * Drop redundant IS_KABYLAKE. (Chris Wilson)
 * Re-order engine wa/ init so latest platforms are first. (Rodrigo Vivi)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=107945
Fixes: 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203133341.10258-1-tvrtko.ursulin@linux.intel.com

4a15c75c

21 11月, 2018 1 次提交

drm/i915: Show waiter's status on engine dump · aa6a65da

由 Chris Wilson 提交于 11月 21, 2018

When showing the list of waiters, include the task's status so that we
can tell if they have been woken up and are waiting for the CPU, or if
they are still waiting to be woken.

v2: task_state_to_char()
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181121151653.24595-1-chris@chris-wilson.co.uk

aa6a65da

16 11月, 2018 1 次提交

drm/i915/selftests: Workaround an issue with unused lockdep subclass · f911e723

由 Chris Wilson 提交于 11月 15, 2018

lockdep insists that if we give a lock a subclass, it must be used.
Failure to do so triggers a self-consistency check when reading
lockdep_stats:

[   49.902002] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused)
[   49.902009] WARNING: CPU: 3 PID: 383 at kernel/locking/lockdep_proc.c:249 lockdep_stats_show+0x984/0xa10
[   49.902026] Modules linked in: nls_ascii nls_cp437 vfat fat crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_uncore intel_rapl_perf intel_gtt efivars prime_numbers ahci libahci i2c_i801 video button efivarfs [last unloaded: drm_kms_helper]
[   49.902059] CPU: 3 PID: 383 Comm: cat Tainted: G     U            4.20.0-rc2+ #304
[   49.902068] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[   49.902079] RIP: 0010:lockdep_stats_show+0x984/0xa10
[   49.902086] Code: 00 85 c0 0f 84 aa f8 ff ff 8b 05 77 37 e2 00 85 c0 0f 85 9c f8 ff ff 48 c7 c6 e0 57 bc 81 48 c7 c7 28 30 bb 81 e8 6b 77 fa ff <0f> 0b e9 82 f8 ff ff 48 c7 44 24 50 00 00 00 00 45 31 e4 31 db 31
[   49.902103] RSP: 0018:ffffc90000247d58 EFLAGS: 00010292
[   49.902110] RAX: 0000000000000044 RBX: 00000000000002f0 RCX: 0000000000000000
[   49.902118] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffffff810b3464
[   49.902126] RBP: 0000000000000039 R08: 0000000000000002 R09: 0000000000000000
[   49.902133] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000007ead
[   49.902141] R13: 0000000000000001 R14: ffff88884c021000 R15: 0000000000000097
[   49.902150] FS:  00007fb347e66540(0000) GS:ffff88885e600000(0000) knlGS:0000000000000000
[   49.902159] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   49.902165] CR2: 00007fb347aeb000 CR3: 00000008544bd005 CR4: 00000000001606e0
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181115203851.25739-1-chris@chris-wilson.co.uk

f911e723

30 10月, 2018 1 次提交

drm/i915: Prefer IS_GEN<n> check with bitmask. · 9e783375

由 Rodrigo Vivi 提交于 10月 26, 2018

Whenever possible we should stick with IS_GEN<n> checks.

Bitmaks has been introduced on commit ae7617f0 ("drm/i915:
Allow optimized platform checks") for efficiency.

Let's stick with it whenever possible.

This patch was generated with coccinelle:

spatch -sp_file is_gen.cocci *{c,h} --in-place

is_gen.cocci:
@gen2@ expression e; @@
-INTEL_GEN(e) == 2
+IS_GEN2(e)
@gen3@ expression e; @@
-INTEL_GEN(e) == 3
+IS_GEN3(e)
@gen4@ expression e; @@
-INTEL_GEN(e) == 4
+IS_GEN4(e)
@gen5@ expression e; @@
-INTEL_GEN(e) == 5
+IS_GEN5(e)
@gen6@ expression e; @@
-INTEL_GEN(e) == 6
+IS_GEN6(e)
@gen7@ expression e; @@
-INTEL_GEN(e) == 7
+IS_GEN7(e)
@gen8@ expression e; @@
-INTEL_GEN(e) == 8
+IS_GEN8(e)
@gen9@ expression e; @@
-INTEL_GEN(e) == 9
+IS_GEN9(e)
@gen10@ expression e; @@
-INTEL_GEN(e) == 10
+IS_GEN10(e)
@gen11@ expression e; @@
-INTEL_GEN(e) == 11
+IS_GEN11(e)

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181026195143.20353-1-rodrigo.vivi@intel.com

9e783375

18 10月, 2018 1 次提交

drm/i915: GEM_WARN_ON considered harmful · bbb8a9d7

由 Tvrtko Ursulin 提交于 10月 12, 2018

GEM_WARN_ON currently has dangerous semantics where it is completely
compiled out on !GEM_DEBUG builds. This can leave users who expect it to
be more like a WARN_ON, just without a warning in non-debug builds, in
complete ignorance.

Another gotcha with it is that it cannot be used as a statement. Which is
again different from a standard kernel WARN_ON.

This patch fixes both problems by making it behave as one would expect.

It can now be used both as an expression and as statement, and also the
condition evaluates properly in all builds - code under the conditional
will therefore not unexpectedly disappear.

To satisfy call sites which really want the code under the conditional to
completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the
callers to it. This one can also be used as both expression and statement.

>From the above it follows GEM_DEBUG_WARN_ON should be used in situations
where we are certain the condition will be hit during development, but at
a place in code where error can be handled to the benefit of not crashing
the machine.

GEM_WARN_ON on the other hand should be used where condition may happen in
production and we just want to distinguish the level of debugging output
emitted between the production and debug build.

v2:
 * Dropped BUG_ON hunk.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: NTomasz Lis <tomasz.lis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com

bbb8a9d7

17 10月, 2018 1 次提交

drm/i915: Ensure intel_engine_init_execlist() builds with Clang · 410ed573

由 Jani Nikula 提交于 10月 16, 2018

Clang build with UBSAN enabled leads to the following build error:

drivers/gpu/drm/i915/intel_engine_cs.o: In function `intel_engine_init_execlist':
drivers/gpu/drm/i915/intel_engine_cs.c:411: undefined reference to `__compiletime_assert_411'

Again, for this to work the code would first need to be inlined and then
constant folded, which doesn't work for Clang because semantic analysis
happens before optimization/inlining.

Use GEM_BUG_ON() instead of BUILD_BUG_ON().

v2: Use is_power_of_2() from log2.h (Chris)

References: http://mid.mail-archive.com/20181015203410.155997-1-swboyd@chromium.orgReported-by: NStephen Boyd <swboyd@chromium.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: NNathan Chancellor <natechancellor@gmail.com>
Tested-by: NStephen Boyd <swboyd@chromium.org>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181016122938.18757-2-jani.nikula@intel.com

410ed573

12 10月, 2018 1 次提交

drm/i915: Inject load failure inside intel_engines_init_mmio · 645ff9e3

由 Michal Wajdeczko 提交于 10月 11, 2018

We need extra load failure point to better test error path in
i915_driver_init_mmio.
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181011130008.24640-2-michal.wajdeczko@intel.com

645ff9e3

01 10月, 2018 1 次提交

drm/i915: Combine multiple internal plists into the same i915_priolist bucket · 85f5e1f3

由 Chris Wilson 提交于 10月 01, 2018

As we are about to allow ourselves to slightly bump the user priority
into a few different sublevels, packthose internal priority lists
into the same i915_priolist to keep the rbtree compact and avoid having
to allocate the default user priority even after the internal bumping.
The downside to having an requests[] rather than a node per active list,
is that we then have to walk over the empty higher priority lists. To
compensate, we track the active buckets and use a small bitmap to skip
over any inactive ones.

v2: Use MASK of internal levels to simplify our usage.
v3: Prevent overflow when SHIFT is zero.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181001123204.23982-4-chris@chris-wilson.co.uk

85f5e1f3

26 9月, 2018 1 次提交

drm/i915: Convert to BITS_PER_TYPE · 74f6e183

由 Chris Wilson 提交于 9月 26, 2018

In commit 9144d75e ("include/linux/bitops.h: introduce BITS_PER_TYPE"),
we made BITS_PER_TYPE available to all and now we can use the macro to
replace some open-coded computation of sizeof(T) * BITS_PER_BYTE.
Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180926104707.17410-1-chris@chris-wilson.co.uk

74f6e183

14 9月, 2018 1 次提交

drm/i915: Flush the tasklet when checking for idle · 22495b68

由 Chris Wilson 提交于 9月 14, 2018

In order to reduce latency when checking for idle we kick the tasklet
directly. Sometimes this is not enough as it is queued on another cpu
and so to improve the accuracy of this idle-check (and so to reduce
latency overall by avoiding another pass, or worse declaring a timeout!)
wait for the tasklet to complete.

References: https://bugs.freedesktop.org/show_bug.cgi?id=107916Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180914080017.30308-2-chris@chris-wilson.co.uk

22495b68

04 9月, 2018 2 次提交

drm/i915: Use a cached mapping for the physical HWS · d6acae36

由 Chris Wilson 提交于 9月 03, 2018

Older gen use a physical address for the hardware status page, for which
we use cache-coherent writes. As the writes are into the cpu cache, we use
a normal WB mapped page to read the HWS, used for our seqno tracking.

Anecdotally, I observed lost breadcrumbs writes into the HWS on i965gm,
which so far have not reoccurred with this patch. How reliable that
evidence is remains to be seen.

v2: Explicitly pass the expected physical address to the hw
v3: Also remember the wild writes we once had for HWS above 4G.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NDaniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180903152304.31589-2-chris@chris-wilson.co.uk

d6acae36

drm/i915: Combine cleanup_status_page() · a0e731f4

由 Chris Wilson 提交于 9月 03, 2018

Pull the physical status page cleanup into a common
cleanup_status_page() for caller simplicity.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180903152304.31589-1-chris@chris-wilson.co.uk

a0e731f4

21 8月, 2018 1 次提交

drm/i915: Correct CSB probing for engine state dumper · df4f94e8

由 Chris Wilson 提交于 8月 21, 2018

Since we no longer maintain our read position in the CSB pointers
register, it always returns 0 and not where we last read up to. As a
result the CSB probing in the state dumper starts from 0, either missing
entries or showing stale one.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180821101138.15822-1-chris@chris-wilson.co.uk

df4f94e8

15 8月, 2018 1 次提交

drm/i915: Clear stop-engine for a pardoned reset · a99b32a6

由 Chris Wilson 提交于 8月 14, 2018

If we pardon a per-engine reset, we may leave the STOP_RING bit asserted
in RING_MI_MODE resulting in the engine hanging. Unconditionally clear
it on the per-engine exit path as we know that either we skipped the
reset and so need the cancellation, or the reset was successful and the
cancellation is a no-op, or there was an error and we will follow up
with a full-reset or wedging (both of which will stop the engines again
as required).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106560Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180814171857.24673-1-chris@chris-wilson.co.uk

a99b32a6

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功