提交 · d8857d541c67110cdcedb330e9677efdc9c5844d · openeuler / Kernel

29 6月, 2018 2 次提交

drm/i915/execlists: Pull CSB reset under the timeline.lock · d8857d54

由 Chris Wilson 提交于 6月 28, 2018

In the following patch, we will process the CSB events under the
timeline.lock and not serialised by the tasklet. This also means that we
will need to protect access to common variables such as
execlists->csb_head with the timeline.lock during reset.

v2: Move sync_irq to avoid deadlocks between taking timeline.lock from
our interrupt handler.
v3: Kill off the synchronize_hardirq as it raises more questions than
answered; now we use the timeline.lock entirely for CSB serialisation
between the irq and elsewhere, we don't need to be so heavy handed with
flushing
v4: Treat request cancellation (wedging after failed reset) similarly
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-3-chris@chris-wilson.co.uk

d8857d54

drm/i915/execlists: Pull submit after dequeue under timeline lock · 0b02befa

由 Chris Wilson 提交于 6月 28, 2018

In the next patch, we will begin processing the CSB from inside the
submission path (underneath an irqsoff section, and even from inside
interrupt handlers). This means that updating the execlists->port[] will
no longer be serialised by the tasklet but needs to be locked by the
engine->timeline.lock instead. Pull dequeue and submit under the same
lock for protection. (An alternate future plan is to keep the in/out
arrays separate for concurrent processing and reduced lock coverage.)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-2-chris@chris-wilson.co.uk

0b02befa

25 6月, 2018 2 次提交

drm/i915: Context objects can never be active when freed · efe79d48

由 Chris Wilson 提交于 6月 25, 2018

Due to how we only release the pining on the context state on
retirement and never track activity on the context vma itself, the
object can never be active at the point of release. Replace the
conditional transfer of ownership onto an active-reference with an
assert that the object is idle.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180625100604.22598-2-chris@chris-wilson.co.uk

efe79d48

drm/i915/execlists: Check for ce->state before destroy · dd12c6ca

由 Chris Wilson 提交于 6月 25, 2018

As we may cancel the ce->state allocation during context pinning (but
crucially after we mark ce as operational), that means we may be asked
to destroy a nonexistent ce->state. Given the choice in handing a
complex error path on pinning, and just ignoring the lack of state in
destroy, choice the latter for simplicity.
Reported-by: NZhao Yakui <yakui.zhao@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180625100604.22598-1-chris@chris-wilson.co.uk

dd12c6ca

18 6月, 2018 2 次提交

drm/i915/execlists: Pull the w/a LRI emission into a helper · 5ee4a7a6

由 Chris Wilson 提交于 6月 18, 2018

Having the w/a registers as an open-coded table leaves a trap for the
unwary; it would be easy to miss incrementing the LRI counter when
adding a new register to the list. Instead, pull the list of registers
into a table, so that we only need add new registers to that table
rather than try and remember important side-effects of earlier chunks of
GPU instructions.
Suggested-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180618094150.30895-1-chris@chris-wilson.co.uk

5ee4a7a6

drm/i915: Enable provoking vertex fix on Gen9 systems. · b77422f8

由 Kenneth Graunke 提交于 6月 15, 2018

The SF and clipper units mishandle the provoking vertex in some cases,
which can cause misrendering with shaders that use flat shaded inputs.

There are chicken bits in 3D_CHICKEN3 (for SF) and FF_SLICE_CHICKEN
(for the clipper) that work around the issue.  These registers are
unfortunately not part of the logical context (even the power context),
and so we must reload them every time we start executing in a context.

Bugzilla: https://bugs.freedesktop.org/103047Signed-off-by: NKenneth Graunke <kenneth@whitecape.org>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180615190605.16238-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org

b77422f8

16 6月, 2018 1 次提交

drm/i915/execlists: Reset the CSB head tracking on reset/sanitization · b2209e62

由 Chris Wilson 提交于 6月 15, 2018

We can avoid the mmio read of the CSB pointers after reset based on the
knowledge that the HW always start writing at entry 0 in the CSB buffer.
We need to reset our CSB head tracking after GPU reset (and on
sanitization after resume) so that we are expecting to read from entry
0, hence we reset our head tracking back to the entry before (the last
entry in the ring).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180615093137.14270-2-chris@chris-wilson.co.uk

b2209e62

15 6月, 2018 1 次提交

drm/i915/execlists: Push the tasklet kick after reset to reset_finish · 5db1d4ea

由 Chris Wilson 提交于 6月 04, 2018

In the unlikely case where we have failed to keep submitting to the GPU,
we end up with the ELSP queue empty but a pending queue of requests.
Here, we skip the per-engine reset as there is no guilty request, but in
doing so we also skip the engine restart leaving ourselves with a
permanently hung engine. A quick way to recover is by moving the tasklet
kick to execlists_reset_finish() (from init_hw). We still emit the error
on hanging, so the error is not lost but we should be able to recover.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180604073441.6737-2-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>

5db1d4ea

12 6月, 2018 1 次提交

drm/i915/execlists: Avoid putting the error pointer · 467d3578

由 Chris Wilson 提交于 6月 11, 2018

On allocation error, do not jump to the unwind handler that tries to
free the error pointer.
Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a89d1f92 ("drm/i915: Split i915_gem_timeline into individual timelines")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611153332.14824-1-chris@chris-wilson.co.uk

467d3578

11 6月, 2018 1 次提交

drm/i915: Wrap around the tail offset before setting ring->tail · 41d37680

由 Chris Wilson 提交于 6月 11, 2018

The HW only accepts offsets within ring->size, and fails peculiarly if
the RING_HEAD or RING_TAIL is set to ring->size. Therefore whenever we
set ring->head/ring->tail we want to make sure it is within value (using
intel_ring_wrap()).

v2: Double check execlists as well
v3: Remove redundancy with assert_ring_tail_valid()
v4: Just assert in intel_ring_reset() rather than be over-defensive.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20180611110845.31890-2-chris@chris-wilson.co.uk

41d37680

06 6月, 2018 1 次提交

drm/i915/gtt: Rename i915_hw_ppgtt base member · 82ad6443

由 Chris Wilson 提交于 6月 05, 2018

In the near future, I want to subclass gen6_hw_ppgtt as it contains a
few specialised members and I wish to add more. To avoid the ugliness of
using ppgtt->base.base, rename the i915_hw_ppgtt base member
(i915_address_space) as vm, which is our common shorthand for an
i915_address_space local.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk

82ad6443

05 6月, 2018 2 次提交

drm/i915/perf: fix ctx_id read with GuC & ICL · 61d5676b

由 Lionel Landwerlin 提交于 6月 02, 2018

One thing we didn't really understand about the OA report is that the
ContextID field (dword 2) is copy of the context descriptor (dword 1).

On Gen8->10 and without using GuC we didn't notice the issue because
we only checked the 21bits of the ContextID field in the OA reports
which matches exactly the hw_id stored into the context descriptor.

When using GuC submission we have an issue of a non matching hw_id
because GuC uses bit 20 of the hw_id to signal proxy submission. This
change introduces a mask to compare only the relevant bits.

On ICL the context descriptor format has changed and we failed to
address this. On top of using a mask we also need to shift the bits
properly.

v2: Reuse lrc_desc rather than recomputing part of it (Chris/Michel)

v3: Always pin the context we're filtering with (Chris)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1de401c0 ("drm/i915/perf: enable perf support on ICL")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104252
BSpec: 1237
Testcase: igt/perf/gen8-unprivileged-single-ctx-counters
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180602112946.30803-3-lionel.g.landwerlin@intel.com
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org

61d5676b

drm/i915: drop one bit on the hw_id when using guc · 218b5000

由 Lionel Landwerlin 提交于 6月 02, 2018

We currently using GuC as a proxy to the hardware. When Guc is used in
such mode, it consumes the bit 20 of the hw_id to indicate that the
workload was submitted by proxy.

So far we probably haven't seen the issue because we need to allocate
1048576+ contexts to hit this issue. Still, we should avoid allocating
the hw_id on that bit and restriction to bits [0:19] (i.e 20bits
instead of 21).

v2: Leave the max hw_id computation in i915_gem_context.c (Michel)

v3: Be consistent on if/else usage (Chris)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
BSpec: 1237
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180602112946.30803-2-lionel.g.landwerlin@intel.com

218b5000

01 6月, 2018 1 次提交

drm/i915: After reset on sanitization, reset the engine backends · c3160da9

由 Chris Wilson 提交于 5月 31, 2018

As we reset the GPU on suspend/resume, we also do need to reset the
engine state tracking so call into the engine backends. This is
especially important so that we can also sanitize the state tracking
across resume.

References: https://bugs.freedesktop.org/show_bug.cgi?id=106702Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180531082246.9763-3-chris@chris-wilson.co.uk

c3160da9

25 5月, 2018 2 次提交

drm/i915/execlists: Wait for ELSP submission on restart · fe25f304

由 Chris Wilson 提交于 5月 22, 2018

After a reset, we will ensure that there is at least one request
submitted to HW to ensure that a context is loaded for powersaving.
Let's wait for this submission via a tasklet to complete before we drop
our forcewake, ensuring the system is ready for rc6 before we let it
possibly sleep.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180522101937.7738-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>

fe25f304

drm/i915: Flush the ring stop bit after clearing RING_HEAD in reset · 9a4dc803

由 Chris Wilson 提交于 5月 18, 2018

Inside the live_hangcheck (reset) selftests, we occasionally see
failures like

<7>[  239.094840] i915_gem_set_wedged rcs0
<7>[  239.094843] i915_gem_set_wedged 	current seqno 19a98, last 19a9a, hangcheck 0 [5158 ms]
<7>[  239.094846] i915_gem_set_wedged 	Reset count: 6239 (global 1)
<7>[  239.094848] i915_gem_set_wedged 	Requests:
<7>[  239.095052] i915_gem_set_wedged 		first  19a99 [e8c:5f] prio=1024 @ 5159ms: (null)
<7>[  239.095056] i915_gem_set_wedged 		last   19a9a [e81:1a] prio=139 @ 5159ms: igt/rcs0[5977]/1
<7>[  239.095059] i915_gem_set_wedged 		active 19a99 [e8c:5f] prio=1024 @ 5159ms: (null)
<7>[  239.095062] i915_gem_set_wedged 		[head 0220, postfix 0280, tail 02a8, batch 0xffffffff_ffffffff]
<7>[  239.100050] i915_gem_set_wedged 		ring->start:  0x00283000
<7>[  239.100053] i915_gem_set_wedged 		ring->head:   0x000001f8
<7>[  239.100055] i915_gem_set_wedged 		ring->tail:   0x000002a8
<7>[  239.100057] i915_gem_set_wedged 		ring->emit:   0x000002a8
<7>[  239.100059] i915_gem_set_wedged 		ring->space:  0x00000f10
<7>[  239.100085] i915_gem_set_wedged 	RING_START: 0x00283000
<7>[  239.100088] i915_gem_set_wedged 	RING_HEAD:  0x00000260
<7>[  239.100091] i915_gem_set_wedged 	RING_TAIL:  0x000002a8
<7>[  239.100094] i915_gem_set_wedged 	RING_CTL:   0x00000001
<7>[  239.100097] i915_gem_set_wedged 	RING_MODE:  0x00000300 [idle]
<7>[  239.100100] i915_gem_set_wedged 	RING_IMR: fffffefe
<7>[  239.100104] i915_gem_set_wedged 	ACTHD:  0x00000000_0000609c
<7>[  239.100108] i915_gem_set_wedged 	BBADDR: 0x00000000_0000609d
<7>[  239.100111] i915_gem_set_wedged 	DMA_FADDR: 0x00000000_00283260
<7>[  239.100114] i915_gem_set_wedged 	IPEIR: 0x00000000
<7>[  239.100117] i915_gem_set_wedged 	IPEHR: 0x02800000
<7>[  239.100120] i915_gem_set_wedged 	Execlist status: 0x00044052 00000002
<7>[  239.100124] i915_gem_set_wedged 	Execlist CSB read 5 [5 cached], write 5 [5 from hws], interrupt posted? no, tasklet queued? no (enabled)
<7>[  239.100128] i915_gem_set_wedged 		ELSP[0] count=1, ring->start=00283000, rq: 19a99 [e8c:5f] prio=1024 @ 5164ms: (null)
<7>[  239.100132] i915_gem_set_wedged 		ELSP[1] count=1, ring->start=00257000, rq: 19a9a [e81:1a] prio=139 @ 5164ms: igt/rcs0[5977]/1
<7>[  239.100135] i915_gem_set_wedged 		HW active? 0x5
<7>[  239.100250] i915_gem_set_wedged 		E 19a99 [e8c:5f] prio=1024 @ 5164ms: (null)
<7>[  239.100338] i915_gem_set_wedged 		E 19a9a [e81:1a] prio=139 @ 5164ms: igt/rcs0[5977]/1
<7>[  239.100340] i915_gem_set_wedged 		Queue priority: 139
<7>[  239.100343] i915_gem_set_wedged 		Q 0 [e98:19] prio=132 @ 5164ms: igt/rcs0[5977]/8
<7>[  239.100346] i915_gem_set_wedged 		Q 0 [e84:19] prio=121 @ 5165ms: igt/rcs0[5977]/2
<7>[  239.100349] i915_gem_set_wedged 		Q 0 [e87:19] prio=82 @ 5165ms: igt/rcs0[5977]/3
<7>[  239.100352] i915_gem_set_wedged 		Q 0 [e84:1a] prio=44 @ 5164ms: igt/rcs0[5977]/2
<7>[  239.100356] i915_gem_set_wedged 		Q 0 [e8b:19] prio=20 @ 5165ms: igt/rcs0[5977]/4
<7>[  239.100362] i915_gem_set_wedged 	drv_selftest [5894] waiting for 19a99

where the GPU saw an arbitration point and idles; AND HAS NOT BEEN RESET!
The RING_MODE indicates that is idle and has the STOP_RING bit set, so
try clearing it.

v2: Only clear the bit on restarting the ring, as we want to be sure the
STOP_RING bit is kept if reset fails on wedging.
v3: Spot when the ring state doesn't make sense when re-initialising the
engine and dump it to the logs so that we don't have to wait for an
error later and try to guess what happened earlier.
v4: Prepare to print all the unexpected state, not just the first.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180518100933.2239-1-chris@chris-wilson.co.uk

9a4dc803

19 5月, 2018 1 次提交

drm/i915/execlists: Handle copying default context state for atomic reset · fe0c4935

由 Chris Wilson 提交于 5月 18, 2018

We want to be able to reset the GPU from inside a timer callback
(hardirq context). One step requires us to copy the default context
state over to the guilty context, which means we need to plan in advance
to have that object accessible from within an atomic context. The atomic
context prevents us from pinning the object or from peeking into the
shmemfs backing store (all may sleep), so we choose to pin the
default_state into memory when the engine becomes active. This
compromise allows us to swap out the default state when idle, when
required.

References: 5692251c ("drm/i915/lrc: Scrub the GPU state of the guilty hanging request")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180518090212.5349-2-chris@chris-wilson.co.uk

fe0c4935

18 5月, 2018 4 次提交

drm/i915: Pull the context->pin_count dec into the common intel_context_unpin · 867985d4

由 Chris Wilson 提交于 5月 17, 2018

As all backends implement the same pin_count mechanism and do a
dec-and-test as their first step, pull that into the common
intel_context_unpin(). This also pulls into the caller, eliminating the
indirect call in the usual steady state case. The intel_context_pin()
side is a little more complicated as it combines the lookup/alloc as
well as pinning the state, and so is left for a later date.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-4-chris@chris-wilson.co.uk

867985d4

drm/i915: Store a pointer to intel_context in i915_request · 1fc44d9b

由 Chris Wilson 提交于 5月 17, 2018

To ease the frequent and ugly pointer dance of
&request->gem_context->engine[request->engine->id] during request
submission, store that pointer as request->hw_context. One major
advantage that we will exploit later is that this decouples the logical
context state from the engine itself.

v2: Set mock_context->ops so we don't crash and burn in selftests.
    Cleanups from Tvrtko.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-3-chris@chris-wilson.co.uk

1fc44d9b

drm/i915: Move request->ctx aside · 4e0d64db

由 Chris Wilson 提交于 5月 17, 2018

In the next patch, we want to store the intel_context pointer inside
i915_request, as it is frequently access via a convoluted dance when
submitting the request to hw. Having two context pointers inside
i915_request leads to confusion so first rename the existing
i915_gem_context pointer to i915_request.gem_context.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-1-chris@chris-wilson.co.uk

4e0d64db

drm/i915/execlists: HWACK checking superseded checking port[0].count · 57877b70

由 Chris Wilson 提交于 5月 17, 2018

The HWACK bit more generically solves the problem of resubmitting ESLP
while the hardware is still processing the current ELSP write. We no
longer need to check port[0].count itself.

References: ba74cb10 ("drm/i915/execlists: Delay writing to ELSP until HW has processed the previous write")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517115647.17205-1-chris@chris-wilson.co.uk

57877b70

17 5月, 2018 6 次提交

drm/i915: Stop parking the signaler around reset · 3f6e9822