提交 · 4c914c0c7c787b8f730128a8cdcca9c50b0784ab · openeuler / raspberrypi-kernel

04 2月, 2014 1 次提交

drm/i915: Use hangcheck score to find guilty context · b6b0fac0

由 Mika Kuoppala 提交于 1月 30, 2014

With full ppgtt using acthd is not enough to find guilty
batch buffer. We get multiple false positives as acthd is
per vm.

Instead of scanning which vm was running on a ring,
to find corressponding context, use a different, simpler,
strategy of finding batches that caused gpu hang:

If hangcheck has declared ring to be hung, find first non complete
request on that ring and claim it was guilty.

v2: Rebase

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73652Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b6b0fac0

10 9月, 2013 1 次提交

drm/i915: Write RING_TAIL once per-request · 09246732

由 Chris Wilson 提交于 8月 10, 2013

Ignoring the legacy DRI1 code, and a couple of special cases (to be
discussed later), all access to the ring is mediated through requests.
The first write to a ring will grab a seqno and mark the ring as having
an outstanding_lazy_request. Either through explicitly adding a request
after an execbuffer or through an implicit wait (either by the CPU or by
a semaphore), that sequence of writes will be terminated with a request.
So we can ellide all the intervening writes to the tail register and
send the entire command stream to the GPU at once. This will reduce the
number of *serialising* writes to the tail register by a factor or 3-5
times (depending upon architecture and number of workarounds, context
switches, etc involved). This becomes even more noticeable when the
register write is overloaded with a number of debugging tools. The
astute reader will wonder if it is then possible to overflow the ring
with a single command. It is not. When we start a command sequence to
the ring, we check for available space and issue a wait in case we have
not. The ring wait will in this case be forced to flush the outstanding
register write and then poll the ACTHD for sufficient space to continue.

The exception to the rule where everything is inside a request are a few
initialisation cases where we may want to write GPU commands via the CS
before userspace wakes up and page flips.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

09246732

06 9月, 2013 1 次提交

drm/i915: include hangcheck action and score in error_state · da661464

由 Mika Kuoppala 提交于 9月 06, 2013

Score and action reveals what all the rings were doing
and why hang was declared. Add idle state so that
we can distinguish between waiting and idle ring.

v2: - add idle as a hangcheck action
    - consensed hangcheck status to single line (Chris)
    - mark active explicitly when we are making progress (Chris)
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

da661464

05 9月, 2013 2 次提交

drm/i915; Preallocate the lazy request · 3c0e234c

由 Chris Wilson 提交于 9月 04, 2013

It is possible for us to be forced to perform an allocation for the lazy
request whilst running the shrinker. This allocation may fail, leaving
us unable to reclaim any memory leading to premature OOM. A neat
solution to the problem is to preallocate the request at the same time
as acquiring the seqno for the ring transaction. This means that we can
report ENOMEM prior to touching the rings.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

3c0e234c

drm/i915: Rename ring->outstanding_lazy_request · 1823521d

由 Chris Wilson 提交于 9月 04, 2013

Prior to preallocating an request for lazy emission, rename the existing
field to make way (and differentiate the seqno from the request struct).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1823521d

04 9月, 2013 1 次提交

drm/i915: Embed the ring->private within the struct intel_ring_buffer · 0d1aacac

由 Chris Wilson 提交于 8月 26, 2013

We now have more devices using ring->private than not, and they all want
the same structure. Worse, I would like to use a scratch page from
outside of intel_ringbuffer.c and so for convenience would like to reuse
ring->private. Embed the object into the struct intel_ringbuffer so that
we can keep the code clean.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0d1aacac

23 8月, 2013 1 次提交

drm/i915: Remove I915_READ_{NOPID, SYNC_0, SYNC_1})() · e3ce7633

由 Damien Lespiau 提交于 8月 19, 2013

The code directly uses the registers and ring->mmio_base.
Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e3ce7633

22 8月, 2013 1 次提交

drm/i915: give more distinctive names to ring hangcheck action enums · f2f4d82f

由 Jani Nikula 提交于 8月 11, 2013

The short lowercase names are bound to collide. The default warnings
don't even warn about shadowing.
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f2f4d82f

11 7月, 2013 2 次提交

drm/i915: unify ring irq refcounts (again) · c7113cc3

由 Daniel Vetter 提交于 7月 04, 2013

With the simplified locking there's no reason any more to keep the
refcounts seperate.

v2: Readd the lost comment that ring->irq_refcount is protected by
dev_priv->irq_lock.
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c7113cc3

drm/i915: kill dev_priv->rps.lock · 59cdb63d

由 Daniel Vetter 提交于 7月 04, 2013

Now that the rps interrupt locking isn't clearly separated (at elast
conceptually) from all the other interrupt locking having a different
lock stopped making sense: It protects much more than just the rps
workqueue it started out with. But with the addition of VECS the
separation started to blurr and resulted in some more complex locking
for the ring interrupt refcount.

With this we can (again) unifiy the ringbuffer irq refcounts without
causing a massive confusion, but that's for the next patch.

v2: Explain better why the rps.lock once made sense and why no longer,
requested by Ben.
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

59cdb63d

13 6月, 2013 1 次提交

drm/i915: store ring hangcheck action · ad8beaea

由 Mika Kuoppala 提交于 6月 12, 2013

For guilty batchbuffer analysis later on when rings are reset,
store what state the ring was on when hang was declared.
This helps to weed out the waiting rings from the active ones.
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ad8beaea

11 6月, 2013 1 次提交

drm/i915: Don't count semaphore waits towards a stuck ring · 6274f212

由 Chris Wilson 提交于 6月 10, 2013

If we detect a ring is in a valid wait for another, just let it be.
Eventually it will either begin to progress again, or the entire system
will come grinding to a halt and then hangcheck will fire as soon as the
deadlock is detected.

This error was foretold by Ben in
commit 05407ff8
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Thu May 30 09:04:29 2013 +0300

    drm/i915: detect hang using per ring hangcheck_score

"If ring B is waiting on ring A via semaphore, and ring A is making
progress, albeit slowly - the hangcheck will fire. The check will
determine that A is moving, however ring B will appear hung because
the ACTHD doesn't move. I honestly can't say if that's actually a
realistic problem to hit it probably implies the timeout value is too
low."

v2: Make sure we don't even incur the KICK cost whilst waiting.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65394Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6274f212

07 6月, 2013 1 次提交

drm/i915: Track when we dirty the scanout with render commands · c65355bb

由 Chris Wilson 提交于 6月 06, 2013

This is required for tracking render damage for use with FBC and will be
used in subsequent patches.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c65355bb

03 6月, 2013 1 次提交

drm/i915: detect hang using per ring hangcheck_score · 05407ff8

由 Mika Kuoppala 提交于 5月 30, 2013

Keep track of ring seqno progress and if there are no
progress detected, declare hang. Use actual head (acthd)
to distinguish between ring stuck and batchbuffer looping
situation. Stuck ring will be kicked to trigger progress.

This commit adds a hard limit for batchbuffer completion time.
If batchbuffer completion time is more than 4.5 seconds,
the gpu will be declared hung.

Review comment from Ben which nicely clarifies the semantic change:

"Maybe I'm just stating the functional changes of the patch, but in case
they were unintended here is what I see as potential issues:

1. "If ring B is waiting on ring A via semaphore, and ring A is making
   progress, albeit slowly - the hangcheck will fire. The check will
   determine that A is moving, however ring B will appear hung because
   the ACTHD doesn't move. I honestly can't say if that's actually a
   realistic problem to hit it probably implies the timeout value is too
   low.

2. "There's also another corner case on the kick. If the seqno = 2
   (though not stuck), and on the 3rd hangcheck, the ring is stuck, and
   we try to kick it... we don't actually try to find out if the kick
   helped"

v2: use atchd to detect stuck ring from loop (Ben Widawsky)

v3: Use acthd to check when ring needs kicking.
Declare hang on third time in order to give time for
kick_ring to take effect.

v4: Update commit msg
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
[danvet: Paste in Ben's review comment.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

05407ff8

01 6月, 2013 7 次提交

drm/i915: vebox interrupt get/put · a19d2933