提交 · fc4c79c37e822719cab447a448af0b48f4a52418 · openanolis / cloud-kernel

12 10月, 2016 5 次提交

drm/i915: Consolidate error object printing · fc4c79c3

由 Chris Wilson 提交于 10月 12, 2016

Leave all the pretty printing to userspace and simplify the error
capture to only have a single common object printer. It makes the kernel
code more compact, and the refactoring allows us to apply more complex
transformations like compressing the output.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-5-chris@chris-wilson.co.uk

fc4c79c3

drm/i915: Always use the GTT for error capture · 95374d75

由 Chris Wilson 提交于 10月 12, 2016

Since the GTT provides universal access to any GPU page, we can use it
to reduce our plethora of read methods to just one. It also has the
important characteristic of being exactly what the GPU sees - if there
are incoherency problems, seeing the batch as executed (rather than as
trapped inside the cpu cache) is important.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-4-chris@chris-wilson.co.uk

95374d75

drm/i915: Stop the machine whilst capturing the GPU crash dump · 9f267eb8

由 Chris Wilson 提交于 10月 12, 2016

The error state is purposefully racy as we expect it to be called at any
time and so have avoided any locking whilst capturing the crash dump.
However, with multi-engine GPUs and multiple CPUs, those races can
manifest into OOPSes as we attempt to chase dangling pointers freed on
other CPUs. Under discussion are lots of ways to slow down normal
operation in order to protect the post-mortem error capture, but what it
we take the opposite approach and freeze the machine whilst the error
capture runs (note the GPU may still running, but as long as we don't
process any of the results the driver's bookkeeping will be static).

Note that by of itself, this is not a complete fix. It also depends on
the compiler barriers in list_add/list_del to prevent traversing the
lists into the void. We also depend that we only require state from
carefully controlled sources - i.e. all the state we require for
post-mortem debugging should be reachable from the request itself so
that we only have to worry about retrieving the request carefully. Once
we have the request, we know that all pointers from it are intact.

v2: Avoid drm_clflush_pages() inside stop_machine() as it may use
stop_machine() itself for its wbinvd fallback.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-3-chris@chris-wilson.co.uk

9f267eb8

drm/i915: Allow disabling error capture · 98a2f411

由 Chris Wilson 提交于 10月 12, 2016

We currently capture the GPU state after we detect a hang. This is vital
for us to both triage and debug hangs in the wild (post-mortem
debugging). However, it comes at the cost of running some potentially
dangerous code (since it has to make very few assumption about the state
of the driver) that is quite resource intensive.

This patch introduces both a method to disable error capture at runtime
(for users who hit bugs at runtime and need a workaround) and to disable
error capture at compiletime (for realtime users who want to minimise
any possible latency, and never require error capture, saving ~30k of
code). The cost is that we now have to be wary of (and test!) a kconfig
flag and a module parameter. The effect of the module parameter is easy
to verify through code inspection and runtime testing, but a kconfig flag
needs regular compile checking.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-2-chris@chris-wilson.co.uk

98a2f411

drm/i915: Move common code out of i915_gpu_error.c · 0e704476

由 Chris Wilson 提交于 10月 12, 2016

In the next patch, I want to conditionally compile i915_gpu_error.c and
that requires moving the functions used by debug out of
i915_gpu_error.c!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-1-chris@chris-wilson.co.uk

0e704476

05 10月, 2016 2 次提交

drm/i915: Reduce trickery in DEV_INFO_FOR_EACH_FLAG · 604db650

由 Joonas Lahtinen 提交于 10月 05, 2016

Get rid of SEP_SEMICOLON and SEP_BLANK in DEV_INFO_FOR_EACH_FLAG.
Consolidate the debug output so that instead of one huge line with
"cap1,cap2,capN" each capability is split to own line and displayed
as "capN: [yes|no]" to make the dumps more historically informative.

v2:
- Do not break auto-indent by keeping semicolon after macro (Jani)
- Consolidate and use yesno() in all locations (Chris)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

604db650

drm/i915: Show bounds of active request in the ring on GPU hang · cdb324bd

由 Chris Wilson 提交于 10月 04, 2016

Include the position of the active request in the ring, and display that
alongside the current RING registers (on a GPU hang).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161004201132.21801-6-chris@chris-wilson.co.uk

cdb324bd

21 9月, 2016 2 次提交

drm/i915: Try to print INSTDONE bits for all slice/subslice · f9e61372

由 Ben Widawsky 提交于 9月 20, 2016

v2: (Imre)
- Access only subslices that are known to exist.
- Reset explicitly the MCR selector to slice/sub-slice ID 0 after the
  readout.
- Use the subslice INSTDONE bits for the hangcheck/subunits-stuck
  detection too.
- Take the uncore lock for the MCR-select/subslice-readout sequence.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1474379673-28326-2-git-send-email-imre.deak@intel.com

f9e61372

drm/i915: Cleanup instdone collection · d636951e

由 Ben Widawsky 提交于 9月 20, 2016

Consolidate the instdone logic so we can get a bit fancier. This patch also
removes the duplicated print of INSTDONE[0].

v2: (Imre)
- Rebased on top of hangcheck INSTDONE changes.
- Move all INSTDONE registers into a single struct, store it within the
  engine error struct during error capturing.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1474379673-28326-1-git-send-email-imre.deak@intel.com

d636951e

08 9月, 2016 1 次提交

drm/i915: Make HWS_NEEDS_PHYSICAL the exception · 3177659a

由 Carlos Santa 提交于 8月 17, 2016

Make the .hws_needs_physical the exception by switching the flag
on earlier platforms since they are fewer to support. Remove the flag on
later GPUs hardware since they all use GTT hws by default.

Switch the logic as well in the driver to reflect this change
Signed-off-by: NCarlos Santa <carlos.santa@intel.com>
Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

3177659a

06 9月, 2016 1 次提交

drm/i915: Don't wait for a spinlock inside error capture · 19eb9189

由 Chris Wilson 提交于 9月 06, 2016

If we can't grab the breadcrumb's spinlock, possibly due to a driver
deadlock inside the waiters, ignore them. Like hangcheck, error
capturing must work no matter how the driver/GPU dies.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160906073844.22561-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

19eb9189

22 8月, 2016 1 次提交

drm/i915: pdev cleanup · 52a05c30

由 David Weinehall 提交于 8月 22, 2016

In an effort to simplify things for a future push of dev_priv instead
of dev wherever possible, always take pdev via dev_priv where
feasible, eliminating the direct access from dev. Right now this
only eliminates a few cases of dev, but it also obviates that we pass
dev into a lot of functions where dev_priv would be the more obvious
choice.

v2: Fixed one more place missing in the previous patch set
Signed-off-by: NDavid Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-5-david.weinehall@linux.intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

52a05c30

20 8月, 2016 1 次提交

drm/i915: Embed the io-mapping struct inside drm_i915_private · f7bbe788

由 Chris Wilson 提交于 8月 19, 2016

As io_mapping.h now always allocates the struct, we can avoid that
allocation and extra pointer dance by embedding the struct inside
drm_i915_private
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-5-chris@chris-wilson.co.uk

f7bbe788

19 8月, 2016 1 次提交

drm/i915: Move fence tracking from object to vma · 49ef5294

由 Chris Wilson 提交于 8月 18, 2016

In order to handle tiled partial GTT mmappings, we need to associate the
fence with an individual vma.

v2: A couple of silly drops replaced spotted by Joonas
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-21-chris@chris-wilson.co.uk

49ef5294

15 8月, 2016 14 次提交

drm/i915: Record the RING_MODE register for post-mortem debugging · 21a2c58a