提交 · c8be1a5fc5c63b78f86f3c15fd700163ecfed57f · openeuler / Kernel

06 8月, 2019 1 次提交

drm/i915/guc: Prefer intel_guc_is_submission_supported · c8be1a5f

由 Michal Wajdeczko 提交于 8月 04, 2019

No need to use intel_uc_supports_guc_submission(uc) as we
can directly use intel_guc_is_submission_supported(guc)
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190804195052.31140-2-michal.wajdeczko@intel.com

c8be1a5f

04 8月, 2019 1 次提交

drm/i915: Replace struct_mutex for batch pool serialisation · b40d7378

由 Chris Wilson 提交于 8月 04, 2019

Switch to tracking activity via i915_active on individual nodes, only
keeping a list of retired objects in the cache, and reaping the cache
when the engine itself idles.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190804124826.30272-2-chris@chris-wilson.co.uk

b40d7378

03 8月, 2019 7 次提交

drm/i915: Hide unshrinkable context objects from the shrinker · 1aff1903

由 Chris Wilson 提交于 8月 02, 2019

The shrinker cannot touch objects used by the contexts (logical state
and ring). Currently we mark those as "pin_global" to let the shrinker
skip over them, however, if we remove them from the shrinker lists
entirely, we don't event have to include them in our shrink accounting.

By keeping the unshrinkable objects in our shrinker tracking, we report
a large number of objects available to be shrunk, and leave the shrinker
deeply unsatisfied when we fail to reclaim those. The shrinker will
persist in trying to reclaim the unavailable objects, forcing the system
into a livelock (not even hitting the dread oomkiller).

v2: Extend unshrinkable protection for perma-pinned scratch and guc
allocations (Tvrtko)
v3: Notice that we should be pinned when marking unshrinkable and so the
link cannot be empty; merge duplicate paths.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk

1aff1903

drm/i915/wopcm: Don't fail on WOPCM partitioning failure · 6bd0fbe1

由 Michal Wajdeczko 提交于 8月 02, 2019

We don't have to immediately fail on WOPCM partitioning, we can wait
until we will start programming WOPCM registers. This should give us
more options if we decide to restore fallback in case of GuC failures.

v3: rebased
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802184055.31988-7-michal.wajdeczko@intel.com

6bd0fbe1

drm/i915/uc: Inject probe errors into intel_uc_init_hw · 5d1ef2b4

由 Michal Wajdeczko 提交于 8月 02, 2019

Inject probe errors into intel_uc_init_hw to make sure we
correctly handle any uC initialization failure.

To avoid complains from CI about injected errors use
i915_probe_error to lower message level.

v4: rebased after moving hot fixes moved to separate patches
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v1
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802184055.31988-6-michal.wajdeczko@intel.com

5d1ef2b4

drm/i915/uc: Move GuC error log to uc and release it on fini · 32ff76e8

由 Michal Wajdeczko 提交于 8月 02, 2019

When we fail to load GuC and want to abort probe, we hit:

<7> [229.915779] i915 0000:00:02.0: [drm:intel_uc_init_hw [i915]] GuC initialization failed -6
<7> [229.915813] i915 0000:00:02.0: [drm:i915_gem_init_hw [i915]] Enabling uc failed (-6)
<4> [229.953354] ------------[ cut here ]------------
<4> [229.953355] WARN_ON(dev_priv->mm.shrink_count)
<4> [229.953406] WARNING: CPU: 9 PID: 3287 at drivers/gpu/drm/i915/i915_gem.c:1684 i915_gem_cleanup_early+0xfc/0x110 [i915]
<4> [229.953464] Call Trace:
<4> [229.953489]  i915_driver_late_release+0x19/0x60 [i915]
<4> [229.953514]  i915_driver_probe+0xb82/0x18a0 [i915]
<4> [229.953519]  ? __pm_runtime_resume+0x4f/0x80
<4> [229.953545]  i915_pci_probe+0x43/0x1b0 [i915]
...
<4> [229.962951] ------------[ cut here ]------------
<4> [229.962956] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
<4> [229.962959] WARNING: CPU: 8 PID: 2395 at kernel/locking/mutex.c:912 __mutex_lock+0x750/0x9b0
<4> [229.963091] Call Trace:
<4> [229.963129]  ? i915_vma_destroy+0x86/0x350 [i915]
<4> [229.963166]  ? i915_vma_destroy+0x86/0x350 [i915]
<4> [229.963201]  i915_vma_destroy+0x86/0x350 [i915]
<4> [229.963236]  __i915_gem_free_objects+0xb8/0x510 [i915]
<4> [229.963270]  __i915_gem_free_work+0x5a/0x90 [i915]
<4> [229.963275]  process_one_work+0x245/0x610

as since commit 6f76098f ("drm/i915/uc: Move uC early functions
inside the GT ones") we cleanup uc after gem.

Move captured GuC load error log to uc struct and release it
in intel_uc_fini() instead of intel_uc_driver_late_release()

Note that intel_uc_driver_late_release() is now empty, but
we can leave it as a placeholder for future code.
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802184055.31988-5-michal.wajdeczko@intel.com

32ff76e8

drm/i915/uc: Reorder firmware status codes · 3243bd09

由 Michal Wajdeczko 提交于 8月 02, 2019

On Gen9 when we try to reload HuC due to GuC upload error, we hit:

<7> [232.025927] [drm:intel_uc_init_hw [i915]] GuC fw load failed: -8; will reset and retry 2 more time(s)
<7> [232.026004] [drm:intel_uc_fw_upload [i915]] HuC fw load i915/kbl_huc_ver02_00_1810.bin
<7> [232.026686] [drm:intel_uc_fw_upload [i915]] HuC fw xfer completed
<6> [232.026688] [drm] HuC: Loaded firmware i915/kbl_huc_ver02_00_1810.bin (version 2.0)
<3> [232.026703] intel_uc_fw_copy_rsa:541 GEM_BUG_ON(!intel_uc_fw_is_available(uc_fw))

as firmware that previously failed to load was wrongly treated as
unavailable since its status code was not matching status check logic.
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802184055.31988-4-michal.wajdeczko@intel.com

3243bd09

drm/i915/uc: Do full sanitize instead of pure reset · 771051ea

由 Michal Wajdeczko 提交于 8月 02, 2019

On Gen9 when we try to reload HuC due to GuC upload error, we hit:

<7> [229.656688] [drm:intel_uc_init_hw [i915]] GuC fw load failed: -8; will reset and retry 2 more time(s)
<7> [229.656739] [drm:intel_uc_fw_upload [i915]] HuC fw load i915/kbl_huc_ver02_00_1810.bin
<3> [229.656740] intel_uc_fw_upload:425 GEM_BUG_ON(intel_uc_fw_is_loaded(uc_fw))

as we performed only pure reset and didn't sanitized HuC fw status.
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802184055.31988-3-michal.wajdeczko@intel.com

771051ea

drm/i915: Add i915 to i915_inject_probe_failure · 50d84418

由 Michal Wajdeczko 提交于 8月 02, 2019

With i915 added to i915_inject_probe_failure we can use dedicated
printk when injecting artificial load failure.

Also make this function look like other i915 functions that return
error code and make it more flexible to return any provided error
code instead of previously assumed -ENODEV.
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802184055.31988-2-michal.wajdeczko@intel.com

50d84418

02 8月, 2019 10 次提交

drm/i915: Allow sharing the idle-barrier from other kernel requests · d8af05ff

由 Chris Wilson 提交于 8月 02, 2019

By placing our idle-barriers in the i915_active fence tree, we expose
those for reuse by other components that are issuing requests along the
kernel_context. Reusing the proto-barrier active_node is perfectly fine
as the new request implies a context-switch, and so an opportune point
to run the idle-barrier. However, the proto-barrier is not equivalent
to a normal active_node and care must be taken to avoid dereferencing the
ERR_PTR used as its request marker.

v2: Comment the more egregious cheek
v3: A glossary!
Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ce476c80 ("drm/i915: Keep contexts pinned until after the next kernel context switch")
Fixes: a9877da2 ("drm/i915/oa: Reconfigure contexts on the fly")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802100015.1281-1-chris@chris-wilson.co.uk

d8af05ff

drm/i915/pmu: Atomically acquire the gt_pm wakeref · 51fbd8de

由 Chris Wilson 提交于 8月 02, 2019

Currently, we only sample if the intel_gt is awake, but we acquire our
own runtime_pm wakeref. Since intel_gt has transitioned to tracking its
own wakeref, we can atomically test and acquire that wakeref instead.

v2: Take engine->wakeref for engine sampling
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801233616.23007-1-chris@chris-wilson.co.uk

51fbd8de

drm/i915/uc: Stop sanitizing enable_guc modparam · 01158da7

由 Michal Wajdeczko 提交于 8月 01, 2019

As we already track GuC/HuC uses by other means than modparam
there is no point in sanitizing it. Just scan modparam for
major discrepancies between what was requested vs actual.

v2: rebased, reworded info messages
v3: oops
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801132840.33176-1-michal.wajdeczko@intel.com

01158da7

drm/i915/guc: Use dedicated flag to track submission mode · 724df646

由 Michal Wajdeczko 提交于 7月 31, 2019

Instead of relying on enable_guc modparam to represent actual
GuC submission mode, use dedicated flag and look at modparam
only to check if submission was explicitly disabled by the user.

v2: rebased, simplified condition (Chris)
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190731223321.36436-4-michal.wajdeczko@intel.com

724df646

drm/i915/uc: Consider enable_guc modparam during fw selection · db81bc6e

由 Michal Wajdeczko 提交于 7月 31, 2019

We can use value of enable_guc modparam during firmware path selection
and start using firmware status to see if GuC/HuC is being used.
This is first step to make enable_guc modparam read-only.

v2: rebased, don't care about <0 (Chris)
v3: oops
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190731223321.36436-3-michal.wajdeczko@intel.com

db81bc6e

drm/i915/uc: Rename intel_uc_is_using* into intel_uc_supports* · 57a68c35

由 Michal Wajdeczko 提交于 7月 31, 2019

Rename intel_uc_is_using* into intel_uc_supports* to make clear
distinction from actual state (compare intel_uc_fw_is_running)
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190731223321.36436-2-michal.wajdeczko@intel.com

57a68c35

drm/i915/gt: Introduce intel_gt_runtime_suspend/resume · 9dfe3459

由 Daniele Ceraolo Spurio 提交于 7月 31, 2019

To be called from the top level runtime functions, to hide the
gt-specific bits (mainly related to intel_uc).

v2: rebased
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801005709.34092-3-daniele.ceraolospurio@intel.com

9dfe3459

drm/i915/uc: Move uC early functions inside the GT ones · 6f76098f

由 Daniele Ceraolo Spurio 提交于 7月 31, 2019

uC is a subcomponent of GT, so initialize/clean it as part of it. The
wopcm_init_early doesn't have to be happen before the uC one, but since
in other parts of the code we consider WOPCM first do the same for
consistency.

v2: s/cleanup_early/late_release to match the caller
v3: s/late_release/driver_late_release/ (Chris)
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> #v1
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801005709.34092-2-daniele.ceraolospurio@intel.com

6f76098f

drm/i915/gt: Move gt_cleanup_early out of gem_cleanup_early · 6cf72db6

由 Daniele Ceraolo Spurio 提交于 7月 31, 2019

We don't call the init_early function from within the gem code, so we
shouldn't do it for the cleanup either.

v2: while at it, s/gt_cleanup_early/gt_late_release (Chris)
v3: s/late_release/driver_late_release/ (Chris)
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801005709.34092-1-daniele.ceraolospurio@intel.com

6cf72db6

drm/i915: Remove lrc default desc from GEM context · a1c9ca22

由 Chris Wilson 提交于 7月 30, 2019

We only compute the lrc_descriptor() on pinning the context, i.e.
infrequently, so we do not benefit from storing the template as the
addressing mode is also fixed for the lifetime of the intel_context.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NPrathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730133035.1977-9-chris@chris-wilson.co.uk

a1c9ca22

01 8月, 2019 1 次提交

drm/i915/execlists: Always clear pending&inflight requests on reset · 10e36489

由 Chris Wilson 提交于 7月 30, 2019

If we skip the reset as we found the engine inactive at the time of the
reset, we still need to clear the residual inflight & pending request
bookkeeping to reflect the current state of HW.

Otherwise, we may end up stuck in a loop like:

<7> [416.490346] hangcheck rcs0
<7> [416.490371] hangcheck 	Awake? 1
<7> [416.490376] hangcheck 	Hangcheck: 8003 ms ago
<7> [416.490380] hangcheck 	Reset count: 0 (global 0)
<7> [416.490383] hangcheck 	Requests:
<7> [416.491210] hangcheck 	RING_START: 0x0017b000
<7> [416.491983] hangcheck 	RING_HEAD:  0x00000048
<7> [416.491992] hangcheck 	RING_TAIL:  0x00000048
<7> [416.492006] hangcheck 	RING_CTL:   0x00000000
<7> [416.492037] hangcheck 	RING_MODE:  0x00000200 [idle]
<7> [416.492044] hangcheck 	RING_IMR: 00000000
<7> [416.492809] hangcheck 	ACTHD:  0x00000000_9ca00048
<7> [416.492824] hangcheck 	BBADDR: 0x00000000_00001004
<7> [416.492838] hangcheck 	DMA_FADDR: 0x00000000_00000000
<7> [416.492845] hangcheck 	IPEIR: 0x00000000
<7> [416.492852] hangcheck 	IPEHR: 0x00000000
<7> [416.492863] hangcheck 	Execlist status: 0x00018001 00000000, entries 12
<7> [416.492869] hangcheck 	Execlist CSB read 1, write 1, tasklet queued? no (enabled)
<7> [416.492938] hangcheck 		Pending[0] ring:{start:0017b000, hwsp:fedf9000, seqno:00016fd6}, rq:  20ffa:16fd6!+  prio=-4094 @ 8307ms: signaled
<7> [416.492972] hangcheck 		Queue priority hint: -4093
<7> [416.492979] hangcheck 		Q  20ffa:16fd8-  prio=-4093 @ 8307ms: [i915]
<7> [416.492985] hangcheck 		Q  20ffa:16fda  prio=-4094 @ 8307ms: [i915]
<7> [416.492990] hangcheck 		Q  20ffa:16fdc  prio=-4094 @ 8307ms: [i915]
<7> [416.492996] hangcheck 		Q  20ffa:16fde  prio=-4094 @ 8307ms: [i915]
<7> [416.493001] hangcheck 		Q  20ffa:16fe0  prio=-4094 @ 8307ms: [i915]
<7> [416.493007] hangcheck 		Q  20ffa:16fe2  prio=-4094 @ 8307ms: [i915]
<7> [416.493013] hangcheck 		Q  20ffa:16fe4  prio=-4094 @ 8307ms: [i915]
<7> [416.493021] hangcheck 		...skipping 21 queued requests...
<7> [416.493027] hangcheck 		Q  20ffa:17010  prio=-4094 @ 8307ms: [i915]
<7> [416.493081] hangcheck HWSP:
<7> [416.493089] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [416.493094] hangcheck *
<7> [416.493100] hangcheck [0040] 10008002 00000000 10000018 00000000 10000018 00000000 10000001 00000000
<7> [416.493106] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000000 10000001 00000000
<7> [416.493111] hangcheck *
<7> [416.493117] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
<7> [416.493123] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [416.493127] hangcheck *
<7> [416.493132] hangcheck Idle? no
<6> [416.512124] i915 0000:00:02.0: GPU HANG: ecode 11:0:0x00000000, hang on rcs0
<6> [416.512205] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6> [416.512207] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6> [416.512208] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6> [416.512210] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6> [416.512212] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<5> [416.513602] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
<7> [424.489258] hangcheck rcs0
<7> [424.489263] hangcheck 	Awake? 1
<7> [424.489267] hangcheck 	Hangcheck: 5954 ms ago
<7> [424.489271] hangcheck 	Reset count: 1 (global 0)
<7> [424.489274] hangcheck 	Requests:
<7> [424.490128] hangcheck 	RING_START: 0x00000000
<7> [424.490870] hangcheck 	RING_HEAD:  0x00000000
<7> [424.490877] hangcheck 	RING_TAIL:  0x00000000
<7> [424.490887] hangcheck 	RING_CTL:   0x00000000
<7> [424.490897] hangcheck 	RING_MODE:  0x00000200 [idle]
<7> [424.490904] hangcheck 	RING_IMR: 00000000
<7> [424.490917] hangcheck 	ACTHD:  0x00000000_00000000
<7> [424.490930] hangcheck 	BBADDR: 0x00000000_00000000
<7> [424.490943] hangcheck 	DMA_FADDR: 0x00000000_00000000
<7> [424.490950] hangcheck 	IPEIR: 0x00000000
<7> [424.490956] hangcheck 	IPEHR: 0x00000000
<7> [424.490968] hangcheck 	Execlist status: 0x00000001 00000000, entries 12
<7> [424.490972] hangcheck 	Execlist CSB read 11, write 11, tasklet queued? no (enabled)
<7> [424.490983] hangcheck 		Pending[0] ring:{start:0017b000, hwsp:fedf9000, seqno:00016fd6}, rq:  20ffa:16fd6!+  prio=-4094 @ 16305ms: signaled
<7> [424.490989] hangcheck 		Queue priority hint: -4093
<7> [424.490996] hangcheck 		Q  20ffa:16fd8-  prio=-4093 @ 16305ms: [i915]
<7> [424.491001] hangcheck 		Q  20ffa:16fda  prio=-4094 @ 16305ms: [i915]
<7> [424.491006] hangcheck 		Q  20ffa:16fdc  prio=-4094 @ 16305ms: [i915]
<7> [424.491011] hangcheck 		Q  20ffa:16fde  prio=-4094 @ 16305ms: [i915]
<7> [424.491016] hangcheck 		Q  20ffa:16fe0  prio=-4094 @ 16305ms: [i915]
<7> [424.491022] hangcheck 		Q  20ffa:16fe2  prio=-4094 @ 16305ms: [i915]
<7> [424.491048] hangcheck 		Q  20ffa:16fe4  prio=-4094 @ 16305ms: [i915]
<7> [424.491057] hangcheck 		...skipping 21 queued requests...
<7> [424.491063] hangcheck 		Q  20ffa:17010  prio=-4094 @ 16305ms: [i915]
<7> [424.491095] hangcheck HWSP:
<7> [424.491102] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [424.491106] hangcheck *
<7> [424.491113] hangcheck [0040] 10008002 00000000 10000018 00000000 10000018 00000000 10000001 00000000
<7> [424.491118] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000000 10000001 00000000
<7> [424.491122] hangcheck *
<7> [424.491127] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0000000b
<7> [424.491133] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [424.491136] hangcheck *
<7> [424.491141] hangcheck Idle? no
<5> [424.491834] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0

Where not having cleared the pending array on reset, it persists
indefinitely.

Fixes: fff8102a ("drm/i915/execlists: Process interrupted context on reset")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NAndi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730133035.1977-2-chris@chris-wilson.co.uk

10e36489

31 7月, 2019 8 次提交

drm/i915: Move MOCS setup to intel_mocs.c · 1b6c3c6d

由 Tvrtko Ursulin 提交于 7月 30, 2019

Hide the details of MOCS setup from i915_gem by moving both current calls
into one in intel_mocs_init.

Cc: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NStuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190713010940.17711-21-lucas.demarchi@intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20190730180407.5993-6-lucas.demarchi@intel.com

1b6c3c6d

drm/i915/tgl: Tigerlake only has global MOCS registers · a7a7a0e6

由 Michel Thierry 提交于 7月 30, 2019

Until Icelake, each engine had its own set of 64 MOCS registers. In
order to simplify, Tigerlake moves to only 64 Global MOCS registers,
which are no longer part of the engine context. Since these registers
are now global, they also only need to be initialized once.

>From Gen12 onwards, MOCS must specify the target cache (3:2) and LRU
management (5:4) fields and cannot be programmed to 'use the value from
Private PAT', because these fields are no longer part of the PPAT. Also
cacheability control (1:0) field has changed, 00 no longer means 'use
controls from page table', but uncacheable (UC).

v2 (Lucas):
    - Move the changes to the fault registers to a separate commit - the
      old ones overlap with the range used by the new global MOCS
      (requested by Daniele)
v3 (Lucas):
    - Clarify comment about setting the unused entries to the same value
      of index 0, that is the invalid entry (requested by Daniele)
    - Move changes to DONE_REG and ERROR_GEN6 to a separate commit
      (requested by Daniele)

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NTomasz Lis <tomasz.lis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730180407.5993-5-lucas.demarchi@intel.com

a7a7a0e6

drm/i915/tgl: Define MOCS entries for Tigerlake · 2ddf9921

由 Tomasz Lis 提交于 7月 30, 2019

The MOCS table is published as part of bspec, and versioned. Entries
are supposed to never be modified, but new ones can be added. Adding
entries increases table version. The patch includes version 1 entries.

Two of the 3 legacy entries used for gen9 are no longer expected to work.
Although we are changing the gen11 table, those changes are supposed to
be backward compatible since we are only touching previously undefined
entries.

v2: Add the missing entries in 49-51 range and replace "HW reserved"
    terminology to what it actually is: L1 is implicitly enabled
    (from Daniele)
v3: Use a different table for Tiger Lake since entries 0 and 1 are not
    the same (from Daniele)

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NTomasz Lis <tomasz.lis@intel.com>
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730180407.5993-4-lucas.demarchi@intel.com

2ddf9921

drm/i915/tgl: Move fault registers to their new offset · 91b59cd9

由 Lucas De Marchi 提交于 7月 30, 2019

The fault registers moved to another offset. The old location is now
taken by the global MOCS registers, to be added in a follow up change.

Based on previous patches by Michel Thierry <michel.thierry@intel.com>.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730180407.5993-2-lucas.demarchi@intel.com

91b59cd9

drm/i915: remove dangling forward declaration · 900c9173

由 Lucas De Marchi 提交于 7月 30, 2019

Commit 20a7f2fc ("drm/i915: Convert intel_mocs_init_l3cc_table to
intel_gt") removed the only user.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730182614.14379-1-lucas.demarchi@intel.com

900c9173

drm/i915/uc: Move uC WOPCM setup in uc_init_hw · 63064d82

由 Daniele Ceraolo Spurio 提交于 7月 30, 2019

The register we write are not WOPCM regs but uC ones related to how
GuC and HuC are going to use the WOPCM, so it makes logical sense
for them to be programmed as part of uc_init_hw. The WOPCM map on the
other side is not uC-specific (although that is our main use-case), so
keep that separate.

v2: move write_and_verify to uncore, fix log, re-use err_out tag,
    add intel_wopcm_guc_base, fix log
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730230743.19542-2-daniele.ceraolospurio@intel.com

63064d82

drm/i915/uc: Don't enable communication twice on resume · 602776f9

由 Daniele Ceraolo Spurio 提交于 7月 30, 2019

When coming out of S3/S4 we sanitize and re-init the HW, which includes
enabling communication during uc_init_hw. We therefore don't want to do
that again in uc_resume and can just tell GuC to reload its state.

v2: split uc_resume and uc_runtime_resume to match the suspend
functions and to better differentiate the expected state in the 2
scenarios (Chris)
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730230743.19542-1-daniele.ceraolospurio@intel.com

602776f9

drm/i915/selftests: Pass intel_context to igt_spinner · f277bc0c

由 Chris Wilson 提交于 7月 31, 2019

Teach igt_spinner to only use our internal structs, decoupling the
interface from the GEM contexts. This makes it easier to avoid
requiring ce->gem_context back references for kernel_context that may
have them in future.

v2: Lift engine lock to verify_wa() caller.
v3: Less than v2, but more so
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190731081126.9139-1-chris@chris-wilson.co.uk

f277bc0c

30 7月, 2019 5 次提交

drm/i915/gt: Provide a local intel_context.vm · f5d974f9

由 Chris Wilson 提交于 7月 30, 2019

Track the currently bound address space used by the HW context. Minor
conversions to use the local intel_context.vm are made, leaving behind
some more surgery required to make intel_context the primary through the
selftests.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-2-chris@chris-wilson.co.uk

f5d974f9

drm/i915: Move aliasing_ppgtt underneath its i915_ggtt · c082afac

由 Chris Wilson 提交于 7月 30, 2019

The aliasing_ppgtt provides a PIN_USER alias for the global gtt, so move
it under the i915_ggtt to simplify later transformations to enable
intel_context.vm.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-1-chris@chris-wilson.co.uk

c082afac

drm/i915: Inline engine->init_context into its caller · a5627721

由 Chris Wilson 提交于 7月 29, 2019

We only use the init_context vfunc once while recording the default
context state, and we use the same sequence in each backend (eliding
steps that do not apply). Remove the vfunc for simplicity and
de-duplication.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190729113720.24830-1-chris@chris-wilson.co.uk

a5627721

drm/i915: use upstream version of header tests · 1032a2af

由 Jani Nikula 提交于 7月 29, 2019

Throw out our local hacks of header tests now that the more generic
kbuild versions are upstream.

At least for now, continue to keep the header tests behind
CONFIG_DRM_I915_WERROR=y knob.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190729140847.18557-1-jani.nikula@intel.com

1032a2af

drm/i915/uc: Don't fail on HuC firmware failure · 301efe96

由 Michal Wajdeczko 提交于 7月 29, 2019

HuC is usually not a critical component, so we can safely ignore
firmware load or authentication failures unless HuC was explicitly
requested by the user.

v2: add convenient way to disable loading (Chris)
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v1
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190729112612.37476-1-michal.wajdeczko@intel.com

301efe96

29 7月, 2019 1 次提交

drm/i915/selftests: Careful not to flush hang_fini on error setups · 76c5399f

由 Chris Wilson 提交于 7月 29, 2019

Smatch spotted that we test at the start of hang_fini for a valid (h->gt
is only set after a request is created) but then used it regardless
later on.

v2: Alternatively, we do not need to check as we now always prime h->gt
in hang_init()

References: cb823ed9 ("drm/i915/gt: Use intel_gt as the primary object for handling resets")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190729085944.2179-1-chris@chris-wilson.co.uk

76c5399f

27 7月, 2019 6 次提交

drm/i915/uc: Fixup kerneldoc after params were flipped and renamed · 62336cc6

由 Chris Wilson 提交于 7月 27, 2019

drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c:194: warning: Function parameter or member 'i915' not described in 'intel_uc_fw_fetch'
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c:194: warning: Excess function parameter 'dev_priv' description in 'intel_uc_fw_fetch'

Fixes: 97dee74b ("drm/i915/uc: Reorder params in intel_uc_fw_fetch")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190727101055.5300-1-chris@chris-wilson.co.uk

62336cc6

drm/i915/uc: Remove redundant RSA offset definition · 08f0e4a7

由 Michal Wajdeczko 提交于 7月 26, 2019

According to Firmware layout definition, RSA signature is located
after CSS header and uCode so actual RSA offset in the blob can be
easily calculated when needed (and we need it only once).
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190726184212.1836-3-michal.wajdeczko@intel.com

08f0e4a7

drm/i915/uc: Remove redundant ucode offset definition · 5de51fa0

由 Michal Wajdeczko 提交于 7月 26, 2019

According to Firmware layout definition, uCode is located right
after CSS header, so ucode offset is always same as header size.
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190726184212.1836-2-michal.wajdeczko@intel.com

5de51fa0

drm/i915/uc: Remove redundant header_offset/size definitions · 3a8c63d2

由 Michal Wajdeczko 提交于 7月 26, 2019

According to Firmware layout definition, CSS header is located
in front of the firmware blob, so header offset is always 0.
Similarly, size of the CSS header is constant and currently
used version is exactly 128.

While here, move type/status enums up and keep them together.

v2: use sizeof consistently (Daniele), update commit message
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190726184212.1836-1-michal.wajdeczko@intel.com

3a8c63d2

drm/i915/gt: Add to timeline requires the timeline mutex · 340c4c8d

由 Chris Wilson 提交于 7月 25, 2019

Modifying a remote context requires careful serialisation with requests
on that context, and that serialisation requires us to take their
timeline->mutex. Make it so.

Note that while struct_mutex rules, we can't create more than one
request in parallel, but that age is soon coming to an end.

v2: Though it doesn't affect the current users, contexts may share
timelines so check if we already hold the right mutex.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725131447.27515-1-chris@chris-wilson.co.uk

340c4c8d

drm/i915/uc: Don't sanitize guc_log_level modparam · f91bf738

由 Michal Wajdeczko 提交于 7月 25, 2019

We are already storing runtime value of log level in private
field, so there is no need to modify modparam.
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725205106.36148-1-michal.wajdeczko@intel.com

f91bf738

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功