提交 · 0d8ee5ba8db46c1c833f212a85f8f6d79286722a · openeuler / Kernel

24 9月, 2021 4 次提交

drm/i915: Don't back up pinned LMEM context images and rings during suspend · 0d8ee5ba

由 Thomas Hellström 提交于 9月 22, 2021

Pinned context images are now reset during resume. Don't back them up,
and assuming that rings can be assumed empty at suspend, don't back them
up either.

Introduce a new object flag, I915_BO_ALLOC_PM_VOLATILE meaning that an
object is allowed to lose its content on suspend.

v3:
- Slight documentation clarification (Matthew Auld)
Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-7-thomas.hellstrom@linux.intel.com

0d8ee5ba

drm/i915/gt: Register the migrate contexts with their engines · 3e42cc61

由 Thomas Hellström 提交于 9月 22, 2021

Pinned contexts, like the migrate contexts need reset after resume
since their context image may have been lost. Also the GuC needs to
register pinned contexts.

Add a list to struct intel_engine_cs where we add all pinned contexts on
creation, and traverse that list at resume time to reset the pinned
contexts.

This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now,
but proper LMEM backup / restore is needed for full suspend functionality.
However, note that even with full LMEM backup / restore it may be
desirable to keep the reset since backing up the migrate context images
must happen using memcpy() after the migrate context has become inactive,
and for performance- and other reasons we want to avoid memcpy() from
LMEM.

Also traverse the list at guc_init_lrc_mapping() calling
guc_kernel_context_pin() for the pinned contexts, like is already done
for the kernel context.

v2:
- Don't reset the contexts on each __engine_unpark() but rather at
  resume time (Chris Wilson).
v3:
- Reset contexts in the engine sanitize callback. (Chris Wilson)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Brost Matthew <matthew.brost@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-6-thomas.hellstrom@linux.intel.com

3e42cc61

drm/i915 Implement LMEM backup and restore for suspend / resume · c56ce956

由 Thomas Hellström 提交于 9月 22, 2021

Just evict unpinned objects to system. For pinned LMEM objects,
make a backup system object and blit the contents to that.

Backup is performed in three steps,
1: Opportunistically evict evictable objects using the gpu blitter.
2: After gt idle, evict evictable objects using the gpu blitter. This will
be modified in an upcoming patch to backup pinned objects that are not used
by the blitter itself.
3: Backup remaining pinned objects using memcpy.

Also move uC suspend to after 2) to make sure we have a functional GuC
during 2) if using GuC submission.

v2:
- Major refactor to make sure gem_exec_suspend@hang-SX subtests work, and
suspend / resume works with a slightly modified GuC submission enabling
patch series.

v3:
- Fix a potential use-after-free (Matthew Auld)
- Use i915_gem_object_create_shmem() instead of
i915_gem_object_create_region (Matthew Auld)
- Minor simplifications (Matthew Auld)
- Fix up kerneldoc for i195_ttm_restore_region().
- Final lmem_suspend() call moved to i915_gem_backup_suspend from
i915_gem_suspend_late, since the latter gets called at driver unload
and we don't unnecessarily want to run it at that time.

v4:
- Interface change of ttm- & lmem suspend / resume functions to use
flags rather than bools. (Matthew Auld)
- Completely drop the i915_gem_backup_suspend change (Matthew Auld)
Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-5-thomas.hellstrom@linux.intel.com

c56ce956

drm/i915/gt: Increase suspend timeout · 81387fc4

由 Thomas Hellström 提交于 9月 22, 2021

With GuC submission on DG1, the execution of the requests times out
for the gem_exec_suspend igt test case after executing around 800-900
of 1000 submitted requests.

Given the time we allow elsewhere for fences to signal (in the order of
seconds), increase the timeout before we mark the gt wedged and proceed.
Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-4-thomas.hellstrom@linux.intel.com

81387fc4

23 9月, 2021 1 次提交

drm/i915/guc, docs: Fix pdfdocs build error by removing nested grid · 017792a0

由 Akira Yokosawa 提交于 9月 20, 2021

Nested grids in grid-table cells are not specified as proper ReST
constructs.
Commit 572f2a5c ("drm/i915/guc: Update firmware to v62.0.0")
added a couple of kerneldoc tables of the form:

  +---+-------+------------------------------------------------------+
  | 1 |  31:0 |  +------------------------------------------------+  |
  +---+-------+  |                                                |  |
  |...|       |  |  Embedded `HXG Message`_                       |  |
  +---+-------+  |                                                |  |
  | n |  31:0 |  +------------------------------------------------+  |
  +---+-------+------------------------------------------------------+

For "make htmldocs", they happen to work as one might expect,
but they are incompatible with "make latexdocs" and "make pdfdocs",
and cause the generated gpu.tex file to become incomplete and
unbuildable by xelatex.

Restore the compatibility by removing those nested grids in the tables.

Size comparison of generated gpu.tex:

                  Sphinx 2.4.4  Sphinx 4.2.0
  v5.14:               3238686       3841631
  v5.15-rc1:            376270        432729
  with this fix:       3377846       3998095

Fixes: 572f2a5c ("drm/i915/guc: Update firmware to v62.0.0")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: NAkira Yokosawa <akiyks@gmail.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4a227569-074f-c501-58bb-d0d8f60a8ae9@gmail.com

017792a0

21 9月, 2021 5 次提交

drm/i915/xehp: Check new fuse bits for SFC availability · ff04f8be

由 Matt Roper 提交于 9月 17, 2021

Xe_HP adds some new bits to the FUSE1 register to let us know whether a
given SFC unit is present.  We should take this into account while
initializing SFC availability to our VCS and VECS engines.

While we're at it, update the FUSE1 register definition to use
REG_GENMASK / REG_FIELD_GET notation.

Note that, the bspec confusingly names the fuse bits "disable" despite
the register reflecting the *enable* status of the SFC units.  The
original architecture documents which the bspec is based on do properly
name this field "SFC_ENABLE."

Bspec: 52543
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NJosé Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210917161203.812251-2-matthew.d.roper@intel.com

ff04f8be

drm/i915/guc: Enable GuC submission by default on DG1 · 9175ffff

由 Matthew Brost 提交于 9月 16, 2021

Enable GuC submission by default on DG1
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916162819.27848-5-matthew.brost@intel.com

9175ffff

drm/i915/guc: Add DG1 GuC / HuC firmware defs · 87ba15d6

由 Daniele Ceraolo Spurio 提交于 9月 16, 2021

Add DG1 GuC / HuC firmware defs
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916162819.27848-4-matthew.brost@intel.com

87ba15d6

drm/i915/guc: put all guc objects in lmem when available · 7acbbc7c

由 Daniele Ceraolo Spurio 提交于 9月 16, 2021

The firmware binary has to be loaded from lmem and the recommendation is
to put all other objects in there as well. Note that we don't fall back
to system memory if the allocation in lmem fails because all objects are
allocated during driver load and if we have issues with lmem at that point
something is seriously wrong with the system, so no point in trying to
handle it.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916162819.27848-3-matthew.brost@intel.com

7acbbc7c

drm/i915: Do not define vma on stack · ea97e44f

由 Venkata Sandeep Dhanalakota 提交于 9月 16, 2021

Defining vma on stack can cause stack overflow, if
vma gets populated with new fields.

v2:
 (Daniel Vetter)
  - Add kerneldoc for new field

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NVenkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916162819.27848-2-matthew.brost@intel.com

ea97e44f

20 9月, 2021 2 次提交

drm/i915/gt: Add "intel_" as prefix in set_mocs_index() · 53718bff

由 Ayaz A Siddiqui 提交于 9月 16, 2021

Adding missing "intel_" prefix in set_mocs_index().

Fixes: b62aa57e ("drm/i915/gt: Add support of mocs propagation")
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: NAyaz A Siddiqui <ayaz.siddiqui@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916062736.1733587-1-ayaz.siddiqui@intel.com

53718bff

drm/i915: Make wa list per-gt · d0a65249

由 Venkata Sandeep Dhanalakota 提交于 9月 17, 2021

Support for multiple GT's within a single i915 device will be arriving
soon. Since each GT may have its own fusing and require different
workarounds, we need to make the GT workaround functions and multicast
steering setup per-gt.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NVenkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210917170845.836358-1-matthew.d.roper@intel.com

d0a65249

19 9月, 2021 4 次提交

drm/i915: deduplicate frequency dump on debugfs · d0c56031

由 Lucas De Marchi 提交于 9月 17, 2021

Although commit 9dd4b065 ("drm/i915/gt: Move pm debug files into a
gt aware debugfs") says it was moving debug files to gt/, the
i915_frequency_info file was left behind and its implementation copied
into drivers/gpu/drm/i915/gt/debugfs_gt_pm.c. Over time we had several
patches having to change both places to keep them in sync (and some
patches failing to do so). The initial idea was to remove
i915_frequency_info, but there are user space tools using it. From a
quick code search there are other scripts and test tools besides igt, so
it's not simply updating igt to get rid of the older file.

Here we export a function using drm_printer as parameter and make
both show() implementations to call this same function. Aside from a few
variable name differences, for i915_frequency_info this brings a few
lines that were not previously printed: RP UP EI, RP UP THRESHOLD, RP
DOWN THRESHOLD and RP DOWN EI. These came in as part of
commit 9c878557 ("drm/i915/gt: Use the RPM config register to
determine clk frequencies"), which didn't change both places.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210918025754.1254705-4-lucas.demarchi@intel.com

d0c56031

drm/i915: rename debugfs_gt_pm files · 23f6a829

由 Lucas De Marchi 提交于 9月 17, 2021

We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
functions, defines and structs follow suit.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210918025754.1254705-3-lucas.demarchi@intel.com

23f6a829

drm/i915: rename debugfs_engines files · 00142bce

由 Lucas De Marchi 提交于 9月 17, 2021

We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_engines.[ch] to intel_gt_engines_debugfs.[ch] and then make
functions, defines and structs follow suit.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210918025754.1254705-2-lucas.demarchi@intel.com

00142bce

drm/i915: rename debugfs_gt files · 022f324c

由 Lucas De Marchi 提交于 9月 17, 2021

We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_gt.[ch] to intel_gt_debugfs.[ch] and then make functions,
defines and structs follow suit.

While at it and since we are renaming the header, sort the includes
alphabetically.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210918025754.1254705-1-lucas.demarchi@intel.com

022f324c

15 9月, 2021 3 次提交

drm/i915: Mark GPU wedging on driver unregister unrecoverable · dc34ca92

由 Janusz Krzysztofik 提交于 9月 03, 2021

GPU wedged flag now set on driver unregister to prevent from further
using the GPU can be then cleared unintentionally when calling
__intel_gt_unset_wedged() still before the flag is finally marked
unrecoverable.  We need to have it marked unrecoverable earlier.
Implement that by replacing a call to intel_gt_set_wedged() in
intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini().

With the above in place, intel_gt_set_wedged_on_fini() is now called
twice on driver remove, second time from __intel_gt_disable().  This
seems harmless, while dropping intel_gt_set_wedged_on_fini() from
__intel_gt_disable() proved to break some driver probe error unwind
paths as well as mock selftest exit path.
Signed-off-by: NJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903142837.216978-1-janusz.krzysztofik@linux.intel.com

dc34ca92

drm/i915/dg2: Define MOCS table for DG2 · e9354051

由 Matt Roper 提交于 9月 03, 2021

Bspec: 45101, 45427
Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NMatt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210904003544.2422282-3-matthew.d.roper@intel.com

e9354051

drm/i915/xehpsdv: Define MOCS table for XeHP SDV · 50bc6486

由 Lucas De Marchi 提交于 9月 03, 2021

Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a
dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to
memory for L3 destined transaction" and L3_LKP to "enable Lookup for
uncacheable accesses".

Bspec: 45101
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: NStuart Summers <stuart.summers@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NClint Taylor <Clinton.A.Taylor@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210904003544.2422282-2-matthew.d.roper@intel.com

50bc6486

14 9月, 2021 21 次提交

drm/i915/guc: Add GuC kernel doc · 4f41ddc7

由 Matthew Brost 提交于 9月 09, 2021

Add GuC kernel doc for all structures added thus far for GuC submission
and update the main GuC submission section with the new interface
details.

v2:
 - Drop guc_active.lock DOC
v3:
 - Fixup a few kernel doc comments (Daniele)
v4 (Daniele):
 - Implement doc suggestions from John
 - Add kerneldoc for all members of the GuC structure and pull the file
   in i915.rst
v5 (Daniele):
 - Implement new doc suggestions from John
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-24-matthew.brost@intel.com

4f41ddc7

drm/i915/guc: Drop guc_active move everything into guc_state · af5bc9f2

由 Matthew Brost 提交于 9月 09, 2021

Now that we have locking hierarchy of sched_engine->lock ->
ce->guc_state everything from guc_active can be moved into guc_state and
protected the guc_state.lock.
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-23-matthew.brost@intel.com

af5bc9f2

drm/i915/guc: Move fields protected by guc->contexts_lock into sub structure · 3cb3e343

由 Matthew Brost 提交于 9月 09, 2021

To make ownership of locking clear move fields (guc_id, guc_id_ref,
guc_id_link) to sub structure guc_id in intel_context.
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-22-matthew.brost@intel.com

3cb3e343

drm/i915/guc: Move GuC priority fields in context under guc_active · 9798b172

由 Matthew Brost 提交于 9月 09, 2021

Move GuC management fields in context under guc_active struct as this is
where the lock that protects theses fields lives. Also only set guc_prio
field once during context init.

v2:
 (Daniele)
  - set CONTEXT_SET_INIT
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-21-matthew.brost@intel.com

9798b172

drm/i915/guc: Drop pin count check trick between sched_disable and re-pin · 5b116c17

由 Matthew Brost 提交于 9月 09, 2021

Drop pin count check trick between a sched_disable and re-pin, now rely
on the lock and counter of the number of committed requests to determine
if scheduling should be disabled on the context.
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-20-matthew.brost@intel.com

5b116c17

drm/i915/guc: Proper xarray usage for contexts_lookup · 1424ba81

由 Matthew Brost 提交于 9月 09, 2021

Lock the xarray and take ref to the context if needed.

v2:
 (Checkpatch)
  - Add new line after declaration
 (Daniel Vetter)
  - Correct put / get accounting in xa_for_loops
v3:
 (Checkpatch)
  - Extra new line
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-19-matthew.brost@intel.com

1424ba81

drm/i915/guc: Rework and simplify locking · 0f797650

由 Matthew Brost 提交于 9月 09, 2021

Rework and simplify the locking with GuC subission. Drop
sched_state_no_lock and move all fields under the guc_state.sched_state
and protect all these fields with guc_state.lock . This requires
changing the locking hierarchy from guc_state.lock -> sched_engine.lock
to sched_engine.lock -> guc_state.lock.

v2:
 (Daniele)
  - Don't check fields outside of lock during sched disable, check less
    fields within lock as some of the outside are no longer needed
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-18-matthew.brost@intel.com

0f797650

drm/i915/guc: Move guc_blocked fence to struct guc_state · 52d66c06

由 Matthew Brost 提交于 9月 09, 2021

Move guc_blocked fence to struct guc_state as the lock which protects
the fence lives there.

s/ce->guc_blocked/ce->guc_state.blocked/g

v2:
 (Daniele)
  - s/blocked_fence/blocked/g
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-17-matthew.brost@intel.com

52d66c06

drm/i915/guc: Release submit fence from an irq_work · b0d83888

由 Matthew Brost 提交于 9月 09, 2021

A subsequent patch will flip the locking hierarchy from
ce->guc_state.lock -> sched_engine->lock to sched_engine->lock ->
ce->guc_state.lock. As such we need to release the submit fence for a
request from an IRQ to break a lock inversion - i.e. the fence must be
release went holding ce->guc_state.lock and the releasing of the can
acquire sched_engine->lock.

v2:
 (Daniele)
  - Delete request from list before calling irq_work_queue
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-16-matthew.brost@intel.com

b0d83888

drm/i915/guc: Reset LRC descriptor if register returns -ENODEV · ae36b629

由 Matthew Brost 提交于 9月 09, 2021

Reset LRC descriptor if a context register returns -ENODEV as this means
we are mid-reset.

Fixes: eb5e7da7 ("drm/i915/guc: Reset implementation for new GuC interface")
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-15-matthew.brost@intel.com

ae36b629

drm/i915/guc: Don't touch guc_state.sched_state without a lock · f16d5cb9

由 Matthew Brost 提交于 9月 09, 2021

Before we did some clever tricks to not use the a lock when touching
guc_state.sched_state in certain cases. Don't do that, enforce the use
of the lock.

v2:
 (kernel test robo )
  - Add __maybe_unused to sched_state_is_init()

v3: rebase after the unused code path removal has been moved to an
earlier patch.
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reported-by: Nkernel test robot <lkp@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-14-matthew.brost@intel.com

f16d5cb9

drm/i915/guc: Take context ref when cancelling request · 422cda4f

由 Matthew Brost 提交于 9月 09, 2021

A context can get destroyed after cancelling a request, if a context or
GT reset occurs, so take a reference to context when cancelling a
request.

Fixes: 62eaf0ae ("drm/i915/guc: Support request cancellation")
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-13-matthew.brost@intel.com

422cda4f

drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H · d2420c2e

由 Matthew Brost 提交于 9月 09, 2021

While debugging an issue with full GT resets I went down a rabbit hole
thinking the scrubbing of lost G2H wasn't working correctly. This proved
to be incorrect as this was working just fine but this chase inspired me
to write a selftest to prove that this works. This simple selftest
injects errors dropping various G2H and then issues a full GT reset
proving that the scrubbing of these G2H doesn't blow up.

v2:
 (Daniel Vetter)
  - Use ifdef instead of macros for selftests
v3:
 (Checkpatch)
  - A space after 'switch' statement
v4:
 (Daniele)
  - A comment saying GT won't idle if G2H are lost
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-12-matthew.brost@intel.com

d2420c2e

drm/i915/guc: Copy whole golden context, set engine state size of subset · d135865c

由 Matthew Brost 提交于 9月 09, 2021

When the GuC does a media reset, it copies a golden context state back
into the corrupted context's state. The address of the golden context
and the size of the engine state restore are passed in via the GuC ADS.
The i915 had a bug where it passed in the whole size of the golden
context, not the size of the engine state to restore resulting in a
memory corruption.

Also copy the entire golden context on init rather than just the engine
state that is restored.

v2 (Daniele): use defines to avoid duplicated const variables (John).

Fixes: 481d458c ("drm/i915/guc: Add golden context to GuC ADS")
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-11-matthew.brost@intel.com

d135865c

drm/i915/guc: Don't enable scheduling on a banned context, guc_id invalid, not registered · 9888beaa

由 Matthew Brost 提交于 9月 09, 2021

When unblocking a context, do not enable scheduling if the context is
banned, guc_id invalid, or not registered.

v2:
 (Daniele)
  - Add helper for unblock

Fixes: 62eaf0ae ("drm/i915/guc: Support request cancellation")
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-10-matthew.brost@intel.com

9888beaa

drm/i915/guc: Kick tasklet after queuing a request · cf37e5c8

由 Matthew Brost 提交于 9月 09, 2021

Kick tasklet after queuing a request so it submitted in a timely manner.

Fixes: 3a4cdf19 ("drm/i915/guc: Implement GuC context operations for new inteface")
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-9-matthew.brost@intel.com

cf37e5c8

Revert "drm/i915/gt: Propagate change in error status to children on unhold" · ac653dd7

由 Matthew Brost 提交于 9月 09, 2021

Propagating errors to dependent fences is broken and can lead to errors
from one client ending up in another. In commit 3761baae ("Revert
"drm/i915: Propagate errors on awaiting already signaled fences""), we
attempted to get rid of fence error propagation but missed the case
added in commit 8e9f84cf ("drm/i915/gt: Propagate change in error
status to children on unhold"). Revert that one too. This error was
found by an up-and-coming selftest which triggers a reset during
request cancellation and verifies that subsequent requests complete
successfully.

v2:
 (Daniel Vetter)
  - Use revert
v3:
 (Jason)
  - Update commit message

v4 (Daniele):
 - fix checkpatch error in commit message.

References: '3761baae ("Revert "drm/i915: Propagate errors on awaiting already signaled fences"")'
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-8-matthew.brost@intel.com

ac653dd7

drm/i915/guc: Workaround reset G2H is received after schedule done G2H · 1ca36cff

由 Matthew Brost 提交于 9月 09, 2021

If the context is reset as a result of the request cancellation the
context reset G2H is received after schedule disable done G2H which is
the wrong order. The schedule disable done G2H release the waiting
request cancellation code which resubmits the context. This races
with the context reset G2H which also wants to resubmit the context but
in this case it really should be a NOP as request cancellation code owns
the resubmit. Use some clever tricks of checking the context state to
seal this race until the GuC firmware is fixed.

v2:
 (Checkpatch)
  - Fix typos
v3:
 (Daniele)
  - State that is a bug in the GuC firmware

Fixes: 62eaf0ae ("drm/i915/guc: Support request cancellation")
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-7-matthew.brost@intel.com

1ca36cff

drm/i915/guc: Process all G2H message at once in work queue · d67e3d5a

由 Matthew Brost 提交于 9月 09, 2021

Rather than processing 1 G2H at a time and re-queuing the work queue if
more messages exist, process all the G2H in a single pass of the work
queue.
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-6-matthew.brost@intel.com

d67e3d5a

drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context · 88209a8e

由 Matthew Brost 提交于 9月 09, 2021

Don't drop ce->guc_active.lock when unwinding a context after reset.
At one point we had to drop this because of a lock inversion but that is
no longer the case. It is much safer to hold the lock so let's do that.

Fixes: eb5e7da7 ("drm/i915/guc: Reset implementation for new GuC interface")
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-5-matthew.brost@intel.com

88209a8e

drm/i915/guc: Unwind context requests in reverse order · c39f51cc

由 Matthew Brost 提交于 9月 09, 2021

When unwinding requests on a reset context, if other requests in the
context are in the priority list the requests could be resubmitted out
of seqno order. Traverse the list of active requests in reverse and
append to the head of the priority list to fix this.

Fixes: eb5e7da7 ("drm/i915/guc: Reset implementation for new GuC interface")
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-4-matthew.brost@intel.com

c39f51cc

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功