提交 · 5664561cbb8b2efe143df94ac17db23971e6d243 · openeuler / Kernel

16 11月, 2022 1 次提交

drm/i915: Update workaround documentation · 5664561c

由 Lucas De Marchi 提交于 11月 15, 2022

There were several updates in the driver on how the workarounds are
handled since its documentation was written. Update the documentation to
reflect the current reality.

v2:
  - Remove footnote that was wrongly referenced, adding back the
    reference in the correct paragraph.
  - Remove "Display workarounds" and just mention "display IP" under
    "Other" category since all of them are peppered around the driver.

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> # v1
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115192611.179981-1-lucas.demarchi@intel.com

5664561c

15 11月, 2022 5 次提交

drm/i915/guc: handle interrupts from media GuC · a187f13d

由 Daniele Ceraolo Spurio 提交于 11月 07, 2022

The render and media GuCs share the same interrupt enable register, so
we can no longer disable interrupts when we disable communication for
one of the GuCs as this would impact the other GuC. Instead, we keep the
interrupts always enabled in HW and use a variable in the GuC structure
to determine if we want to service the received interrupts or not.

v2: use MTL_ prefix for reg definition (Matt)
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-7-daniele.ceraolospurio@intel.com

a187f13d

drm/i915/guc: define media GT GuC send regs · b910f716

由 Daniele Ceraolo Spurio 提交于 11月 07, 2022

The media GT shares the G-unit with the root GT, so a second set of
communication registers is required for the media GuC.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-6-daniele.ceraolospurio@intel.com

b910f716

drm/i915/mtl: Handle wopcm per-GT and limit calculations. · ee71434e

由 Aravind Iddamsetty 提交于 11月 07, 2022

With MTL standalone media architecture the wopcm layout has changed,
with separate partitioning in WOPCM for the root GT GuC and the media
GT GuC. The size of WOPCM is 4MB with the lower 2MB reserved for the
media GT and the upper 2MB for the root GT.

Given that MTL has GuC deprivilege, the WOPCM registers are pre-locked
by the bios. Therefore, we can skip all the math for the partitioning
and just limit ourselves to sanity-checking the values.

v2: fix makefile file ordering (Jani)
v3: drop XELPM_SAMEDIA_WOPCM_SIZE, check huc instead of VDBOX (John)
v4: further clarify commit message, remove blank line (John)
Signed-off-by: NAravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-5-daniele.ceraolospurio@intel.com

ee71434e

drm/i915/uc: use different ggtt pin offsets for uc loads · 01624116

由 Daniele Ceraolo Spurio 提交于 11月 07, 2022

Our current FW loading process is the same for all FWs:

- Pin FW to GGTT at the start of the ggtt->uc_fw node
- Load the FW
- Unpin

This worked because we didn't have a case where 2 FWs would be loaded on
the same GGTT at the same time. On MTL, however, this can happen if both
GTs are reset at the same time, so we can't pin everything in the same
spot and we need to use separate offset. For simplicity, instead of
calculating the exact required size, we reserve a 2MB slot for each fw.

v2: fail fetch if FW is > 2MBs, improve comments (John)
v3: more comment improvements (John)
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-3-daniele.ceraolospurio@intel.com

01624116

drm/i915/huc: only load HuC on GTs that have VCS engines · 472098c8

由 Daniele Ceraolo Spurio 提交于 11月 07, 2022

On MTL the primary GT doesn't have any media capabilities, so no video
engines and no HuC. We must therefore skip the HuC fetch and load on
that specific case. Given that other multi-GT platforms might have HuC
on the primary GT, we can't just check for that and it is easier to
instead check for the lack of VCS engines.

Based on code from Aravind Iddamsetty

v2: clarify which engine_mask is used for each GT and why (Tvrtko)
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NAravind Iddamsetty <aravind.iddamsetty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-1-daniele.ceraolospurio@intel.com

472098c8

14 11月, 2022 2 次提交

drm/i915: Simplify internal helper function signature · 36537275

由 Tvrtko Ursulin 提交于 11月 10, 2022

Since we are now storing the GT backpointer in the wa list we can drop the
explicit struct intel_gt * argument to wa_list_apply.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221110124633.3135026-1-tvrtko.ursulin@linux.intel.com

36537275

drm/i915/mtl: Add Wa_14017073508 for SAMedia · 8f70f1ec

由 Badal Nilawar 提交于 11月 04, 2022

This workaround is added for Media tile of MTL A step. It is to help
pcode workaround which handles the hardware issue seen during package C2/C3
transitions due to RC6 entry/exit transitions on Media tile. As a part of
workaround pcode expect kmd to send mailbox message "media busy" when
components of Media tile are in use and "media idle" otherwise.
As per workaround description gucrc need to be disabled so enabled
host based RC for Media tile.

v2:
 - Correct workaround id (Matt)
 - Fix review comments (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NBadal Nilawar <badal.nilawar@intel.com>
Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NAnshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221103184559.2306481-1-badal.nilawar@intel.com

8f70f1ec

11 11月, 2022 1 次提交

drm/i915/guc/slpc: Add selftest for slpc tile-tile interaction · 733827ee

由 Riana Tauro 提交于 11月 09, 2022

Run a workload on tiles simultaneously by requesting for RP0 frequency.
Pcode can however limit the frequency being granted due to throttling
reasons. This test checks if there is any throttling but does not fail
if RP0 is not granted due to throttle reasons

v2: Fix build error
v3: Use IS_ERR_OR_NULL to check worker
    Addressed cosmetic review comments (Tvrtko)
v4: do not skip test on media engines if gt type is GT_MEDIA.
    Use correct PERF_LIMIT_REASONS register for MTL (Vinay)
Signed-off-by: NRiana Tauro <riana.tauro@intel.com>
Reviewed-by: NVinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: NAnshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221109112541.275021-2-riana.tauro@intel.com

733827ee

10 11月, 2022 1 次提交

drm/i915: Partial abandonment of legacy DRM logging macros · a10234fd

由 Tvrtko Ursulin 提交于 11月 09, 2022

Convert some usages of legacy DRM logging macros into versions which tell
us on which device have the events occurred.

v2:
 * Don't have struct drm_device as local. (Jani, Ville)

v3:
 * Store gt, not i915, in workaround list. (John)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221109104633.2579245-1-tvrtko.ursulin@linux.intel.com

a10234fd

08 11月, 2022 6 次提交

drm/i915/guc: don't hardcode BCS0 in guc_hang selftest · 8b693ea2

由 Daniele Ceraolo Spurio 提交于 11月 02, 2022

On MTL there are no BCS engines on the media GT, so we can't always use
BCS0 in the test. There is no actual reason to use a BCS engine over an
engine of a different class, so switch to using any available engine.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Acked-by: NNirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102214310.2829310-1-daniele.ceraolospurio@intel.com

8b693ea2

drm/i915/mtl: don't expose GSC command streamer to the user · 194babe2

由 Daniele Ceraolo Spurio 提交于 11月 02, 2022

There is no userspace user for this CS yet, we only need it for internal
kernel ops (e.g. HuC, PXP), so don't expose it.

v2: even if it's not exposed, rename the engine so it is easier to
identify in the debug logs (Matt)
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-6-daniele.ceraolospurio@intel.com

194babe2

drm/i915/mtl: add GSC CS reset support · ef8281ab

由 Daniele Ceraolo Spurio 提交于 11月 02, 2022

The GSC CS has its own dedicated bit in the GDRST register.

Bspec: 52549
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-5-daniele.ceraolospurio@intel.com

ef8281ab

drm/i915/mtl: add GSC CS interrupt support · c07ee636

由 Daniele Ceraolo Spurio 提交于 11月 02, 2022

The GSC CS re-uses the same interrupt bits that the GSC used in older
platforms. This means that we can now have an engine interrupt coming
out of OTHER_CLASS, so we need to handle that appropriately.

v2: clean up the if statement for the engine irq (Tvrtko)
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-4-daniele.ceraolospurio@intel.com

c07ee636

drm/i915/mtl: pass the GSC CS info to the GuC · c9c12ba7

由 Daniele Ceraolo Spurio 提交于 11月 02, 2022

We need to tell the GuC that the GSC CS is there.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-3-daniele.ceraolospurio@intel.com

c9c12ba7

drm/i915/mtl: add initial definitions for GSC CS · 5fd974d1

由 Daniele Ceraolo Spurio 提交于 11月 02, 2022

Starting on MTL, the GSC is no longer managed with direct MMIO access,
but we instead have a dedicated command streamer for it. As a first step
for adding support for this CS, add the required definitions.
Note that, although it is now a CS, the GSC retains its old
class:instance value (OTHER_CLASS instance 6)

Bspec: 65308, 45605
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-2-daniele.ceraolospurio@intel.com

5fd974d1

05 11月, 2022 3 次提交

drm/i915/guc: Don't deadlock busyness stats vs reset · 178b8a36

由 John Harrison 提交于 11月 02, 2022

The engine busyness stats has a worker function to do things like
64bit extend the 32bit hardware counters. The GuC's reset prepare
function flushes out this worker function to ensure no corruption
happens during the reset. Unforunately, the worker function has an
infinite wait for active resets to finish before doing its work. Thus
a deadlock would occur if the worker function had actually started
just as the reset starts.

The function being used to lock the reset-in-progress mutex is called
intel_gt_reset_trylock(). However, as noted it does not follow
standard 'trylock' conventions and exit if already locked. So rename
the current _trylock function to intel_gt_reset_lock_interruptible(),
which is the behaviour it actually provides. In addition, add a new
implementation of _trylock and call that from the busyness stats
worker instead.

v2: Rename existing trylock to interruptible rather than trying to
preserve the existing (confusing) naming scheme (review comments from
Tvrtko).
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-3-John.C.Harrison@Intel.com

178b8a36

drm/i915/guc: Properly initialise kernel contexts · de51de96

由 John Harrison 提交于 11月 02, 2022

If a context has already been registered prior to first submission
then context init code was not being called. The noticeable effect of
that was the scheduling priority was left at zero (meaning super high
priority) instead of being set to normal. This would occur with
kernel contexts at start of day as they are manually pinned up front
rather than on first submission. So add a call to initialise those
when they are pinned.
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-2-John.C.Harrison@Intel.com

de51de96

drm/i915/guc: Remove excessive line feeds in state dumps · cc2e0cf0

由 John Harrison 提交于 10月 31, 2022

Some of the GuC state dump messages were adding extra line feeds. When
printing via a DRM printer to dmesg, for example, that messes up the
log formatting as it loses any prefixing from the printer. Given that
the extra line feeds are just in the middle of random bits of GuC
state, there isn't any real need for them. So just remove them
completely.
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031220007.4176835-1-John.C.Harrison@Intel.com

cc2e0cf0

02 11月, 2022 7 次提交

drm/i915/selftests: Run the perf MI_BB tests on gen4/5 · 1086af67

由 Ville Syrjälä 提交于 10月 31, 2022

Now that we know the ring timestamp frequency on gen4/5 we
can run the perf tests that depend on sampling the timestamp.

On g4x/ilk we must read the udw of the 64bit timestamp
register. Details in {g4x,gen5)_read_clock_frequency().

When executing the read via the CS i965 doesn't seem to need
the double read trick that CPU mmio reads need.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-7-ville.syrjala@linux.intel.comReviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>

1086af67

drm/i915/selftests: Test RING_TIMESTAMP on gen4/5 · 38530a37

由 Ville Syrjälä 提交于 10月 31, 2022

Now that we actually know the cs timestamp frequency on gen4/5
let's run the corresponding test.

On g4x/ilk we must read the udw of the 64bit timestamp
register. Details in {g4x,gen5)_read_clock_frequency().

The one extra caveat is that on i965 (or at least CL, don't
recall if I ever tested on BW) we must read the register
twice to get an up to date value. For some unknown reason
the first read tends to return a stale value.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-6-ville.syrjala@linux.intel.comReviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>

38530a37

drm/i915/selftests: Run MI_BB perf selftests on SNB · cf8a82de

由 Ville Syrjälä 提交于 10月 31, 2022

SNB does have the RING_TIMESTAMP register on the RCS engine.
Run the MI_BB perf tests on it.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-5-ville.syrjala@linux.intel.comReviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>

cf8a82de

drm/i915: Fix cs timestamp frequency for cl/bw · dbea79a5

由 Ville Syrjälä 提交于 10月 31, 2022

Despite what the spec says the TIMESTAMP register seems to
tick once every hrawclk (confirmed on i965gm and g35).

v2: Rebase
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-4-ville.syrjala@linux.intel.comReviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>

dbea79a5

drm/i915: Stop claiming cs timestamp frquency on gen2/3 · 78e418d0

由 Ville Syrjälä 提交于 10月 31, 2022

Gen2/3 have no TIMESTAMP registers to sample so no point in thinking
we have any frequency for it either.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-3-ville.syrjala@linux.intel.comReviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>

78e418d0

drm/i915: Fix cs timestamp frequency for ctg/elk/ilk · ad1ea980

由 Ville Syrjälä 提交于 10月 31, 2022

On ilk the UDW of TIMESTAMP increments every 1000 ns,
LDW is mbz. In order to represent that we'd need 52 bits,
but we only have 32 bits. Even worse most things want to
only deal with 32 bits of timestamp. So let's just set
up the timestamp frequency as if we only had the UDW.

On ctg/elk 63:20 of TIMESTAMP increments every 1/4 ns, 19:0
are mbz. To make life simpler let's ignore the LDW and set up
timestamp frequency based on the UDW only (increments every
1024 ns).

v2: Rebase
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031135703.14670-2-ville.syrjala@linux.intel.comReviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>

ad1ea980

drm/i915/dg2: Introduce Wa_18017747507 · ea9c6215

由 Wayne Boyer 提交于 10月 31, 2022

WA 18017747507 applies to all DG2 skus.

BSpec: 56035, 46121, 68173
Signed-off-by: NWayne Boyer <wayne.boyer@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031131509.3411195-1-wayne.boyer@intel.com

ea9c6215

31 10月, 2022 1 次提交

drm/i915: Encapsulate lmem rpm stuff in intel_runtime_pm · e66c8dcf

由 Anshuman Gupta 提交于 10月 27, 2022

Runtime pm is not really per GT, therefore it make sense to
move lmem_userfault_list, lmem_userfault_lock and
userfault_wakeref from intel_gt to intel_runtime_pm structure,
which is embedded to i915.

No functional change.

v2:
- Fixes the code comment nit. [Matt Auld]
Signed-off-by: NAnshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221027092242.1476080-2-anshuman.gupta@intel.com

e66c8dcf

29 10月, 2022 1 次提交

drm/i915/mtl: Add missing steering table terminators · 876e9047

由 Matt Roper 提交于 10月 28, 2022

The termination entries were missing for a couple of the recently-added
MTL steering tables.

Fixes: f32898c9 ("drm/i915/xelpg: Add multicast steering")
Fixes: a7ec65fc ("drm/i915/xelpmp: Add multicast steering for media GT")
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028224022.964997-1-matthew.d.roper@intel.com

876e9047

28 10月, 2022 8 次提交

drm/i915/guc: Support OA when Wa_16011777198 is enabled · 01e74274

由 Vinay Belgaumkar 提交于 10月 26, 2022

On DG2, a w/a resets RCS/CCS before it goes into RC6. This breaks OA
since OA does not expect engine resets during its use. Fix it by
disabling RC6.

v2: (Ashutosh)
- Bring back slpc_unset_param helper
- Update commit msg
- Use with_intel_runtime_pm helper for set/unset

v3: (Ashutosh)
- Just use intel_uc_uses_guc_rc
Signed-off-by: NVinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-15-umesh.nerlige.ramappa@intel.com

01e74274

drm/i915/perf: Save/restore EU flex counters across reset · fa569804

由 Umesh Nerlige Ramappa 提交于 10月 26, 2022

If a drm client is killed, then hw contexts used by the client are reset
immediately. This reset clears the EU flex counter configuration. If an
OA use case is running in parallel, it would start seeing zeroed eu
counter values following the reset even if the drm client is restarted.
Save/restore the EU flex counter config so that the EU counters can be
monitored continuously across resets.

v2:
- Save/restore eu flex config only for gen12, as for pre-gen12, these
  are saved and restored in the context image.
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-14-umesh.nerlige.ramappa@intel.com

fa569804

drm/i915/perf: Add Wa_1508761755:dg2 · ed6b25aa

由 Umesh Nerlige Ramappa 提交于 10月 26, 2022

Disable Clock gating in EU when gathering the events so that EU events
are not lost.

v2: Fix checkpatch issues
v3: User MCR helpers to write to MC reg
v4: Indent correctly (checkpatch)
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-12-umesh.nerlige.ramappa@intel.com

ed6b25aa

drm/i915/perf: Move gt-specific data from i915->perf to gt->perf · 9677a9f3

由 Umesh Nerlige Ramappa 提交于 10月 26, 2022

Make perf part of gt as the OAG buffer is specific to a gt. The refactor
eventually simplifies programming the right OA buffer and the right HW
registers when supporting multiple gts.
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-8-umesh.nerlige.ramappa@intel.com

9677a9f3

drm/i915/perf: Determine gen12 oa ctx offset at runtime · a5c3a3cb

由 Umesh Nerlige Ramappa 提交于 10月 26, 2022

Some SKUs of same gen12 platform may have different oactxctrl
offsets. For gen12, determine oactxctrl offsets at runtime.

v2: (Lionel)
- Move MI definitions to intel_gpu_commands.h
- Ensure __find_reg_in_lri does read past context image size

v3: (Ashutosh)
- Drop unnecessary use of double underscores
- fix find_reg_in_lri
- Return error if oa context offset is U32_MAX
- Error out if oa_ctx_ctrl_offset does not find offset

v4: (Ashutosh)
- Warn on odd MI LRI_LEN
- Remove unnecessary check for valid_oactxctrl_offset
- Drop valid_oactxctrl_offset macro

v5: Drop unrelated comment
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-5-umesh.nerlige.ramappa@intel.com

a5c3a3cb

drm/i915/perf: Fix noa wait predication for DG2 · 2d9da585

由 Umesh Nerlige Ramappa 提交于 10月 26, 2022

Predication for batch buffer commands changed in XEHPSDV.
MI_BATCH_BUFFER_START predicates based on MI_SET_PREDICATE_RESULT
register. The MI_SET_PREDICATE_RESULT register can only be modified
with MI_SET_PREDICATE command. When configured, the MI_SET_PREDICATE
command sets MI_SET_PREDICATE_RESULT based on bit 0 of
MI_PREDICATE_RESULT_2. Use this to configure predication in noa_wait.
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-4-umesh.nerlige.ramappa@intel.com

2d9da585

drm/i915/perf: Fix OA filtering logic for GuC mode · 682aa437

由 Umesh Nerlige Ramappa 提交于 10月 26, 2022

With GuC mode of submission, GuC is in control of defining the context
id field that is part of the OA reports. To filter reports, UMD and KMD
must know what sw context id was chosen by GuC. There is not interface
between KMD and GuC to determine this, so read the upper-dword of
EXECLIST_STATUS to filter/squash OA reports for the specific context.

v2: Explain guc id stealing w.r.t OA use case
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-2-umesh.nerlige.ramappa@intel.com

682aa437

drm/i915: Fix CFI violations in gt_sysfs · a8a4f046

由 Nathan Chancellor 提交于 10月 25, 2022

When booting with CONFIG_CFI_CLANG, there are numerous violations when
accessing the files under
/sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0:

  $ cd /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0

  $ grep . *
  id:0
  punit_req_freq_mhz:350
  rc6_enable:1
  rc6_residency_ms:214934
  rps_act_freq_mhz:1300
  rps_boost_freq_mhz:1300
  rps_cur_freq_mhz:350
  rps_max_freq_mhz:1300
  rps_min_freq_mhz:350
  rps_RP0_freq_mhz:1300
  rps_RP1_freq_mhz:350
  rps_RPn_freq_mhz:350
  throttle_reason_pl1:0
  throttle_reason_pl2:0
  throttle_reason_pl4:0
  throttle_reason_prochot:0
  throttle_reason_ratl:0
  throttle_reason_status:0
  throttle_reason_thermal:0
  throttle_reason_vr_tdc:0
  throttle_reason_vr_thermalert:0

  $ sudo dmesg &| grep "CFI failure at"
  [  214.595903] CFI failure at kobj_attr_show+0x19/0x30 (target: id_show+0x0/0x70 [i915]; expected type: 0xc527b809)
  [  214.596064] CFI failure at kobj_attr_show+0x19/0x30 (target: punit_req_freq_mhz_show+0x0/0x40 [i915]; expected type: 0xc527b809)
  [  214.596407] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_enable_show+0x0/0x40 [i915]; expected type: 0xc527b809)
  [  214.596528] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_residency_ms_show+0x0/0x270 [i915]; expected type: 0xc527b809)
  [  214.596682] CFI failure at kobj_attr_show+0x19/0x30 (target: act_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  214.596792] CFI failure at kobj_attr_show+0x19/0x30 (target: boost_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  214.596893] CFI failure at kobj_attr_show+0x19/0x30 (target: cur_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  214.596996] CFI failure at kobj_attr_show+0x19/0x30 (target: max_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  214.597099] CFI failure at kobj_attr_show+0x19/0x30 (target: min_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  214.597198] CFI failure at kobj_attr_show+0x19/0x30 (target: RP0_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  214.597301] CFI failure at kobj_attr_show+0x19/0x30 (target: RP1_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  214.597405] CFI failure at kobj_attr_show+0x19/0x30 (target: RPn_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  214.597538] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  214.597701] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  214.597836] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  214.597952] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  214.598071] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  214.598177] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  214.598307] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  214.598439] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  214.598542] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)

With kCFI, indirect calls are validated against their expected type
versus actual type and failures occur when the two types do not match.
The ultimate issue is that these sysfs functions are expecting to be
called via dev_attr_show() but they may also be called via
kobj_attr_show(), as certain files are created under two different
kobjects that have two different sysfs_ops in intel_gt_sysfs_register(),
hence the warnings above. When accessing the gt_ files under
/sys/devices/pci0000:00/0000:00:02.0/drm/card0, which are using the same
sysfs functions, there are no violations, meaning the functions are
being called with the proper type.

To make everything work properly, adjust certain functions to match the
type of the ->show() and ->store() members in 'struct kobj_attribute'.
Add a macro to generate functions for that can be called via both
dev_attr_{show,store}() or kobj_attr_{show,store}() so that they can be
called through both kobject locations without violating kCFI and adjust
the attribute groups to account for this.

Link: https://github.com/ClangBuiltLinux/linux/issues/1716Reviewed-by: NAndi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NAndi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221013205909.1282545-1-nathan@kernel.org

a8a4f046

27 10月, 2022 4 次提交

drm/i915/guc: Remove intel_context:number_committed_requests counter · a7ac9d84

由 Alan Previn 提交于 10月 06, 2022

With the introduction of the delayed disable-sched behavior,
we use the GuC's xarray of valid guc-id's as a way to
identify if new requests had been added to a context
when the said context is being checked for closure.

Additionally that prior change also closes the race for when
a new incoming request fails to cancel the pending
delayed disable-sched worker.

With these two complementary checks, we see no more
use for intel_context:guc_state:number_committed_requests.
Signed-off-by: NAlan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006225121.826257-3-alan.previn.teres.alexis@intel.com

a7ac9d84

drm/i915/guc: Delay disabling guc_id scheduling for better hysteresis · 83321094

由 Matthew Brost 提交于 10月 06, 2022

Add a delay, configurable via debugfs (default 34ms), to disable
scheduling of a context after the pin count goes to zero. Disable
scheduling is a costly operation as it requires synchronizing with
the GuC. So the idea is that a delay allows the user to resubmit
something before doing this operation. This delay is only done if
the context isn't closed and less than a given threshold
(default is 3/4) of the guc_ids are in use.

Alan Previn: Matt Brost first introduced this patch back in Oct 2021.
However no real world workload with measured performance impact was
available to prove the intended results. Today, this series is being
republished in response to a real world workload that benefited greatly
from it along with measured performance improvement.

Workload description: 36 containers were created on a DG2 device where
each container was performing a combination of 720p 3d game rendering
and 30fps video encoding. The workload density was configured in a way
that guaranteed each container to ALWAYS be able to render and
encode no less than 30fps with a predefined maximum render + encode
latency time. That means the totality of all 36 containers and their
workloads were not saturating the engines to their max (in order to
maintain just enough headrooom to meet the min fps and max latencies
of incoming container submissions).

Problem statement: It was observed that the CPU core processing the i915
soft IRQ work was experiencing severe load. Using tracelogs and an
instrumentation patch to count specific i915 IRQ events, it was confirmed
that the majority of the CPU cycles were caused by the
gen11_other_irq_handler() -> guc_irq_handler() code path. The vast
majority of the cycles was determined to be processing a specific G2H
IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent
by GuC in response to i915 KMD sending H2G requests:
INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G requests are sent
whenever a context goes idle so that we can unpin the context from GuC.
The high CPU utilization % symptom was limiting density scaling.

Root Cause Analysis: Because the incoming execution buffers were spread
across 36 different containers (each with multiple contexts) but the
system in totality was NOT saturated to the max, it was assumed that each
context was constantly idling between submissions. This was causing
a thrashing of unpinning contexts from GuC at one moment, followed quickly
by repinning them due to incoming workload the very next moment. These
event-pairs were being triggered across multiple contexts per container,
across all containers at the rate of > 30 times per sec per context.

Metrics: When running this workload without this patch, we measured an
average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10
seconds or ~10 million times over ~25+ mins. With this patch, the count
reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The
improvement observed is ~99% for the average counts per 10 seconds.

Design awareness: Selftest impact.
As temporary WA disable this feature for the selftests. Selftests are
very timing sensitive and any change in timing can cause failure. A
follow up patch will fixup the selftests to understand this delay.

Design awareness: Race between guc_request_alloc and guc_context_close.
If a context close is issued while there is a request submission in
flight and a delayed schedule disable is pending, guc_context_close
and guc_request_alloc will race to cancel the delayed disable.
To close the race, make sure that guc_request_alloc waits for
guc_context_close to finish running before checking any state.

Design awareness: GT Reset event.
If a gt reset is triggered, as preparation steps, add an additional step
to ensure all contexts that have a pending delay-disable-schedule task
be flushed of it. Move them directly into the closed state after cancelling
the worker. This is okay because the existing flow flushes all
yet-to-arrive G2H's dropping them anyway.
Signed-off-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NAlan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006225121.826257-2-alan.previn.teres.alexis@intel.com

83321094

drm/i915/guc: Fix GuC error capture sizing estimation and reporting · befb231d

由 Alan Previn 提交于 10月 25, 2022

During GuC error capture initialization, we estimate the amount of size
we need for the error-capture-region of the shared GuC-log-buffer.
This calculation was incorrect so fix that. With the fixed calculation
we can reduce the allocation of error-capture region from 4MB to 1MB
(see note2 below for reasoning). Additionally, switch from drm_notice to
drm_debug for the 3X spare size check since that would be impossible to
hit without redesigning gpu_coredump framework to hold multiple captures.

NOTE1: Even for 1x the min size estimation case, actually running out
of space is a corner case because it can only occur if all engine
instances get reset all at once and i915 isn't able extract the capture
data fast enough within G2H handler worker.

NOTE2: With the corrected calculation, a DG2 part required ~77K and a PVC
required ~115K (1X min-est-size that is calculated as one-shot all-engine-
reset scenario).

Fixes: d7c15d76 ("drm/i915/guc: Check sizing of guc_capture output")
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: NAlan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026060506.1007830-2-alan.previn.teres.alexis@intel.com

befb231d

drm/i915/slpc: Use platform limits for min/max frequency · 37d52e44

由 Vinay Belgaumkar 提交于 10月 24, 2022

GuC will set the min/max frequencies to theoretical max on
ATS-M. This will break kernel ABI, so limit min/max frequency
to RP0(platform max) instead.

Also modify the SLPC selftest to update the min frequency
when we have a server part so that we can iterate between
platform min and max.

v2: Check softlimits instead of platform limits (Riana)
v3: More review comments (Ashutosh)
v4: No need to use saved_min_freq and other comments (Ashutosh)

Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7030Acked-by: NNirmoy Das <nirmoy.das@intel.com>
Reviewed-by: NRiana Tauro <riana.tauro@intel.com>
Signed-off-by: NVinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221024225453.4856-1-vinay.belgaumkar@intel.com

37d52e44

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功