提交 · 79398d24da4c9294285bdedf67018ff09fe97bdc · openeuler / Kernel

30 6月, 2022 1 次提交

drm/i915/guc/slpc: Add a new SLPC selftest · 79398d24

由 Vinay Belgaumkar 提交于 6月 27, 2022

This test will validate we can achieve actual frequency of RP0. Pcode
grants frequencies based on what GuC is requesting. However, thermal
throttling can limit what is being granted. Add a test to request for
max, but don't fail the test if RP0 is not granted due to throttle
reasons.

Also optimize the selftest by using a common run_test function to avoid
code duplication. Rename the "clamp" tests to vary_max_freq and
vary_min_freq.

v2: Fix compile warning
v3: Review comments (Ashutosh). Added a FIXME for the media RP0 case.
v4: Checkpatch (strict) fixes, remove FIXME and other comments (Ashutosh)

Fixes commit 8ee2c227 ("drm/i915/guc/slpc: Add SLPC selftest")

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NVinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220627230346.27720-1-vinay.belgaumkar@intel.com

79398d24

29 6月, 2022 1 次提交

drm/i915: Fix a lockdep warning at error capture · a0696856

由 Nirmoy Das 提交于 6月 24, 2022

For some platfroms we use stop_machine version of
gen8_ggtt_insert_page/gen8_ggtt_insert_entries to avoid a
concurrent GGTT access bug but this causes a circular locking
dependency warning:

  Possible unsafe locking scenario:
        CPU0                    CPU1
        ----                    ----
   lock(&ggtt->error_mutex);
                                lock(dma_fence_map);
                                lock(&ggtt->error_mutex);
   lock(cpu_hotplug_lock);

Fix this by calling gen8_ggtt_insert_page/gen8_ggtt_insert_entries
directly at error capture which is concurrent GGTT access safe because
reset path make sure of that.

v2: Fix rebase conflict and added a comment.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5595Reviewed-by: NGwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NNirmoy Das <nirmoy.das@intel.com>
Reviewed-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: NRamalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624110821.29190-1-nirmoy.das@intel.com

a0696856

28 6月, 2022 5 次提交

drm/i915/guc/slpc: Use non-blocking H2G for waitboost · 58eaa6b3

由 Vinay Belgaumkar 提交于 6月 22, 2022

SLPC min/max frequency updates require H2G calls. We are seeing
timeouts when GuC channel is backed up and it is unable to respond
in a timely fashion causing warnings and affecting CI.

This is seen when waitboosting happens during a stress test.
this patch updates the waitboost path to use a non-blocking
H2G call instead, which returns as soon as the message is
successfully transmitted.

v2: Use drm_notice to report any errors that might occur while
sending the waitboost H2G request (Tvrtko)
v3: Add drm_notice inside force_min_freq (Ashutosh)

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NVinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220623003225.23301-1-vinay.belgaumkar@intel.com

58eaa6b3

drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend · 0667429c

由 Umesh Nerlige Ramappa 提交于 6月 21, 2022

For execlists backend, current implementation of Wa_22011802037 is to
stop the CS before doing a reset of the engine. This WA was further
extended to wait for any pending MI FORCE WAKEUPs before issuing a
reset. Add the extended steps in the execlist path of reset.

In addition, extend the WA to gen11.

v2: (Tvrtko)
- Clarify comments, commit message, fix typos
- Use IS_GRAPHICS_VER for gen 11/12 checks

v3: (Daneile)
- Drop changes to intel_ring_submission since WA does not apply to it
- Log an error if MSG IDLE is not defined for an engine
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: f6aa0d71 ("drm/i915: Add Wa_22011802037 force cs halt")
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com

0667429c

drm/i915/guc: Don't update engine busyness stats too frequently · 59bcdb56

由 Alan Previn 提交于 6月 22, 2022

Using two different types of workoads, it was observed that
guc_update_engine_gt_clks was being called too frequently and/or
causing a CPU-to-lmem bandwidth hit over PCIE. Details on
the workloads and numbers are in the notes below.

Background: At the moment, guc_update_engine_gt_clks can be invoked
via one of 3 ways. #1 and #2 are infrequent under normal operating
conditions:
1.When a predefined "ping_delay" timer expires so that GuC-
busyness can sample the GTPM clock counter to ensure it
doesn't miss a wrap-around of the 32-bits of the HW counter.
(The ping_delay is calculated based on 1/8th the time taken
for the counter go from 0x0 to 0xffffffff based on the
GT frequency. This comes to about once every 28 seconds at a
GT frequency of 19.2Mhz).
2.In preparation for a gt reset.
3.In response to __gt_park events (as the gt power management
puts the gt into a lower power state when there is no work
being done).

Root-cause: For both the workloads described farther below, it was
observed that when user space calls IOCTLs that unparks the
gt momentarily and repeats such calls many times in quick succession,
it triggers calling guc_update_engine_gt_clks as many times. However,
the primary purpose of guc_update_engine_gt_clks is to ensure we don't
miss the wraparound while the counter is ticking. Thus, the solution
is to ensure we skip that check if gt_park is calling this function
earlier than necessary.

Solution: Snapshot jiffies when we do actually update the busyness
stats. Then get the new jiffies every time intel_guc_busyness_park
is called and bail if we are being called too soon. Use half of the
ping_delay as a safe threshold.

NOTE1: Workload1: IGTs' gem_create was modified to create a file handle,
allocate memory with sizes that range from a min of 4K to the max supported
(in power of two step-sizes). Its maps, modifies and reads back the
memory. Allocations and modification is repeated until total memory
allocation reaches the max. Then the file handle is closed. With this
workload, guc_update_engine_gt_clks was called over 188 thousand times
in the span of 15 seconds while this test ran three times. With this patch,
the number of calls reduced to 14.

NOTE2: Workload2: 30 transcode sessions are created in quick succession.
While these sessions are created, pcm-iio tool was used to measure I/O
read operation bandwidth consumption sampled at 100 milisecond intervals
over the course of 20 seconds. The total bandwidth consumed over 20 seconds
without this patch was measured at average at 311KBps per sample. With this
patch, the number went down to about 175Kbps which is about a 43% savings.
Signed-off-by: NAlan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220623023157.211650-2-alan.previn.teres.alexis@intel.com

59bcdb56

Revert "drm/i915: Hold reference to intel_context over life of i915_request" · bcb9aa45

由 Niranjana Vishwanathapura 提交于 6月 15, 2022

This reverts commit 1e98d8c5.

The problem with this patch is that it makes i915_request to hold a
reference to intel_context, which in turn holds a reference on the VM.
This strong back referencing can lead to reference loops which leads
to resource leak.

An example is the upcoming VM_BIND work which requires VM to hold
a reference to some shared VM specific BO. But this BO's dma-resv
fences holds reference to the i915_request thus leading to reference
loop.

v2:
  Do not use reserved requests for virtual engines
Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NRamalingam C <ramalingam.c@intel.com>
Suggested-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NMathew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220614184348.23746-3-ramalingam.c@intel.com

bcb9aa45

drm/i915: Do not access rq->engine without a reference · 7307e91b

由 Niranjana Vishwanathapura 提交于 6月 15, 2022

In i915_fence_get_driver_name(), user may not hold a
reference to rq->engine. Hence do not access it. Instead,
store required device private pointer in 'rq->i915' and use it.
Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Suggested-by: NMatthew Brost <matthew.brost@intel.com>
Reviewed-by: NMatthew Brost <matthew.brost@intel.com>
Signed-off-by: NRamalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220614184348.23746-2-ramalingam.c@intel.com

7307e91b

27 6月, 2022 3 次提交

drm/i915: Prefer "XEHP_" prefix for registers · 7d809707

由 Matt Roper 提交于 6月 24, 2022

We've been introducing new registers with a mix of "XEHP_"
(architecture) and "XEHPSDV_" (platform) prefixes.  For consistency,
let's settle on "XEHP_" as the preferred form.

XEHPSDV_RP_STATE_CAP stays with its current name since that's truly a
platform-specific register and not something that applies to the Xe_HP
architecture as a whole.
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NCaz Yokoyama <caz@caztech.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624210328.308630-2-matthew.d.roper@intel.com

7d809707

drm/i915: Correct duplicated/misplaced GT register definitions · 8524bb67

由 Matt Roper 提交于 6月 24, 2022

XEHPSDV_FLAT_CCS_BASE_ADDR, GEN8_L3_LRA_1_GPGPU, and MMCD_MISC_CTRL were
duplicated between i915_reg.h and intel_gt_regs.h. These are all GT
registers, so we should drop the copy from i915_reg.h.

XEHPSDV_TILE0_ADDR_RANGE was defined in i915_reg.h, but really belongs
in intel_gt_regs.h. Move it.
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624210328.308630-1-matthew.d.roper@intel.com

8524bb67

drm/i915: tweak the ordering in cpu_write_needs_clflush · 563aaf4a

由 Matthew Auld 提交于 6月 22, 2022

For imported dma-buf objects we leave the object as cache_coherent = 0
across all platforms, which is reasonable given that have no clue what
the memory underneath is, and its not like the driver can ever manually
clflush the pages anyway (like with i915_gem_clflush_object) for such
objects. However on discrete we choose to treat cache_dirty = true as a
programmer error, leading to a warning. The simplest fix looks to be to
just change the ordering in cpu_write_needs_clflush to prevent ever
setting cache_dirty for dma-buf objects on discrete.

Fixes: d028a769 ("drm/i915/dmabuf: Fix prime_mmap to work when using LMEM")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5266Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: NGwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622155919.355081-1-matthew.auld@intel.com

563aaf4a

23 6月, 2022 3 次提交

drm/i915/selftests: Increase timeout for live_parallel_switch · 373269ae

由 Akeem G Abodunrin 提交于 6月 22, 2022

With GuC submission, it takes a little bit longer switching contexts
among all available engines simultaneously, when running
live_parallel_switch subtest. Increase the timeout.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5885Signed-off-by: NAkeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622141104.334432-1-matthew.auld@intel.com

373269ae

drm/i915/gt: Re-do the intel-gtt split · 9ce07d94

由 Lucas De Marchi 提交于 6月 17, 2022

Re-do what was attempted in commit 7a5c9223 ("drm/i915/gt: Split
intel-gtt functions by arch"). The goal of that commit was to split the
handlers for older hardware that depend on intel-gtt.ko so i915 can
be built for non-x86 archs, after some more patches. Other archs do not
need intel-gtt.ko.

Main issue with the previous approach: it moved all the hooks, including
the gen8, which is used by all platforms gen8 and newer.  Re-do the
split moving only the handlers for gen < 6, which are the only ones
calling out to the separate module.

While at it do some minor cleanups:
  - Rename the prefix s/gen5_/gmch_/ to be more accurate what platforms
    are covered by intel_ggtt_gmch.c
  - Remove dead code for gen12 out of needs_idle_maps()
  - Remove TODO comment leftover
  - Re-order if/else ladder in ggtt_probe_hw() to keep newest platforms
    first

v2: Add minor cleanups (Matt Roper)
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617230559.2109427-2-lucas.demarchi@intel.com

9ce07d94

agp/intel: Rename intel-gtt symbols · 64e06652

由 Lucas De Marchi 提交于 6月 17, 2022

Exporting the symbols like intel_gtt_* creates some confusion inside
i915 that has symbols named similarly. In an attempt to isolate
platforms needing intel-gtt.ko, commit 7a5c9223 ("drm/i915/gt: Split
intel-gtt functions by arch") moved way too much
inside gt/intel_gt_gmch.c, even the functions that don't callout to this
module. Rename the symbols to make the separation clear.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617230559.2109427-1-lucas.demarchi@intel.com

64e06652

22 6月, 2022 6 次提交

drm/i915/display: Add smem fallback allocation for dpt · 0dc987b6

由 Juha-Pekka Heikkila 提交于 6月 10, 2022

Add fallback smem allocation for dpt if stolen memory
allocation failed.
Signed-off-by: NJuha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220610121205.29645-1-juhapekka.heikkila@gmail.com

0dc987b6

drm/i915: extend i915_vma_pin_iomap() · d976521a

由 CQ Tang 提交于 6月 10, 2022

In the future display might try call this with a normal smem object,
which doesn't require PIN_MAPPABLE underneath in order to CPU map the
pages (unlike stolen).  Extend i915_vma_pin_iomap() to directly use
i915_gem_object_pin_map() for such cases.

This change was suggested by Chris P Wilson, that we pin
the smem with i915_gem_object_pin_map_unlocked().

v2 (jheikkil): Change i915_gem_object_pin_map_unlocked to
               i915_gem_object_pin_map
Signed-off-by: NCQ Tang <cq.tang@intel.com>
Signed-off-by: NJuha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Jari Tahvanainen <jari.tahvanainen@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
[mauld: tweak commit message, plus minor checkpatch fix]
Link: https://patchwork.freedesktop.org/patch/msgid/20220610121205.29645-2-juhapekka.heikkila@gmail.com

d976521a

drm/i915: don't leak lmem mapping in vma_evict · afd5cb39

由 Juha-Pekka Heikkila 提交于 6月 10, 2022

Don't leak lmem mapping in vma_evict, move __i915_vma_iounmap outside
i915_vma_is_map_and_fenceable.
Signed-off-by: NJuha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220610121205.29645-3-juhapekka.heikkila@gmail.com

afd5cb39

drm/i915/gem: add missing else · 7482a656

由 katrinzhou 提交于 6月 21, 2022

Add missing else in set_proto_ctx_param() to fix coverity issue.

Addresses-Coverity: ("Unused value")
Fixes: d4433c76 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)")
Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Nkatrinzhou <katrinzhou@tencent.com>
[tursulin: fixup alignment]
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621124926.615884-1-tvrtko.ursulin@linux.intel.com

7482a656

drm/i915: Fix spelling typo in comment · 14d6a086

由 pengfuyuan 提交于 6月 16, 2022

Fix spelling typo in comment.
Reported-by: Nk2ci <kernel-bot@kylinos.cn>
Signed-off-by: Npengfuyuan <pengfuyuan@kylinos.cn>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/tencent_7B226C4A9BC2B5EEB37B70C188B5015D290A@qq.com

14d6a086

drm/i915: Add global forcewake request to drpc · fc98eb49

由 Vinay Belgaumkar 提交于 6月 17, 2022

We have seen multiple RC6 issues where it is useful to know
which global forcewake bits are set. Add this to the 'drpc'
debugfs output.

v2: Review comments (Ashutosh)
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NVinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617212032.34577-1-vinay.belgaumkar@intel.com

fc98eb49

20 6月, 2022 1 次提交

drm/i915: Improve on suspend / resume time with VT-d enabled · 2ef6efa7

由 Thomas Hellström 提交于 6月 17, 2022

When DMAR / VT-d is enabled, the display engine uses overfetching,
presumably to deal with the increased latency. To avoid display engine
errors and DMAR faults, as a workaround the GGTT is populated with scatch
PTEs when VT-d is enabled. However starting with gen10, Write-combined
writing of scratch PTES is no longer possible and as a result, populating
the full GGTT with scratch PTEs like on resume becomes very slow as
uncached access is needed.

Therefore, on integrated GPUs utilize the fact that the PTEs are stored in
stolen memory which retain content across S3 suspend. Don't clear the PTEs
on suspend and resume. This improves on resume time with around 100 ms.
While 100+ms might appear like a short time it's 10% to 20% of total resume
time and important in some applications.

One notable exception is Intel Rapid Start Technology which may cause
stolen memory to be lost across what the OS percieves as S3 suspend.
If IRST is enabled or if we can't detect whether IRST is enabled, retain
the old workaround, clearing and re-instating PTEs.

As an additional measure, if we detect that the last ggtt pte was lost
during suspend, print a warning and re-populate the GGTT ptes

On discrete GPUs, the display engine scans out from LMEM which isn't
subject to DMAR, and presumably the workaround is therefore not needed,
but that needs to be verified and disabling the workaround for dGPU,
if possible, will be deferred to a follow-up patch.

v2:
- Rely on retained ptes to also speed up suspend and resume re-binding.
- Re-build GGTT ptes if Intel rst is enabled.
v3:
- Re-build GGTT ptes also if we can't detect whether Intel rst is enabled,
and if the guard page PTE and end of GGTT was lost.
v4:
- Fix some kerneldoc issues (Matthew Auld), rebase.
Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617152856.249295-1-thomas.hellstrom@linux.intel.com

2ef6efa7

17 6月, 2022 4 次提交

drm/i915/gt: Cleanup interface for MCR operations · 3fe6c7f5

由 Matt Roper 提交于 6月 14, 2022

Let's replace the assortment of intel_gt_* and intel_uncore_* functions
that operate on MCR registers with a cleaner set of interfaces:

  * intel_gt_mcr_read -- unicast read from specific instance
  * intel_gt_mcr_read_any[_fw] -- unicast read from any non-terminated
    instance
  * intel_gt_mcr_unicast_write -- unicast write to specific instance
  * intel_gt_mcr_multicast_write[_fw] -- multicast write to all instances

We'll also replace the historic "slice" and "subslice" terminology with
"group" and "instance" to match the documentation for more recent
platforms; these days MCR steering applies to more types of replication
than just slice/subslice.

v2:
 - Reference the new kerneldoc from i915.rst.  (Jani)
 - Tweak the wording of the documentation for a couple functions to
   clarify the difference between "_fw" and non-"_fw" forms.

v3:
 - s/read/write/ to fix copy-paste mistake in a couple comments.
   (Harish)
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: NHarish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220615001019.1821989-3-matthew.d.roper@intel.com

3fe6c7f5

drm/i915/gt: Move multicast register handling to a dedicated file · e7858254

由 Matt Roper 提交于 6月 14, 2022

Handling of multicast/replicated registers is spread across intel_gt.c
and intel_uncore.c today. As multicast handling and the related
steering logic gets more complicated with the addition of new platforms
and new rules it makes sense to centralize it all in one place.

For now the existing functions have been moved to the new .c/.h as-is.
Function renames and updates to operate in a more consistent manner will
be done in subsequent patches.
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: NHarish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220615001019.1821989-2-matthew.d.roper@intel.com

e7858254

drm/i915/fdinfo: Don't show engine classes not present · 9f1b1d0b

由 Tvrtko Ursulin 提交于 6月 16, 2022

Stop displaying engine classes with no engines - it is not a huge problem
if they are shown, since the values will correctly be all zeroes, but it
does count as misleading.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 055634e4 ("drm/i915: Expose client engine utilisation via fdinfo")
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220616140056.559074-1-tvrtko.ursulin@linux.intel.com

9f1b1d0b

drm/i915: Improve user experience and driver robustness under SIGINT or similar · 45c64ecf

由 Tvrtko Ursulin 提交于 5月 27, 2022

We have long standing customer complaints that pressing Ctrl-C (or to the
effect of) causes engine resets with otherwise well behaving programs.

Not only is logging engine resets during normal operation not desirable
since it creates support incidents, but more fundamentally we should avoid
going the engine reset path when we can since any engine reset introduces
a chance of harming an innocent context.

Reason for this undesirable behaviour is that the driver currently does
not distinguish between banned contexts and non-persistent contexts which
have been closed.

To fix this we add the distinction between the two reasons for revoking
contexts, which then allows the strict timeout only be applied to banned,
while innocent contexts (well behaving) can preempt cleanly and exit
without triggering the engine reset path.

Note that the added context exiting category applies both to closed non-
persistent context, and any exiting context when hangcheck has been
disabled by the user.

At the same time we rename the backend operation from 'ban' to 'revoke'
which more accurately describes the actual semantics. (There is no ban at
the backend level since banning is a concept driven by the scheduling
frontend. Backends are simply able to revoke a running context so that
is the more appropriate name chosen.)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220527072452.2225610-1-tvrtko.ursulin@linux.intel.com

45c64ecf

16 6月, 2022 1 次提交

drm/i915/pvc: Add recommended MMIO setting · 1556c3b4

由 Matt Roper 提交于 6月 13, 2022

As with past platforms, the bspec's performance tuning guide provides
recommended MMIO settings.  Although not technically "workarounds" we
apply these through the workaround framework to ensure that they're
re-applied at the proper times (e.g., on engine resets) and that any
conflicts with real workarounds are flagged.

Bspec: 72161
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NJosé Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220613165314.862029-1-matthew.d.roper@intel.com

1556c3b4

15 6月, 2022 1 次提交

drm/i915/pvc: Adjust EU per SS according to HAS_ONE_EU_PER_FUSE_BIT() · 9affc1b8

由 Matt Roper 提交于 6月 10, 2022

If we're treating each bit in the EU fuse register as a single EU
instead of a pair of EUs, then that also cuts the number of potential
EUs per subslice in half.

Fixes: 5ac342ef ("drm/i915/pvc: Add SSEU changes")
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NAnusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220610230801.459577-1-matthew.d.roper@intel.com

9affc1b8

09 6月, 2022 1 次提交

drm/i915/pvc: Add register steering · e0d7371b

由 Matt Roper 提交于 6月 08, 2022

Ponte Vecchio no longer has MSLICE or LNCF steering, but the bspec does
document several new types of multicast register ranges. Fortunately,
most of the different MCR types all provide valid values at instance
(0,0) so there's no need to read fuse registers and calculate a
non-terminated instance. We'll lump all of those range types (BSLICE,
HALFBSLICE, TILEPSMI, CC, and L3BANK) into a single category called
"INSTANCE0" to keep things simple. We'll also perform explicit steering
for each of these multicast register types, even if the implicit
steering setup for COMPUTE/DSS ranges would have worked too; this is
based on guidance from our hardware architects who suggested that we
move away from implicit steering and start explicitly steer all MCR
register accesses on modern platforms (we'll work on transitioning
COMPUTE/DSS to explicit steering in the future).

Note that there's one additional MCR range type defined in the bspec
(SQIDI) that we don't handle here. Those ranges use a different
steering control register that we never touch; since instance 0 is also
always a valid setting there, we can just ignore those ranges.

Finally, we'll rename the HAS_MSLICES() macro to HAS_MSLICE_STEERING().
PVC hardware still has units referred to as mslices, but there's no
register steering based on mslice for this platform.

v2:
- Rebase on other recent changes
- Swap two table rows to keep table sorted & easy to read. (Harish)

Bspec: 67609
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NHarish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220608170700.4026648-1-matthew.d.roper@intel.com

e0d7371b

08 6月, 2022 6 次提交

drm/i915/xehp: Correct steering initialization · 17f65658

由 Matt Roper 提交于 6月 07, 2022

Another mistake during the conversion to DSS bitmaps: after retrieving
the DSS ID intel_sseu_find_first_xehp_dss() we forgot to modulo it down
to obtain which ID within the current gslice it is.

Fixes: b87d3901 ("drm/i915/sseu: Disassociate internal subslice mask representation from uapi")
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607175716.3338661-1-matthew.d.roper@intel.com

17f65658

drm/i915: More PVC+DG2 workarounds · c5cb0002

由 Matt Roper 提交于 6月 07, 2022

A new PVC+DG2 workaround has appeared recently:
 - Wa_16015675438

And a couple existing DG2 workarounds have been extended to PVC:
 - Wa_14015795083
 - Wa_18018781329

Note that Wa_16015675438 asks us to program a register that is in the
0x2xxx range typically associated with the RCS engine, even though PVC
does not have an RCS.  By default the GuC will think we've made a
mistake and throw an exception when it sees this register on a CCS
engine's save/restore list, so we need to pass an extra GuC control flag
to tell it that this is expected and not a problem.
Signed-off-by: NAnshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: NBadal Nilawar <badal.nilawar@intel.com>
Signed-off-by: NPrathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NAnshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220608005108.3717895-1-matthew.d.roper@intel.com

c5cb0002

drm/i915/uc: remove accidental static from a local variable · 5821a0bb

由 Jani Nikula 提交于 5月 11, 2022

The arrays are static const, but the pointer shouldn't be static.

Fixes: 3d832f37 ("drm/i915/uc: Allow platforms to have GuC but not HuC")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220511094619.27889-1-jani.nikula@intel.com

5821a0bb

drm/i915/dg2: Correct DSS check for Wa_1308578152 · 81298056

由 Matt Roper 提交于 6月 07, 2022

When converting our DSS masks to bitmaps, we fumbled the condition used
to check whether any DSS are present in the first gslice. Since
intel_sseu_find_first_xehp_dss() returns a 0-based number, we need a >=
condition rather than >.

Fixes: b87d3901 ("drm/i915/sseu: Disassociate internal subslice mask representation from uapi")
Reported-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607154724.3155521-1-matthew.d.roper@intel.com

81298056

drm/i915/dg2: Add Wa_14015795083 · c6e38067

由 Anshuman Gupta 提交于 6月 07, 2022

i915 must disable Render DOP clock gating globally.

v2:
- Addressed cosmetic review comments.

Bspec: 52621
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: NAnshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607104542.8559-1-anshuman.gupta@intel.com

c6e38067

drm/i915/client: only include what's needed · 34b68c17

由 Jani Nikula 提交于 6月 07, 2022

Only the uapi header is required.
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607092207.476653-1-jani.nikula@intel.com

34b68c17

03 6月, 2022 2 次提交

drm/i915/pvc: GuC depriv applies to PVC · f7dad0da

由 Matt Roper 提交于 6月 02, 2022

We missed this setting in the initial device info patch's definition of
XE_HPC_FEATURES.
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NJosé Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220602233019.1659283-1-matthew.d.roper@intel.com

f7dad0da

drm/i915: Add extra registers to GPU error dump · b729cfee

由 Stuart Summers 提交于 6月 01, 2022

Our internal teams have identified a few additional engine registers
that are worth inspecting in error state dumps during development &
debug. Let's capture and print them as part of our error dump.

For simplicity we'll just dump these registers on gen11 and beyond.
Most of these registers have existed since earlier platforms (e.g., gen6
or gen7) but were initially introduced only for a subset of the
platforms' engines; gen11 seems to be where they became available on all
engines.
Signed-off-by: NStuart Summers <stuart.summers@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Reviewed-by: NJosé Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601210646.615946-1-matthew.d.roper@intel.com

b729cfee

02 6月, 2022 5 次提交

drm/i915/pvc: Add SSEU changes · 5ac342ef

由 Matt Roper 提交于 6月 01, 2022

PVC splits the mask of enabled DSS over two registers. It also changes
the meaning of the EU fuse register such that each bit represents a
single EU rather than a pair of EUs.
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-7-matthew.d.roper@intel.com

5ac342ef

drm/i915/sseu: Disassociate internal subslice mask representation from uapi · b87d3901

由 Matt Roper 提交于 6月 01, 2022

As with EU masks, it's easier to store subslice/DSS masks internally in
a format that's more natural for the driver to work with, and then only
covert into the u8[] uapi form when the query ioctl is invoked.  Since
the hardware design changed significantly with Xe_HP, we'll use a union
to choose between the old "hsw-style" subslice masks or the newer xehp
mask.  HSW-style masks will be stored in an array of u8's, indexed by
slice (there's never more than 6 subslices per slice on older
platforms).  For Xe_HP and beyond where slices no longer exist, we only
need a single bitmask.  However we already know that this mask is
eventually going to grow too large for a simple u64 to hold, so we'll
represent it in a manner that can be operated on by the utilities in
linux/bitmap.h.

v2:
 - Fix typo: BIT(s) -> BIT(ss) in gen9_sseu_device_status()

v3:
 - Eliminate sseu->ss_stride and just calculate the stride while
   specifically handling uapi.  (Tvrtko)
 - Use BITMAP_BITS() macro to refer to size of masks rather than
   passing I915_MAX_SS_FUSE_BITS directly.  (Tvrtko)
 - Report compute/geometry DSS masks separately when dumping Xe_HP SSEU
   info.  (Tvrtko)
 - Restore dropped range checks to intel_sseu_has_subslice().  (Tvrtko)

v4:
 - Make the bitmap size macro check the size of the .xehp field rather
   than the containing union.  (Tvrtko)
 - Don't add GEM_BUG_ON() intel_sseu_has_subslice()'s check for whether
   slice or subslice ID exceed sseu->max_[sub]slices; various loops
   in the driver are expected to exceed these, so we should just
   silently return 'false.'

v5:
 - Move XEHP_BITMAP_BITS() to the header so that we can also replace a
   usage of I915_MAX_SS_FUSE_BITS in one of the inline functions.
   (Bala)
 - Change the local variable in intel_slicemask_from_xehp_dssmask() from
   u16 to 'unsigned long' to make it a bit more future-proof.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-6-matthew.d.roper@intel.com

b87d3901

drm/i915/sseu: Don't try to store EU mask internally in UAPI format · bc3c5e08

由 Matt Roper 提交于 6月 01, 2022

Storing the EU mask internally in the same format the I915_QUERY
topology queries use makes the final copy_to_user() a bit simpler, but
makes the rest of the driver's SSEU more complicated and harder to
follow.  Let's switch to an internal representation that's more natural:
Xe_HP platforms will be a simple array of u16 masks, whereas pre-Xe_HP
platforms will be a two-dimensional array, indexed by [slice][subslice].
We'll convert to the uapi format only when the query uapi is called.

v2:
 - Drop has_common_ss_eumask.  We waste some space repeating identical
   EU masks for every single DSS, but the code is simpler without it.
   (Tvrtko)

v3:
 - Mask down EUs passed to sseu_set_eus at the callsite rather than
   inside the function.  (Tvrtko)
 - Eliminate sseu->eu_stride and calculate it when needed.  (Tvrtko)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-5-matthew.d.roper@intel.com

bc3c5e08

drm/i915/sseu: Simplify gen11+ SSEU handling · 4cfd1665

由 Matt Roper 提交于 6月 01, 2022

Although gen11 and gen12 architectures supported the concept of multiple
slices, in practice all the platforms that were actually designed only
had a single slice (i.e., note the parameters to 'intel_sseu_set_info'
that we pass for each platform).  We can simplify the code slightly by
dropping the multi-slice logic from gen11+ platforms.

v2:
 - Promote drm_dbg to drm_WARN_ON if the slice fuse register reports
   unexpected fusing.  (Tvrtko)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-4-matthew.d.roper@intel.com

4cfd1665

drm/i915/xehp: Drop GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK · aa2bdc48

由 Matt Roper 提交于 6月 01, 2022

Slice/subslice/EU information should be obtained via the topology
queries provided by the I915_QUERY interface; let's turn off support for
the old GETPARAM lookups on Xe_HP and beyond where we can't return
meaningful values.

The slice mask lookup is meaningless since Xe_HP doesn't support
traditional slices (and we make no attempt to return the various new
units like gslices, cslices, mslices, etc.) here.

The subslice mask lookup is even more problematic; given the distinct
masks for geometry vs compute purposes, the combined mask returned here
is likely not what userspace would want to act upon anyway. The value
is also limited to 32-bits by the nature of the GETPARAM ioctl which is
sufficient for the initial Xe_HP platforms, but is unable to convey the
larger masks that will be needed on other upcoming platforms. Finally,
the value returned here becomes even less meaningful when used on
multi-tile platforms where each tile will have its own masks.
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # mesa
Reviewed-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-3-matthew.d.roper@intel.com

aa2bdc48

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功