- 12 9月, 2022 15 次提交
-
-
由 Matt Roper 提交于
Top-level handling of standalone media interrupts will be processed as part of the primary GT's interrupt handler (since primary and media GTs share an MMIO space, unlike remote tile setups). When we get down to the point of handling engine interrupts, we need to take care to lookup VCS and VECS engines in the media GT rather than the primary. There are also a couple of additional "other" instance bits that correspond to the media GT's GuC and media GT's power management interrupts; we need to direct those to the media GT instance as well. Bspec: 45605 Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-15-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
When we hook up interrupts (in the next patch), interrupts for the media GT are still processed as part of the primary GT's interrupt flow. As such, we should share the same IRQ lock with the primary GT. Let's convert gt->irq_lock into a pointer and just point the media GT's instance at the same lock the primary GT is using. v2: - Point media's gt->irq_lock at the primary GT lock properly. (Daniele) - Fix jump target for intel_root_gt_init_early errors. (Daniele) Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-14-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
Xe_LPM+ platforms have "standalone media." I.e., the media unit is designed as an additional GT with its own engine list, GuC, forcewake, etc. Let's allow platforms to include media GTs in their device info. v2: - Simplify GSI register handling and split it out to a separate patch for ease of review. (Daniele) Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NAravind Iddamsetty <aravind.iddamsetty@intel.com> Acked-by: NAravind Iddamsetty <aravind.iddamsetty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-13-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
The aux table invalidation registers are a bit unique --- they're engine-centric registers that reside in the GSI register space rather than within the engines' regular MMIO ranges. That means that when issuing invalidation on engines in the standalone media GT, the GSI offset must be added to the regular MMIO offset for the invalidation registers. Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NAravind Iddamsetty <aravind.iddamsetty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-12-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
GT non-engine registers (referred to as "GSI" registers by the spec) have the same relative offsets on standalone media as they do on the primary GT, just with an additional "GSI offset" added to their MMIO address. If we store this GSI offset in the standalone media's intel_uncore structure, it can be automatically applied to all GSI reg reads/writes that happen on that GT, allowing us to re-use our existing GT code with minimal changes. Forcewake and shadowed register tables for the media GT (which will be added in a future patch) are listed as final addresses that already include the GSI offset, so we also need to add the GSI offset before doing lookups of registers in one of those tables. v2: - Add comment on raw_reg_*() macros explaining why we don't bother with GSI offsets in them. (Daniele) Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220908224550.821257-1-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
The common early GT init is needed for initialization of all GT types (root/primary, remote tile, standalone media). Since standalone media (coming in a future patch) will be implemented in a separate file, rename and expose the function for use. Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NRadhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-7-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
We're going to introduce an additional intel_gt for MTL's media unit soon. Let's provide a bit more multi-GT initialization framework in preparation for that. The initialization will pull the list of GTs for a platform from the device info structure. Although necessary for the immediate MTL media enabling, this same framework will also be used farther down the road when we enable remote tiles on xehpsdv and pvc. v2: - Re-add missing test for !HAS_EXTRA_GT_LIST in intel_gt_probe_all(). v3: - Move intel_gt_definition struct to intel_gt_types.h. (Jani) - Drop gtdef->setup(). For now we'll just use a switch() based on GT type since we don't have too many different handlers for the foreseeable future. (Jani) Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NRadhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: NAravind Iddamsetty <aravind.iddamsetty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-6-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
Unmapping of the MMIO range can be done as a DRM-managed action, which will take care of the unmapping on device teardown and error paths. This will also ensure proper ordering with respect to other DRM-managed actions that we'll be using to clean up non-primary GTs in upcoming patches. We have not yet enabled any non-root GTs in the driver yet, so the kfree() of the GT structure is effectively dead code. When we do start enabling non-root GTs in upcoming patches, those are going to be using DRM-managed allocations tied to the device lifetime, so we don't need to explicitly free them (and kfree would be incorrect anyway). Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-5-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
We're slowly transitioning the init-time kzalloc's of the driver over to DRM-managed allocations; let's make sure the uncore objects allocated for non-root GTs are thus allocated. Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NRadhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-4-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matt Roper 提交于
The original intent of intel_uncore_mmio_debug as described in commit 0a9b2630 ("drm/i915: split out uncore_mmio_debug") was to be a singleton structure that could be shared between multiple GTs' uncore objects in a multi-tile system. Somehow we went off track and started allocating separate instances of this structure for each GT, which defeats that original goal. But in reality, there isn't even a need to share the mmio_debug between multiple GTs; on all modern platforms (i.e., everything after gen7) unclaimed register accesses are something that can only be detected for display registers. There's no point in grabbing the debug spinlock and checking for unclaimed accesses on an uncore used by an xehpsdv or pvc remote tile GT, or the uncore used by a mtl standalone media GT since all of the display accesses go through the primary intel_uncore. The simplest solution is to simply leave uncore->debug NULL on all intel_uncore instances except for the primary one. This will allow us to avoid the pointless debug spinlock acquisition we've been doing on MMIO accesses coming in through these intel_uncores. Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NRadhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-3-matthew.d.roper@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Umesh Nerlige Ramappa 提交于
The worker is canceled in gt_park path, but earlier it was assumed that gt_park path cannot sleep and the cancel is asynchronous. This caused a race with suspend flow where the worker runs after suspend and causes an unclaimed register access warning. Cancel the worker synchronously since the gt_park is indeed allowed to sleep. v2: Fix author name and sign-off mismatch Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4419 Fixes: 77cdd054 ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220827002135.139349-1-umesh.nerlige.ramappa@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Tomas Winkler 提交于
GSC requires more operational memory than available on chip. Reserve 4M of LMEM for GSC operation. The memory is provided to the GSC as struct resource to the auxiliary data of the child device. Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: NTomas Winkler <tomas.winkler@intel.com> Signed-off-by: NAlexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-16-tomas.winkler@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Alexander Usyskin 提交于
Define GSC on XeHP SDV (Intel(R) dGPU without display) XeHP SDV uses the same hardware settings as DG1, but uses polling instead of interrupts and runs the firmware in slow pace due to hardware limitations. Signed-off-by: NVitaly Lubart <vitaly.lubart@intel.com> Signed-off-by: NTomas Winkler <tomas.winkler@intel.com> Signed-off-by: NAlexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-6-tomas.winkler@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Alexander Usyskin 提交于
Add slow_firmware flag to the gsc device definition and pass it to mei auxiliary device, this instructs the driver to use longer operation timeouts. Signed-off-by: NAlexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: NTomas Winkler <tomas.winkler@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-5-tomas.winkler@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Vitaly Lubart 提交于
Some platforms require the host to poll on the GSC registers instead of relaying on the interrupts. For those platforms, irq initialization should be skipped Signed-off-by: NVitaly Lubart <vitaly.lubart@intel.com> Signed-off-by: NTomas Winkler <tomas.winkler@intel.com> Signed-off-by: NAlexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-2-tomas.winkler@intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 08 9月, 2022 3 次提交
-
-
So far, different views (normal, partial, rotated and remapped) into the same object are only supported for GGTT mappings. But with the upcoming VM_BIND feature, PPGTT will also use the partial view mapping. Hence rename ggtt_view to more generic gtt_view. Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220901183854.3446-1-niranjana.vishwanathapura@intel.com
-
由 John Harrison 提交于
With the move to un-versioned filenames, it becomes more difficult to know exactly what version of a given firmware is being used. So add the patch level version number to the debugfs output. Also, support matching by patch level when selecting code paths for firmware compatibility. While a patch level change cannot be backwards breaking, it is potentially possible that a new feature only works from a given patch level onwards (even though it was theoretically added in an earlier version that bumped the major or minor version). Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906230147.479945-2-daniele.ceraolospurio@intel.com
-
由 John Harrison 提交于
There was a misunderstanding in how firmware file compatibility should be managed within i915. This has been clarified as: i915 must support all existing firmware releases forever new minor firmware releases should replace prior versions only backwards compatibility breaking releases should be a new file This patch cleans up the single fallback file support that was added as a quick fix emergency effort. That is now removed in preference to supporting arbitrary numbers of firmware files per platform. The patch also adds support for having GuC firmware files that are named by major version only (because the major version indicates backwards breaking changes that affect the KMD) and for having HuC firmware files with no version number at all (because the KMD has no interface requirements with the HuC). For GuC, the driver will report via dmesg if the found file is older than expected. For HuC, the KMD will no longer require updating for any new HuC release so will not be able to report what the latest expected version is. Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220906230147.479945-1-daniele.ceraolospurio@intel.com
-
- 07 9月, 2022 1 次提交
-
-
由 Rodrigo Vivi 提交于
Specially in GT reset case this could be triggered and try to disable things that had never been enabled. Let's add some protection here. Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Acked-by: NNirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220902095126.373036-1-rodrigo.vivi@intel.com
-
- 02 9月, 2022 1 次提交
-
-
由 Rodrigo Vivi 提交于
We need to inform PCODE of a desired ring frequencies so PCODE update the memory frequencies to us. rps->min_freq and rps->max_freq are the frequencies used in that request. However they were unset when SLPC was enabled and PCODE never updated the memory freq. v2 (as Suggested by Ashutosh): if SLPC is in use, let's pick the right frequencies from the get_ia_constants instead of the fake init of rps' min and max. v3: don't forget the max <= min return v4: Move all the freq conversion to intel_rps.c. And the max <= min check to where it belongs. v5: (Ashutosh) Fix old comment s/50 HZ/50 MHz and add a doc explaining the "raw format" Fixes: 7ba79a67 ("drm/i915/guc/slpc: Gate Host RPS when SLPC is enabled") Cc: <stable@vger.kernel.org> # v5.15+ Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Tested-by: NSushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220831214538.143950-1-rodrigo.vivi@intel.com
-
- 01 9月, 2022 1 次提交
-
-
由 Rodrigo Vivi 提交于
Fix for intel_guc_slpc_set_min_freq() warn: inconsistent returns '&slpc->lock'. v2: Avoid with_intel_runtime_pm with the internal goto/return. (Ashutosh) Also standardize the 'ret' if this came from the efficient setup. And avoid the 'unlikely'. Fixes: 95ccf312 ("drm/i915/guc/slpc: Allow SLPC to use efficient frequency") Reported-by: Nkernel test robot <lkp@intel.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220830193537.52201-1-rodrigo.vivi@intel.com
-
- 31 8月, 2022 2 次提交
-
-
由 Matt Roper 提交于
On client DG2 platforms, optimal performance is achieved with the hardware's default "age based" thread execution setting. However on ATS-M, switching this to "round robin after dependencies" provides better performance. We'll add a new "tuning" feature flag to the ATS-M device info to enable/disable this setting. Bspec: 68331 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NMatt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220826212718.409948-1-matthew.d.roper@intel.com
-
由 Matt Roper 提交于
This reverts commit ca692081. The intent of Wa_14015141709 was to inform us that userspace can no longer control object-level preemption as it has on past platforms (i.e., by twiddling register bit CS_CHICKEN1[0]). The description of the workaround in the spec wasn't terribly well-written, and when we requested clarification from the hardware teams we were told that on the kernel side we should also probably stop setting FF_SLICE_CS_CHICKEN1[14], which is the register bit that directs the hardware to honor the settings in per-context register CS_CHICKEN1. It turns out that this guidance about FF_SLICE_CS_CHICKEN1[14] was a mistake; even though CS_CHICKEN1[0] is non-operational and useless to userspace, there are other bits in the register that do still work and might need to be adjusted by userspace in the future (e.g., to implement other workarounds that show up). If we don't set FF_SLICE_CS_CHICKEN1[14] in i915, then those future workarounds would not take effect. This miscommunication came to light because another workaround (Wa_16013994831) has now shown up that requires userspace to adjust the value of CS_CHICKEN[10] in certain circumstances. To ensure userspace's updates to this chicken bit are handled properly by the hardware, we need to make sure that FF_SLICE_CS_CHICKEN1[14] is once again set by the kernel. Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NMatt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220826210233.406482-1-matthew.d.roper@intel.com
-
- 29 8月, 2022 1 次提交
-
-
由 Joonas Lahtinen 提交于
Remove the module parameters for configuring GuC log size. We should instead rely on tuning the defaults to be usable for reporting bugs. v2: - Use correct 1M unit Fixes: 8ad0152a ("drm/i915/guc: Make GuC log sizes runtime configurable") Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220826092343.184568-1-joonas.lahtinen@linux.intel.com
-
- 26 8月, 2022 2 次提交
-
-
由 Matt Roper 提交于
Although register tuning settings are generally implemented via the workaround infrastructure, it turns out that the DRAW_WATERMARK register is not properly saved/restored by hardware around power events (i.e., RC6 entry) so updates to the value cannot be applied in the usual manner. New workaround Wa_16014892111 informs us that any tuning updates to this register must instead be applied via an INDIRECT_CTX batch buffer. This will ensure that the necessary value is re-applied when a context begins running, even if an RC6 entry had wiped the register back to hardware defaults since the last context ran. Fixes: 6dc85721 ("drm/i915/dg2: Add additional tuning settings") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6642Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220823202449.83727-1-matthew.d.roper@intel.com
-
由 Matt Roper 提交于
Unlike the Xe_HP platforms, MTL only has a single CCS engine; the quad-based engine masking logic does not apply to this platform (or presumably any future platforms that only have 0 or 1 CCS). Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Signed-off-by: NRadhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: NBalasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220818234202.451742-5-radhakrishna.sripada@intel.com
-
- 24 8月, 2022 2 次提交
-
-
由 Vinay Belgaumkar 提交于
Host Turbo operates at efficient frequency when GT is not idle unless the user or workload has forced it to a higher level. Replicate the same behavior in SLPC by allowing the algorithm to use efficient frequency. We had disabled it during boot due to concerns that it might break kernel ABI for min frequency. However, this is not the case since SLPC will still abide by the (min,max) range limits. With this change, min freq will be at efficient frequency level at init instead of fused min (RPn). If user chooses to reduce min freq below the efficient freq, we will turn off usage of efficient frequency and honor the user request. When a higher value is written, it will get toggled back again. The patch also corrects the register which needs to be read for obtaining the correct efficient frequency for Gen9+. We see much better perf numbers with benchmarks like glmark2 with efficient frequency usage enabled as expected. v2: Address review comments (Rodrigo) v3: with efficient frequency being dynamic, it is possible that the req frequency may go beyond max freq. This will cause SLPC selftests to fail. Add a FIXME there to start the test with [RPn, RP0] instead and restore it afterwards. BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/5468 Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NVinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220820010832.15350-1-vinay.belgaumkar@intel.com
-
由 Jani Nikula 提交于
Commit 368d179a ("drm/i915/guc: Add GuC <-> kernel time stamp translation information") added intel_device_info_print_runtime() in the time info dump for no obvious reason or explanation in the commit message. It only logs the rawclk freq. Remove it. Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b395ac4c909042f5daabf29959d8733993545aa2.1660910433.git.jani.nikula@intel.com
-
- 21 8月, 2022 1 次提交
-
-
由 Matthew Auld 提交于
This reverts commit 6a079903. Everything in CI using GuC is now timing out[1], and killing the machine with this change (perhaps a deadlock?). CI was recently on fire due to some changes coming in from -rc1, so likely the pre-merge CI results for this series were invalid? For now just revert, unless GuC experts already have a fix in mind. [1] https://intel-gfx-ci.01.org/tree/drm-tip/index.html? Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com> Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220819123904.913750-1-matthew.auld@intel.com
-
- 19 8月, 2022 1 次提交
-
-
由 Matthew Brost 提交于
Add a delay, configurable via debugfs (default 34ms), to disable scheduling of a context after the pin count goes to zero. Disable scheduling is a costly operation as it requires synchronizing with the GuC. So the idea is that a delay allows the user to resubmit something before doing this operation. This delay is only done if the context isn't closed and less than a given threshold (default is 3/4) of the guc_ids are in use. As temporary WA disable this feature for the selftests. Selftests are very timing sensitive and any change in timing can cause failure. A follow up patch will fixup the selftests to understand this delay. Alan Previn: Matt Brost first introduced this series back in Oct 2021. However no real world workload with measured performance impact was available to prove the intended results. Today, this series is being republished in response to a real world workload that benefited greatly from it along with measured performance improvement. Workload description: 36 containers were created on a DG2 device where each container was performing a combination of 720p 3d game rendering and 30fps video encoding. The workload density was configured in a way that guaranteed each container to ALWAYS be able to render and encode no less than 30fps with a predefined maximum render + encode latency time. That means the totality of all 36 containers and their workloads were not saturating the engines to their max (in order to maintain just enough headrooom to meet the min fps and max latencies of incoming container submissions). Problem statement: It was observed that the CPU core processing the i915 soft IRQ work was experiencing severe load. Using tracelogs and an instrumentation patch to count specific i915 IRQ events, it was confirmed that the majority of the CPU cycles were caused by the gen11_other_irq_handler() -> guc_irq_handler() code path. The vast majority of the cycles was determined to be processing a specific G2H IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent by GuC in response to i915 KMD sending H2G requests: INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G requests are sent whenever a context goes idle so that we can unpin the context from GuC. The high CPU utilization % symptom was limiting density scaling. Root Cause Analysis: Because the incoming execution buffers were spread across 36 different containers (each with multiple contexts) but the system in totality was NOT saturated to the max, it was assumed that each context was constantly idling between submissions. This was causing a thrashing of unpinning contexts from GuC at one moment, followed quickly by repinning them due to incoming workload the very next moment. These event-pairs were being triggered across multiple contexts per container, across all containers at the rate of > 30 times per sec per context. Metrics: When running this workload without this patch, we measured an average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10 seconds or ~10 million times over ~25+ mins. With this patch, the count reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The improvement observed is ~99% for the average counts per 10 seconds. Signed-off-by: NMatthew Brost <matthew.brost@intel.com> Signed-off-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com> Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220817020511.2180747-3-alan.previn.teres.alexis@intel.com
-
- 18 8月, 2022 8 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
If the GuC CTs are full and we need to stall the request submission while waiting for space, we save the stalled request and where the stall occurred; when the CTs have space again we pick up the request submission from where we left off. If a full GT reset occurs, the state of all contexts is cleared and all non-guilty requests are unsubmitted, therefore we need to restart the stalled request submission from scratch. To make sure that we do so, clear the saved request after a reset. Fixes note: the patch that introduced the bug is in 5.15, but no officially supported platform had GuC submission enabled by default in that kernel, so the backport to that particular version (and only that one) can potentially be skipped. Fixes: 925dc1cf ("drm/i915/guc: Implement GuC submission tasklet") Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: <stable@vger.kernel.org> # v5.15+ Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220811210812.3239621-1-daniele.ceraolospurio@intel.com
-
由 Daniele Ceraolo Spurio 提交于
The test needs GT reset to trigger the scrubbing logic, so we can only run it when reset is enabled. Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220708224158.929327-1-daniele.ceraolospurio@intel.com
-
由 John Harrison 提交于
Some debug code got left in when the GuC based register save for error capture was added. Remove that. Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-8-John.C.Harrison@Intel.com
-
由 John Harrison 提交于
The GuC log buffer sizes had to be configured statically at compile time. This can be quite troublesome when needing to get larger logs out of a released driver. So re-organise the code to allow a boot time module parameter override. Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-7-John.C.Harrison@Intel.com
-
由 Chris Wilson 提交于
Use a temporary page and mempy_from_wc to reduce the time it takes to dump the guc log to debugfs. Signed-off-by: NChris Wilson <chris.p.wilson@intel.com> Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-6-John.C.Harrison@Intel.com
-
由 John Harrison 提交于
It is useful to be able to match GuC events to kernel events when looking at the GuC log. That requires being able to convert GuC timestamps to kernel time. So, when dumping error captures and/or GuC logs, include a stamp in both time zones plus the clock frequency. Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-4-John.C.Harrison@Intel.com
-
由 John Harrison 提交于
There was a size check to warn if the GuC error state capture buffer allocation would be too small to fit a reasonable amount of capture data for the current platform. Unfortunately, the test was done too early in the boot sequence and was actually testing 'if(-ENODEV > size)'. Move the check to be later. The check is only used to print a warning message, so it doesn't really matter how early or late it is done. Note that it is not possible to dynamically size the buffer because the allocation needs to be done before the engine information is available (at least, it would be in the intended two-phase GuC init process). Now that the check works, it is reporting size too small for newer platforms. The check includes a 3x oversample multiplier to allow for multiple error captures to be bufferd by GuC before i915 has a chance to read them out. This is less important than simply being big enough to fit the first capture. So a) bump the default size to be large enough for one capture minimum and b) make the warning only if one capture won't fit, instead use a notice for the 3x size. Note that the size estimate is a worst case scenario. Actual captures will likely be smaller. Lastly, use drm_warn istead of DRM_WARN as the former provides more infmration and the latter is deprecated. Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-3-John.C.Harrison@Intel.com
-
由 Alan Previn 提交于
Add a helper to get GuC log buffer size. Signed-off-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NMatthew Brost <matthew.brost@intel.com> Reviewed-by: NAlan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-2-John.C.Harrison@Intel.com
-
- 17 8月, 2022 2 次提交
-
-
由 Matt Roper 提交于
Some additional MMIO tuning settings have appeared in the bspec's performance tuning guide section. One of the tuning settings here is also documented as formal workaround Wa_22012654132 for some steppings of DG2. However the tuning setting applies to all DG2 variants and steppings, making it a superset of the workaround. v2: - Move DRAW_WATERMARK to engine workaround section. It only moves into the engine context on future platforms. (Lucas) - CHICKEN_RASTER_2 needs to be handled as a masked register. (Lucas) Bspec: 68331 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220816210601.2041572-2-matthew.d.roper@intel.com
-
由 Matt Roper 提交于
The bspec performance tuning section gives recommended settings that the driver should program for various MMIO registers. Although these settings aren't "workarounds" we use the workaround infrastructure to do this programming to make sure it is handled at the appropriate places and doesn't conflict with any real workarounds. Since more of these are starting to show up on recent platforms, it's a good time to create a dedicated function to hold them so that there's less ambiguity about how/where to implement new ones. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220816210601.2041572-1-matthew.d.roper@intel.com
-