- 29 1月, 2019 3 次提交
-
-
由 Chris Wilson 提交于
Now that we have allocated ourselves a cacheline to store a breadcrumb, we can emit a write from the GPU into the timeline's HWSP of the per-context seqno as we complete each request. This drops the mirroring of the per-engine HWSP and allows each context to operate independently. We do not need to unwind the per-context timeline, and so requests are always consistent with the timeline breadcrumb, greatly simplifying the completion checks as we no longer need to be concerned about the global_seqno changing mid check. One complication though is that we have to be wary that the request may outlive the HWSP and so avoid touching the potentially danging pointer after we have retired the fence. We also have to guard our access of the HWSP with RCU, the release of the obj->mm.pages should already be RCU-safe. At this point, we are emitting both per-context and global seqno and still using the single per-engine execution timeline for resolving interrupts. v2: s/fake_complete/mark_complete/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Allocate a page for use as a status page by a group of timelines, as we only need a dword of storage for each (rounded up to the cacheline for safety) we can pack multiple timelines into the same page. Each timeline will then be able to track its own HW seqno. v2: Reuse the common per-engine HWSP for the solitary ringbuffer timeline, so that we do not have to emit (using per-gen specialised vfuncs) the breadcrumb into the distinct timeline HWSP and instead can keep on using the common MI_STORE_DWORD_INDEX. However, to maintain the sleight-of-hand for the global/per-context seqno switchover, we will store both temporarily (and so use a custom offset for the shared timeline HWSP until the switch over). v3: Keep things simple and allocate a page for each timeline, page sharing comes next. v4: I was caught repeating the same MI_STORE_DWORD_IMM over and over again in selftests. v5: And caught red handed copying create timeline + check. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Currently we only allocate an object and vma if we are using a GGTT virtual HWSP, and a plain struct page for a physical HWSP. For convenience later on with global timelines, it will be useful to always have the status page being tracked by a struct i915_vma. Make it so. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-4-chris@chris-wilson.co.uk
-
- 25 1月, 2019 3 次提交
-
-
由 Chris Wilson 提交于
Now that the submission backends are controlled via their own spinlocks, with a wave of a magic wand we can lift the struct_mutex requirement around GPU reset. That is we allow the submission frontend (userspace) to keep on submitting while we process the GPU reset as we can suspend the backend independently. The major change is around the backoff/handoff strategy for performing the reset. With no mutex deadlock, we no longer have to coordinate with any waiter, and just perform the reset immediately. Testcase: igt/gem_mmap_gtt/hang # regresses Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Now that we know we measure the size of the engine->emit_breadcrumb() correctly, we can remove the previous manual counting. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125120005.25191-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Instead of tediously and fragilely counting up the number of dwords required to emit the breadcrumb to seal a request, fake a request and measure it automatically once during engine setup. The downside is that this requires a fair amount of mocking to create a proper breadcrumb. Still, should be less error prone in future as the breadcrumb size fluctuates! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125100520.20163-1-chris@chris-wilson.co.uk
-
- 18 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Since commit d4ccceb0 ("drm/i915/icl: Ringbuffer interrupt handling") we have required a mechanism to avoid touching the interrupt hardware for breadcrumbs, superseding our mock interface for selftests. The residual problem (ideas welcome) is in probing the mock ring registers for ring_is_idle. Hmm, maybe we should just install mock handlers for i915->uncore.mmio__write and friends? Only problem being is that we would to truly mock some expected reads. :( References: d4ccceb0 ("drm/i915/icl: Ringbuffer interrupt handling") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190118112225.13780-1-chris@chris-wilson.co.uk
-
- 17 1月, 2019 2 次提交
-
-
由 Jani Nikula 提交于
Mixed C99 and kernel types use is getting ugly. Prefer kernel types. sed -i 's/\buint\(8\|16\|32\|64\)_t\b/u\1/g' Minor checkpatch fixes sprinkled on top of the changed lines. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NJosé Roberto de Souza <jose.souza@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/14ed72e7f04c9340a057855c5950b54811f8a477.1547629303.git.jani.nikula@intel.com
-
由 Chris Wilson 提交于
Currently the code to reset the GPU and our state is spread widely across a few files. Pull the logic together into a common file. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk
-
- 16 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Move the debug pretty printer into a standalone routine prior to extending it in upcoming feature work. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190115212948.10423-1-chris@chris-wilson.co.uk
-
- 15 1月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Keep track of the temporary rpm wakerefs used for user access to the device, so that we can cancel them upon release and clearly identify any leaks. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-10-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The majority of runtime-pm operations are bounded and scoped within a function; these are easy to verify that the wakeref are handled correctly. We can employ the compiler to help us, and reduce the number of wakerefs tracked when debugging, by passing around cookies provided by the various rpm_get functions to their rpm_put counterpart. This makes the pairing explicit, and given the required wakeref cookie the compiler can verify that we pass an initialised value to the rpm_put (quite handy for double checking error paths). For regular builds, the compiler should be able to eliminate the unused local variables and the program growth should be minimal. Fwiw, it came out as a net improvement as gcc was able to refactor rpm_get and rpm_get_if_in_use together, v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual mark up for smaller more targeted patches. v3: Mention the cookie in Returns Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
-
- 03 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
When we first introduced the reset to sanitize the GPU on taking over from the BIOS and before returning control to third parties (the BIOS!), we restricted it to only systems utilizing HW contexts as we were uncertain of how stable our reset mechanism truly was. We now have reasonable coverage across all machines that expose a GPU reset method, and so we should be safe to sanitize the GPU state everywhere. v2: We _have_ to skip the reset if it would clobber the display. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190103112104.19561-1-chris@chris-wilson.co.uk
-
- 02 1月, 2019 1 次提交
-
-
由 Jani Nikula 提交于
First move the low hanging fruit, the fields that are only initialized runtime. Use RUNTIME_INFO() exclusively to access the fields. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c24fe7a4b0492a888690c46814c0ff21ce2f12b1.1546267488.git.jani.nikula@intel.com
-
- 31 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Now that we have eliminated the CPU-side irq_seqno_barrier by moving the delays on the GPU before emitting the MI_USER_INTERRUPT, we can remove the engine->irq_seqno_barrier infrastructure. Though intentionally slowing down the GPU is nasty, so is the code we can now remove! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181228171641.16531-6-chris@chris-wilson.co.uk
-
- 28 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
The writing is on the wall for the existence of a single execution queue along each engine, and as a consequence we will not be able to track dependencies along the HW queue itself, i.e. we will not be able to use HW semaphores on gen7 as they use a global set of registers (and unlike gen8+ we can not effectively target memory to keep per-context seqno and dependencies). On the positive side, when we implement request reordering for gen7 we also can not presume a simple execution queue and would also require removing the current semaphore generation code. So this bring us another step closer to request reordering for ringbuffer submission! The negative side is that using interrupts to drive inter-engine synchronisation is much slower (4us -> 15us to do a nop on each of the 3 engines on ivb). This is much better than it was at the time of introducing the HW semaphores and equally important userspace weaned itself off intermixing dependent BLT/RENDER operations (the prime culprit was glyph rendering in UXA). So while we regress the microbenchmarks, it should not impact the user. References: https://bugs.freedesktop.org/show_bug.cgi?id=108888Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181228140736.32606-2-chris@chris-wilson.co.uk
-
- 18 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Having completed a test run of gem_eio across all machines in CI we also observe the phenomenon (of lost interrupts after resetting the GPU) on gen3 machines as well as the previously sighted gen6/gen7. Let's apply the same HWSTAM workaround that was effective for gen6+ for all, as although we haven't seen the same failure on gen4/5 it seems prudent to keep the code the same. As a consequence we can remove the extra setting of HWSTAM and apply the register from a single site. v2: Delazy and move the HWSTAM into its own function v3: Mask off all HWSP writes on driver unload and engine cleanup. v4: And what about the physical hwsp? v5: No, engine->init_hw() is not called from driver_init_hw(), don't be daft. Really scrub HWSTAM as early as we can in driver_init_mmio() v6: Rename set_hwsp as it was setting the mask not the hwsp register. v7: Ville pointed out that although vcs(bsd) was introduced for g4x/ilk, per-engine HWSTAM was not introduced until gen6! References: https://bugs.freedesktop.org/show_bug.cgi?id=108735Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181218102712.11058-1-chris@chris-wilson.co.uk
-
- 13 12月, 2018 3 次提交
-
-
由 Lucas De Marchi 提交于
Instead of using IS_GEN() for consecutive gen checks, let's pass the range to IS_GEN_RANGE(). By code inspection these were the ranges deemed necessary for spatch: @@ expression e; @@ ( - IS_GEN(e, 3) || IS_GEN(e, 2) + IS_GEN_RANGE(e, 2, 3) | - IS_GEN(e, 3) || IS_GEN(e, 4) + IS_GEN_RANGE(e, 3, 4) | - IS_GEN(e, 5) || IS_GEN(e, 6) + IS_GEN_RANGE(e, 5, 6) | - IS_GEN(e, 6) || IS_GEN(e, 7) + IS_GEN_RANGE(e, 6, 7) | - IS_GEN(e, 7) || IS_GEN(e, 8) + IS_GEN_RANGE(e, 7, 8) | - IS_GEN(e, 8) || IS_GEN(e, 9) + IS_GEN_RANGE(e, 8, 9) | - IS_GEN(e, 10) || IS_GEN(e, 9) + IS_GEN_RANGE(e, 9, 10) | - IS_GEN(e, 9) || IS_GEN(e, 10) + IS_GEN_RANGE(e, 9, 10) ) After conversion, checking we don't have any missing IS_GEN_RANGE() || IS_GEN() was also done. Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-3-lucas.demarchi@intel.com
-
由 Lucas De Marchi 提交于
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of gen_mask to do the comparison. Now callers can pass then gen as a parameter, so we don't require one macro for each gen. The following spatch was used to convert the users of these macros: @@ expression e; @@ ( - IS_GEN2(e) + IS_GEN(e, 2) | - IS_GEN3(e) + IS_GEN(e, 3) | - IS_GEN4(e) + IS_GEN(e, 4) | - IS_GEN5(e) + IS_GEN(e, 5) | - IS_GEN6(e) + IS_GEN(e, 6) | - IS_GEN7(e) + IS_GEN(e, 7) | - IS_GEN8(e) + IS_GEN(e, 8) | - IS_GEN9(e) + IS_GEN(e, 9) | - IS_GEN10(e) + IS_GEN(e, 10) | - IS_GEN11(e) + IS_GEN(e, 11) ) v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than using the bitmask Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
-
由 Lucas De Marchi 提交于
RANGE makes it longer, but clearer. We are also going to add a macro to check an individual gen, so add the _RANGE prefix here. Diff generated with: sed 's/IS_GEN(/IS_GEN_RANGE(/g' drivers/gpu/drm/i915/{*/,}*.{c,h} -i v2: use IS_GEN rather than GT_GEN Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-1-lucas.demarchi@intel.com
-
- 12 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Currently we allocate a scratch page for each engine, but since we only ever write into it for post-sync operations, it is not exposed to userspace nor do we care for coherency. As we then do not care about its contents, we can use one page for all, reducing our allocations and avoid complications by not assuming per-engine isolation. For later use, it simplifies engine initialisation (by removing the allocation that required struct_mutex!) and means that we can always rely on there being a scratch page. v2: Check that we allocated a large enough scratch for I830 w/a Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.18.20+ (cherry picked from commit 51797499) [Joonas: Use new function in gen9_init_indirectctx_bb too] Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 07 12月, 2018 1 次提交
-
-
由 Christian König 提交于
For a lot of use cases we need 64bit sequence numbers. Currently drivers overload the dma_fence structure to store the additional bits. Stop doing that and make the sequence number in the dma_fence always 64bit. For compatibility with hardware which can do only 32bit sequences the comparisons in __dma_fence_is_later only takes the lower 32bits as significant when the upper 32bits are all zero. v2: change the logic in __dma_fence_is_later Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Link: https://patchwork.freedesktop.org/patch/266927/
-
- 05 12月, 2018 1 次提交
-
-
由 Tvrtko Ursulin 提交于
We stopped re-applying the GT workarounds after engine reset since commit 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds"). Issue with this is that some of the GT workarounds live in the MMIO space which gets lost during engine resets. So far the registers in 0x2xxx and 0xbxxx address range have been identified to be affected. This losing of applied workarounds has obvious negative effects and can even lead to hard system hangs (see the linked Bugzilla). Rather than just restoring this re-application, because we have also observed that it is not safe to just re-write all GT workarounds after engine resets (GPU might be live and weird hardware states can happen), we introduce a new class of per-engine workarounds and move only the affected GT workarounds over. Using the framework introduced in the previous patch, we therefore after engine reset, re-apply only the workarounds living in the affected MMIO address ranges. v2: * Move Wa_1406609255:icl to engine workarounds as well. * Rename API. (Chris Wilson) * Drop redundant IS_KABYLAKE. (Chris Wilson) * Re-order engine wa/ init so latest platforms are first. (Rodrigo Vivi) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=107945 Fixes: 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds") Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-gfx@lists.freedesktop.org Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181203133341.10258-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit 4a15c75c) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 04 12月, 2018 4 次提交
-
-
由 Chris Wilson 提交于
Currently we allocate a scratch page for each engine, but since we only ever write into it for post-sync operations, it is not exposed to userspace nor do we care for coherency. As we then do not care about its contents, we can use one page for all, reducing our allocations and avoid complications by not assuming per-engine isolation. For later use, it simplifies engine initialisation (by removing the allocation that required struct_mutex!) and means that we can always rely on there being a scratch page. v2: Check that we allocated a large enough scratch for I830 w/a Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.18.20+
-
由 Tvrtko Ursulin 提交于
Convert the per context workaround handling code to run against the newly introduced common workaround framework and fuse the two to use the existing smarter list add helper, the one which does the sorted insert and merges registers where possible. This completes migration of all four classes of workarounds onto the common framework. Existing macros are kept untouched for smaller code churn. v2: * Rename to list name ctx_wa_list and move from dev_priv to engine. v3: * API rename and parameters tweaking. (Chris Wilson) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181203133357.10341-1-tvrtko.ursulin@linux.intel.com
-
由 Tvrtko Ursulin 提交于
Instead of having a separate list of white-listed registers we can trivially move this to the common workarounds framework. This brings us one step closer to the goal of driving all workaround classes using the same code. v2: * Use GEM_DEBUG_WARN_ON for the sanity check. (Chris Wilson) v3: * API rename. (Chris Wilson) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181203125014.3219-6-tvrtko.ursulin@linux.intel.com
-
由 Tvrtko Ursulin 提交于
We stopped re-applying the GT workarounds after engine reset since commit 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds"). Issue with this is that some of the GT workarounds live in the MMIO space which gets lost during engine resets. So far the registers in 0x2xxx and 0xbxxx address range have been identified to be affected. This losing of applied workarounds has obvious negative effects and can even lead to hard system hangs (see the linked Bugzilla). Rather than just restoring this re-application, because we have also observed that it is not safe to just re-write all GT workarounds after engine resets (GPU might be live and weird hardware states can happen), we introduce a new class of per-engine workarounds and move only the affected GT workarounds over. Using the framework introduced in the previous patch, we therefore after engine reset, re-apply only the workarounds living in the affected MMIO address ranges. v2: * Move Wa_1406609255:icl to engine workarounds as well. * Rename API. (Chris Wilson) * Drop redundant IS_KABYLAKE. (Chris Wilson) * Re-order engine wa/ init so latest platforms are first. (Rodrigo Vivi) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=107945 Fixes: 59b449d5 ("drm/i915: Split out functions for different kinds of workarounds") Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-gfx@lists.freedesktop.org Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181203133341.10258-1-tvrtko.ursulin@linux.intel.com
-
- 21 11月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
When showing the list of waiters, include the task's status so that we can tell if they have been woken up and are waiting for the CPU, or if they are still waiting to be woken. v2: task_state_to_char() Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181121151653.24595-1-chris@chris-wilson.co.uk
-
- 16 11月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
lockdep insists that if we give a lock a subclass, it must be used. Failure to do so triggers a self-consistency check when reading lockdep_stats: [ 49.902002] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused) [ 49.902009] WARNING: CPU: 3 PID: 383 at kernel/locking/lockdep_proc.c:249 lockdep_stats_show+0x984/0xa10 [ 49.902026] Modules linked in: nls_ascii nls_cp437 vfat fat crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_uncore intel_rapl_perf intel_gtt efivars prime_numbers ahci libahci i2c_i801 video button efivarfs [last unloaded: drm_kms_helper] [ 49.902059] CPU: 3 PID: 383 Comm: cat Tainted: G U 4.20.0-rc2+ #304 [ 49.902068] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [ 49.902079] RIP: 0010:lockdep_stats_show+0x984/0xa10 [ 49.902086] Code: 00 85 c0 0f 84 aa f8 ff ff 8b 05 77 37 e2 00 85 c0 0f 85 9c f8 ff ff 48 c7 c6 e0 57 bc 81 48 c7 c7 28 30 bb 81 e8 6b 77 fa ff <0f> 0b e9 82 f8 ff ff 48 c7 44 24 50 00 00 00 00 45 31 e4 31 db 31 [ 49.902103] RSP: 0018:ffffc90000247d58 EFLAGS: 00010292 [ 49.902110] RAX: 0000000000000044 RBX: 00000000000002f0 RCX: 0000000000000000 [ 49.902118] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffffff810b3464 [ 49.902126] RBP: 0000000000000039 R08: 0000000000000002 R09: 0000000000000000 [ 49.902133] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000007ead [ 49.902141] R13: 0000000000000001 R14: ffff88884c021000 R15: 0000000000000097 [ 49.902150] FS: 00007fb347e66540(0000) GS:ffff88885e600000(0000) knlGS:0000000000000000 [ 49.902159] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.902165] CR2: 00007fb347aeb000 CR3: 00000008544bd005 CR4: 00000000001606e0 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181115203851.25739-1-chris@chris-wilson.co.uk
-
- 30 10月, 2018 1 次提交
-
-
由 Rodrigo Vivi 提交于
Whenever possible we should stick with IS_GEN<n> checks. Bitmaks has been introduced on commit ae7617f0 ("drm/i915: Allow optimized platform checks") for efficiency. Let's stick with it whenever possible. This patch was generated with coccinelle: spatch -sp_file is_gen.cocci *{c,h} --in-place is_gen.cocci: @gen2@ expression e; @@ -INTEL_GEN(e) == 2 +IS_GEN2(e) @gen3@ expression e; @@ -INTEL_GEN(e) == 3 +IS_GEN3(e) @gen4@ expression e; @@ -INTEL_GEN(e) == 4 +IS_GEN4(e) @gen5@ expression e; @@ -INTEL_GEN(e) == 5 +IS_GEN5(e) @gen6@ expression e; @@ -INTEL_GEN(e) == 6 +IS_GEN6(e) @gen7@ expression e; @@ -INTEL_GEN(e) == 7 +IS_GEN7(e) @gen8@ expression e; @@ -INTEL_GEN(e) == 8 +IS_GEN8(e) @gen9@ expression e; @@ -INTEL_GEN(e) == 9 +IS_GEN9(e) @gen10@ expression e; @@ -INTEL_GEN(e) == 10 +IS_GEN10(e) @gen11@ expression e; @@ -INTEL_GEN(e) == 11 +IS_GEN11(e) Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181026195143.20353-1-rodrigo.vivi@intel.com
-
- 18 10月, 2018 1 次提交
-
-
由 Tvrtko Ursulin 提交于
GEM_WARN_ON currently has dangerous semantics where it is completely compiled out on !GEM_DEBUG builds. This can leave users who expect it to be more like a WARN_ON, just without a warning in non-debug builds, in complete ignorance. Another gotcha with it is that it cannot be used as a statement. Which is again different from a standard kernel WARN_ON. This patch fixes both problems by making it behave as one would expect. It can now be used both as an expression and as statement, and also the condition evaluates properly in all builds - code under the conditional will therefore not unexpectedly disappear. To satisfy call sites which really want the code under the conditional to completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the callers to it. This one can also be used as both expression and statement. >From the above it follows GEM_DEBUG_WARN_ON should be used in situations where we are certain the condition will be hit during development, but at a place in code where error can be handled to the benefit of not crashing the machine. GEM_WARN_ON on the other hand should be used where condition may happen in production and we just want to distinguish the level of debugging output emitted between the production and debug build. v2: * Dropped BUG_ON hunk. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: NTomasz Lis <tomasz.lis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com
-
- 17 10月, 2018 1 次提交
-
-
由 Jani Nikula 提交于
Clang build with UBSAN enabled leads to the following build error: drivers/gpu/drm/i915/intel_engine_cs.o: In function `intel_engine_init_execlist': drivers/gpu/drm/i915/intel_engine_cs.c:411: undefined reference to `__compiletime_assert_411' Again, for this to work the code would first need to be inlined and then constant folded, which doesn't work for Clang because semantic analysis happens before optimization/inlining. Use GEM_BUG_ON() instead of BUILD_BUG_ON(). v2: Use is_power_of_2() from log2.h (Chris) References: http://mid.mail-archive.com/20181015203410.155997-1-swboyd@chromium.orgReported-by: NStephen Boyd <swboyd@chromium.org> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: NNathan Chancellor <natechancellor@gmail.com> Tested-by: NStephen Boyd <swboyd@chromium.org> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181016122938.18757-2-jani.nikula@intel.com
-
- 12 10月, 2018 1 次提交
-
-
由 Michal Wajdeczko 提交于
We need extra load failure point to better test error path in i915_driver_init_mmio. Suggested-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181011130008.24640-2-michal.wajdeczko@intel.com
-
- 01 10月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
As we are about to allow ourselves to slightly bump the user priority into a few different sublevels, packthose internal priority lists into the same i915_priolist to keep the rbtree compact and avoid having to allocate the default user priority even after the internal bumping. The downside to having an requests[] rather than a node per active list, is that we then have to walk over the empty higher priority lists. To compensate, we track the active buckets and use a small bitmap to skip over any inactive ones. v2: Use MASK of internal levels to simplify our usage. v3: Prevent overflow when SHIFT is zero. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181001123204.23982-4-chris@chris-wilson.co.uk
-
- 26 9月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
In commit 9144d75e ("include/linux/bitops.h: introduce BITS_PER_TYPE"), we made BITS_PER_TYPE available to all and now we can use the macro to replace some open-coded computation of sizeof(T) * BITS_PER_BYTE. Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180926104707.17410-1-chris@chris-wilson.co.uk
-
- 14 9月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
In order to reduce latency when checking for idle we kick the tasklet directly. Sometimes this is not enough as it is queued on another cpu and so to improve the accuracy of this idle-check (and so to reduce latency overall by avoiding another pass, or worse declaring a timeout!) wait for the tasklet to complete. References: https://bugs.freedesktop.org/show_bug.cgi?id=107916Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180914080017.30308-2-chris@chris-wilson.co.uk
-
- 04 9月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
Older gen use a physical address for the hardware status page, for which we use cache-coherent writes. As the writes are into the cpu cache, we use a normal WB mapped page to read the HWS, used for our seqno tracking. Anecdotally, I observed lost breadcrumbs writes into the HWS on i965gm, which so far have not reoccurred with this patch. How reliable that evidence is remains to be seen. v2: Explicitly pass the expected physical address to the hw v3: Also remember the wild writes we once had for HWS above 4G. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NDaniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180903152304.31589-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Pull the physical status page cleanup into a common cleanup_status_page() for caller simplicity. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180903152304.31589-1-chris@chris-wilson.co.uk
-
- 21 8月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Since we no longer maintain our read position in the CSB pointers register, it always returns 0 and not where we last read up to. As a result the CSB probing in the state dumper starts from 0, either missing entries or showing stale one. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180821101138.15822-1-chris@chris-wilson.co.uk
-
- 15 8月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
If we pardon a per-engine reset, we may leave the STOP_RING bit asserted in RING_MI_MODE resulting in the engine hanging. Unconditionally clear it on the per-engine exit path as we know that either we skipped the reset and so need the cancellation, or the reset was successful and the cancellation is a no-op, or there was an error and we will follow up with a full-reset or wedging (both of which will stop the engines again as required). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106560Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180814171857.24673-1-chris@chris-wilson.co.uk
-