- 11 6月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
The discovery with trying to enable full-ppgtt was that we were completely failing to the load both the mm and context following the reset. Although we were performing mmio to set the PP_DIR (per-process GTT) and CCID (context), these were taking no effect (the assumption was that this would trigger reload of the context and restore the page tables). It was not until we performed the LRI + MI_SET_CONTEXT in a following context switch would anything occur. Since we are then required to reset the context image and PP_DIR using CS commands, we place those commands into every batch. The hardware should recognise the no-ops and eliminate the expensive context loads, but we still have to pay the cost of using cross-powerwell register writes. In practice, this has no effect on actual context switch times, and only adds a few hundred nanoseconds to no-op switches. We can improve the latter by eliminating the w/a around known no-op switches, but there is an ulterior motive to keeping them. Always emitting the context switch at the beginning of the request (and relying on HW to skip unneeded switches) does have one key advantage. Should we implement request reordering on Haswell, we will not know in advance what the previous executing context was on the GPU and so we would not be able to elide the MI_SET_CONTEXT commands ourselves and always have to emit them. Having our hand forced now actually prepares us for later. Now since that context and mm follow the request, we no longer (and not for a long time since requests took over!) require a trace point to tell when we write the switch into the ring, since it is always. (This is even more important when you remember that simply writing into the ring bears no relation to the current mm.) v2: Sandybridge has to agree to use LRI as well. Testcase: igt/drv_selftests/live_hangcheck Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180611110845.31890-1-chris@chris-wilson.co.uk
-
- 06 6月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
In the near future, I want to subclass gen6_hw_ppgtt as it contains a few specialised members and I wish to add more. To avoid the ugliness of using ppgtt->base.base, rename the i915_hw_ppgtt base member (i915_address_space) as vm, which is our common shorthand for an i915_address_space local. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk
-
- 24 5月, 2018 2 次提交
-
-
由 Yunwei Zhang 提交于
WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Icelake as well. References: HSD#1405586840, BSID#0575 v2: - GEN11 mask is different from its predecessors. (Oscar) - Better separate GEN10 and GEN11. (Oscar) Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: NYunwei Zhang <yunwei.zhang@intel.com> Reviewed-by: NOscar Mateo <oscar.mateo@intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1526683232-24753-1-git-send-email-yunwei.zhang@intel.com
-
由 Yunwei Zhang 提交于
WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO read into Slice/Subslice specific registers, MCR packet control register(0xFDC) needs to be programmed to point to any enabled slice/subslice pair. Otherwise, incorrect value will be returned. However, that means each subsequent MMIO read will be forwarded to a specific slice/subslice combination as read is unicast. This is OK since slice/subslice specific register values are consistent in almost all cases across slice/subslice. There are rare occasions such as INSTDONE that this value will be dependent on slice/subslice combo, in such cases, we need to program 0xFDC and recover this after. This is already covered by read_subslice_reg. Also, 0xFDC will lose its information after TDR/engine reset/power state change. References: HSD#1405586840, BSID#0575 v2: - use fls() instead of find_last_bit() (Chris) - added INTEL_SSEU to extract sseu from device info. (Chris) v3: - rebase on latest tip v5: - Added references (Mika) - Change the ordered of passing arguments and etc. (Ursulin) v7: - Moved WA explanation Comments(Oscar) - Rebased. v8: - Renamed sanitize_mcr to calculate_s_ss_select. (Oscar) - calculate s/ss selector instead of whole mcr. (Oscar) v9: - Updated function name (Oscar) - Remove redundant variables (Oscar) v10: - Separate pre-GEN10 and GEN11 mask. (Oscar) Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: NYunwei Zhang <yunwei.zhang@intel.com> Reviewed-by: NOscar Mateo <oscar.mateo@intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1526683197-24656-1-git-send-email-yunwei.zhang@intel.com
-
- 19 5月, 2018 3 次提交
-
-
由 Chris Wilson 提交于
We want to be able to reset the GPU from inside a timer callback (hardirq context). One step requires us to copy the default context state over to the guilty context, which means we need to plan in advance to have that object accessible from within an atomic context. The atomic context prevents us from pinning the object or from peeking into the shmemfs backing store (all may sleep), so we choose to pin the default_state into memory when the engine becomes active. This compromise allows us to swap out the default state when idle, when required. References: 5692251c ("drm/i915/lrc: Scrub the GPU state of the guilty hanging request") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180518090212.5349-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
To be useful later, enable intel_engine_dump() to be called from irq context (i.e. using saving and restoring irq start rather than assuming we enter with irqs enabled). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180518090212.5349-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We rely on ksoftirqd to run in a timely fashion in order to drain the execlists queue. Quite frequently, it does not. In some cases we may see latencies of over 200ms triggering our idle timeouts and forcing us to declare the driver wedged! Thus we can speed up idle detection by bypassing ksoftirqd in these cases and flush our tasklet to confirm if we are indeed still waiting for the ELSP to drain. v2: Put the execlists.first check back; it is required for handling reset! References: https://bugs.freedesktop.org/show_bug.cgi?id=106373Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180506171328.30034-1-chris@chris-wilson.co.uk
-
- 18 5月, 2018 4 次提交
-
-
由 Chris Wilson 提交于
To ease the frequent and ugly pointer dance of &request->gem_context->engine[request->engine->id] during request submission, store that pointer as request->hw_context. One major advantage that we will exploit later is that this decouples the logical context state from the engine itself. v2: Set mock_context->ops so we don't crash and burn in selftests. Cleanups from Tvrtko. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: NZhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Move the knowledge about resetting the current context tracking on the engine from inside i915_gem_context.c into intel_engine_cs.c Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In the next patch, we want to store the intel_context pointer inside i915_request, as it is frequently access via a convoluted dance when submitting the request to hw. Having two context pointers inside i915_request leads to confusion so first rename the existing i915_gem_context pointer to i915_request.gem_context. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Make sure that when we don't have any scheduler attributes for the request, the string is terminated. Fixes: 247870ac ("drm/i915: Build request info on stack before printk") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517152824.11619-1-chris@chris-wilson.co.uk
-
- 17 5月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
We cannot call kthread_park() from softirq context, so let's avoid it entirely during the reset. We wanted to suspend the signaler so that it would not mark a request as complete at the same time as we marked it as being in error. Instead of parking the signaling, stop the engine from advancing so that the GPU doesn't emit the breadcrumb for our chosen "guilty" request. v2: Refactor setting STOP_RING so that we don't have the same code thrice Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Michałt Winiarski <michal.winiarski@intel.com> CC: Michel Thierry <michel.thierry@intel.com> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180516183355.10553-8-chris@chris-wilson.co.uk
-
- 14 5月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
The original switch to use CSB from the HWSP was plagued by the effect of read ordering on VT-d; we would read the WRITE pointer from the HWSP before it had completed writing the CSB contents. The mystery comes down to the lack of rmb() for correct ordering with respect to the writes from HW, and with that resolved we can remove the VT-d special casing. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180511121147.31915-3-chris@chris-wilson.co.ukTested-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
- 11 5月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
In the previous patch (to include a rmb() after readig the CSB WRITE pointer from the HWSP) we believe we have fixed the underlying bug, and so can re-enable using the HWSP on Cannolake. This reverts commit 61bf9719 ("drm/i915/cnl: Use mmio access to context status buffer"). References: https://bugs.freedesktop.org/show_bug.cgi?id=105888 References: https://bugs.freedesktop.org/show_bug.cgi?id=106185 References: 61bf9719 ("drm/i915/cnl: Use mmio access to context status buffer") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Timo Aaltonen <tjaalton@ubuntu.com> Tested-by: NTimo Aaltonen <tjaalton@ubuntu.com> Acked-by: NMichel Thierry <michel.thierry@intel.com> Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180511121147.31915-2-chris@chris-wilson.co.uk
-
- 03 5月, 2018 3 次提交
-
-
由 Chris Wilson 提交于
As we unpark the engines and are about to begin a new cycle of activity, mark the current status of the hangceck as idle so that we avoid carrying over a stale timestamp/action into the next cycle. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502220313.6459-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We need to move to a more flexible timeline that doesn't assume one fence context per engine, and so allow for a single timeline to be used across a combination of engines. This means that preallocating a fence context per engine is now a hindrance, and so we want to introduce the singular timeline. From the code perspective, this has the notable advantage of clearing up a lot of mirky semantics and some clumsy pointer chasing. By splitting the timeline up into a single entity rather than an array of per-engine timelines, we can realise the goal of the previous patch of tracking the timeline alongside the ring. v2: Tweak wait_for_idle to stop the compiling thinking that ret may be uninitialised. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In the future, we want to move a request between engines. To achieve this, we first realise that we have two timelines in effect here. The first runs through the GTT is required for ordering vma access, which is tracked currently by engine. The second is implied by sequential execution of commands inside the ringbuffer. This timeline is one that maps to userspace's expectations when submitting requests (i.e. given the same context, batch A is executed before batch B). As the rings's timelines map to userspace and the GTT timeline an implementation detail, move the timeline from the GTT into the ring itself (per-context in logical-ring-contexts/execlists, or a global per-engine timeline for the shared ringbuffers in legacy submission. The two timelines are still assumed to be equivalent at the moment (no migrating requests between engines yet) and so we can simply move from one to the other without adding extra ordering. v2: Reinforce that one isn't allowed to mix the engine execution timeline with the client timeline from userspace (on the ring). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk
-
- 02 5月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Since the advent of execlists, the HW no longer executes from a single statically assigned ring, but instead switches to a different ring for each context (logical ringbuffer contexts as it is called). So a good way to tally the executing context against what we have queued is by comparing the RING_START register against our requests. Make it so. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502104150.29874-1-chris@chris-wilson.co.uk
-
- 30 4月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
Make life easier in upcoming patches by moving the context_pin and context_unpin vfuncs into inline helpers. v2: Fixup mock_engine to mark the context as pinned on use. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In commit 9b6586ae ("drm/i915: Keep a global seqno per-engine"), we moved from a global inflight counter to per-engine counters in the hope that will be easy to run concurrently in future. However, with the advent of the desire to move requests between engines, we do need a global counter to preserve the semantics that no engine wraps in the middle of a submit. (Although this semantic is now only required for gen7 semaphore support, which only supports greater-then comparisons!) v2: Keep a global counter of all requests ever submitted and force the reset when it wraps. References: 9b6586ae ("drm/i915: Keep a global seqno per-engine") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-1-chris@chris-wilson.co.uk
-
- 26 4月, 2018 1 次提交
-
-
由 Tvrtko Ursulin 提交于
We can convert engine stats from a spinlock to seqlock to ensure interrupt processing is never even a tiny bit delayed by parallel readers. There is a smidgen bit more cost on the write lock side, and an extremely unlikely chance that readers will have to retry a few times in face of heavy interrupt load. But it should be extremely unlikely given how lightweight read side section is compared to the interrupt processing side, and also compared to the rest of the code paths which can lead into it. Furthermore, writer is the ones doing the real, latency sensitive work, while readers are only informative. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180426074716.7352-1-tvrtko.ursulin@linux.intel.com
-
- 24 4月, 2018 3 次提交
-
-
由 Chris Wilson 提交于
Knowing the offset of the per-engine scratch/HWS page during boot is not very informative, so remove the DRM_DEBUG. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180424115236.2022-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If we have more than a few, possibly several thousand request in the queue, don't show the central portion, just the first few and the last being executed and/or queued. The first few should be enough to help identify a problem in execution, and most often comparing the first/last in the queue is enough to identify problems in the scheduling. We may need some fine tuning to set MAX_REQUESTS_TO_SHOW for common debug scenarios, but for the moment if we can avoiding spending more than a few seconds dumping the GPU state that will avoid a nasty livelock (where hangcheck spends so long dumping the state, it fires again and starts to dump the state again in parallel, ad infinitum). v2: Remember to print last not the stale rq iter after the loop. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180424081600.27544-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
printk unhelpfully inserts a '\n' between consecutive calls, and since our drm_printf wrapper may be emitting info a seq_file instead, KERN_CONT is not an option. To work with any drm_printf destination, we need to build up the output into a temporary buf on the stack and then feed the complete line in a single call to printk. Fixes: b7268c5e ("drm/i915: Pack params to engine->schedule() into a struct") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180424010839.22860-1-chris@chris-wilson.co.uk
-
- 19 4月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
Today we only want to pass along the priority to engine->schedule(), but in the future we want to have much more control over the various aspects of the GPU during a context's execution, for example controlling the frequency allowed. As we need an ever growing number of parameters for scheduling, move those into a struct for convenience. v2: Move the anonymous struct into its own function for legibility and ye olde gcc. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180418184052.7129-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Having moved the priotree struct into i915_scheduler.h, identify it as the scheduling element and rebrand into i915_sched. This becomes more useful as we start attaching more information we require to propagate through the scheduler. v2: Use i915_sched_node for future distinctiveness Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180418184052.7129-2-chris@chris-wilson.co.uk
-
- 13 4月, 2018 1 次提交
-
-
由 Mika Kuoppala 提交于
Evidence indicates that Cannonlake HWSP is not coherent as it should. Revert to using mmio access for now. Testcase: igt/gem_ctx_switch References: https://bugs.freedesktop.org/show_bug.cgi?id=105888 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180412145802.23313-1-mika.kuoppala@linux.intel.com
-
- 12 4月, 2018 2 次提交
-
-
由 Oscar Mateo 提交于
This has grown to be a sizable amount of code, so move it to its own file before we try to refactor anything. For the moment, we are leaving behind the WA BB code and the WAs that get applied (incorrectly) in init_clock_gating, but we will deal with it later. v2: Use intel_ prefix for code that deals with the hardware (Chris) v3: Rebased v4: - Rebased - New license header v5: - Rebased - Added some organisational notes to the file (Chris) v6: Include DOC section in the documentation build (Jani) Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> [ickle: appease checkpatch, mostly] Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1523376767-18480-1-git-send-email-oscar.mateo@intel.com
-
由 Chris Wilson 提交于
We can refine our current execlists->queue_priority if we inspect ELSP[1] rather than the head of the unsubmitted queue. Currently, we use the unsubmitted queue and say that if a subsequent request is more important than the current queue, we will rerun the submission tasklet to evaluate the need for preemption. However, we only want to preempt if we need to jump ahead of a currently executing request in ELSP. The second reason for running the submission tasklet is amalgamate requests into the active context on ELSP[0] to avoid a stall when ELSP[0] drains. (Though repeatedly amalgamating requests into the active context and triggering many lite-restore is off question gain, the goal really is to put a context into ELSP[1] to cover the interrupt.) So if instead of looking at the head of the queue, we look at the context in ELSP[1] we can answer both of the questions more accurately -- we don't need to rerun the submission tasklet unless our new request is important enough to feed into, at least, ELSP[1]. v2: Add some comments from the discussion with Tvrtko. v3: More commentary to cross-reference queue_request() References: f6322edd ("drm/i915/preemption: Allow preemption between submission ports") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180411103929.27374-1-chris@chris-wilson.co.uk
-
- 27 3月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
For the off-chance we have an interrupt posted and haven't processed the CSB. v2: Include tasklet enable/disable state for good measure. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180326115044.2505-4-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
- 20 3月, 2018 1 次提交
-
-
由 Kelvin Gardiner 提交于
ICL 11 has a greater number of maximum subslices. This patch reflects this. v2: GEN11 updates to MCR_SELECTOR (Oscar) v3: Copypasta error in the new defines (Lionel) Bspec: 21139 BSpec: 21108 Signed-off-by: NKelvin Gardiner <kelvin.gardiner@intel.com> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> (v1) Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1) Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180316121456.11577-3-mika.kuoppala@linux.intel.comSigned-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
- 15 3月, 2018 3 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
The only usage outside the intel_lrc.c file is in the ringbuffer init, but the irq mask calculated there is then overwritten for all engines that have a non-zero shift, so we can drop it. This change is not aimed at code saving but at removing from intel_engines information that does not apply to all gens that have the engine. When checking without the temporary WARN_ON, code size is basically unchanged. v2: make the irq_shifts array static const v3: rebase, move irq_shifts array to logical_ring_default_irqs v4: move array inside the if and use u8 for it (Chris) Suggested-by: NMichel Thierry <michel.thierry@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180314182653.26981-4-daniele.ceraolospurio@intel.com
-
由 Daniele Ceraolo Spurio 提交于
Check that the entries are in reverse gen order and that all entries with gen > 0 have an mmio base set. v2: loop forward, simplify logic, use i915_subtests (Chris) Suggested-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180314182653.26981-2-daniele.ceraolospurio@intel.com
-
由 Daniele Ceraolo Spurio 提交于
The mmio bases we're currently storing in the intel_engines array are only valid for a subset of gens, so we need to ignore them and use different values in some cases. Instead of doing that, we can have a table of [starting gen, mmio base] pairs for each engine in intel_engines and select the correct one based on the gen we're running on in a consistent way. v2: document that the list goes in reverse order, update starting gen for render (Chris) v3: starting gen for render back to 1 to make our life easier with selftests (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v2 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180314182653.26981-1-daniele.ceraolospurio@intel.com
-
- 14 3月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Not only is the context suspect to disappearing, but so is it's timeline. Under a lockless inspection of the requests for debugging from intel_engine_dump(), the context may already have been freed and we have to check before chasing the dangling pointer. [28033.681755] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel snd_pcm mei_me mei i915 r8169 mii prime_numbers i2c_hid [28033.681796] CPU: 3 PID: 3058 Comm: gem_exec_schedu Tainted: G U 4.16.0-rc5+ #9 [28033.681804] Hardware name: Acer Aspire E5-575G/Ironman_SK , BIOS V1.12 08/02/2016 [28033.681834] RIP: 0010:print_request+0x2b/0xb0 [i915] [28033.681840] RSP: 0018:ffffc90004afbc18 EFLAGS: 00010202 [28033.681847] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8801921b5a40 RCX: 0000000000000006 [28033.681854] RDX: ffffc90004afbc60 RSI: ffff8801921b5a40 RDI: 0000000000000004 [28033.681861] RBP: ffffc90004afbd80 R08: 0000000000000000 R09: 0000000000000001 [28033.681868] R10: ffffc90004afbbd0 R11: ffffc90004afbc73 R12: ffffc90004afbc60 [28033.681875] R13: ffffc90004afbd80 R14: ffff8801d40ec670 R15: ffff8801921b5a40 [28033.681883] FS: 00007fbba5f6c8c0(0000) GS:ffff8801e8400000(0000) knlGS:0000000000000000 [28033.681891] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [28033.681897] CR2: 00007fbba5f8f000 CR3: 00000001b2efa002 CR4: 00000000003606e0 [28033.681904] Call Trace: [28033.681932] intel_engine_print_registers+0x6a7/0x930 [i915] [28033.681962] intel_engine_dump+0x30d/0x740 [i915] [28033.681971] ? seq_printf+0x3a/0x50 [28033.681995] i915_engine_info+0xb8/0xe0 [i915] [28033.682003] ? drm_get_color_range_name+0x20/0x20 [28033.682010] seq_read+0xe1/0x440 [28033.682018] full_proxy_read+0x51/0x80 [28033.682025] __vfs_read+0x21/0x130 [28033.682031] ? do_sys_open+0x134/0x220 [28033.682037] ? kmem_cache_free+0x177/0x2b0 [28033.682043] vfs_read+0xa1/0x150 [28033.682049] SyS_read+0x40/0xa0 [28033.682055] do_syscall_64+0x6b/0x1b0 [28033.682063] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [28033.682069] RIP: 0033:0x7fbba4655d11 [28033.682074] RSP: 002b:00007ffd8c49da58 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [28033.682082] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbba4655d11 [28033.682089] RDX: 000000000000003f RSI: 00005647bfbfc260 RDI: 0000000000000006 [28033.682096] RBP: 000000000000003f R08: 00000000ffffffff R09: 0000000000000000 [28033.682104] R10: 0000000000000000 R11: 0000000000000246 R12: 00005647bfbfc260 [28033.682111] R13: 0000000000000006 R14: 0000000000000000 R15: 00005647bfbfc260 [28033.682119] Code: 41 55 41 54 49 89 d4 55 53 48 89 fd 48 8b 86 c8 00 00 00 48 8b 3d d6 1e 14 e2 48 89 f3 48 2b be a8 02 00 00 48 8b 80 b0 00 00 00 <4c> 8b 68 18 e8 bc 80 02 e1 8b 8b 70 02 00 00 8b b3 28 02 00 00 [28033.682206] RIP: print_request+0x2b/0xb0 [i915] RSP: ffffc90004afbc18 Reported-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180314101630.8933-1-chris@chris-wilson.co.uk
-
- 10 3月, 2018 1 次提交
-
-
由 Michal Wajdeczko 提交于
Function i915_gem_batch_pool_init() failed to follow obj-verb naming schema. Fix that by swapping function parameters. While here, change license text to SPDX format. v2: use intel_engine_init_batch_pool (Chris) as proxy (Michal) Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180308095037.18264-3-michal.wajdeczko@intel.com
-
- 09 3月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Include ring->emit and ring->space alongside ring->(head,tail) when printing debug information. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-4-chris@chris-wilson.co.uk
-
- 07 3月, 2018 2 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
Starting from Gen11 the context descriptor format has been updated in the HW. The hw_id field has been considerably reduced in size and engine class and instance fields have been added. There is a slight name clashing issue because the field that we call hw_id is actually called SW Context ID in the specs for Gen11+. With the current size of the hw_id field we can have a maximum of 2k contexts at any time, but we could use the sw_counter field (which is sw defined) to increase that because the HW requirement is that engine_id + sw id + sw_counter is a unique number. GuC uses a similar method to support more contexts but does its tracking at lrc level. To avoid doing an implementation that will need to be reworked once GuC support lands, defer it for now and mark it as TODO. v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT v3: rebased, bring back lost code from i915_gem_context.c v4: make TODO comment more generic v5: be consistent with bit ordering, add extra checks (Chris) Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NOscar Mateo <oscar.mateo@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.comSigned-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
由 Oscar Mateo 提交于
Gen11 has up to 4 VCS and up to 2 VECS engines, this patch adds mmio base definitions for all of them. Bspec: 20944 Bspec: 7021 v2: Set the correct mmio_base in intel_engines_init_mmio; updating the base mmio values any later would cause incorrect reads in i915_gem_sanitize (Michel). Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com> Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Signed-off-by: NMichel Thierry <michel.thierry@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-2-mika.kuoppala@linux.intel.comSigned-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
- 28 2月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Although we protect the request itself, we don't lock inside intel_engine_dump() and so the request maybe retired as we peek into it. One consequence is that the request->ctx may be freed before we dereference it, leading to a use-after-free. Replace the hw_id we are peeking from inside request->ctx with the request->fence.context, with which we can still track from which context the request originated (although to tie to HW reports requires a little more legwork, but is good enough to follow the GEM traces). [52640.729670] general protection fault: 0000 [#2] SMP [52640.729694] Dumping ftrace buffer: [52640.729701] (ftrace buffer empty) [52640.729705] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_\ temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep gha\ sh_clmulni_intel snd_hda_core snd_pcm mei_me mei i915 r8169 mii prime_numbers i2c_hid [52640.729748] CPU: 2 PID: 4335 Comm: gem_exec_schedu Tainted: G UD W 4.16.0-rc3+ #7 [52640.729759] Hardware name: Acer Aspire E5-575G/Ironman_SK , BIOS V1.12 08/02/2016 [52640.729803] RIP: 0010:print_request+0x2b/0xb0 [i915] [52640.729811] RSP: 0018:ffffc90001453c18 EFLAGS: 00010206 [52640.729820] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8801e0292d40 RCX: 0000000000000006 [52640.729829] RDX: ffffc90001453c60 RSI: ffff8801e0292d40 RDI: 0000000000000003 [52640.729838] RBP: ffffc90001453d80 R08: 0000000000000000 R09: 0000000000000001 [52640.729847] R10: ffffc90001453bd0 R11: ffffc90001453c73 R12: ffffc90001453c60 [52640.729856] R13: ffffc90001453d80 R14: ffff8801d5a683c8 R15: ffff8801e0292d40 [52640.729866] FS: 00007f1ee50548c0(0000) GS:ffff8801e8200000(0000) knlGS:0000000000000000 [52640.729876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [52640.729884] CR2: 00007f1ee5077000 CR3: 00000001d9411004 CR4: 00000000003606e0 [52640.729893] Call Trace: [52640.729922] intel_engine_print_registers+0x623/0x890 [i915] [52640.729948] intel_engine_dump+0x4a3/0x590 [i915] [52640.729957] ? seq_printf+0x3a/0x50 [52640.729977] i915_engine_info+0xb8/0xe0 [i915] [52640.729984] ? drm_mode_gamma_get_ioctl+0xf0/0xf0 [52640.729990] seq_read+0xd5/0x410 [52640.729997] full_proxy_read+0x4b/0x70 [52640.730004] __vfs_read+0x1e/0x120 [52640.730009] ? do_sys_open+0x134/0x220 [52640.730015] ? kmem_cache_free+0x174/0x2b0 [52640.730021] vfs_read+0xa1/0x150 [52640.730026] SyS_read+0x40/0xa0 [52640.730032] do_syscall_64+0x65/0x1a0 [52640.730038] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Reported-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180228094732.28462-1-chris@chris-wilson.co.uk
-