- 03 1月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
Currently, we record the elsp register offset inside init-hw but we only need to do it once during engine setup (after we know the mmio iomapping). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180102151235.3949-17-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Move the clearing of the CS-interrupt into the engine reset phase, before the current init-hw phase. This helps clarify that we clear the pending interrupts prior to any restarting of the execlists. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180102151235.3949-16-chris@chris-wilson.co.uk
-
- 23 12月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
We already emit a GEM_TRACE for when we start preemption, but we lack one to show when the preemption is completed and we return to the regular queue. This is to continue the investigation into the mysterious <0>[ 197.854177] <idle>-0 1..s1 197837017us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0], tail=0 [0] <0>[ 197.854209] drv_self-6008 2.... 197837390us : reset_common_ring: rcs0 seqno=15515 <0>[ 197.854240] drv_self-6008 2.... 197837415us : reset_common_ring: bcs0 seqno=0 <0>[ 197.854270] drv_self-6008 2.... 197837443us : reset_common_ring: vcs0 seqno=0 <0>[ 197.854300] drv_self-6008 2.... 197837463us : reset_common_ring: vcs1 seqno=0 <0>[ 197.854330] drv_self-6008 2.... 197837482us : reset_common_ring: vecs0 seqno=0 <0>[ 197.854360] ksoftirq-23 2..s. 197838341us : execlists_submission_tasklet: bcs0 in[0]: ctx=0.1, seqno=1dce7 <0>[ 197.854392] <idle>-0 1..s1 197838347us : execlists_submission_tasklet: bcs0 cs-irq head=0 [0], tail=0 [0] <0>[ 197.854423] ksoftirq-23 2..s. 197838354us : execlists_submission_tasklet: vcs0 in[0]: ctx=0.1, seqno=1d027 <0>[ 197.854456] ksoftirq-23 2.Ns. 197838361us : execlists_submission_tasklet: vcs1 in[0]: ctx=0.1, seqno=1e738 <0>[ 197.854488] ksoftirq-23 2.Ns. 197838366us : execlists_submission_tasklet: vecs0 in[0]: ctx=0.1, seqno=235aa <0>[ 197.854520] ksoftirq-23 2.Ns. 197838376us : execlists_submission_tasklet: rcs0 in[0]: ctx=0.1, seqno=15518 <0>[ 197.854552] <idle>-0 1..s1 197853285us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0], tail=7 [7] <0>[ 197.854584] <idle>-0 1..s1 197853285us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000 <0>[ 197.854616] <idle>-0 1..s1 197853286us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171222132742.4272-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
- 20 12月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
Looking at the coordination of resets with the submission of execlists, it will be useful to have a GEM_TRACE for when we issue the reset. Whilst there tidy up the other GEM_TRACE to always include the engine name, and be careful not to trust any pointers prior to asserts. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171220090626.31643-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
A lesson that has to be relearnt over and over again is that the request does not keep a reference to the context and so we cannot freely dereference the context from inside the execlists_submission_tasklet. In particular, we try to do so in the new GEM_TRACE() so convert those over to the port->context_id we keep for GEM debugging. This means the tracing now depends on DRM_I915_GEM_DEBUG. Fixes: bccd3b83 ("drm/i915: Use trace_printk to provide a death rattle for GEM") References: https://bugs.freedesktop.org/show_bug.cgi?id=104066 References: https://bugs.freedesktop.org/show_bug.cgi?id=104162 References: https://bugs.freedesktop.org/show_bug.cgi?id=104242 References: https://bugs.freedesktop.org/show_bug.cgi?id=104310Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMichel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171219220916.30882-1-chris@chris-wilson.co.uk
-
- 08 12月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Currently on every submission, we recalculate the ELSP register offset for the engine, after chasing the pointers to find the iomem base. Since this is fixed for the lifetime of the driver, record the offset in the execlists struct. In practice the difference is negligible, it just happens to remove 27 bytes of eyesore pointer dancing from next to the hottest instruction (which is itself due to stalling for a cache miss) in perf profiles of the execlists_submission_tasklet(). v2: Trim off one more elsp local. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMichel Thierry <michel.thierry@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171207222434.17686-1-chris@chris-wilson.co.uk
-
- 29 11月, 2017 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Sagar noticed the check can be consolidated between the engine stats implementation and the PMU. My first choice was a static inline helper but that got into include ordering mess quickly fast so I went with a macro instead. At some point we should perhaps looking into taking out the non-ringubffer bits from intel_ringbuffer.h into a new intel_engine.h or something. v2: Use engine->flags. (Chris Wilson) v3: Rebase and mark GuC as not yet supported. (Chris Wilson) v4: Move flag setting to intel_engines_reset_default_submission. (Chris Wilson) v5: Move flag setting to logical_ring_setup. v6: intel_engines_reset_default_submission is the wrong place to set the flag - it needs to be in execlists_set_default_submission. (Sagar) v7: Flag setting in logical_ring_setup is not required. (Chris) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: NSagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v6) Link: https://patchwork.freedesktop.org/patch/msgid/20171129102805.22690-1-tvrtko.ursulin@linux.intel.com
-
- 22 11月, 2017 2 次提交
-
-
由 Tvrtko Ursulin 提交于
Track total time requests have been executing on the hardware. We add new kernel API to allow software tracking of time GPU engines are spending executing requests. Both per-engine and global API is added with the latter also being exported for use by external users. v2: * Squashed with the internal API. * Dropped static key. * Made per-engine. * Store time in monotonic ktime. v3: Moved stats clearing to disable. v4: * Comments. * Don't export the API just yet. v5: Whitespace cleanup. v6: * Rename ref to active. * Drop engine aggregate stats for now. * Account initial busy period after enabling stats. v7: * Rebase. v8: * Move context in notification after the notifier. (Chris Wilson) v9: In cases where stats tracking is getting disabled while there is an active context on an engine, add up the current value to the total. This also implies we don't clear the total when tracking is disabled any longer. There is no real need to do so because we define the stats as relative while enabled, meaning comparison between two samples while tracking is enabled is the valid usage. However, when busy stats will later be plugged into the perf PMU API, it is beneficial to not reset the total, since the PMU core likes to do some counter disable/enable cycles on startup, and while doing so during a single long context executing on an engine we would lose some accuracy and so make unit testing more difficult than needs to be. v10: * Fix accounting for preemption. v11: * Rebase for i915_modparams.enable_execlists removal. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-5-tvrtko.ursulin@linux.intel.com
-
由 Tvrtko Ursulin 提交于
No functional change just something which will be handy in the following patch. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-4-tvrtko.ursulin@linux.intel.com
-
- 21 11月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
Execlists and legacy ringbuffer submission are no longer feature comparable (execlists now offer greater functionality that should overcome their performance hit) and obsoletes the unsafe module parameter, i.e. comparing the two modes of execution is no longer useful, so remove the debug tool. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk
-
由 Michel Thierry 提交于
The hardware needs some time to process the information received in the ExecList Submission Port, and expects us to not write anything more until it has 'acknowledged' this new submission by sending an IDLE_ACTIVE or PREEMPTED CSB event. If we do not follow this, the driver could write new data into the ELSP before HW had finishing fetching the previous one, putting us in 'undefined behaviour' space. This seems to be the problem causing the spurious PREEMPTED & COMPLETE events after a COMPLETE like the one below: [] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3. [] vcs0: Execlist CSB[0]: 0x00000018 _ 0x00000007 [] vcs0: Execlist CSB[1]: 0x00000001 _ 0x00000000 [] vcs0: Execlist CSB[2]: 0x00000018 _ 0x00000007 <<< COMPLETE [] vcs0: Execlist CSB[3]: 0x00000012 _ 0x00000007 <<< PREEMPTED & COMPLETE [] vcs0: Execlist CSB[4]: 0x00008002 _ 0x00000006 [] vcs0: Execlist CSB[5]: 0x00000014 _ 0x00000006 The ELSP writes that lead to this CSB sequence show that the HW hadn't started executing the previous execlist (the one with only ctx 0x6) by the time the new one was submitted; this is a bit more clear in the data show in the EXECLIST_STATUS register at the time of the ELSP write. [] vcs0: ELSP[0] = 0x0_0 [execlist1] - status_reg = 0x0_302 [] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302 [] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308 [] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308 Note that having to wait for this ack does not disable lite-restores, although it may reduce their numbers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035Signed-off-by: NMichel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/<20171118003038.7935-1-michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-4-chris@chris-wilson.co.ukReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Tested-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 20 11月, 2017 3 次提交
-
-
由 Chris Wilson 提交于
If IDLE_ACTIVE is set, then all other bits are invalid. For us, we can assert that if we see a COMPLETE | PREEMPTED event, then it should be impossible for it to also contain an IDLE_ACTIVE flag. Suggested-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-3-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Since we get a COMPLETE event when the context switch occurs on RING_HEAD == RING_TAIL and a PREEMPTED event when a switch occurs before that point, COMPLETE | PREEMPTED should cover all possible context switch completion events. We can move the ELEMENT_SWITCH info message from the COMPLETED_MASK into an assertion for when we are performing a switch to port[1]. Suggested-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Since commit e1fee72c Author: Oscar Mateo <oscar.mateo@intel.com> Date: Thu Jul 24 17:04:40 2014 +0100 drm/i915/bdw: Avoid non-lite-restore preemptions execlists has listened to (ACTIVE_IDLE | ELEMENT_SWITCH) for detecting when one context completed and it either continued onto the next (in port 1) or idled. We would always see COMPLETE | ACTIVE_IDLE on the final context-switch event, but on recent gen it appears that we now get separate ACTIVE_IDLE and COMPLETE events. In particular, the ACTIVE_IDLE events may not be coupled to a context (since it is a general state rather than a specific context completion event). v2: Update the history, execlists did originally start out by listening to the COMPLETE event not ACTIVE_IDLE. v3: Update preempt completion test to also use COMPLETE not ACTIVE_IDLE. References: bspec/12255 References: https://bugs.freedesktop.org/show_bug.cgi?id=103800Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Acked-by: NMichel Thierry <michel.thierry@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-1-chris@chris-wilson.co.uk
-
- 16 11月, 2017 2 次提交
-
-
由 Sagar Arun Kamble 提交于
intel_lrc_irq_handler and i915_guc_irq_handler are HW submission related tasklet functions. Name them with "submission_tasklet" suffix and remove intel/i915 prefix as they are static. Also rename irq_tasklet as just tasklet for clarity. v2: s/_bh/_tasklet (Chris) Suggested-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NSagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-2-git-send-email-sagar.a.kamble@intel.comSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
At the start of building a request, we would wait for roughly enough space to fit the average request (to reduce the likelihood of having to wait and abort partway through request construction). To achieve we would try to begin a 0-length command packet, this just adds extra confusion so make the wait-for-space explicit, as in the next patch we want to move it from the backend to the i915_gem_request_alloc() so it can ensure that the wait-for-space is the first operation in building a new request. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115151204.8105-2-chris@chris-wilson.co.uk
-
- 11 11月, 2017 3 次提交
-
-
由 Chris Wilson 提交于
As we now record the default HW state and so only emit the "golden" renderstate once to prepare the HW, there is no advantage in keeping the renderstate batch around as it will never be used again. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-8-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Take a copy of the HW state after a reset upon module loading by executing a context switch from a blank context to the kernel context, thus saving the default hw state over the blank context image. We can then use the default hw state to initialise any future context, ensuring that each starts with the default view of hw state. v2: Unmap our default state from the GTT after stealing it from the context. This should stop us from accidentally overwriting it via the GTT (and frees up some precious GTT space). Testcase: igt/gem_ctx_isolation Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-7-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In the next few patches, we will want to both copy out of the context image and write a valid image into a new context. To be completely safe, we should then couple in our domain tracking to ensure that we don't have any issues with stale data remaining in unwanted cachelines. Historically, we omitted the .write=true from the call to set-gtt-domain in i915_switch_context() in order to avoid a stall between every request as we would want to wait for the previous context write from the gpu. Since then, we limit the set-gtt-domain to only occur when we first bind the vma, so once in use we will never stall, and we are sure to flush the context following a load from swap. Equally we never applied the lessons learnt from ringbuffer submission to execlists; so time to apply the flush of the lrc after load as well. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-6-chris@chris-wilson.co.uk
-
- 10 11月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Trying to enable printk debugging for GEM is fraught with the issue of spam; interactions with HW are very frequent and often boring. However, one instance where they are not so boring is just before a BUG; here ftrace provides a facility to dump its ringbuffer on an oops. So for CI let's enable trace_printk() to capture the last exchanges with HW as a death rattle. For example, [ 79.234110] ------------[ cut here ]------------ [ 79.234137] kernel BUG at drivers/gpu/drm/i915/intel_lrc.c:907! [ 79.234145] invalid opcode: 0000 [#1] SMP [ 79.234153] Dumping ftrace buffer: [ 79.234158] --------------------------------- ... [ 79.314044] gem_conc-1059 1..s1 79203443us : intel_lrc_irq_handler: bcs0 out[0]: ctx=5.2, seqno=145 [ 79.314089] gem_conc-1059 1..s. 79220800us : intel_lrc_irq_handler: bcs0 csb[1/1]: status=0x00000018:0x00000005 [ 79.314133] gem_conc-1059 1..s. 79220803us : intel_lrc_irq_handler: bcs0 out[0]: ctx=5.1, seqno=145 [ 79.314177] gem_conc-1062 2..s1 79230458us : intel_lrc_irq_handler: bcs0 in[0]: ctx=8.1, seqno=146 [ 79.314220] gem_conc-1062 2..s1 79230515us : intel_lrc_irq_handler: bcs0 in[0]: ctx=8.2, seqno=147 [ 79.314265] gem_conc-1059 1..s1 79230951us : intel_lrc_irq_handler: bcs0 csb[2/3]: status=0x00000012:0x00000008 [ 79.314309] gem_conc-1059 1..s1 79230954us : intel_lrc_irq_handler: bcs0 out[0]: ctx=8.2, seqno=147 [ 79.314353] gem_conc-1059 1..s1 79230954us : intel_lrc_irq_handler: bcs0 csb[3/3]: status=0x00008002:0x00000008 [ 79.314396] gem_conc-1059 1..s1 79230955us : intel_lrc_irq_handler: bcs0 out[0]: ctx=8.1, seqno=147 [ 79.314402] --------------------------------- v2: Tweak the formatting to be more consistent between in/out. v3: do {} while (0) stub macro protection Suggested-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171109143019.16568-1-chris@chris-wilson.co.uk
-
- 09 11月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
Originally we set the priority to max upon inserting the request into the execlists queue (and removing it from the scheduler lists). We could then use the prio==INT_MAX as a shortcut within execlists_schedule() to detect the end of the dependency chain. Since commit 1f181225 ("drm/i915/execlists: Keep request->priority for its lifetime") this is no longer true as we use the request completion as an indicator the schedule dependency chain is complete instead. (This allows us to then reschedule requests even when its context is in flight.) However, this makes the GEM_BUG_ON() inside execlists_schedule() racy as we may change the rq->prio at the same time. As the assertion is useful, let's keep the assertion and remove the micro-optimisation. Fixes: 1f181225 ("drm/i915/execlists: Keep request->priority for its lifetime") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171024115501.21033-1-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com> (cherry picked from commit 64b80085) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Chris Wilson 提交于
Back in commit a4b2b015 ("drm/i915: Don't mark an execlists context-switch when idle") we noticed the presence of late context-switch interrupts. We were able to filter those out by looking at whether the ELSP remained active, but in commit beecec90 ("drm/i915/execlists: Preemption!") that became problematic as we now anticipate receiving a context-switch event for preemption while ELSP may be empty. To restore the spurious interrupt suppression, add a counter for the expected number of pending context-switches and skip if we do not need to handle this interrupt to make forward progress. v2: Don't forget to switch on for preempt. v3: Reduce the counter to a on/off boolean tracker. Declare the HW as active when we first submit, and idle after the final completion event (with which we confirm the HW says it is idle), and track each source of activity separately. With a finite number of sources, it should aide us in debugging which gets stuck. Fixes: beecec90 ("drm/i915/execlists: Preemption!") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171023213237.26536-3-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> (cherry picked from commit 4a118ecb) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 27 10月, 2017 3 次提交
-
-
由 Michał Winiarski 提交于
Pretty similar to what we have on execlists. We're reusing most of the GEM code, however, due to GuC quirks we need a couple of extra bits. Preemption is implemented as GuC action, and actions can be pretty slow. Because of that, we're using a mutex to serialize them. Since we're requesting preemption from the tasklet, the task of creating a workitem and wrapping it in GuC action is delegated to a worker. To distinguish that preemption has finished, we're using additional piece of HWSP, and since we're not getting context switch interrupts, we're also adding a user interrupt. The fact that our special preempt context has completed unfortunately doesn't mean that we're ready to submit new work. We also need to wait for GuC to finish its own processing. v2: Don't compile out the wait for GuC, handle workqueue flush on reset, no need for ordered workqueue, put on a reviewer hat when looking at my own patches (Chris) Move struct work around in intel_guc, move user interruput outside of conditional (Michał) Keep ring around rather than chase though intel_context v3: Extract WA for flushing ggtt writes to a helper (Chris) Keep work_struct in intel_guc rather than engine (Michał) Use ordered workqueue for inject_preempt worker to avoid GuC quirks. v4: Drop now unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE (Daniele) Drop stray newlines, use container_of for intel_guc in worker, check for presence of workqueue when flushing it, rather than enable_guc_submission modparam, reorder preempt postprocessing (Chris) v5: Make wq NULL after destroying it v6: Swap struct guc_preempt_work members (Michał) Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171026133558.19580-1-michal.winiarski@intel.com
-
由 Michał Winiarski 提交于
We would also like to make use of execlist_cancel_port_requests and unwind_incomplete_requests in GuC preemption backend. Let's rename the functions to use the correct prefixes, so that we can simply add the declarations in the following patch. Similar thing for applies for can_preempt, except we're introducing HAS_LOGICAL_RING_PREEMPTION macro instad, converting other users that were previously touching device info directly. v2: s/intel_engine/execlists and pass execlists to unwind (Chris) v3: use locked version for exporting, drop const qual (Chris) Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-11-michal.winiarski@intel.com
-
由 Michał Winiarski 提交于
Let's separate the "emit" part from touching any internal structures, this way we can have a generic "emit coherent GGTT write" function. We would like to reuse this functionality for emitting HWSP write, to confirm that preempt-to-idle has finished. v2: Reorder args to match emit_pipe_control, s/render/rcs (Chris) Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-8-michal.winiarski@intel.com
-
- 26 10月, 2017 2 次提交
-
-
由 Michał Winiarski 提交于
Now that we're handling request resubmission the same way as regular submission (from the tasklet), we can move GuC initialization earlier, before restarting the engines. This way, we're no longer being in the state of flux during engine restart - we're already in user requested submission mode. Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025172519.10670-5-chris@chris-wilson.co.ukSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
In the next patch, we will want to install a callback when the engines (GT as a whole) become idle and similarly when they first become busy. To enable that callback, first rename intel_engines_mark_idle() to intel_engines_park() and provide the companion intel_engines_unpark(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171025143943.7661-2-chris@chris-wilson.co.uk
-
- 24 10月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
Originally we set the priority to max upon inserting the request into the execlists queue (and removing it from the scheduler lists). We could then use the prio==INT_MAX as a shortcut within execlists_schedule() to detect the end of the dependency chain. Since commit 1f181225 ("drm/i915/execlists: Keep request->priority for its lifetime") this is no longer true as we use the request completion as an indicator the schedule dependency chain is complete instead. (This allows us to then reschedule requests even when its context is in flight.) However, this makes the GEM_BUG_ON() inside execlists_schedule() racy as we may change the rq->prio at the same time. As the assertion is useful, let's keep the assertion and remove the micro-optimisation. Fixes: 1f181225 ("drm/i915/execlists: Keep request->priority for its lifetime") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171024115501.21033-1-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
-
由 Chris Wilson 提交于
Back in commit a4b2b015 ("drm/i915: Don't mark an execlists context-switch when idle") we noticed the presence of late context-switch interrupts. We were able to filter those out by looking at whether the ELSP remained active, but in commit beecec90 ("drm/i915/execlists: Preemption!") that became problematic as we now anticipate receiving a context-switch event for preemption while ELSP may be empty. To restore the spurious interrupt suppression, add a counter for the expected number of pending context-switches and skip if we do not need to handle this interrupt to make forward progress. v2: Don't forget to switch on for preempt. v3: Reduce the counter to a on/off boolean tracker. Declare the HW as active when we first submit, and idle after the final completion event (with which we confirm the HW says it is idle), and track each source of activity separately. With a finite number of sources, it should aide us in debugging which gets stuck. Fixes: beecec90 ("drm/i915/execlists: Preemption!") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171023213237.26536-3-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
- 17 10月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
In the next patch, we want to reduce the lock coverage within the shrinker, and one of the dangerous walks we have is over obj->vma_list. We are only walking the obj->vma_list in order to check whether it has been permanently pinned by HW access, typically via use on the scanout. But we have a couple of other long term pins, the context objects for which we currently have to check the individual vma pin_count. If we instead mark these using obj->pin_display, we can forgo the dangerous and sometimes slow list iteration. v2: Rearrange code to try and avoid confusion from false associations due to arrangement of whitespace along with rebasing on obj->pin_global. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-4-chris@chris-wilson.co.uk
-
- 16 10月, 2017 1 次提交
-
-
由 Weinan Li 提交于
Let GVT-g VM read the CSB and CSB write pointer from virtual HWSP, not all the host support this feature, need to check the BIT(3) of caps in PVINFO. v3 : Remove unnecessary comments. v4 : Separate VM enable patch with GVT-g implementation patch due to code dependency. v5 : Use inline for GVT virtual HWSP caps check function. v6 : Comments refine. Signed-off-by: NWeinan Li <weinan.z.li@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1508039725-1066-1-git-send-email-weinan.z.li@intel.com
-
- 10 10月, 2017 1 次提交
-
-
由 Mika Kuoppala 提交于
There is function to tell how many ports we have, so use it. We still have direct relationship with array size and port count, so no harm was done. Fixes: 76e70087 ("drm/i915: Make execlist port count variable") Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171010114857.13108-1-mika.kuoppala@intel.com
-
- 07 10月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Michel Thierry noticed that we were applying WaDisableCtxRestoreArbitration even to gen9, which does not require the w/a. The rationale is that we need to enable MI arbitration for execlists to work, and to be safe we do that before every batch (in addition to every context switch into the batch). Since this is not clear from the single line comment suggesting the MI_ARB_ENABLE is solely for the w/a, add a little more detail. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171005191005.13462-1-chris@chris-wilson.co.ukReviewed-by: NMichel Thierry <michel.thierry@intel.com>
-
- 05 10月, 2017 4 次提交
-
-
由 Chris Wilson 提交于
When we write to ELSP, it triggers a context preemption at the earliest arbitration point (3DPRIMITIVE, some PIPECONTROLs, a few other operations and the explicit MI_ARB_CHECK). If this is to the same context, it triggers a LITE_RESTORE where the RING_TAIL is merely updated (used currently to chain requests from the same context together, avoiding bubbles). However, if it is to a different context, a full context-switch is performed and it will start to execute the new context saving the image of the old for later execution. Previously we avoided preemption by only submitting a new context when the old was idle. But now we wish embrace it, and if the new request has a higher priority than the currently executing request, we write to the ELSP regardless, thus triggering preemption, but we tell the GPU to switch to our special preemption context (not the target). In the context-switch interrupt handler, we know that the previous contexts have finished execution and so can unwind all the incomplete requests and compute the new highest priority request to execute. It would be feasible to avoid the switch-to-idle intermediate by programming the ELSP with the target context. The difficulty is in tracking which request that should be whilst maintaining the dependency change, the error comes in with coalesced requests. As we only track the most recent request and its priority, we may run into the issue of being tricked in preempting a high priority request that was followed by a low priority request from the same context (e.g. for PI); worse still that earlier request may be our own dependency and the order then broken by preemption. By injecting the switch-to-idle and then recomputing the priority queue, we avoid the issue with tracking in-flight coalesced requests. Having tried the preempt-to-busy approach, and failed to find a way around the coalesced priority issue, Michal's original proposal to inject an idle context (based on handling GuC preemption) succeeds. The current heuristic for deciding when to preempt are only if the new request is of higher priority, and has the privileged priority of greater than 0. Note that the scheduler remains unfair! v2: Disable for gen8 (bdw/bsw) as we need additional w/a for GPGPU. Since, the feature is now conditional and not always available when we have a scheduler, make it known via the HAS_SCHEDULER GETPARAM (now a capability mask). v3: Stylistic tweaks. v4: Appease Joonas with a snippet of kerneldoc, only to fuel to fire of the preempt vs preempting debate. Suggested-by: NMichal Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-8-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
With preemption, we will want to "unsubmit" a request, taking it back from the hw and returning it to the priority sorted execution list. In order to know where to insert it into that list, we need to remember its adjust priority (which may change even as it was being executed). This also affects reset for execlists as we are now unsubmitting the requests following the reset (rather than directly writing the ELSP for the inflight contexts). This turns reset into an accidental preemption point, as after the reset we may choose a different pair of contexts to submit to hw. GuC is not updated as this series doesn't add preemption to the GuC submission, and so it can keep benefiting from the early pruning of the DFS inside execlists_schedule() for a little longer. We also need to find a way of reducing the cost of that DFS... v2: Include priority in error-state Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-6-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Move the re-enabling of MI arbitration from a per-bb w/a buffer to the emission of the batch buffer itself. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Let the listener know that the context we just scheduled out was not complete, and will be scheduled back in at a later point. v2: Handle CONTEXT_STATUS_PREEMPTED in gvt by aliasing it to CONTEXT_STATUS_OUT for the moment, gvt can expand upon the difference later. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: "Zhenyu Wang" <zhenyuw@linux.intel.com> Cc: "Wang, Zhi A" <zhi.a.wang@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-3-chris@chris-wilson.co.uk
-
- 04 10月, 2017 1 次提交
-
-
由 Maarten Lankhorst 提交于
This has been unused since commit afa8ce5b ("drm/i915: Nuke legacy flip queueing code"). Changes since v1: - Rebase on top of all the changes to modparams. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> \o/-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171004094416.31306-1-maarten.lankhorst@linux.intel.com
-
- 29 9月, 2017 2 次提交
-
-
由 Michał Winiarski 提交于
Avoid the repeated rbtree lookup for each request as we unwind them by tracking the last priolist. v2: Fix up my unhelpful suggestion of using default_priolist. Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170928193910.17988-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We use INT_MIN to denote the priority of a request that has not been submitted to the scheduler; we treat INT_MIN as an invalid priority and initialise the request to it. Give the value a name so it stands out. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170928193910.17988-3-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com
-