- 04 5月, 2017 3 次提交
-
-
由 Chris Wilson 提交于
Some callers immediately want to know the current ring->space after calling intel_ring_update_space(), which we can freely provide via the return parameter. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170504130846.4807-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Exploit the power-of-two ring size to compute the space across the wraparound using a mask rather than a if. Convert to unsigned integers so the operation is well defined. References: https://bugs.freedesktop.org/show_bug.cgi?id=99671Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170504130846.4807-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Since unifying ringbuffer/execlist submission to use engine->pin_context, we ensure that the intel_ring is available before we start constructing the request. We can therefore move the assignment of the request->ring to the central i915_gem_request_alloc() and not require it in every engine->request_alloc() callback. Another small step towards simplification (of the core, but at a cost of handling error pointers in less important callers of engine->pin_context). v2: Rearrange a few branches to reduce impact of PTR_ERR() on gcc's code generation. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NOscar Mateo <oscar.mateo@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170504093308.4137-1-chris@chris-wilson.co.uk
-
- 28 4月, 2017 1 次提交
-
-
由 Joonas Lahtinen 提交于
Pre-calculate engine context size based on engine class and device generation and store it in the engine instance. v2: - Squash and get rid of hw_context_size (Chris) v3: - Move after MMIO init for probing on Gen7 and 8 (Chris) - Retained rounding (Tvrtko) v4: - Rebase for deferred legacy context allocation Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: intel-gvt-dev@lists.freedesktop.org Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 26 4月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
If we are enabling the breadcrumbs signaling prior to submitting the request, we know that we cannot have missed the interrupt and can therefore skip immediately waking the signaler to check. This reduces a significant chunk of the __i915_gem_request_submit() overhead for inter-engine synchronisation, for example in gem_exec_whisper. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170426080659.28771-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
- 25 4月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
We need to keep track of the last location we ask the hw to read up to (RING_TAIL) separately from our last write location into the ring, so that in the event of a GPU reset we do not tell the HW to proceed into a partially written request (which can happen if that request is waiting for an external signal before being executed). v2: Refactor intel_ring_reset() (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100144 Testcase: igt/gem_exec_fence/await-hang Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests") Fixes: d55ac5bf ("drm/i915: Defer transfer onto execution timeline to actual hw submission") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170425130049.26147-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
- 11 4月, 2017 4 次提交
-
-
由 Chris Wilson 提交于
We want to refer to the index of the engine consistently throughout the userspace ABI. We already have such an index through the execbuffer engine specifier, that needs to be able to refer to each engine specifically, so rename it the index to uabi_id to reflect its generality beyond execbuf. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170411124306.15448-1-chris@chris-wilson.co.uk
-
由 Oscar Mateo 提交于
Not really needed, but makes the next change a little bit more compact. v2: - Use zero-based numbering for engine names: xcs0, xcs1.. xcsN (Tvrtko, Chris) - Make sure the mock engine name is null-terminated (Tvrtko, Chris) v3: Because I'm stupid (Chris) v4: Verify engine name wasn't truncated (Michal) v5: - Kill the warning in mock engine (Chris) Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1491834873-9345-4-git-send-email-oscar.mateo@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Oscar Mateo 提交于
If we needed to do something different for the init functions, we could always look at the engine instance to make the distinction. But, in any case, the two functions are virtually identical already (please notice that BSD2_RING is only used from gen8 onwards). With this, the init functions depends excusively on the engine class (a fact that we will use soon). v2: Commit message Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1491834873-9345-3-git-send-email-oscar.mateo@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Daniele Ceraolo Spurio 提交于
In such a way that vcs and vcs2 are just two different instances (0 and 1) of the same engine class (VIDEO_DECODE_CLASS). v2: Align the instance types (Tvrtko) v3: Don't use enums for bspec-defined stuff (Michal) Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1491834873-9345-2-git-send-email-oscar.mateo@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 03 4月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Or rather it is used only by intel_ring_pin() to extract the drm_i915_private which we can easily pass in. As this is a relatively rare operation, save the space in the struct, and as such it is even break even in the extra code for passing around the parameter: add/remove: 0/0 grow/shrink: 2/3 up/down: 15/-15 (0) function old new delta intel_init_ring_buffer 906 918 +12 execlists_context_pin 1308 1311 +3 mock_engine 407 403 -4 intel_engine_create_ring 367 363 -4 intel_ring_pin 326 319 -7 Total: Before=1261794, After=1261794, chg +0.00% v2: Reorder intel_init_ring_buffer to keep the ring setup together: add/remove: 0/0 grow/shrink: 2/3 up/down: 9/-15 (-6) function old new delta intel_init_ring_buffer 906 912 +6 execlists_context_pin 1308 1311 +3 mock_engine 407 403 -4 intel_engine_create_ring 367 363 -4 intel_ring_pin 326 319 -7 Total: Before=1261794, After=1261788, chg -0.00% Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170403113426.25707-1-chris@chris-wilson.co.uk
-
- 29 3月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
If the request->wa_tail is 0 (because it landed exactly on the end of the ringbuffer), when we reconstruct request->tail following a reset we fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is never able to catch up with RING_TAIL and the GPU spins endlessly. If the ring contains a couple of breadcrumbs, even our hangcheck is unable to catch the busy-looping as the ACTHD and seqno continually advance. v2: Move the wrap into a common intel_ring_wrap(). Fixes: a3aabe86 ("drm/i915/execlists: Reinitialise context image after GPU hang") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Link: http://patchwork.freedesktop.org/patch/msgid/20170327130009.4678-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> (cherry picked from commit 450362d3) Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170329121315.1290-1-chris@chris-wilson.co.uk
-
- 27 3月, 2017 5 次提交
-
-
由 Chris Wilson 提交于
Whilst I like having the assertions clearly visible in the code, they are quite repetitious! As we find new limits we want to incorporate into the set of assertions, it make sense to refactor them to a common routine. v2: Add a guc holdout. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170327131412.20293-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
由 Chris Wilson 提交于
If the request->wa_tail is 0 (because it landed exactly on the end of the ringbuffer), when we reconstruct request->tail following a reset we fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is never able to catch up with RING_TAIL and the GPU spins endlessly. If the ring contains a couple of breadcrumbs, even our hangcheck is unable to catch the busy-looping as the ACTHD and seqno continually advance. v2: Move the wrap into a common intel_ring_wrap(). Fixes: a3aabe86 ("drm/i915/execlists: Reinitialise context image after GPU hang") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Link: http://patchwork.freedesktop.org/patch/msgid/20170327130009.4678-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
由 Chris Wilson 提交于
Since the engine's flag is just the bit of its id, use BIT(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170324163540.31981-3-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
intel_flush_status_page() is defunct since commit f8dd2934 ("drm/i915: Remove BXT incoherent seqno write workaround"), time to remove it. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170324163540.31981-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Not all of our target platforms have clflush. For those without, just assume the status page is sufficiently coherent that we do not need our paranoia. Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Fixes: 14a6bbf9 ("drm/i915: Replace irq_seqno_barrier on hws write with a clflush") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170324163540.31981-1-chris@chris-wilson.co.ukTested-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
- 21 3月, 2017 2 次提交
-
-
由 Changbin Du 提交于
GVTg has introduced the context status notifier to schedule the GVTg workload. At that time, the notifier is bound to GVTg context only, so GVTg is not aware of host workloads. Now we are going to improve GVTg's guest workload scheduler policy, and add Guc emulation support for new Gen graphics. Both these two features require acknowledgment for all contexts running on hardware. (But will not alter host workload.) So here try to make some change. The change is simple: 1. Move the context status notifier head from i915_gem_context to intel_engine_cs. Which means there is a notifier head per engine instead of per context. Execlist driver still call notifier for each context sched-in/out events of current engine. 2. At GVTg side, it binds a notifier_block for each physical engine at GVTg initialization period. Then GVTg can hear all context status events. In this patch, GVTg do nothing for host context event, but later will add a function there. But in any case, the notifier callback is a noop if this is no active vGPU. Since intel_gvt_init() is called at early initialization stage and require the status notifier head has been initiated, I initiate it in intel_engine_setup(). v2: remove a redundant newline. (chris) Fixes: 3c7ba635 ("drm/i915: Introduce execlist context status change notification") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232Signed-off-by: NChangbin Du <changbin.du@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170313024711.28591-1-changbin.du@intel.comAcked-by: NZhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit 3fc03069) Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170321144720.17020-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Storing the position of the breadcrumb of the last retired request as a separate last_retired_head is superfluous as we always copy that into head prior to recalculation of the intel_ring.space. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170321102552.24357-1-chris@chris-wilson.co.uk
-
- 17 3月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
It turns out that we may want to restore the original engine->submit_request (and engine->schedule) callbacks from more than just the guc <-> execlists transition. Move this to a vfunc so we can have a common interface. v2: Move initial selection to intel_engines_init_common(), repaint vfunc with engine->set_default_submission (and a similar colour for the helper). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-2-chris@chris-wilson.co.uk
-
由 Changbin Du 提交于
GVTg has introduced the context status notifier to schedule the GVTg workload. At that time, the notifier is bound to GVTg context only, so GVTg is not aware of host workloads. Now we are going to improve GVTg's guest workload scheduler policy, and add Guc emulation support for new Gen graphics. Both these two features require acknowledgment for all contexts running on hardware. (But will not alter host workload.) So here try to make some change. The change is simple: 1. Move the context status notifier head from i915_gem_context to intel_engine_cs. Which means there is a notifier head per engine instead of per context. Execlist driver still call notifier for each context sched-in/out events of current engine. 2. At GVTg side, it binds a notifier_block for each physical engine at GVTg initialization period. Then GVTg can hear all context status events. In this patch, GVTg do nothing for host context event, but later will add a function there. But in any case, the notifier callback is a noop if this is no active vGPU. Since intel_gvt_init() is called at early initialization stage and require the status notifier head has been initiated, I initiate it in intel_engine_setup(). v2: remove a redundant newline. (chris) Fixes: 3c7ba635 ("drm/i915: Introduce execlist context status change notification") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232Signed-off-by: NChangbin Du <changbin.du@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170313024711.28591-1-changbin.du@intel.comAcked-by: NZhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 16 3月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
When manually overwriting the HWS, rather than assume irq_seqno_barrier does the right thing, we can explicitly flush the cacheline instead. This avoids us calling the engine->irq_seqno_barrier() from an illegal context: [ 1472.651797] BUG: scheduling while atomic: migration/0/11/0x00000002 [ 1472.651807] Modules linked in: ctr ccm arc4 snd_hda_codec_hdmi bnep rfcomm iwldvm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_pcm dm_multipath snd_hwdep intel_powerclamp coretemp snd_seq_midi crct10dif_pclmul snd_seq_midi_event crc32_pclmul iwlwifi ghash_clmulni_intel btusb snd_rawmidi btrtl aesni_intel btbcm aes_x86_64 crypto_simd btintel cryptd glue_helper bluetooth snd_seq cfg80211 snd_timer snd_seq_device intel_ips binfmt_misc snd mei_me soundcore mei dm_mirror dm_region_hash dm_log i915 intel_gtt i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea prime_numbers e1000e drm ahci libahci [ 1472.651897] CPU: 0 PID: 11 Comm: migration/0 Tainted: G U 4.11.0-rc1+ #203 [ 1472.651899] Hardware name: LENOVO 514328U/514328U, BIOS 6QET44WW (1.14 ) 04/20/2010 [ 1472.651900] Call Trace: [ 1472.651913] dump_stack+0x63/0x90 [ 1472.651922] __schedule_bug+0x5d/0x6b [ 1472.651930] __schedule+0x46a/0x5f0 [ 1472.651934] schedule+0x38/0x90 [ 1472.651938] schedule_hrtimeout_range_clock+0x85/0x110 [ 1472.651945] ? hrtimer_init+0x10/0x10 [ 1472.651949] schedule_hrtimeout_range+0xe/0x10 [ 1472.651952] usleep_range+0x4d/0x60 [ 1472.652037] gen5_seqno_barrier+0x13/0x20 [i915] [ 1472.652101] intel_engine_init_global_seqno+0xd7/0x160 [i915] [ 1472.652160] __i915_gem_set_wedged_BKL+0xa0/0x180 [i915] [ 1472.652166] multi_cpu_stop+0xbb/0xe0 [ 1472.652170] ? cpu_stop_queue_work+0x90/0x90 [ 1472.652174] cpu_stopper_thread+0x82/0x110 [ 1472.652179] smpboot_thread_fn+0x137/0x190 [ 1472.652184] kthread+0xf7/0x130 [ 1472.652187] ? sort_range+0x20/0x20 [ 1472.652191] ? kthread_park+0x90/0x90 [ 1472.652195] ret_from_fork+0x2c/0x40 Testcase: igt/gem_eio #ilk Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170314111452.9375-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
- 04 3月, 2017 2 次提交
-
-
由 Michal Wajdeczko 提交于
Generally we are using macros for any hardware identifiers as these may change between Gens. Do the same with hardware engine ids. v2: move hw engine defs to i915_reg.h (Chris) Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170301202615.118632-1-michal.wajdeczko@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
As we now take the breadcrumbs spinlock within the interrupt handler, we wish to minimise its hold time. During the interrupt we do not care about the state of the full rbtree, only that of the first element, so we can guard that with a separate lock. v2: Rename first_wait to irq_wait to make it clearer that it is guarded by irq_lock. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170303190824.1330-1-chris@chris-wilson.co.uk
-
- 03 3月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
The code to check for execlists completion is generic, so move it to intel_engine_cs.c, where we can reuse the new intel_engine_is_idle(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170303121947.20482-2-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
由 Chris Wilson 提交于
During reset_all_global_seqno() on seqno rollover, we have to update the HWS. This causes all in flight requests to be completed, so first we wait. However, we were only waiting for the requests themselves to be completed and clearing out the waiter rbtrees - what I had missed was the extra reference in execlists->port[]. Since commit fe9ae7a3 ("drm/i915/execlists: Detect an out-of-order context switch") we can detect when the request is retired before the context switch interrupt is completed. The impact should be neglible outside of debugging. Testcase: igt/gem_exec_whisper Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170303121947.20482-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
- 28 2月, 2017 3 次提交
-
-
由 Chris Wilson 提交于
A significant cost in setting up a wait is the overhead of enabling the interrupt. As we disable the interrupt whenever the queue of waiters is empty, if we are frequently waiting on alternating batches, we end up re-enabling the interrupt on a frequent basis. We do want to disable the interrupt during normal operations as under high load it may add several thousand interrupts/s - we have been known in the past to occupy whole cores with our interrupt handler after accidentally leaving user interrupts enabled. As a compromise, leave the interrupt enabled until the next IRQ, or the system is idle. This gives a small window for a waiter to keep the interrupt active and not be delayed by having to re-enable the interrupt. v2: Restore hangcheck/missed-irq detection for continuations v3: Be more careful restoring the hangcheck timer after reset v4: Be more careful restoring the fake irq after reset (if required!) v5: Redo changes to intel_engine_wakeup() v6: Factor out __intel_engine_wakeup() v7: Improve commentary for declaring a missed wakeup Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As execlists and other non-semaphore multi-engine devices coordinate between engines using interrupts, we can shave off a few 10s of microsecond of scheduling latency by doing the fence signaling from the interrupt as opposed to a RT kthread. (Realistically the delay adds about 1% to an individual cross-engine workload.) We only signal the first fence in order to limit the amount of work we move into the interrupt handler. We also have to remember that our breadcrumbs may be unordered with respect to the interrupt and so we still require the waiter process to perform some heavyweight coherency fixups, as well as traversing the tree of waiters. v2: No need for early exit in irq handler - it breaks the flow between patches and prevents the tracepoint v3: Restore rcu hold across irq signaling of request Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The two users of the return value from intel_engine_wakeup() are expecting different results. In the breadcrumbs hangcheck, we are using it to determine whether wake_up_process() detected the waiter was currently running (and if so we presume that it hasn't yet missed the interrupt). However, in the fake_irq path, we are using the return value as a check as to whether there are any waiters, and so we may incorrectly stop the fake-irq if that waiter was currently running. To handle the two different needs, return both bits of information! We uninline it from the irq path in preparation for the next patch which makes the irq hotpath special and relegates intel_engine_wakeup() to the slow fixup paths. v2: s/ret/result/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-1-chris@chris-wilson.co.uk
-
- 23 2月, 2017 4 次提交
-
-
由 Chris Wilson 提交于
If we preempt a request and remove it from the execution queue, we need to undo its global seqno and restart any waiters. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-11-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The plan in the near-future is to allow requests to be removed from the signaler. We can no longer then rely on holding a reference to the request for the duration it is in the signaling tree, and instead must obtain a reference to the request for the current operation using RCU. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-10-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
A request is assigned a global seqno only when it is on the hardware execution queue. The global seqno can be used to maintain a list of requests on the same engine in retirement order, for example for constructing a priority queue for waiting. Prior to its execution, or if it is subsequently removed in the event of preemption, its global seqno is zero. As both insertion and removal from the execution queue may operate in IRQ context, it is not guarded by the usual struct_mutex BKL. Instead those relying on the global seqno must be prepared for its value to change between reads. Only when the request is complete can the global seqno be stable (due to the memory barriers on submitting the commands to the hardware to write the breadcrumb, if the HWS shows that it has passed the global seqno and the global seqno is unchanged after the read, it is indeed complete). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-9-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Replace the global device seqno with one for each engine, and account for in-flight seqno on each separately. This is consistent with dma-fence as each timeline has separate fence-contexts for each engine and a seqno is only ordered within a fence-context (i.e. seqno do not need to be ordered wrt to other engines, just ordered within a single engine). This is required to enable request rewinding for preemption on individual engines (we have to rewind the global seqno to avoid overflow, and we do not have to rewind all engines just to preempt one.) v2: Rename active_seqno to inflight_seqnos to more clearly indicate that it is a counter and not equivalent to the existing seqno. Update functions that operated on active_seqno similarly. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-3-chris@chris-wilson.co.uk
-
- 17 2月, 2017 3 次提交
-
-
由 Chris Wilson 提交于
When the timer expires for checking on interrupt processing, check to see if any interrupts arrived within the last time period. If real interrupts are still being delivered, we can be reassured that we haven't missed the final interrupt as the waiter will still be woken. Only once all activity ceases, do we have to worry about the waiter never being woken and so need to install a timer to kick the waiter for a slow arrival of a seqno. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170217151304.16665-2-chris@chris-wilson.co.uk
-
由 Tvrtko Ursulin 提交于
We have a few open coded instances in the execlists code and an almost suitable helper in intel_ringbuf.c We can consolidate to a single helper if we change the existing helper to emit directly to ring buffer memory and move the space reservation outside it. v2: Drop memcpy for memset. (Chris Wilson) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170216122325.31391-2-tvrtko.ursulin@linux.intel.com
-
由 Tvrtko Ursulin 提交于
It is only used within intel_ringbuffer.c Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.oc.uk>
-
- 15 2月, 2017 1 次提交
-
-
由 Tvrtko Ursulin 提交于
intel_ring_workarounds_emit is exactly the same code. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170214150017.16058-1-tvrtko.ursulin@linux.intel.com
-
- 14 2月, 2017 2 次提交
-
-
由 Tvrtko Ursulin 提交于
This removes the usage of intel_ring_emit in favour of directly writing to the ring buffer. intel_ring_emit was preventing the compiler for optimising fetch and increment of the current ring buffer pointer and therefore generating very verbose code for every write. It had no useful purpose since all ringbuffer operations are started and ended with intel_ring_begin and intel_ring_advance respectively, with no bail out in the middle possible, so it is fine to increment the tail in intel_ring_begin and let the code manage the pointer itself. Useless instruction removal amounts to approximately two and half kilobytes of saved text on my build. Not sure if this has any measurable performance implications but executing a ton of useless instructions on fast paths cannot be good. v2: * Change return from intel_ring_begin to error pointer by popular demand. * Move tail increment to intel_ring_advance to enable some error checking. v3: * Move tail advance back into intel_ring_begin. * Rebase and tidy. v4: * Complete rebase after a few months since v3. v5: * Remove unecessary cast and fix !debug compile. (Chris Wilson) v6: * Make intel_ring_offset take request as well. * Fix recording of request postfix plus a sprinkle of asserts. (Chris Wilson) v7: * Use intel_ring_offset to get the postfix. (Chris Wilson) * Convert GVT code as well. v8: * Rename *out++ to *cs++. v9: * Fix GVT out to cs conversion in GVT. v10: * Rebase for new intel_ring_begin in selftests. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170214113242.29241-1-tvrtko.ursulin@linux.intel.com
-
由 Chris Wilson 提交于
First retroactive test, make sure that the waiters are in global seqno order after random inserts and removals. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-3-chris@chris-wilson.co.uk
-
- 11 2月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
After a brief discussion, we settled on a naming convention for the conditional GEM debugging data that should be clearer to the casual user: GEM_DEBUG Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170207102319.10910-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-