- 05 10月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Let the listener know that the context we just scheduled out was not complete, and will be scheduled back in at a later point. v2: Handle CONTEXT_STATUS_PREEMPTED in gvt by aliasing it to CONTEXT_STATUS_OUT for the moment, gvt can expand upon the difference later. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: "Zhenyu Wang" <zhenyuw@linux.intel.com> Cc: "Wang, Zhi A" <zhi.a.wang@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-3-chris@chris-wilson.co.uk
-
- 04 10月, 2017 1 次提交
-
-
由 Maarten Lankhorst 提交于
This has been unused since commit afa8ce5b ("drm/i915: Nuke legacy flip queueing code"). Changes since v1: - Rebase on top of all the changes to modparams. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> \o/-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171004094416.31306-1-maarten.lankhorst@linux.intel.com
-
- 29 9月, 2017 3 次提交
-
-
由 Michał Winiarski 提交于
Avoid the repeated rbtree lookup for each request as we unwind them by tracking the last priolist. v2: Fix up my unhelpful suggestion of using default_priolist. Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170928193910.17988-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We use INT_MIN to denote the priority of a request that has not been submitted to the scheduler; we treat INT_MIN as an invalid priority and initialise the request to it. Give the value a name so it stands out. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170928193910.17988-3-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com
-
由 Chris Wilson 提交于
In the future, we will want to unwind requests following a preemption point. This requires the same steps as for unwinding upon a reset, so extract the existing code to a separate function for later use. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170928193910.17988-2-chris@chris-wilson.co.uk
-
- 27 9月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
When cancelling requests, also send the notification to any listeners (gvt) that the request is no longer scheduled on hw. They may require to keep the in/out exactly balanced, and so the reuse after the reset may confuse the listener. Fixes: 221ab971 ("drm/i915/execlists: Unwind incomplete requests on resets") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: "Zhenyu Wang" <zhenyuw@linux.intel.com> Cc: "Wang, Zhi A" <zhi.a.wang@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170926101720.9479-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
- 26 9月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Just rearrange the code slightly to trim the number of iterations required. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170925124929.16974-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
- 25 9月, 2017 7 次提交
-
-
由 Chris Wilson 提交于
Since we inherited the context image setup from gen8 which needed a per-bb workaround (for GPGPU), we are submitting an empty per-bb buffer on gen9. Now that we can skip adding the buffer to the context image, remove the dangling per-bb. This slightly improves execution latency, most notably on an idle engine. References: https://bugs.freedesktop.org/show_bug.cgi?id=87725Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170921135444.27330-2-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
由 Chris Wilson 提交于
The per-context and per-batch workaround buffers are optional, yet we tell the GPU to execute them even if they contain no instructions. Doing so incurs the dispatch latency, which we can avoid if we don't ask the GPU to execute the no-op buffers. Allow ourselves to skip setup of empty buffer, and then to only enable non-empty buffers in the context image. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170921135444.27330-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
由 Mika Kuoppala 提交于
As we emulate execlists on top of the GuC workqueue, it is not restricted to just 2 ports and we can increase that number arbitrarily to trade-off queue depth (i.e. scheduling latency) against pipeline bubbles. v2: rebase. better commit msg (Chris) v3: rebase Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170922124307.10914-5-mika.kuoppala@intel.com
-
由 Mika Kuoppala 提交于
When first execlist entry is processed, we move the port (contents). Introduce function for this as execlist and guc use this common operation. v2: rebase. s/GEM_DEBUG_BUG/GEM_BUG (Chris) v3: rebase Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170922124307.10914-4-mika.kuoppala@intel.com
-
由 Mika Kuoppala 提交于
On reset and wedged path, we want to release the requests that are tied to ports and then mark the ports to be unset. Introduce a function for this. v2: rebase v3: drop local, keep GEM_BUG_ON (Michał, Chris) v4: rebase Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170922124307.10914-3-mika.kuoppala@intel.com
-
由 Mika Kuoppala 提交于
Move execlist init into a common engine setup. As it is common to both guc and hw execlists. v2: rebase with csb changes v3: rebase Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170922124307.10914-2-mika.kuoppala@intel.com
-
由 Mika Kuoppala 提交于
Engine's execlist related items have been increasing to a point where a separate struct is warranted. Carve execlist specific items to a dedicated struct to add clarity. v2: add kerneldoc and fix whitespace (Joonas, Chris) v3: csb_mmio changes, rebase v4: s/\b(el|execlist)\b/execlists/ (Joonas) Suggested-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> (v3) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v3) Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170922124307.10914-1-mika.kuoppala@intel.com
-
- 22 9月, 2017 1 次提交
-
-
由 Michal Wajdeczko 提交于
Our global struct with params is named exactly the same way as new preferred name for the drm_i915_private function parameter. To avoid such name reuse lets use different name for the global. v5: pure rename v6: fix Credits-to: Coccinelle @@ identifier n; @@ ( - i915.n + i915_modparams.n ) Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ville Syrjala <ville.syrjala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170919193846.38060-1-michal.wajdeczko@intel.com
-
- 18 9月, 2017 6 次提交
-
-
由 Michał Winiarski 提交于
To create an upper bound on number of GuC workitems, we need to change the way that requests are being submitted. Rather than submitting each request as an individual workitem, we can do coalescing in a similar way we're handlig execlist submission ports. We also need to stop pretending that we're doing "lite-restore" in GuC submission (we would create a workitem each time we hit this condition). This allows us to completely remove the reservation, replacing it with a compile time check. v2: Also coalesce when replaying on reset (Daniele) v3: Consistent wq_resv - per-request (Daniele) v4: Squash removing wq_resv v5: Reflect i915_guc_submit argument changes in doc v6: Rebase on top of execlists reset/restart fix (Chris,Michał) References: https://bugs.freedesktop.org/show_bug.cgi?id=101873 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170914083216.10192-2-michal.winiarski@intel.comSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
Given the mechanism to unwind and replay requests (designed to support preemption), we have an alternative to the current method of resubmitting the ELSP upon reset. Resubmitting ELSP turns out to be more complicated than expected, due to having to handle lost context-switch interrupts and so guessing what ELSP we need to resubmit later. Instead, by unwinding the requests and clearing the ELSP tracking entirely, we can then just dequeue the first pair of ready requests after resetting, using the normal submission procedure. Currently, the unwound requests have maximum priority and so are guaranteed to be resubmitted upon resume. If we are lucky, we may be able to coalesce a new request on top! Suggested-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170916204414.32762-4-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
-
由 Chris Wilson 提交于
In the next patch we will want to reinsert a request not at the end of the priority queue, but at the front. Here we split insert_request() into two, the first function retrieves the priority list (for reuse for unsubmit later) and a wrapper function to insert at the end of that list and to schedule the tasklet if we were first. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170916204414.32762-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Move insert_request() earlier to avoid a forward declaration in a later patch. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170916204414.32762-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
During a reset, we may skip over completed requests and lost context-switch interrupts. Following the reset, we may then may end up with no active requests in the ELSP (and so do not resubmit to restart the engine), but have a queue of requests ready for execution. This is unlikely, it requires the last request to complete after the hang is detected, but not impossible. The outcome of this is that the engine stalls, possibly leading to full ring and indefinite wait under struct_mutex, eventually leading to a full driver hang. Alternatively, we can solve this by unsubmitting the incomplete requests and just kickstarting the tasklet. Michał has patches for that, which I initially disliked due to the extra complexity, but the complexity of this "simple" restart is growing... Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170916204414.32762-1-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
-
由 Chris Wilson 提交于
When wedging the hw, we want to mark all in-flight requests as -EIO. This is made slightly more complex by execlists who store the ready but not yet submitted-to-hw requests on a private queue (an rbtree priolist). Call into execlists to cancel not only the ELSP tracking for the submitted requests, but also the queue of unsubmitted requests. v2: Move the majority of engine_set_wedged to the backends (both legacy ringbuffer and execlists handling their own lists). Reported-by: NMichał Winiarski <michal.winiarski@intel.com> Testcase: igt/gem_eio/in-flight-contexts Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170915173100.26470-1-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
-
- 14 9月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
The engine also provides a mirror of the CSB write pointer in the HWSP, but not of our read pointer. To take advantage of this we need to remember where we read up to on the last interrupt and continue off from there. This poses a problem following a reset, as we don't know where the hw will start writing from, and due to the use of power contexts we cannot perform that query during the reset itself. So we continue the current modus operandi of delaying the first read of the context-status read/write pointers until after the first interrupt. With this we should now have eliminated all uncached mmio reads in handling the context-status interrupt, though we still have the uncached mmio writes for submitting new work, and many uncached mmio reads in the global interrupt handler itself. Still a step in the right direction towards reducing our resubmit latency, although it appears lost in the noise! v2: Cannonlake moved the CSB write index v3: Include the sw/hwsp state in debugfs/i915_engine_info v4: Also revert to using CSB mmio for GVT-g v5: Prevent the compiler reloading tail (Mika) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Acked-by: NMichel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170913085605.18299-6-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
由 Chris Wilson 提交于
The engine provides a mirror of the CSB in the HWSP. If we use the cacheable reads from the HWSP, we can shave off a few mmio reads per context-switch interrupt (which are quite frequent!). Just removing a couple of mmio is not enough to actually reduce any latency, but a small reduction in overall cpu usage. Much appreciation for Ben dropping the bombshell that the CSB was in the HWSP and for Michel in digging out the details. v2: Don't be lazy, add the defines for the indices. v3: Include the HWSP in debugfs/i915_engine_info v4: Check for GVT-g, it currently depends on intercepting CSB mmio v5: Fixup GVT-g mmio path v6: Disable HWSP if VT-d is active as the iommu adds unpredictable memory latency. (Mika) v7: Also markup the CSB read with READ_ONCE() as it may still be an mmio read and we want to stop the compiler from issuing a later (v.slow) reload. Suggested-by: NBen Widawsky <benjamin.widawsky@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Acked-by: NMichel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170913133534.26927-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
- 13 9月, 2017 3 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
On gen8+ we're currently using the PPHWSP of the kernel ctx as the global HWSP. However, when the kernel ctx gets submitted (e.g. from __intel_autoenable_gt_powersave) the HW will use that page as both HWSP and PPHWSP. This causes a conflict in the register arena of the HWSP, i.e. dword indices below 0x30. We don't current utilize this arena, but in the following patches we will take advantage of the cached register state for handling execlist's context status interrupt. To avoid the conflict, instead of re-using the PPHWSP of the kernel ctx we can allocate a separate page for the HWSP like what happens for pre-execlists platform. v2: Add a use-case for the register arena of the HWSP. Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1499357440-34688-1-git-send-email-daniele.ceraolospurio@intel.comSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170913085605.18299-3-chris@chris-wilson.co.uk
-
由 Michel Thierry 提交于
Not only the context image consist of two parts (the PPHWSP, and the logical context state), but we also allocate a header at the start of for sharing data with GuC. Thus every lrc looks like this: | [guc] | [hwsp] [logical state] | |<- our header ->|<- context image ->| So far, we have oversimplified whenever we use each of these parts of the context, just because the GuC header happens to be in page 0, and the (PP)HWSP is in page 1. But this had led to using the same define for more than one meaning (as a page index in the lrc and as 1 page). This patch adds defines for the GuC shared page, the PPHWSP page and the start of the logical state. It also updated the places where the old define was being used. Since we are not changing the size (or format) of the context, there are no functional changes. v2: Use PPHWSP index for hws again. Suggested-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMichel Thierry <michel.thierry@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: intel-gvt-dev@lists.freedesktop.org Link: http://patchwork.freedesktop.org/patch/msgid/20170712193032.27080-1-michel.thierry@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170913085605.18299-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The context descriptor is stored inside the per-engine context state, as we only need to compute it once and access it frequently. However, currently only intel_lrc.c has easy access, but i915_guc_submission.c would like to frequently read it as well, and more so only ever needs the lower 32bits. Make it an inline as the compiler should be able to retrieve the value in less instructions than it takes to do the function call: add/remove: 0/1 grow/shrink: 1/0 up/down: 8/-45 (-37) function old new delta i915_guc_submit 621 629 +8 intel_lr_context_descriptor 45 - -45 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170912214905.21987-1-chris@chris-wilson.co.ukReviewed-by: NOscar Mateo <oscar.mateo@intel.com>
-
- 21 8月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
During a global reset, we disable the irq. As we disable the irq, the hardware may be raising a GT interrupt that we then ignore, leaving it pending in the GTIIR. After the reset, we then re-enable the irq, triggering the pending interrupt. However, that interrupt was for the stale state from before the reset, and the contents of the CSB buffer are now invalid. v2: Add a comment to make it clear that the double clear is purely my paranoia. Reported-by: N"Dong, Chuanxiao" <chuanxiao.dong@intel.com> Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: "Dong, Chuanxiao" <chuanxiao.dong@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170807121919.30165-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20170818090509.5363-1-chris@chris-wilson.co.ukReviewed-by: NMichel Thierry <michel.thierry@intel.com> (cherry picked from commit 64f09f00) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 19 8月, 2017 2 次提交
-
-
由 Rodrigo Vivi 提交于
Let's inherit workarounds from previous platforms that according to wa_database and BSpec are still valid for Cannonlake. v2: Add missed workarounds. v3: Rebase v4: Remove bad chunk that was added to rc6 disable. (Ander) Also remove A0 W/a that are not needed anymore. v5: Rebase on top of CFL. v6: Remove empty gen9_init_perctx_bb and gen9_init_indirectctx_bb since they don't carry any gen10 related W/a. (by Oscar). Also Remove A0 exclusive workaround. v7: Remove more A0 exclusive workarounds. As pointed out by Oscar many workarounds were changed to be A0 only so let's remove them. Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NOscar Mateo <oscar.mateo@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-1-rodrigo.vivi@intel.com
-
由 Chris Wilson 提交于
During a global reset, we disable the irq. As we disable the irq, the hardware may be raising a GT interrupt that we then ignore, leaving it pending in the GTIIR. After the reset, we then re-enable the irq, triggering the pending interrupt. However, that interrupt was for the stale state from before the reset, and the contents of the CSB buffer are now invalid. v2: Add a comment to make it clear that the double clear is purely my paranoia. Reported-by: N"Dong, Chuanxiao" <chuanxiao.dong@intel.com> Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: "Dong, Chuanxiao" <chuanxiao.dong@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170807121919.30165-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20170818090509.5363-1-chris@chris-wilson.co.ukReviewed-by: NMichel Thierry <michel.thierry@intel.com>
-
- 27 7月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
When doing a GPU reset, the CSB register will be trashed and we will lose any context-switch notifications that happened since the tasklet was disabled. If we find that all requests on this engine were completed, we want to make sure that the ELSP tracker is similarly empty so that we do not feed back in the completed requests upon recovering from the reset. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-4-chris@chris-wilson.co.ukSigned-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 21 6月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Create a substruct to hold all the global context state under drm_i915_private. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-1-chris@chris-wilson.co.uk
-
- 19 6月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
We need to keep track of the last location we ask the hw to read up to (RING_TAIL) separately from our last write location into the ring, so that in the event of a GPU reset we do not tell the HW to proceed into a partially written request (which can happen if that request is waiting for an external signal before being executed). v2: Refactor intel_ring_reset() (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100144 Testcase: igt/gem_exec_fence/await-hang Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests") Fixes: d55ac5bf ("drm/i915: Defer transfer onto execution timeline to actual hw submission") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170425130049.26147-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> (cherry picked from commit e6ba9992) Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170615131129.3061-1-chris@chris-wilson.co.uk
-
- 15 6月, 2017 1 次提交
-
-
由 Robert Bragg 提交于
Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all share (more-or-less) the same OA unit design. Of particular note in comparison to Haswell: some OA unit HW config state has become per-context state and as a consequence it is somewhat more complicated to manage synchronous state changes from the cpu while there's no guarantee of what context (if any) is currently actively running on the gpu. The periodic sampling frequency which can be particularly useful for system-wide analysis (as opposed to command stream synchronised MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to have become per-context save and restored (while the OABUFFER destination is still a shared, system-wide resource). This support for gen8+ takes care to consider a number of timing challenges involved in synchronously updating per-context state primarily by programming all config state from the cpu and updating all current and saved contexts synchronously while the OA unit is still disabled. The driver intentionally avoids depending on command streamer programming to update OA state considering the lack of synchronization between the automatic loading of OACTXCONTROL state (that includes the periodic sampling state and enable state) on context restore and the parsing of any general purpose BB the driver can control. I.e. this implementation is careful to avoid the possibility of a context restore temporarily enabling any out-of-date periodic sampling state. In addition to the risk of transiently-out-of-date state being loaded automatically; there are also internal HW latencies involved in the loading of MUX configurations which would be difficult to account for from the command streamer (and we only want to enable the unit when once the MUX configuration is complete). Since the Gen8+ OA unit design no longer supports clock gating the unit off for a single given context (which effectively stopped any progress of counters while any other context was running) and instead supports tagging OA reports with a context ID for filtering on the CPU, it means we can no longer hide the system-wide progress of counters from a non-privileged application only interested in metrics for its own context. Although we could theoretically try and subtract the progress of other contexts before forwarding reports via read() we aren't in a position to filter reports captured via MI_REPORT_PERF_COUNT commands. As a result, for Gen8+, we always require the dev.i915.perf_stream_paranoid to be unset for any access to OA metrics if not root. v5: Drain submitted requests when enabling metric set to ensure no lite-restore erases the context image we just updated (Lionel) v6: In addition to drain, switch to kernel context & update all context in place (Chris) v7: Add missing mutex_unlock() if switching to kernel context fails (Matthew) v8: Simplify OA period/flex-eu-counters programming by using the batchbuffer instead of modifying ctx-image (Lionel) v9: Back to updating the context image (due to erroneous testing, batchbuffer programming the OA unit doesn't actually work) (Lionel) Pin context before updating context image (Chris) Drop MMIO programming now that we switch to a kernel context with right values in initial context image (Chris) v10: Just pin_map the contexts we want to modify or let the configuration happen on first use (Chris) v11: Update kernel context OA config through the batchbuffer rather than on the fly ctx-image update (Lionel) v12: Rework OA context registers update again by swithing away from user contexts and reconfiguring the kernel context through the batchbuffer and updating all the other contexts' context image. Also take care to lock slice/subslice configuration when OA is on. (Lionel) v13: Request rpcs updates on all engine when updating the OA config (Lionel) v14: Drop any kind of rpcs management now that we monitor sseu configuration changes in a later patch (Lionel) Remove usleep after programming the NOA configs on Gen8+, this doesn't seem to be needed (Lionel) v15: Respect coding style for block comments (Chris) v16: Add missing i915_add_request() in case we fail to emit OA configuration (Matthew) Signed-off-by: NRobert Bragg <robert@sixbynine.org> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> \o/ Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
-
- 07 6月, 2017 1 次提交
-
-
由 Michel Thierry 提交于
Indirect Context Offset Pointer has changed for Cannonlake. INDIRECT_CTX_OFFSET[15:6] valid value for CNL is 19h per Spec. v2: rebased to intel_lr_indirect_ctx_offset v3: Commit message added per Tvrtko request. Signed-off-by: NMichel Thierry <michel.thierry@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1496781040-20888-9-git-send-email-rodrigo.vivi@intel.com
-
- 19 5月, 2017 1 次提交
-
-
由 Chuanxiao Dong 提交于
During execlist_context_deferred_alloc() we presumed that the context is uninitialised (we only just allocated the state object for it!) and chose to optimise away the later call to engine->init_context() if engine->init_context were NULL. This breaks with GVT's contexts that are marked as pre-initialised to avoid us annoyingly calling engine->init_context(). The fix is to not override ce->initialised if it is already true. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1494497262-24855-1-git-send-email-chuanxiao.dong@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit 0d402a24) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 17 5月, 2017 4 次提交
-
-
由 Chris Wilson 提交于
If we do not require to perform priority bumping, and we haven't yet submitted the request, we can update its priority in situ and skip acquiring the engine locks -- thus avoiding any contention between us and submit/execute. v2: Remove the stack element from the list if we can do the early assignment. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-10-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The i915_priolist are allocated within an atomic context on a path where we wish to minimise latency. If we use a dedicated kmem_cache, we have the advantage of a local freelist from which to service new requests that should keep the latency impact of an allocation small. Though currently we expect the majority of requests to be at default priority (and so hit the preallocate priolist), once userspace starts using priorities they are likely to use many fine grained policies improving the utilisation of a private slab. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-9-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
All the requests at the same priority are executed in FIFO order. They do not need to be stored in the rbtree themselves, as they are a simple list within a level. If we move the requests at one priority into a list, we can then reduce the rbtree to the set of priorities. This should keep the height of the rbtree small, as the number of active priorities can not exceed the number of active requests and should be typically only a few. Currently, we have ~2k possible different priority levels, that may increase to allow even more fine grained selection. Allocating those in advance seems a waste (and may be impossible), so we opt for allocating upon first use, and freeing after its requests are depleted. To avoid the possibility of an allocation failure causing us to lose a request, we preallocate the default priority (0) and bump any request to that priority if we fail to allocate it the appropriate plist. Having a request (that is ready to run, so not leading to corruption) execute out-of-order is better than leaking the request (and its dependency tree) entirely. There should be a benefit to reducing execlists_dequeue() to principally using a simple list (and reducing the frequency of both rbtree iteration and balancing on erase) but for typical workloads, request coalescing should be small enough that we don't notice any change. The main gain is from improving PI calls to schedule, and the explicit list within a level should make request unwinding simpler (we just need to insert at the head of the list rather than the tail and not have to make the rbtree search more complicated). v2: Avoid use-after-free when deleting a depleted priolist v3: Michał found the solution to handling the allocation failure gracefully. If we disable all priority scheduling following the allocation failure, those requests will be executed in fifo and we will ensure that this request and its dependencies are in strict fifo (even when it doesn't realise it is only a single list). Normal scheduling is restored once we know the device is idle, until the next failure! Suggested-by: NMichał Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-8-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
add/remove: 1/1 grow/shrink: 5/4 up/down: 391/-578 (-187) function old new delta execlists_submit_ports 262 471 +209 port_assign.isra - 136 +136 capture 6344 6359 +15 reset_common_ring 438 452 +14 execlists_submit_request 228 238 +10 gen8_init_common_ring 334 341 +7 intel_engine_is_idle 106 105 -1 i915_engine_info 2314 2290 -24 __i915_gem_set_wedged_BKL 485 411 -74 intel_lrc_irq_handler 1789 1604 -185 execlists_update_context 294 - -294 The most important change there is the improve to the intel_lrc_irq_handler and excclist_submit_ports (net improvement since execlists_update_context is now inlined). v2: Use the port_api() for guc as well (even though currently we do not pack any counters in there, yet) and hide all port->request_count inside the helpers. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-5-chris@chris-wilson.co.uk
-
- 12 5月, 2017 1 次提交
-
-
由 Chuanxiao Dong 提交于
During execlist_context_deferred_alloc() we presumed that the context is uninitialised (we only just allocated the state object for it!) and chose to optimise away the later call to engine->init_context() if engine->init_context were NULL. This breaks with GVT's contexts that are marked as pre-initialised to avoid us annoyingly calling engine->init_context(). The fix is to not override ce->initialised if it is already true. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1494497262-24855-1-git-send-email-chuanxiao.dong@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-