- 06 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
-
- 05 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
We no longer use the semaphore sync registers on gen6/7, so including them in the GPU error state is mere noise. References: 6faf5916 ("drm/i915: Remove HW semaphores for gen7 inter-engine synchronisation") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305150914.11340-2-chris@chris-wilson.co.uk
-
- 26 2月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Having weaned the interrupt handling off using a single global execution queue, we no longer need to emit a global_seqno. Note that we still have a few assumptions about execution order along engine timelines, but this removes the most obvious artefact! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Stop accessing the HWSP to read the global seqno, and stop tracking the mirror in the engine's execution timeline -- it is unused. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-2-chris@chris-wilson.co.uk
-
- 19 2月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Currently, we accumulate each time a context hangs the GPU, offset against the number of requests it submits, and if that score exceeds a certain threshold, we ban that context from submitting any more requests (cancelling any work in flight). In contrast, we use a simple timer on the file, that if we see more than a 9 hangs faster than 60s apart in total across all of its contexts, we will ban the client from creating any more contexts. This leads to a confusing situation where the file may be banned before the context, so lets use a simple timer scheme for each. If the context submits 3 hanging requests within a 120s period, declare it forbidden to ever send more requests. This has the advantage of not being easy to repair by simply sending empty requests, but has the disadvantage that if the context is idle then it is forgiven. However, if the context is idle, it is not disrupting the system, but a hog can evade the request counting and cause much more severe disruption to the system. Updating ban_score from request retirement is dubious as the retirement is purposely not in sync with request submission (i.e. we try and batch retirement to reduce overhead and avoid latency on submission), which leads to surprising situations where we can forgive a hang immediately due to a backlog of requests from before the hang being retired afterwards. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-2-chris@chris-wilson.co.uk
-
- 06 2月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Looking forward, we need to break the struct_mutex dependency on i915_gem_active. In the meantime, external use of i915_gem_active is quite beguiling, little do new users suspect that it implies a barrier as each request it tracks must be ordered wrt the previous one. As one of many, it can be used to track activity across multiple timelines, a shared fence, which fits our unordered request submission much better. We need to steer external users away from the singular, exclusive fence imposed by i915_gem_active to i915_active instead. As part of that process, we move i915_gem_active out of i915_request.c into i915_active.c to start separating the two concepts, and rename it to i915_active_request (both to tie it to the concept of tracking just one request, and to give it a longer, less appealing name). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
-
- 30 1月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Missed breadcrumb detection is defunct due to the tight coupling with dma_fence signaling and the myriad ways we may signal fences from everywhere but from an interrupt, i.e. we frequently signal a fence before we even see its interrupt. This means that even if we miss an interrupt for a fence, it still is signaled before our breadcrumb hangcheck fires, so simplify the breadcrumb hangchecking by moving it into the GPU hangcheck and forgo fake interrupts. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
A few years ago, see commit 688e6c72 ("drm/i915: Slaughter the thundering i915_wait_request herd"), the issue of handling multiple clients waiting in parallel was brought to our attention. The requirement was that every client should be woken immediately upon its request being signaled, without incurring any cpu overhead. To handle certain fragility of our hw meant that we could not do a simple check inside the irq handler (some generations required almost unbounded delays before we could be sure of seqno coherency) and so request completion checking required delegation. Before commit 688e6c72, the solution was simple. Every client waiting on a request would be woken on every interrupt and each would do a heavyweight check to see if their request was complete. Commit 688e6c72 introduced an rbtree so that only the earliest waiter on the global timeline would woken, and would wake the next and so on. (Along with various complications to handle requests being reordered along the global timeline, and also a requirement for kthread to provide a delegate for fence signaling that had no process context.) The global rbtree depends on knowing the execution timeline (and global seqno). Without knowing that order, we must instead check all contexts queued to the HW to see which may have advanced. We trim that list by only checking queued contexts that are being waited on, but still we keep a list of all active contexts and their active signalers that we inspect from inside the irq handler. By moving the waiters onto the fence signal list, we can combine the client wakeup with the dma_fence signaling (a dramatic reduction in complexity, but does require the HW being coherent, the seqno must be visible from the cpu before the interrupt is raised - we keep a timer backup just in case). Having previously fixed all the issues with irq-seqno serialisation (by inserting delays onto the GPU after each request instead of random delays on the CPU after each interrupt), we can rely on the seqno state to perfom direct wakeups from the interrupt handler. This allows us to preserve our single context switch behaviour of the current routine, with the only downside that we lose the RT priority sorting of wakeups. In general, direct wakeup latency of multiple clients is about the same (about 10% better in most cases) with a reduction in total CPU time spent in the waiter (about 20-50% depending on gen). Average herd behaviour is improved, but at the cost of not delegating wakeups on task_prio. v2: Capture fence signaling state for error state and add comments to warm even the most cold of hearts. v3: Check if the request is still active before busywaiting v4: Reduce the amount of pointer misdirection with list_for_each_safe and using a local i915_request variable inside the loops v5: Add a missing pluralisation to a purely informative selftest message. References: 688e6c72 ("drm/i915: Slaughter the thundering i915_wait_request herd") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
-
- 29 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Our goal is to remove struct_mutex and replace it with fine grained locking. One of the thorny issues is our eviction logic for reclaiming space for an execbuffer (or GTT mmaping, among a few other examples). While eviction itself is easy to move under a per-VM mutex, performing the activity tracking is less agreeable. One solution is not to do any MRU tracking and do a simple coarse evaluation during eviction of active/inactive, with a loose temporal ordering of last insertion/evaluation. That keeps all the locking constrained to when we are manipulating the VM itself, neatly avoiding the tricky handling of possible recursive locking during execbuf and elsewhere. Note that discarding the MRU (currently implemented as a pair of lists, to avoid scanning the active list for a NONBLOCKING search) is unlikely to impact upon our efficiency to reclaim VM space (where we think a LRU model is best) as our current strategy is to use random idle replacement first before doing a search, and over time the use of softpinned 48b per-ppGTT is growing (thereby eliminating any need to perform any eviction searches, in theory at least) with the remaining users being found on much older devices (gen2-gen6). v2: Changelog and commentary rewritten to elaborate on the duality of a single list being both an inactive and active list. v3: Consolidate bool parameters into a single set of flags; don't comment on the duality of a single variable being a multiplicity of bits. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
-
- 25 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Now that the submission backends are controlled via their own spinlocks, with a wave of a magic wand we can lift the struct_mutex requirement around GPU reset. That is we allow the submission frontend (userspace) to keep on submitting while we process the GPU reset as we can suspend the backend independently. The major change is around the backoff/handoff strategy for performing the reset. With no mutex deadlock, we no longer have to coordinate with any waiter, and just perform the reset immediately. Testcase: igt/gem_mmap_gtt/hang # regresses Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk
-
- 17 1月, 2019 1 次提交
-
-
由 Jani Nikula 提交于
Mixed C99 and kernel types use is getting ugly. Prefer kernel types. sed -i 's/\buint\(8\|16\|32\|64\)_t\b/u\1/g' Minor checkpatch fixes sprinkled on top of the changed lines. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NJosé Roberto de Souza <jose.souza@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/14ed72e7f04c9340a057855c5950b54811f8a477.1547629303.git.jani.nikula@intel.com
-
- 10 1月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
If we find an incompletely setup vma inside the request/engine at the time of a hang, it may not have vma->pages initialised, so skip capturing the object before we iterate over NULL. Spotted by Matthew in preparation for using unpinned vma to track engine state. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190110111522.11023-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Although commit fb6f0b64 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture") applied cleanly after a 24 month hiatus, the code had moved on with new methods for peeking and fetching the captured gpu info. Make sure we catch all uses of the stashed error state and avoid dereferencing the error pointer. v2: Move error pointer determination into i915_gpu_capture_state v3: Restore early check to avoid capturing and then throwing away subsequent GPU error states. Fixes: fb6f0b64 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181207110554.19897-1-chris@chris-wilson.co.uk (cherry picked from commit e6154e4c) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 03 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
As the question of 32b/64b kernels became relevant in the light of certain bugs, include that information in the error state. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190103101245.15100-1-chris@chris-wilson.co.uk
-
- 02 1月, 2019 1 次提交
-
-
由 Jani Nikula 提交于
First move the low hanging fruit, the fields that are only initialized runtime. Use RUNTIME_INFO() exclusively to access the fields. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c24fe7a4b0492a888690c46814c0ff21ce2f12b1.1546267488.git.jani.nikula@intel.com
-
- 31 12月, 2018 2 次提交
-
-
由 Jani Nikula 提交于
Abstract the one user in anticipation of more. Set the dangling pointers to NULL while at it. Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/8637d1e5049dc003718772f19d664aeaf9540856.1545920737.git.jani.nikula@intel.com
-
由 Jani Nikula 提交于
Abstract the one user in anticipation of more. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c6a94b4da8dc723df025b1f602fe46d76d00d53f.1545920737.git.jani.nikula@intel.com
-
- 13 12月, 2018 2 次提交
-
-
由 Lucas De Marchi 提交于
Instead of using IS_GEN() for consecutive gen checks, let's pass the range to IS_GEN_RANGE(). By code inspection these were the ranges deemed necessary for spatch: @@ expression e; @@ ( - IS_GEN(e, 3) || IS_GEN(e, 2) + IS_GEN_RANGE(e, 2, 3) | - IS_GEN(e, 3) || IS_GEN(e, 4) + IS_GEN_RANGE(e, 3, 4) | - IS_GEN(e, 5) || IS_GEN(e, 6) + IS_GEN_RANGE(e, 5, 6) | - IS_GEN(e, 6) || IS_GEN(e, 7) + IS_GEN_RANGE(e, 6, 7) | - IS_GEN(e, 7) || IS_GEN(e, 8) + IS_GEN_RANGE(e, 7, 8) | - IS_GEN(e, 8) || IS_GEN(e, 9) + IS_GEN_RANGE(e, 8, 9) | - IS_GEN(e, 10) || IS_GEN(e, 9) + IS_GEN_RANGE(e, 9, 10) | - IS_GEN(e, 9) || IS_GEN(e, 10) + IS_GEN_RANGE(e, 9, 10) ) After conversion, checking we don't have any missing IS_GEN_RANGE() || IS_GEN() was also done. Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-3-lucas.demarchi@intel.com
-
由 Lucas De Marchi 提交于
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of gen_mask to do the comparison. Now callers can pass then gen as a parameter, so we don't require one macro for each gen. The following spatch was used to convert the users of these macros: @@ expression e; @@ ( - IS_GEN2(e) + IS_GEN(e, 2) | - IS_GEN3(e) + IS_GEN(e, 3) | - IS_GEN4(e) + IS_GEN(e, 4) | - IS_GEN5(e) + IS_GEN(e, 5) | - IS_GEN6(e) + IS_GEN(e, 6) | - IS_GEN7(e) + IS_GEN(e, 7) | - IS_GEN8(e) + IS_GEN(e, 8) | - IS_GEN9(e) + IS_GEN(e, 9) | - IS_GEN10(e) + IS_GEN(e, 10) | - IS_GEN11(e) + IS_GEN(e, 11) ) v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than using the bitmask Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
-
- 12 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Currently we allocate a scratch page for each engine, but since we only ever write into it for post-sync operations, it is not exposed to userspace nor do we care for coherency. As we then do not care about its contents, we can use one page for all, reducing our allocations and avoid complications by not assuming per-engine isolation. For later use, it simplifies engine initialisation (by removing the allocation that required struct_mutex!) and means that we can always rely on there being a scratch page. v2: Check that we allocated a large enough scratch for I830 w/a Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.18.20+ (cherry picked from commit 51797499) [Joonas: Use new function in gen9_init_indirectctx_bb too] Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 07 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Although commit fb6f0b64 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture") applied cleanly after a 24 month hiatus, the code had moved on with new methods for peeking and fetching the captured gpu info. Make sure we catch all uses of the stashed error state and avoid dereferencing the error pointer. v2: Move error pointer determination into i915_gpu_capture_state v3: Restore early check to avoid capturing and then throwing away subsequent GPU error states. Fixes: fb6f0b64 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181207110554.19897-1-chris@chris-wilson.co.uk
-
- 04 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Currently we allocate a scratch page for each engine, but since we only ever write into it for post-sync operations, it is not exposed to userspace nor do we care for coherency. As we then do not care about its contents, we can use one page for all, reducing our allocations and avoid complications by not assuming per-engine isolation. For later use, it simplifies engine initialisation (by removing the allocation that required struct_mutex!) and means that we can always rely on there being a scratch page. v2: Check that we allocated a large enough scratch for I830 w/a Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.18.20+
-
- 23 11月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Currently, we convert the error state into a string every time we read from sysfs (and sysfs reads in page size (4KiB) chunks). We do try to window the string and only capture the portion that is being read, but that means that we must always convert up to the window to find the start. For a very large error state bordering on EXEC_OBJECT_CAPTURE abuse, this is noticeable as it degrades to O(N^2)! As we do not have a convenient hook for sysfs open(), and we would like to keep the lazy conversion into a string, do the conversion of the whole string on the first read and keep the string until the error state is freed. v2: Don't double advance simple_read_from_buffer v3: Due to extreme pain of lack of vrealloc, use a scatterlist v4: Keep the forward iterator loosely cached v5: Stylistic improvements to reduce patch size Reported-by: NJason Ekstrand <jason@jlekstrand.net> Testcase: igt/gem_exec_capture/many* Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181123132325.26541-1-chris@chris-wilson.co.uk
-
- 22 11月, 2018 1 次提交
-
-
由 Hans Holmberg 提交于
There is no need to rebuild i915_gpu_error.o when the version string changes as the version is available in init_utsname()->release. Signed-off-by: NHans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181121095423.20760-1-hans.ml.holmberg@owltronix.com
-
- 20 11月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
Since capturing the error state requires fiddling around with the GGTT to read arbitrary buffers and is itself run under stop_machine(), it deadlocks the machine (effectively a hard hang) when run in conjunction with Broxton's VTd workaround to serialize GGTT access. v2: Store the ERR_PTR in first_error so that the error can be reported to the user via sysfs. v3: Mention the quirk in dmesg (using info as per usual) Fixes: 0ef34ad6 ("drm/i915: Serialize GTT/Aperture accesses on BXT") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: John Harrison <john.C.Harrison@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-5-chris@chris-wilson.co.uk (cherry picked from commit fb6f0b64) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Since capturing the error state requires fiddling around with the GGTT to read arbitrary buffers and is itself run under stop_machine(), it deadlocks the machine (effectively a hard hang) when run in conjunction with Broxton's VTd workaround to serialize GGTT access. v2: Store the ERR_PTR in first_error so that the error can be reported to the user via sysfs. v3: Mention the quirk in dmesg (using info as per usual) Fixes: 0ef34ad6 ("drm/i915: Serialize GTT/Aperture accesses on BXT") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: John Harrison <john.C.Harrison@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-5-chris@chris-wilson.co.uk
-
- 03 10月, 2018 3 次提交
-
-
由 Chris Wilson 提交于
The final call to zlib_deflate(Z_FINISH) may require more output space to be allocated and so needs to re-invoked. Failure to do so in the current code leads to incomplete zlib streams (albeit intact due to the use of Z_SYNC_FLUSH) resulting in the occasional short object capture. v2: Check against overrunning our pre-allocated page array v3: Drop Z_SYNC_FLUSH entirely Testcase: igt/i915-error-capture.js Fixes: 0a97015d ("drm/i915: Compress GPU objects in error state") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181003082422.23214-1-chris@chris-wilson.co.uk (cherry picked from commit 83bc0f5b) Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
由 Chris Wilson 提交于
We do not need to continually clear our dedicated PTE for error capture as it will be updated and invalidated to the next object. Only at the end do we wish to be sure that the PTE doesn't point back to any buffer. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181001194447.29910-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The final call to zlib_deflate(Z_FINISH) may require more output space to be allocated and so needs to re-invoked. Failure to do so in the current code leads to incomplete zlib streams (albeit intact due to the use of Z_SYNC_FLUSH) resulting in the occasional short object capture. v2: Check against overrunning our pre-allocated page array v3: Drop Z_SYNC_FLUSH entirely Testcase: igt/i915-error-capture.js Fixes: 0a97015d ("drm/i915: Compress GPU objects in error state") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181003082422.23214-1-chris@chris-wilson.co.uk
-
- 27 9月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Now that we are confident in providing full-ppgtt where supported, remove the ability to override the context isolation. v2: Remove faked aliasing-ppgtt for testing as it no longer is accepted. v3: s/USES/HAS/ to match usage and reject attempts to load the module on old GVT-g setups that do not provide support for full-ppgtt. v4: Insulate ABI ppGTT values from our internal enum (later plans involve moving ppGTT depth out of the enum, thus potentially breaking ABI unless we document the current values). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NZhi Wang <zhi.a.wang@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180926201222.5643-1-chris@chris-wilson.co.uk
-
- 15 9月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
If we fail to allocate an array for a large number of user requested capture objects, reduce the array size and try to grab at least some of the objects! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180911115810.8917-3-chris@chris-wilson.co.uk
-
- 30 7月, 2018 1 次提交
-
-
由 Jordan Crouse 提交于
The i915 DRM driver very cleverly used ascii85 encoding for their GPU state file. Move the encode functions to a general header file to support other drivers that might be interested in the same functionality. v4: Make the return value const char * as suggested by Chris Wilson v3: Fix error_puts -> err_puts pointed out by the 01.org bot v2: Update API to be cleaner for the caller as suggested by Chris Wilson Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org> Signed-off-by: NRob Clark <robdclark@gmail.com>
-
- 07 7月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
In the next patch, we will want to be able to use more flexible request timelines that can hop between engines. From the vma pov, we can then not rely on the binding of this request to an engine and so can not ensure that different requests are ordered through a per-engine timeline, and so we must track activity of all timelines. (We track activity on the vma itself to prevent unbinding from HW before the HW has finished accessing it.) v2: Switch to a rbtree for 32b safety (since using u64 as a radixtree index is fraught with aliasing of unsigned longs). v3: s/lookup_active/active_instance/ because we can never agree on names Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-5-chris@chris-wilson.co.uk
-
- 08 6月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
In order to allow ourselves to use VMA to wrap other entities other than GEM objects, we need to allow for the vma->obj backpointer to be NULL. In most cases, we know we are operating on a GEM object and its vma, but we need the core code (such as i915_vma_pin/insert/bind/unbind) to work regardless of the innards. The remaining eyesore here is vma->obj->cache_level and related (but less of an issue) vma->obj->gt_ro. With a bit of care we should mirror those on the vma itself. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-1-chris@chris-wilson.co.uk
-
- 06 6月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
In the near future, I want to subclass gen6_hw_ppgtt as it contains a few specialised members and I wish to add more. To avoid the ugliness of using ppgtt->base.base, rename the i915_hw_ppgtt base member (i915_address_space) as vm, which is our common shorthand for an i915_address_space local. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The inactive counter was over the active list, and vice versa. Fortuitously this should not cause a problem in practice as they shared the same array and clamped the number of entries they would write. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605160623.30163-1-chris@chris-wilson.co.uk
-
- 18 5月, 2018 3 次提交
-
-
由 Chris Wilson 提交于
To ease the frequent and ugly pointer dance of &request->gem_context->engine[request->engine->id] during request submission, store that pointer as request->hw_context. One major advantage that we will exploit later is that this decouples the logical context state from the engine itself. v2: Set mock_context->ops so we don't crash and burn in selftests. Cleanups from Tvrtko. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: NZhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In the next patch, we want to store the intel_context pointer inside i915_request, as it is frequently access via a convoluted dance when submitting the request to hw. Having two context pointers inside i915_request leads to confusion so first rename the existing i915_gem_context pointer to i915_request.gem_context. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-1-chris@chris-wilson.co.uk
-
由 Oscar Mateo 提交于
Stop reading some now deprecated interrupt registers in both debugfs and error state. Instead, read the new equivalents in the Gen11 interrupt repartitioning scheme. Note that the equivalent to the PM ISR & IIR cannot be read without affecting the current state of the system, so I've opted for leaving them out. See gen11_reset_one_iir() for more info. v2: else if !!! (Paulo) v3: another else if (Vinay) v4: - Rebased - Renamed patch - Improved the ordering of GENs - Improved the printing of per-GEN info v5: Avoid maybe-unitialized & add comment explaining the lack of PM ISR & IIR Suggested-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: NVinay Belgaumkar <vinay.belgaumkar@intel.com> [Paulo: fix commit message and coding style.] Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1525989595-18220-1-git-send-email-oscar.mateo@intel.com
-
- 03 5月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
We need to move to a more flexible timeline that doesn't assume one fence context per engine, and so allow for a single timeline to be used across a combination of engines. This means that preallocating a fence context per engine is now a hindrance, and so we want to introduce the singular timeline. From the code perspective, this has the notable advantage of clearing up a lot of mirky semantics and some clumsy pointer chasing. By splitting the timeline up into a single entity rather than an array of per-engine timelines, we can realise the goal of the previous patch of tracking the timeline alongside the ring. v2: Tweak wait_for_idle to stop the compiling thinking that ret may be uninitialised. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk
-