- 25 7月, 2019 1 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
We currently track fetch and load status separately, but the 2 are actually sequential in the uc lifetime (fetch must complete before we can attempt the load!). Unifying the 2 variables we can better follow the sequential states and improve our trackng of the uC state. Also, sprinkle some GEM_BUG_ON to make sure we transition correctly between states. v2: rename states, add the running state (Michal), drop some logs in the fetch path (Michal, Chris) v3: re-rename states, extend early status check to all helpers (Michal) Suggested-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190725001813.4740-5-daniele.ceraolospurio@intel.com
-
- 14 7月, 2019 3 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
With our HW interface logic moving from i915 to gt and with GuC and HuC being part of the gt HW, it makes sense to use the intel_gt structure instead of i915 as our reference object in GuC/HuC paths. Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-9-chris@chris-wilson.co.ukSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Daniele Ceraolo Spurio 提交于
Being part of the GT HW, it make sense to keep the guc/huc structures inside the GT structure. To help with the encapsulation work done by the following patches, both structures are placed inside a new intel_uc container. Although this results in code with ugly nested dereferences (i915->gt.uc.guc...), it saves us the extra work required in moving the structures twice (i915 -> gt -> uc). The following patches will reduce the number of places where we try to access the guc/huc structures directly from i915 and reduce the ugliness. Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-7-chris@chris-wilson.co.ukSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Daniele Ceraolo Spurio 提交于
Both microcontrollers are part of the GT HW and are closely related to GT operations. To keep all the files cleanly together, they've been placed in their own subdir inside the gt/ folder Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-6-chris@chris-wilson.co.ukSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 13 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Having taken the first step in encapsulating the functionality by moving the related files under gt/, the next step is to start encapsulating by passing around the relevant structs rather than the global drm_i915_private. In this step, we pass intel_gt to intel_reset.c Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk
-
- 12 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
drivers/gpu/drm/i915/intel_guc_submission.c:799: warning: Excess function parameter 'ctx' description in 'guc_client_alloc' Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190711162415.2938-1-chris@chris-wilson.co.uk
-
- 11 7月, 2019 2 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
We originally added support, in some cases partial, for different modes of operations via guc clients: - proxy vs direct submission; - variable engine mask per-client. We only ever used one flow (all submissions via a single proxy), so the other code paths haven't been exercised and are most likely non-functional. The guc firmware interface is also in the process of being updated to better fit the i915 flow and our client abstraction will need to change accordingly (or possibly go away entirely), so these old unused paths can be considered dead and removed. Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Acked-by: NMatthew Brost <Matthew Brost <matthew.brost@intel.com> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190710005437.3496-3-daniele.ceraolospurio@intel.com
-
由 Chris Wilson 提交于
Preemption via GuC submission is not being supported with its current legacy incarnation. The current FW does support a similar pre-emption flow via H2G, but it is class-based instead of being instance-based, which doesn't fit well with the i915 tracking. To fix this, the firmware is being updated to better support our needs with a new flow, so we can safely remove the old code. v2 (Daniele): resurrect & rebase, reword commit message, remove preempt_context as well Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Acked-by: NMatthew Brost <matthew.brost@intel.com> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190710005437.3496-2-daniele.ceraolospurio@intel.com
-
- 10 7月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
We want to set this flag in the next commit on requests containing perf queries so that the result of the perf query can just be a delta of global counters, rather than doing post processing of the OA buffer. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> [ickle: add basic selftest for nopreempt] Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190709164227.25859-1-chris@chris-wilson.co.uk
-
- 20 6月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
When using a global seqno, we required a precise stop-the-workd event to handle preemption and unwind the global seqno counter. To accomplish this, we would preempt to a special out-of-band context and wait for the machine to report that it was idle. Given an idle machine, we could very precisely see which requests had completed and which we needed to feed back into the run queue. However, now that we have scrapped the global seqno, we no longer need to precisely unwind the global counter and only track requests by their per-context seqno. This allows us to loosely unwind inflight requests while scheduling a preemption, with the enormous caveat that the requests we put back on the run queue are still _inflight_ (until the preemption request is complete). This makes request tracking much more messy, as at any point then we can see a completed request that we believe is not currently scheduled for execution. We also have to be careful not to rewind RING_TAIL past RING_HEAD on preempting to the running context, and for this we use a semaphore to prevent completion of the request before continuing. To accomplish this feat, we change how we track requests scheduled to the HW. Instead of appending our requests onto a single list as we submit, we track each submission to ELSP as its own block. Then upon receiving the CS preemption event, we promote the pending block to the inflight block (discarding what was previously being tracked). As normal CS completion events arrive, we then remove stale entries from the inflight tracker. v2: Be a tinge paranoid and ensure we flush the write into the HWS page for the GPU semaphore to pick in a timely fashion. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
-
- 15 6月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
To continue the onslaught of removing the assumption of a global execution ordering, another casualty is the engine->timeline. Without an actual timeline to track, it is overkill and we can replace it with a much less grand plain list. We still need a list of requests inflight, for the simple purpose of finding inflight requests (for retiring, resetting, preemption etc). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
-
- 12 6月, 2019 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Only a few call sites remain which have been converted to uncore mmio accessors and so the macro can be removed. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190611104548.30545-2-tvrtko.ursulin@linux.intel.com
-
- 07 6月, 2019 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Remove a couple dev_priv locals as a consequence. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190607084521.16845-1-tvrtko.ursulin@linux.intel.com
-
- 28 5月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Continuing the theme of separating out the GEM clutter. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk
-
由 Michal Wajdeczko 提交于
With newer GuC firmware it is always ok to ask GuC to update power domain states. Make it an unconditional initialization step. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Spotswood <john.a.spotswood@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NJohn Spotswood <john.a.spotswood@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190527183613.17076-6-michal.wajdeczko@intel.com
-
- 24 5月, 2019 1 次提交
-
-
由 Michal Wajdeczko 提交于
This function just check our software flag, while 'is_alive' may suggest that we are checking runtime firmware status. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-5-michal.wajdeczko@intel.com
-
- 13 5月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Currently there is an underlying assumption that i915_request_unsubmit() is synchronous wrt the GPU -- that is the request is no longer in flight as we remove it. In the near future that may change, and this may upset our signaling as we can process an interrupt for that request while it is no longer in flight. CPU0 CPU1 intel_engine_breadcrumbs_irq (queue request completion) i915_request_cancel_signaling ... ... i915_request_enable_signaling dma_fence_signal Hence in the time it took us to drop the lock to signal the request, a preemption event may have occurred and re-queued the request. In the process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and so reused the rq->signal_link that was in use on CPU0, leading to bad pointer chasing in intel_engine_breadcrumbs_irq. A related issue was that if someone started listening for a signal on a completed but no longer in-flight request, we missed the opportunity to immediately signal that request. Furthermore, as intel_contexts may be immediately released during request retirement, in order to be entirely sure that intel_engine_breadcrumbs_irq may no longer dereference the intel_context (ce->signals and ce->signal_link), we must wait for irq spinlock. In order to prevent the race, we use a bit in the fence.flags to signal the transfer onto the signal list inside intel_engine_breadcrumbs_irq. For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then quickly signals to any outside observer that the fence is indeed signaled. v2: Sketch out potential dma-fence API for manual signaling v3: And the test_and_set_bit() Fixes: 52c0fdb2 ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190508112452.18942-1-chris@chris-wilson.co.uk (cherry picked from commit 0152b3b3) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 08 5月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Currently there is an underlying assumption that i915_request_unsubmit() is synchronous wrt the GPU -- that is the request is no longer in flight as we remove it. In the near future that may change, and this may upset our signaling as we can process an interrupt for that request while it is no longer in flight. CPU0 CPU1 intel_engine_breadcrumbs_irq (queue request completion) i915_request_cancel_signaling ... ... i915_request_enable_signaling dma_fence_signal Hence in the time it took us to drop the lock to signal the request, a preemption event may have occurred and re-queued the request. In the process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and so reused the rq->signal_link that was in use on CPU0, leading to bad pointer chasing in intel_engine_breadcrumbs_irq. A related issue was that if someone started listening for a signal on a completed but no longer in-flight request, we missed the opportunity to immediately signal that request. Furthermore, as intel_contexts may be immediately released during request retirement, in order to be entirely sure that intel_engine_breadcrumbs_irq may no longer dereference the intel_context (ce->signals and ce->signal_link), we must wait for irq spinlock. In order to prevent the race, we use a bit in the fence.flags to signal the transfer onto the signal list inside intel_engine_breadcrumbs_irq. For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then quickly signals to any outside observer that the fence is indeed signaled. v2: Sketch out potential dma-fence API for manual signaling v3: And the test_and_set_bit() Fixes: 52c0fdb2 ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190508112452.18942-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If we couple the scheduler more tightly with the execlists policy, we can apply the preemption policy to the question of whether we need to kick the tasklet at all for this priority bump. v2: Rephrase it as a core i915 policy and not an execlists foible. v3: Pull the kick together. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190507122544.12698-1-chris@chris-wilson.co.uk
-
- 03 5月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Tidy up the cleanup sequence by always ensure that the tasklet is flushed on parking (before we cleanup). The parking provides a convenient point to ensure that the backend is truly idle. v2: Do the full check for idleness before parking, to be sure we flush any residual interrupt. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190503080942.30151-1-chris@chris-wilson.co.uk
-
- 27 4月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
We switched to a tree of per-engine HW context to accommodate the introduction of virtual engines. However, we plan to also support multiple instances of the same engine within the GEM context, defeating our use of the engine as a key to looking up the HW context. Just allocate a logical per-engine instance and always use an index into the ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user specified map. v2: Add for_each_gem_engine() helper to iterator within the engines lock v3: intel_context_create_request() helper v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough. v5: Push iterator locking to caller Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk
-
- 25 4月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJani Nikula <jani.nikula@intel.com> Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
-
- 12 4月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Before causing guc and execlists to diverge further (breaking guc in the process), take a copy of the current reset procedure and make it local to the guc submission backend Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190411130515.20716-1-chris@chris-wilson.co.uk
-
- 09 4月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Circumvent the dance we currently perform to find the preempt_client and lookup its HW context for this engine, as we know we have already pinned the preempt_context on the engine. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190408091728.20207-15-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Replace the WARN with a simple if() + error message to squech the sparse warning that entire wait_for() macro was being stringified: drivers/gpu/drm/i915/intel_guc_submission.c:658:9: error: too long token expansion Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190408091728.20207-2-chris@chris-wilson.co.uk
-
- 03 4月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Use the engine->flags to store whether we want to kick the submission tasklet on receipt of a breadcrumb interrupt, so that this decision can be made by the submission backend and not dependent on a limited feature test within the interrupt handler. This should make it easier to adapt to different submission backends. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Xiaolin Zhang <xiaolin.zhang@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190329154912.13781-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
- 19 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
If we use the STORE_DATA_INDEX function we can use a fixed offset and avoid having to lookup up the engine HWS address. A step closer to being able to emit the final breadcrumb during request_add rather than later in the submission interrupt handler. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190318095204.9913-9-chris@chris-wilson.co.uk
-
- 08 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
In preparation for an ever growing number of engines and so ever increasing static array of HW contexts within the GEM context, move the array over to an rbtree, allocated upon first use. Unfortunately, this imposes an rbtree lookup at a few frequent callsites, but we should be able to mitigate those by moving over to using the HW context as our primary type and so only incur the lookup on the boundary with the user GEM context and engines. v2: Check for no HW context in guc_stage_desc_init Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-4-chris@chris-wilson.co.uk
-
- 06 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
-
- 01 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
WAIT is occasionally suppressed by virtue of preempted requests being promoted to NEWCLIENT if they have not all ready received that boost. Make this consistent for all WAIT boosts that they are not allowed to preempt executing contexts and are merely granted the right to be at the front of the queue for the next execution slot. This is in keeping with the desire that the WAIT boost be a minor tweak that does not give excessive promotion to its user and open ourselves to trivial abuse. The problem with the inconsistent WAIT preemption becomes more apparent as the preemption is propagated across the engines, where one engine may preempt and the other not, and we be relying on the exact execution order being consistent across engines (e.g. using HW semaphores to coordinate parallel execution). v2: Also protect GuC submission from false preemption loops. v3: Build bug safeguards and better debug messages for st. v4: Do the priority bumping in unsubmit (i.e. on preemption/reset unwind), applying it earlier during submit causes out-of-order execution combined with execute fences. v5: Call sw_fence_fini for our dummy request (Matthew) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190228220639.3173-1-chris@chris-wilson.co.uk
-
- 28 2月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
As kmem_caches share the same properties (size, allocation/free behaviour) for all potential devices, we can use global caches. While this potential has worse fragmentation behaviour (one can argue that different devices would have different activity lifetimes, but you can also argue that activity is temporal across the system) it is the default behaviour of the system at large to amalgamate matching caches. The benefit for us is much reduced pointer dancing along the frequent allocation paths. v2: Defer shrinking until after a global grace period for futureproofing multiple consumers of the slab caches, similar to the current strategy for avoiding shrinking too early. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-1-chris@chris-wilson.co.uk
-
- 26 2月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Having weaned the interrupt handling off using a single global execution queue, we no longer need to emit a global_seqno. Note that we still have a few assumptions about execution order along engine timelines, but this removes the most obvious artefact! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-3-chris@chris-wilson.co.uk
-
- 30 1月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
In order to avoid preempting ourselves, we currently refuse to schedule the tasklet if we reschedule an inflight context. However, this glosses over a few issues such as what happens after a CS completion event and we then preempt the newly executing context with itself, or if something else causes a tasklet_schedule triggering the same evaluation to preempt the active context with itself. However, when we avoid preempting ELSP[0], we still retain the preemption value as it may match a second preemption request within the same time period that we need to resolve after the next CS event. However, since we only store the maximum preemption priority seen, it may not match the subsequent event and so we should double check whether or not we actually do need to trigger a preempt-to-idle by comparing the top priorities from each queue. Later, this gives us a hook for finer control over deciding whether the preempt-to-idle is justified. The sequence of events where we end up preempting for no avail is: 1. Queue requests/contexts A, B 2. Priority boost A; no preemption as it is executing, but keep hint 3. After CS switch, B is less than hint, force preempt-to-idle 4. Resubmit B after idling v2: We can simplify a bunch of tests based on the knowledge that PI will ensure that earlier requests along the same context will have the highest priority. v3: Demonstrate the stale preemption hint with a selftest References: a2bf92e8 ("drm/i915/execlists: Avoid kicking priority on the current context") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
After noticing that we trigger preemption events for currently executing requests, as well as requests that complete before the preemption and attempting to suppress those preemption events, it is wise to not consider the queue_priority to be authoritative. As we only track the maximum priority seen between dequeue passes, if the maximum priority request is no longer available for dequeuing (it completed or is even executing on another engine), we have no knowledge of the previous queue_priority as it would require us to keep a full history of enqueued requests -- but we already have that history in the priolists! Rename the queue_priority to queue_priority_hint so that we do not confuse it as being exactly the maximum priority in the queue, but merely an indication that we have seen a new maximum priority value and as such we should check whether it should preempt the currently running request. v2: s/preempt_priority_hint/queue_priority_hint/ as preempt implies it being only used for the singular task of preemption and not the wider question of waking up due to a change in the queue. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-3-chris@chris-wilson.co.uk
-
- 29 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Currently we only allocate an object and vma if we are using a GGTT virtual HWSP, and a plain struct page for a physical HWSP. For convenience later on with global timelines, it will be useful to always have the status page being tracked by a struct i915_vma. Make it so. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-4-chris@chris-wilson.co.uk
-
- 25 1月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Now that the submission backends are controlled via their own spinlocks, with a wave of a magic wand we can lift the struct_mutex requirement around GPU reset. That is we allow the submission frontend (userspace) to keep on submitting while we process the GPU reset as we can suspend the backend independently. The major change is around the backoff/handoff strategy for performing the reset. With no mutex deadlock, we no longer have to coordinate with any waiter, and just perform the reset immediately. Testcase: igt/gem_mmap_gtt/hang # regresses Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Simplify by using sizeof(u32) to convert from the index inside the HWSP to the byte offset. This has the advantage of not only being shorter (and so not upsetting checkpatch!) but that it matches use where we are writing to byte addresses using other commands than MI_STORE_DWORD_IMM. v2: Drop the now superfluous MI_STORE_DWORD_INDEX_SHIFT, it appears to be a local invention so keeping it after the final use does not help to clarify the GPU instruction. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125120005.25191-2-chris@chris-wilson.co.uk
-
- 29 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
In preparation for removing the manual EMIT_FLUSH prior to emitting the breadcrumb implement the flush inline with writing the breadcrumb for execlists. Using one command to both flush and write the breadcrumb is naturally a tiny bit faster than splitting it into two. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181228153114.4948-1-chris@chris-wilson.co.uk
-
- 23 10月, 2018 2 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
A collection of very small cleanups/improvements around doorbell checking that do not deserve their own patch: - Move doorbell-related HW defs to intel_guc_reg.h - use GUC_NUM_DOORBELLS instead of GUC_DOORBELL_INVALID where appropriate - do not stop on error in guc_verify_doorbells - do not print drbreg on error: the only content of the register apart from the valid bit is the lower part of the physical memory address, which we can't use even if valid because we don't know which descriptor it came from (since the doorbell is in an unexpected state) - Move the checking of doorbell valid bit to a common helper. v2: add more cleanups (move defs, use GUC_NUM_DOORBELLS, don't stop in guc_verify_doorbells) (Michal) v3: move more things to intel_guc_reg, redefine GUC_DOORBELL_INVALID (Michal), drop guc_doorbell_qw since it just duplicates guc_doorbell_info Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181022230427.5616-3-daniele.ceraolospurio@intel.com
-
由 Daniele Ceraolo Spurio 提交于
Cacheline selection is only needed if we actually manage to reserve a doorbell. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181022230427.5616-2-daniele.ceraolospurio@intel.com
-