- 19 6月, 2021 2 次提交
-
-
由 Matthew Brost 提交于
The schedule function should be in the schedule object. v3: (Jason Ekstrand) Add kernel doc Signed-off-by: NMatthew Brost <matthew.brost@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-6-matthew.brost@intel.com
-
由 Matthew Brost 提交于
Move active request tracking and its lock to i915_sched_engine. This lock is also the submission lock so having it in the i915_sched_engine is the correct place. v3: (Jason Ekstrand) Add kernel doc v6: Rebase Signed-off-by: NMatthew Brost <matthew.brost@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.comk> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-5-matthew.brost@intel.com
-
- 07 6月, 2021 1 次提交
-
-
由 Lucas De Marchi 提交于
This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NMatt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210606045050.103862-2-lucas.demarchi@intel.com
-
- 06 6月, 2021 2 次提交
-
-
由 Christian König 提交于
The functions can be called both in _rcu context as well as while holding the lock. v2: add some kerneldoc as suggested by Daniel v3: fix indentation Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJason Ekstrand <jason@jlekstrand.net> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-7-christian.koenig@amd.com
-
由 Christian König 提交于
That describes much better what the function is doing here. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJason Ekstrand <jason@jlekstrand.net> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-6-christian.koenig@amd.com
-
- 01 5月, 2021 1 次提交
-
-
由 Bernard Zhao 提交于
This maybe uses lockdep through the fs_reclaim annotations. Signed-off-by: NBernard Zhao <bernard@vivo.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210429021327.57944-1-bernard@vivo.com
-
- 26 4月, 2021 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Reference needs to be taken before arming the timer. Luckily, given the default timer period of 20s, the potential to hit the race is extremely unlikely. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 9b4d0598 ("drm/i915: Request watchdog infrastructure") Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210326105759.2387104-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit f7c37977) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 09 4月, 2021 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Reference needs to be taken before arming the timer. Luckily, given the default timer period of 20s, the potential to hit the race is extremely unlikely. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 9b4d0598 ("drm/i915: Request watchdog infrastructure") Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210326105759.2387104-1-tvrtko.ursulin@linux.intel.com
-
- 26 3月, 2021 3 次提交
-
-
由 Tvrtko Ursulin 提交于
Prepares the plumbing for setting request/fence expiration time. All code is put in place but is never activated due yet missing ability to actually configure the timer. Outline of the basic operation: A timer is started when request is ready for execution. If the request completes (retires) before the timer fires, timer is cancelled and nothing further happens. If the timer fires request is added to a lockless list and worker queued. Purpose of this is twofold: a) It allows request cancellation from a more friendly context and b) coalesces multiple expirations into a single event of consuming the list. Worker locklessly consumes the list of expired requests and cancels them all using previous added i915_request_cancel(). Associated timeout value is stored in rq->context.watchdog.timeout_us. v2: * Log expiration. v3: * Include more information about user timeline in the log message. v4: * Remove obsolete comment and fix formatting. (Matt) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324121335.2307063-6-tvrtko.ursulin@linux.intel.com
-
由 Chris Wilson 提交于
Currently, we cancel outstanding requests within a context when the context is closed. We may also want to cancel individual requests using the same graceful preemption mechanism. v2 (Tvrtko): * Cancel waiters carefully considering no timeline lock and RCU. * Fixed selftests. v3 (Tvrtko): * Remove error propagation to waiters for now. v4 (Tvrtko): * Rebase for extracted i915_request_active_engine. (Matt) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> [danvet: Resolve conflict because intel_engine_flush_scheduler is still called intel_engine_flush_submission] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324121335.2307063-3-tvrtko.ursulin@linux.intel.com
-
由 Tvrtko Ursulin 提交于
Move active engine lookup to exported i915_request_active_engine. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> [danvet: Slight rebase, engine->sched.lock is still called engine->active.lock.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324121335.2307063-2-tvrtko.ursulin@linux.intel.com
-
- 25 3月, 2021 1 次提交
-
-
由 Chris Wilson 提交于
As soon as we mark a request as completed, it may be retired. So when cancelling a request and marking it complete, make sure we first keep a reference to the request. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210201085715.27435-4-chris@chris-wilson.co.ukSigned-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 24 3月, 2021 1 次提交
-
-
由 Maarten Lankhorst 提交于
Instead of sharing pages with breadcrumbs, give each timeline a single page. This allows unrelated timelines not to share locks any more during command submission. As an additional benefit, seqno wraparound no longer requires i915_vma_pin, which means we no longer need to worry about a potential -EDEADLK at a point where we are ready to submit. Changes since v1: - Fix erroneous i915_vma_acquire that should be a i915_vma_release (ickle). - Extra check for completion in intel_read_hwsp(). Changes since v2: - Fix inconsistent indent in hwsp_alloc() (kbuild) - memset entire cacheline to 0. Changes since v3: - Do same in intel_timeline_reset_seqno(), and clflush for good measure. Changes since v4: - Use refcounting on timeline, instead of relying on i915_active. - Fix waiting on kernel requests. Changes since v5: - Bump amount of slots to maximum (256), for best wraparounds. - Add hwsp_offset to i915_request to fix potential wraparound hang. - Ensure timeline wrap test works with the changes. - Assign hwsp in intel_timeline_read_hwsp() within the rcu lock to fix a hang. Changes since v6: - Rename i915_request_active_offset to i915_request_active_seqno(), and elaborate the function. (tvrtko) Changes since v7: - Move hunk to where it belongs. (jekstrand) - Replace CACHELINE_BYTES with TIMELINE_SEQNO_BYTES. (jekstrand) Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> #v1 Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-2-maarten.lankhorst@linux.intel.com
-
- 15 1月, 2021 3 次提交
-
-
由 Chris Wilson 提交于
Avoid the full blown memory barrier of test_and_set_bit() by noting the completed request and removing it from the lists. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Originally, we used the signal->lock as a means of following the previous link in its timeline and peeking at the previous fence. However, we have replaced the explicit serialisation with a series of very careful probes that anticipate the links being deleted and the fences recycled before we are able to acquire a strong reference to it. We do not need the signal->lock crutch anymore, nor want the contention. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
When we know that we are inside the timeline mutex, or inside the submission flow (under active.lock or the holder's rcu lock), we know that the rq->hwsp is stable and we can use the simpler direct version. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-1-chris@chris-wilson.co.uk
-
- 10 1月, 2021 1 次提交
-
-
由 Chris Wilson 提交于
When wedging the device, we cancel all outstanding requests and mark them as EIO. Rather than duplicate the small function to do so between each submission backend, export one. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210109163455.28466-3-chris@chris-wilson.co.uk
-
- 31 12月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
Since we use a flag within i915_request.flags to indicate when we have boosted the request (so that we only apply the boost) once, this can be used as the serialisation with i915_request_retire() to avoid having to explicitly take the i915_request.lock which is more heavily contended. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201231093149.19086-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We only need to evaluate the current status of the context when it is scheduled in, we will force a reschedule when the context is closed propagating the change to inflight contexts. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201231093946.11649-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Since we process schedule-in of a context after submitting the request, if we decide to reset the context at that time, we also have to cancel the requets we have marked for submission. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201230220028.17089-1-chris@chris-wilson.co.uk
-
- 24 12月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Rather than having special case code for opportunistically calling process_csb() and performing a direct submit while holding the engine spinlock for submitting the request, simply call the tasklet directly. This allows us to retain the direct submission path, including the CS draining to allow fast/immediate submissions, without requiring any duplicated code paths, and most importantly greatly simplifying the control flow by removing reentrancy. This will enable us to close a few races in the virtual engines in the next few patches. The trickiest part here is to ensure that paired operations (such as schedule_in/schedule_out) remain under consistent locking domains, e.g. when pulled outside of the engine->active.lock v2: Use bh kicking, see commit 3c53776e ("Mark HI and TASKLET softirq synchronous"). v3: Update engine-reset to be tasklet aware Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
-
- 21 12月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
tasklet_kill() ensures that we _yield_ the processor until a remote tasklet is completed. However, this leads to a starvation condition as being at the bottom of the scheduler's runqueue means that anything else is able to run, including all hogs keeping the tasklet occupied. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201220134858.10510-1-chris@chris-wilson.co.uk
-
- 16 12月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control and friends to gen8_engine_cs.h Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk
-
- 27 11月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Since the introduction of preempt-to-busy, requests can complete in the background, even while they are not on the engine->active.requests list. As such, the engine->active.request list itself is not in strict retirement order, and we have to scan the entire list while unwinding to not miss any. However, if the request is completed we currently leave it on the list [until retirement], but we could just as simply remove it and stop treating it as active. We would only have to then traverse it once while unwinding in quick succession. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-1-chris@chris-wilson.co.uk
-
- 24 11月, 2020 1 次提交
-
-
由 Peter Zijlstra 提交于
Get rid of the __call_single_node union and clean up the API a little to avoid external code relying on the structure layout as much. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org>
-
- 20 11月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
We plan to expand upon the number of available statuses for when we pretty-print the requests along the timelines, and so need a new set of flags. We have settled upon: Unready [U] - initial status after being submitted, the request is not ready for execution as it is waiting for external fences Ready [R] - all fences the request was waiting on have been signaled, and the request is now ready for execution and will be in a backend queue - a ready request may still need to wait on semaphores [internal fences] Ready/virtual [V] - same as ready, but queued over multiple backends Executing [E] - the request has been transferred from the backend queue and submitted for execution on HW - a completed request may still be regarded as executing, its status may not be updated until it is retired and removed from the lists Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Extract i915_request_show for reuse in other request chain pretty printers. For a bonus point, quietly change the seqno format from %llx to %lld to match everywhere else. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-2-chris@chris-wilson.co.uk
-
- 13 10月, 2020 1 次提交
-
-
由 Peter Zijlstra 提交于
We'll use these in the new, lockless kretprobes code. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/159870619318.1229682.13027387548510028721.stgit@devnote2
-
- 01 10月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
We only allow persistent requests to remain on the GPU past the closure of their containing context (and process) so long as they are continuously checked for hangs or allow other requests to preempt them, as we need to ensure forward progress of the system. If we allow persistent contexts to remain on the system after the the hangcheck mechanism is disabled, the system may grind to a halt. On disabling the mechanism, we sent a pulse along the engine to remove all executing contexts from the engine which would check for hung contexts -- but we did not prevent those contexts from being resubmitted if they survived the final hangcheck. Fixes: 9a40bddd ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-stop Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk (cherry picked from commit 7a991cd3) Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
由 Chris Wilson 提交于
The reordering and rebasing of commit 2e4c6c1a ("drm/i915: Remove i915_request.lock requirement for execution callbacks") caused it to revert an earlier correction. Let us restore commit 99f0a640d464 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs") Fixes: 2e4c6c1a ("drm/i915: Remove i915_request.lock requirement for execution callbacks") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk (cherry picked from commit 35faeb7d) Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
- 29 9月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
We only allow persistent requests to remain on the GPU past the closure of their containing context (and process) so long as they are continuously checked for hangs or allow other requests to preempt them, as we need to ensure forward progress of the system. If we allow persistent contexts to remain on the system after the the hangcheck mechanism is disabled, the system may grind to a halt. On disabling the mechanism, we sent a pulse along the engine to remove all executing contexts from the engine which would check for hung contexts -- but we did not prevent those contexts from being resubmitted if they survived the final hangcheck. Fixes: 9a40bddd ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-stop Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk
-
- 26 9月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
The reordering and rebasing of commit 2e4c6c1a ("drm/i915: Remove i915_request.lock requirement for execution callbacks") caused it to revert an earlier correction. Let us restore commit 99f0a640d464 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs") Fixes: 2e4c6c1a ("drm/i915: Remove i915_request.lock requirement for execution callbacks") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk
-
- 16 9月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
To implement preempt-to-busy (and so efficient timeslicing and best utilization of the hardware submission ports) we let the GPU run asynchronously in respect to the ELSP submission queue. This created challenges in keeping and accessing the driver state mirroring the asynchronous GPU execution. The latest occurence of this was spotted by KCSAN: [ 1413.563200] BUG: KCSAN: data-race in __await_execution+0x217/0x370 [i915] [ 1413.563221] [ 1413.563236] race at unknown origin, with read to 0xffff88885bb6c478 of 8 bytes by task 9654 on cpu 1: [ 1413.563548] __await_execution+0x217/0x370 [i915] [ 1413.563891] i915_request_await_dma_fence+0x4eb/0x6a0 [i915] [ 1413.564235] i915_request_await_object+0x421/0x490 [i915] [ 1413.564577] i915_gem_do_execbuffer+0x29b7/0x3c40 [i915] [ 1413.564967] i915_gem_execbuffer2_ioctl+0x22f/0x5c0 [i915] [ 1413.564998] drm_ioctl_kernel+0x156/0x1b0 [ 1413.565022] drm_ioctl+0x2ff/0x480 [ 1413.565046] __x64_sys_ioctl+0x87/0xd0 [ 1413.565069] do_syscall_64+0x4d/0x80 [ 1413.565094] entry_SYSCALL_64_after_hwframe+0x44/0xa9 To complicate matters, we have to both avoid the read tearing of *active and avoid any write tearing as perform the pending[] -> inflight[] promotion of the execlists. This is because we cannot rely on the memcpy doing u64 aligned copies on all kernels/platforms and so we opt to open-code it with explicit WRITE_ONCE annotations to satisfy KCSAN. v2: When in doubt, write the same comment again. v3: Expanded commit message. Fixes: b55230e5 ("drm/i915: Check for awaits on still currently executing requests") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-1-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] [Joonas: Added expanded commit message from Tvrtko and Chris] Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit b4d9145b) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 07 9月, 2020 7 次提交
-
-
由 Chris Wilson 提交于
To implement preempt-to-busy (and so efficient timeslicing and best utilization of the hardware submission ports) we let the GPU run asynchronously in respect to the ELSP submission queue. This created challenges in keeping and accessing the driver state mirroring the asynchronous GPU execution. Previous fix 1d9221e9 ("drm/i915: Skip signaling a signaled request") however did not correctly serialize request retirement with the execution callbacks. We were using the i915_request.lock to serialise adding an execution callback with __i915_request_submit. However, if we use an atomic llist_add to serialise multiple waiters and then check to see if the request is already executing, we can remove the irq-spinlock and fix serialization between retirement and execution callbacks in one go. v2: Avoid using the irq_work when outside of the irq-spinlocks, where we can execute the callbacks immediately. v3: Pay close attention to the order of setting ACTIVE on retirement, we need to ensure the request is signaled and breadcrumbs detached before we finish removing the request from the engine. v4: Expanded commit message. Fixes: 1d9221e9 ("drm/i915: Skip signaling a signaled request") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-2-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] [Joonas: Added expanded commit message from Tvrtko and Chris] Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
To implement preempt-to-busy (and so efficient timeslicing and best utilization of the hardware submission ports) we let the GPU run asynchronously in respect to the ELSP submission queue. This created challenges in keeping and accessing the driver state mirroring the asynchronous GPU execution. The latest occurence of this was spotted by KCSAN: [ 1413.563200] BUG: KCSAN: data-race in __await_execution+0x217/0x370 [i915] [ 1413.563221] [ 1413.563236] race at unknown origin, with read to 0xffff88885bb6c478 of 8 bytes by task 9654 on cpu 1: [ 1413.563548] __await_execution+0x217/0x370 [i915] [ 1413.563891] i915_request_await_dma_fence+0x4eb/0x6a0 [i915] [ 1413.564235] i915_request_await_object+0x421/0x490 [i915] [ 1413.564577] i915_gem_do_execbuffer+0x29b7/0x3c40 [i915] [ 1413.564967] i915_gem_execbuffer2_ioctl+0x22f/0x5c0 [i915] [ 1413.564998] drm_ioctl_kernel+0x156/0x1b0 [ 1413.565022] drm_ioctl+0x2ff/0x480 [ 1413.565046] __x64_sys_ioctl+0x87/0xd0 [ 1413.565069] do_syscall_64+0x4d/0x80 [ 1413.565094] entry_SYSCALL_64_after_hwframe+0x44/0xa9 To complicate matters, we have to both avoid the read tearing of *active and avoid any write tearing as perform the pending[] -> inflight[] promotion of the execlists. This is because we cannot rely on the memcpy doing u64 aligned copies on all kernels/platforms and so we opt to open-code it with explicit WRITE_ONCE annotations to satisfy KCSAN. v2: When in doubt, write the same comment again. v3: Expanded commit message. Fixes: b55230e5 ("drm/i915: Check for awaits on still currently executing requests") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-1-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] [Joonas: Added expanded commit message from Tvrtko and Chris] Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Currently we hold no actual reference to the request nor context while they are attached to a breadcrumb. To avoid freeing the request/context too early, we serialise with cancel-breadcrumbs by taking the irq spinlock in i915_request_retire(). The alternative is to take a reference for a new breadcrumb and release it upon signaling; removing the more frequently hit contention point in i915_request_retire(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200801160225.6814-2-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
On the virtual engines, we only use the intel_breadcrumbs for tracking signaling of stale breadcrumbs from the irq_workers. They do not have any associated interrupt handling, active requests are passed to a physical engine and associated breadcrumb interrupt handler. This causes issues for us as we need to ensure that we do not actually try and enable interrupts and the powermanagement required for them on the virtual engine, as they will never be disabled. Instead, let's specify the physical engine used for interrupt handler on a particular breadcrumb. v2: Drop b->irq_armed = true mocking for no interrupt HW Fixes: 4fe6abb8 ("drm/i915/gt: Ignore irq enabling on the virtual engines") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
After staring at the breadcrumb enabling/cancellation and coming to the conclusion that the cause of the mysterious stale breadcrumbs must the act of submitting a completed requests, we can then redirect those completed requests onto a dedicated signaled_list at the time of construction and so eliminate intel_engine_transfer_stale_breadcrumbs(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-2-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Since the breadcrumb enabling/cancelling itself is serialised by the breadcrumbs.irq_lock, with a bit of care we can remove the outer serialisation with i915_request.lock for concurrent dma_fence_enable_signaling(). This has the important side-effect of eliminating the nested i915_request.lock within request submission. The challenge in serialisation is around the unsubmission where we take an active request that wants a breadcrumb on the signaling engine and put it to sleep. We do not want a concurrent dma_fence_enable_signaling() to attach a breadcrumb as we unsubmit, so we must mark the request as no longer active before serialising with the concurrent enable-signaling. On retire, we serialise with the concurrent enable-signaling, but instead of clearing ACTIVE, we mark it as SIGNALED. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-1-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
I915_GEM_THROTTLE dates back to the time before contexts where there was just a single engine, and therefore a single timeline and request list globally. That request list was in execution/retirement order, and so walking it to find a particular aged request made sense and could be split per file. That is no more. We now have many timelines with a file, as many as the user wants to construct (essentially per-engine, per-context). Each of those run independently and so make the single list futile. Remove the disordered list, and iterate over all the timelines to find a request to wait on in each to satisfy the criteria that the CPU is no more than 20ms ahead of its oldest request. It should go without saying that the I915_GEM_THROTTLE ioctl is no longer used as the primary means of throttling, so it makes sense to push the complication into the ioctl where it only impacts upon its few irregular users, rather than the execbuf/retire where everybody has to pay the cost. Fortunately, the few users do not create vast amount of contexts, so the loops over contexts/engines should be concise. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-