- 24 1月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
If the ctx->vm is freed before we can acquire a local reference to it, we proceed to call i915_vm_put(NULL), which is invalid. Reported-by: NColin Ian King <colin.king@canonical.com> Fixes: 5dbd2b7b ("drm/i915/gem: Convert vm idr to xarray") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Colin Ian King <colin.king@canonical.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123152602.1432282-1-chris@chris-wilson.co.uk
-
- 23 1月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Replace the vm_idr + vm_idr_mutex to an XArray. The XArray data structure is now used to implement IDRs, and provides its own locking. We can simply remove the IDR wrapper and in the process also remove our extra mutex. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200122161531.508903-1-chris@chris-wilson.co.uk
-
- 09 1月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Since we now allow the intel_context_unpin() to run unserialised, we risk our operations under the intel_context_lock_pinned() being run as the context is unpinned (and thus invalidating our state). We can atomically acquire the pin, testing to see if it is pinned in the process, thus ensuring that the state remains consistent during the course of the whole operation. Fixes: 84135022 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200109085142.871563-1-chris@chris-wilson.co.uk
-
- 08 1月, 2020 1 次提交
-
-
由 Matthew Auld 提交于
Attempt to split i915_gem_gtt.[ch] into more manageable chunks. Suggested-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200107134009.3255354-1-chris@chris-wilson.co.uk
-
- 24 12月, 2019 1 次提交
-
-
由 Tvrtko Ursulin 提交于
IDR internally uses xarray so we can use it directly which simplifies our code by removing the need to do external locking. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191224095920.2386297-1-chris@chris-wilson.co.uk
-
- 23 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
The only protection for intel_context.gem_cotext is granted by RCU, so annotate it as a rcu protected pointer and carefully dereference it in the few occasions we need to use it. Fixes: 9f3ccd40 ("drm/i915: Drop GEM context as a direct link from i915_request") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222233558.2201901-1-chris@chris-wilson.co.uk
-
- 22 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Allocate only an internal intel_context for the kernel_context, forgoing a global GEM context for internal use as we only require a separate address space (for our own protection). Now having weaned GT from requiring ce->gem_context, we can stop referencing it entirely. This also means we no longer have to create random and unnecessary GEM contexts for internal use. GEM contexts are now entirely for tracking GEM clients, and intel_context the execution environment on the GPU. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
-
- 20 12月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Instead of rummaging through the intel_context to peek at the GEM context in the middle of request submission to decide whether to use semaphores, store that information on the intel_context itself. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Keep the intel_context as being the primary state for i915_request, with the GEM context a backpointer from the low level state for the rarer cases we need client information. Our goal is to remove such references to clients from the backend, and leave the HW submission agnostic to client interfaces and self-contained. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk
-
- 18 12月, 2019 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Get_pid_task() needs to be paired with a put_pid or we leak a pid reference every time a banned client tries to create a context. v2: * task_pid_nr helper exists! (Chris) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: b083a087 ("drm/i915: Add per client max context ban limit") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191217170933.8108-1-tvrtko.ursulin@linux.intel.com
-
- 07 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
All pinning must be done prior to i915_request_create, to avoid timeline->mutex inversions. Here we slightly abuse the context_barrier_task stages to utilise the 'skip' decision as an opportunity to acquire the pin on the new ppgtt. Consider it s/skip/prepare/. At the moment, we only have on user of context_barrier_task, so it might be worth breaking it down for the specific task of set-vm and refactor it later if we find a second purpose. <4> [402.377487] WARNING: possible circular locking dependency detected <4> [402.377493] 5.4.0-rc8-CI-CI_DRM_7491+ #1 Tainted: G U <4> [402.377497] ------------------------------------------------------ <4> [402.377502] gem_exec_parall/2506 is trying to acquire lock: <4> [402.377507] ffff888403cdac70 (&kernel#2){+.+.}, at: i915_request_create+0x16/0x1c0 [i915] <4> [402.377593] but task is already holding lock: <4> [402.377597] ffff88835efad550 (&ppgtt->pin_mutex){+.+.}, at: gen6_ppgtt_pin+0x4d/0x110 [i915] <4> [402.377660] which lock already depends on the new lock. <4> [402.377664] the existing dependency chain (in reverse order) is: <4> [402.377668] -> #1 (&ppgtt->pin_mutex){+.+.}: <4> [402.377674] __mutex_lock+0x9a/0x9d0 <4> [402.377713] gen6_ppgtt_pin+0x4d/0x110 [i915] <4> [402.377756] emit_ppgtt_update+0x1dc/0x370 [i915] <4> [402.377801] context_barrier_task+0x176/0x310 [i915] <4> [402.377844] ctx_setparam+0x400/0xb10 [i915] <4> [402.377886] i915_gem_context_setparam_ioctl+0xc8/0x160 [i915] <4> [402.377891] drm_ioctl_kernel+0xa7/0xf0 <4> [402.377895] drm_ioctl+0x2e1/0x390 <4> [402.377899] do_vfs_ioctl+0xa0/0x6f0 <4> [402.377903] ksys_ioctl+0x35/0x60 <4> [402.377906] __x64_sys_ioctl+0x11/0x20 <4> [402.377910] do_syscall_64+0x4f/0x210 <4> [402.377914] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [402.377917] -> #0 (&kernel#2){+.+.}: <4> [402.377923] __lock_acquire+0x1328/0x15d0 <4> [402.377926] lock_acquire+0xa7/0x1c0 <4> [402.377930] __mutex_lock+0x9a/0x9d0 <4> [402.377977] i915_request_create+0x16/0x1c0 [i915] <4> [402.378013] intel_engine_flush_barriers+0x4c/0x100 [i915] <4> [402.378062] i915_ggtt_pin+0x7d/0x130 [i915] <4> [402.378108] gen6_ppgtt_pin+0x9c/0x110 [i915] <4> [402.378148] ring_context_pin+0x2e/0xc0 [i915] <4> [402.378183] __intel_context_do_pin+0x6b/0x190 [i915] <4> [402.378226] i915_gem_do_execbuffer+0x180c/0x26b0 [i915] <4> [402.378268] i915_gem_execbuffer2_ioctl+0x11b/0x460 [i915] <4> [402.378272] drm_ioctl_kernel+0xa7/0xf0 <4> [402.378275] drm_ioctl+0x2e1/0x390 <4> [402.378279] do_vfs_ioctl+0xa0/0x6f0 <4> [402.378282] ksys_ioctl+0x35/0x60 <4> [402.378286] __x64_sys_ioctl+0x11/0x20 <4> [402.378289] do_syscall_64+0x4f/0x210 <4> [402.378292] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [402.378295] other info that might help us debug this: <4> [402.378299] Possible unsafe locking scenario: <4> [402.378302] CPU0 CPU1 <4> [402.378305] ---- ---- <4> [402.378307] lock(&ppgtt->pin_mutex); <4> [402.378310] lock(&kernel#2); <4> [402.378314] lock(&ppgtt->pin_mutex); <4> [402.378317] lock(&kernel#2); <4> [402.378320] Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206105527.1130413-4-chris@chris-wilson.co.uk
-
- 02 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Though the context is closed and so no more requests can be added to the timeline, retirement can still be removing requests. It can even be removing the very request we are inspecting and so cause us to wander into dead links. Serialise with the retirement by taking the timeline->mutex used for guarding the timeline->requests list. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112404 Fixes: 4a317415 ("drm/i915/gem: Refine occupancy test in kill_context()") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129151845.1092933-1-chris@chris-wilson.co.uk (cherry picked from commit 7ce596a8) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 30 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Though the context is closed and so no more requests can be added to the timeline, retirement can still be removing requests. It can even be removing the very request we are inspecting and so cause us to wander into dead links. Serialise with the retirement by taking the timeline->mutex used for guarding the timeline->requests list. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112404 Fixes: 4a317415 ("drm/i915/gem: Refine occupancy test in kill_context()") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129151845.1092933-1-chris@chris-wilson.co.uk
-
- 28 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
One does not lightly add a new hidden struct_mutex dependency deep within the execbuf bowels! The immediate suspicion in seeing the whitelist cached on the context, is that it is intended to be preserved between batches, as the kernel is quite adept at caching small allocations itself. But no, it's sole purpose is to serialise command submission in order to save a kmalloc on a slow, slow path! By removing the whitelist dependency from the context, our freedom to chop the big struct_mutex is greatly augmented. v2: s/set_bit/__set_bit/ as the whitelist shall never be accessed concurrently. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128113424.3885958-1-chris@chris-wilson.co.uk
-
- 25 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
As the engine->kernel_context is used within the engine-pm barrier, we have to be careful when emitting requests outside of the barrier, as the strict timeline locking rules do not apply. Instead, we must ensure the engine_park() cannot be entered as we build the request, which is simplest by taking an explicit engine-pm wakeref around the request construction. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-1-chris@chris-wilson.co.uk
-
- 18 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Throw in a flush_work() to specifically flush the context cleanup work before the module is unloaded. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112248 Fixes: a4e7ccda ("drm/i915: Move context management under GEM") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191112150051.1603-1-chris@chris-wilson.co.uk (cherry picked from commit 5f00cac9) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 16 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Inside the constructor, while cloning, we need to replace the dst->engines. Having forgotten that dst->engines is marked as RCU protected, we need to add the appropriate annotations to make sparse happy. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-5-chris@chris-wilson.co.uk
-
- 13 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Throw in a flush_work() to specifically flush the context cleanup work before the module is unloaded. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112248 Fixes: a4e7ccda ("drm/i915: Move context management under GEM") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191112150051.1603-1-chris@chris-wilson.co.uk
-
- 11 11月, 2019 3 次提交
-
-
由 Chris Wilson 提交于
Update the context.name on closing so that the persistent requests are clear in debug prints. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Use a small char buffer inside the i915_gem_context to store the user friendly name so that ctx->name has the same lifetime as the RCU protected GEM context. That is, e.g. when using print_request() that prints the timeline name (ctx->name), the name will not be prematurely freed upon the context being closed and the last reference dropped. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
When inside the lock, remember to unlock even if you want to leave early. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Fixes: a4e7ccda ("drm/i915: Move context management under GEM") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191106144155.25727-1-chris@chris-wilson.co.uk (cherry picked from commit feba2b81) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 08 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
As we read the ctx->vm unlocked before cloning/exporting, we should validate our reference is correct before returning it. We already do for clone_vm() but were not so strict around get_ppgtt(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191106091312.12921-1-chris@chris-wilson.co.uk
-
- 06 11月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
When inside the lock, remember to unlock even if you want to leave early. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Fixes: a4e7ccda ("drm/i915: Move context management under GEM") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191106144155.25727-1-chris@chris-wilson.co.uk
-
由 Jon Bloomfield 提交于
To keep things manageable, the pre-gen9 cmdparser does not attempt to track any form of nested BB_START's. This did not prevent usermode from using nested starts, or even chained batches because the cmdparser is not strictly enforced pre gen9. Instead, the existence of a nested BB_START would cause the batch to be emitted in insecure mode, and any privileged capabilities would not be available. For Gen9, the cmdparser becomes mandatory (for BCS at least), and so not providing any form of nested BB_START support becomes overly restrictive. Any such batch will simply not run. We make heavy use of backward jumps in igt, and it is much easier to add support for this restricted subset of nested jumps, than to rewrite the whole of our test suite to avoid them. Add the required logic to support limited backward jumps, to instructions that have already been validated by the parser. Note that it's not sufficient to simply approve any BB_START that jumps backwards in the buffer because this would allow an attacker to embed a rogue instruction sequence within the operand words of a harmless instruction (say LRI) and jump to that. We introduce a bit array to track every instr offset successfully validated, and test the target of BB_START against this. If the target offset hits, it is re-written to the same offset in the shadow buffer and the BB_START cmd is allowed. Note: This patch deliberately ignores checkpatch issues in the cmdtables, in order to match the style of the surrounding code. We'll correct the entire file in one go in a later patch. v2: set dispatch secure late (Mika) v3: rebase (Mika) v4: Clear whitelist on each parse Minor review updates (Chris) v5: Correct backward jump batching v6: fix compilation error due to struct eb shuffle (Mika) Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: NJon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NChris Wilson <chris.p.wilson@intel.com>
-
- 01 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Don't just look at the very last request in a queue when deciding if we need to evict the context from the GPU, as that request may still be in the submission queue while the rest of the context is running! Instead, walk back along the queued requests looking for the active request and checking that. Fixes: 2e0986a5 ("drm/i915/gem: Cancel contexts when hangchecking is disabled") Testcase: igt/gem_ctx_persistence/queued Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191031090104.22245-1-chris@chris-wilson.co.uk
-
- 30 10月, 2019 2 次提交
-
-
由 Paul E. McKenney 提交于
This commit replaces the use of rcu_swap_protected() with the more intuitively appealing rcu_replace_pointer() as a step towards removing rcu_swap_protected(). Link: https://lore.kernel.org/lkml/CAHk-=wiAsJLw1egFEE=Z7-GGtM6wcvtyytXZA1+BHqta4gg6Hw@mail.gmail.com/Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> [ paulmck: From rcu_replace() to rcu_replace_pointer() per Ingo Molnar. ] Signed-off-by: NPaul E. McKenney <paulmck@kernel.org> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <intel-gfx@lists.freedesktop.org> Cc: <dri-devel@lists.freedesktop.org>
-
由 Chris Wilson 提交于
Our existing behaviour is to allow contexts and their GPU requests to persist past the point of closure until the requests are complete. This allows clients to operate in a 'fire-and-forget' manner where they can setup a rendering pipeline and hand it over to the display server and immediately exit. As the rendering pipeline is kept alive until completion, the display server (or other consumer) can use the results in the future and present them to the user. The compute model is a little different. They have little to no buffer sharing between processes as their kernels tend to operate on a continuous stream, feeding the results back to the client application. These kernels operate for an indeterminate length of time, with many clients wishing that the kernel was always running for as long as they keep feeding in the data, i.e. acting like a DSP. Not all clients want this persistent "desktop" behaviour and would prefer that the contexts are cleaned up immediately upon closure. This ensures that when clients are run without hangchecking (e.g. for compute kernels of indeterminate runtime), any GPU hang or other unexpected workloads are terminated with the process and does not continue to hog resources. The default behaviour for new contexts is the legacy persistence mode, as some desktop applications are dependent upon the existing behaviour. New clients will have to opt in to immediate cleanup on context closure. If the hangchecking modparam is disabled, so is persistent context support -- all contexts will be terminated on closure. We expect this behaviour change to be welcomed by compute users, who have often been caught between a rock and a hard place. They disable hangchecking to avoid their kernels being "unfairly" declared hung, but have also experienced true hangs that the system was then unable to clean up. Naturally, this leads to bug reports. Testcase: igt/gem_ctx_persistence Link: https://github.com/intel/compute-runtime/pull/228Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: NJon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJason Ekstrand <jason@jlekstrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/20191029202338.8841-1-chris@chris-wilson.co.uk
-
- 26 10月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Avoid angering clang and smatch by using a constant value in a '&&' test, by forcing that constant value into a boolean. E.g., drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c:159:13: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand] if (!delay && CONFIG_DRM_I915_PREEMPT_TIMEOUT) { ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: NNathan Chancellor <natechancellor@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191025135943.12524-1-chris@chris-wilson.co.uk
-
- 24 10月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Split the legacy submission backend from the common CS ring buffer handling. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Normally, we rely on our hangcheck to prevent persistent batches from hogging the GPU. However, if the user disables hangcheck, this mechanism breaks down. Despite our insistence that this is unsafe, the users are equally insistent that they want to use endless batches and will disable the hangcheck mechanism. We are looking at replacing hangcheck, in the next patch, with a softer mechanism, that sends a pulse down the engine to check if it is well. We can use the same preemptive pulse to flush an active context off the GPU upon context close, preventing resources being lost and unkillable requests remaining on the GPU after process termination. Testcase: igt/gem_ctx_exec/basic-nohangcheck Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: NJon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-4-chris@chris-wilson.co.uk
-
- 18 10月, 2019 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Medium term goal is to eliminate the i915->engine[] array and to get there we have recently introduced equivalent array in intel_gt. Now we need to migrate the code further towards this state. This next step is to eliminate usage of i915->engines[] from the for_each_engine_masked iterator. For this to work we also need to use engine->id as index when populating the gt->engine[] array and adjust the default engine set indexing to use engine->legacy_idx instead of assuming gt->engines[] indexing. v2: * Populate gt->engine[] earlier. * Check that we don't duplicate engine->legacy_idx v3: * Work around the initialization order issue between default_engines() and intel_engines_driver_register() which sets engine->legacy_idx for now. It will be fixed properly later. v4: * Merge with forgotten v2.5. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191017161852.8836-1-tvrtko.ursulin@linux.intel.com
-
- 04 10月, 2019 6 次提交
-
-
由 Chris Wilson 提交于
Keep track of the GEM contexts underneath i915->gem.contexts and assign them their own lock for the purposes of list management. v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex v3: Correct split with removal of logical HW ID Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
With the introduction of ctx->engines[] we allow multiple logical contexts to be used on the same engine (e.g. with virtual engines). According to bspec, aach logical context requires a unique tag in order for context-switching to occur correctly between them. [Simple experiments show that it is not so easy to trick the HW into performing a lite-restore with matching logical IDs, though my memory from early Broadwell experiments do suggest that it should be generating lite-restores.] We only need to keep a unique tag for the active lifetime of the context, and for as long as we need to identify that context. The HW uses the tag to determine if it should use a lite-restore (why not the LRCA?) and passes the tag back for various status identifies. The only status we need to track is for OA, so when using perf, we assign the specific context a unique tag. v2: Calculate required number of tags to fill ELSP. Fixes: 976b55f0 ("drm/i915: Allow a context to define its set of engines") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We don't need to hold struct_mutex now for retiring requests, so drop it from i915_retire_requests() and i915_gem_wait_for_idle(), finally removing I915_WAIT_LOCKED for good. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-8-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Forgo the struct_mutex serialisation for i915_active, and interpose its own mutex handling for active/retire. This is a multi-layered sleight-of-hand. First, we had to ensure that no active/retire callbacks accidentally inverted the mutex ordering rules, nor assumed that they were themselves serialised by struct_mutex. More challenging though, is the rule over updating elements of the active rbtree. Instead of the whole i915_active now being serialised by struct_mutex, allocations/rotations of the tree are serialised by the i915_active.mutex and individual nodes are serialised by the caller using the i915_timeline.mutex (we need to use nested spinlocks to interact with the dma_fence callback lists). The pain point here is that instead of a single mutex around execbuf, we now have to take a mutex for active tracker (one for each vma, context, etc) and a couple of spinlocks for each fence update. The improvement in fine grained locking allowing for multiple concurrent clients (eventually!) should be worth it in typical loads. v2: Add some comments that barely elucidate anything :( Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As we need to use a mutex to serialise i915_active activation (because we want to allow the callback to sleep), we need to push the i915_active.retire into a worker callback in case we get need to retire from an atomic context. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Replace the struct_mutex requirement for pinning the i915_vma with the local vm->mutex instead. Note that the vm->mutex is tainted by the shrinker (we require unbinding from inside fs-reclaim) and so we cannot allocate while holding that mutex. Instead we have to preallocate workers to do allocate and apply the PTE updates after we have we reserved their slot in the drm_mm (using fences to order the PTE writes with the GPU work and with later unbind). In adding the asynchronous vma binding, one subtle requirement is to avoid coupling the binding fence into the backing object->resv. That is the asynchronous binding only applies to the vma timeline itself and not to the pages as that is a more global timeline (the binding of one vma does not need to be ordered with another vma, nor does the implicit GEM fencing depend on a vma, only on writes to the backing store). Keeping the vma binding distinct from the backing store timelines is verified by a number of async gem_exec_fence and gem_exec_schedule tests. The way we do this is quite simple, we keep the fence for the vma binding separate and only wait on it as required, and never add it to the obj->resv itself. Another consequence in reducing the locking around the vma is the destruction of the vma is no longer globally serialised by struct_mutex. A natural solution would be to add a kref to i915_vma, but that requires decoupling the reference cycles, possibly by introducing a new i915_mm_pages object that is own by both obj->mm and vma->pages. However, we have not taken that route due to the overshadowing lmem/ttm discussions, and instead play a series of complicated games with trylocks to (hopefully) ensure that only one destruction path is called! v2: Add some commentary, and some helpers to reduce patch churn. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
-
- 25 9月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Before we submit the first context to HW, we need to construct a valid image of the register state. This layout is defined by the HW and should match the layout generated by HW when it saves the context image. Asserting that this should be equivalent should help avoid any undefined behaviour and verify that we haven't missed anything important! Of course, having insisted that the initial register state within the LRC should match that returned by HW, we need to ensure that it does. v2: Drop the RELATIVE_MMIO flag from gen11, we ignore it for constructing the lrc image. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190924145950.3011-1-chris@chris-wilson.co.uk
-
- 20 9月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
The request->timeline is only valid until the request is retired (i.e. before it is completed). Upon retiring the request, the context may be unpinned and freed, and along with it the timeline may be freed. We therefore need to be very careful when chasing rq->timeline that the pointer does not disappear beneath us. The vast majority of users are in a protected context, either during request construction or retirement, where the timeline->mutex is held and the timeline cannot disappear. It is those few off the beaten path (where we access a second timeline) that need extra scrutiny -- to be added in the next patch after first adding the warnings about dangerous access. One complication, where we cannot use the timeline->mutex itself, is during request submission onto hardware (under spinlocks). Here, we want to check on the timeline to finalize the breadcrumb, and so we need to impose a second rule to ensure that the request->timeline is indeed valid. As we are submitting the request, it's context and timeline must be pinned, as it will be used by the hardware. Since it is pinned, we know the request->timeline must still be valid, and we cannot submit the idle barrier until after we release the engine->active.lock, ergo while submitting and holding that spinlock, a second thread cannot release the timeline. v2: Don't be lazy inside selftests; hold the timeline->mutex for as long as we need it, and tidy up acquiring the timeline with a bit of refactoring (i915_active_add_request) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
-
- 03 9月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
The aliasing-ppgtt is constrained to be the same size as the Global GTT since it aliases the same address space. Simplifying gtt size reporting in this case. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-2-chris@chris-wilson.co.uk
-