- 28 1月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Similar to commit ac0e331a ("drm/i915: Tighten atomicity of i915_active_acquire vs i915_active_release") we have the same race of trying to pin the context underneath a mutex while allowing the decrement to be atomic outside of that mutex. This leads to the problem where two threads may simultaneously try to pin the context and the second not notice that they needed to repin the context. <2> [198.669621] kernel BUG at drivers/gpu/drm/i915/gt/intel_timeline.c:387! <4> [198.669703] invalid opcode: 0000 [#1] PREEMPT SMP PTI <4> [198.669712] CPU: 0 PID: 1246 Comm: gem_exec_create Tainted: G U W 5.5.0-rc6-CI-CI_DRM_7755+ #1 <4> [198.669723] Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017 <4> [198.669776] RIP: 0010:timeline_advance+0x7b/0xe0 [i915] <4> [198.669785] Code: 00 48 c7 c2 10 f1 46 a0 48 c7 c7 70 1b 32 a0 e8 bb dd e7 e0 bf 01 00 00 00 e8 d1 af e7 e0 31 f6 bf 09 00 00 00 e8 35 ef d8 e0 <0f> 0b 48 c7 c1 48 fa 49 a0 ba 84 01 00 00 48 c7 c6 10 f1 46 a0 48 <4> [198.669803] RSP: 0018:ffffc900004c3a38 EFLAGS: 00010296 <4> [198.669810] RAX: ffff888270b35140 RBX: ffff88826f32ee00 RCX: 0000000000000006 <4> [198.669818] RDX: 00000000000017c5 RSI: 0000000000000000 RDI: 0000000000000009 <4> [198.669826] RBP: ffffc900004c3a64 R08: 0000000000000000 R09: 0000000000000000 <4> [198.669834] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88826f9b5980 <4> [198.669841] R13: 0000000000000cc0 R14: ffffc900004c3dc0 R15: ffff888253610068 <4> [198.669849] FS: 00007f63e663fe40(0000) GS:ffff888276c00000(0000) knlGS:0000000000000000 <4> [198.669857] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [198.669864] CR2: 00007f171f8e39a8 CR3: 000000026b1f6005 CR4: 00000000003606f0 <4> [198.669872] Call Trace: <4> [198.669924] intel_timeline_get_seqno+0x12/0x40 [i915] <4> [198.669977] __i915_request_create+0x76/0x5a0 [i915] <4> [198.670024] i915_request_create+0x86/0x1c0 [i915] <4> [198.670068] i915_gem_do_execbuffer+0xbf2/0x2500 [i915] <4> [198.670082] ? __lock_acquire+0x460/0x15d0 <4> [198.670128] i915_gem_execbuffer2_ioctl+0x11f/0x470 [i915] <4> [198.670171] ? i915_gem_execbuffer_ioctl+0x300/0x300 [i915] <4> [198.670181] drm_ioctl_kernel+0xa7/0xf0 <4> [198.670188] drm_ioctl+0x2e1/0x390 <4> [198.670233] ? i915_gem_execbuffer_ioctl+0x300/0x300 [i915] Fixes: 84135022 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin") References: ac0e331a ("drm/i915: Tighten atomicity of i915_active_acquire vs i915_active_release") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200127152829.2842149-1-chris@chris-wilson.co.uk
-
- 10 1月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
As we use the active state to keep the vma alive while we are reading its contents during GPU error capture, we need to mark the ring->vma as active during execution if we want to include the rinbuffer in the error state. Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: b1e3177b ("drm/i915: Coordinate i915_active with its own mutex") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110110402.1231745-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As we use the active state to keep the vma alive while we are reading its contents during GPU error capture, we need to mark the context->state vma as active during execution if we want to include it in the error state. Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: b1e3177b ("drm/i915: Coordinate i915_active with its own mutex") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110110402.1231745-2-chris@chris-wilson.co.uk
-
- 09 1月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
Now that we have moved the runtime-pm management out of intel_context_acctive_acquire, and that itself out of ce->ops->pin(), no explicit runtime pm wakeref is required in intel_context_pin(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200109085717.873326-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
While this is encroaching on midlayer territory, having already made the state allocation a previous step in pinning, we can now pull the common intel_context_active_acquire() into intel_context_pin() itself. This is a prelude to make the activation a separate step inside pinning, outside of the ce->pin_mutex Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200109085717.873326-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Allow for knowledgeable users to preallocate the context state, and to separate the allocation step from the pinning step during intel_context_pin() Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200109085717.873326-1-chris@chris-wilson.co.uk
-
- 06 1月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
The last remaining reason for serialising the pin/unpin of the intel_context is to ensure that our preallocated wakerefs are not consumed too early (i.e. the unpin of the previous phase does not emit the idle barriers for this phase before we even submit). All of the other operations within the context pin/unpin are supposed to be atomic... Therefore, we can reduce the serialisation to being just on the i915_active.preallocated_barriers itself and drop the nested pin_mutex from intel_context_unpin(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200106114234.2529613-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Convert the few remaining GEM_TRACE() used for debugging over to the appropriate GT_TRACE or RQ_TRACE. References: 639f2f24 ("drm/i915: Introduce new macros for tracing") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200106114234.2529613-4-chris@chris-wilson.co.uk
-
- 22 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Allocate only an internal intel_context for the kernel_context, forgoing a global GEM context for internal use as we only require a separate address space (for our own protection). Now having weaned GT from requiring ce->gem_context, we can stop referencing it entirely. This also means we no longer have to create random and unnecessary GEM contexts for internal use. GEM contexts are now entirely for tracking GEM clients, and intel_context the execution environment on the GPU. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
-
- 20 12月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Instead of rummaging through the intel_context to peek at the GEM context in the middle of request submission to decide whether to use semaphores, store that information on the intel_context itself. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Keep the intel_context as being the primary state for i915_request, with the GEM context a backpointer from the low level state for the rarer cases we need client information. Our goal is to remove such references to clients from the backend, and leave the HW submission agnostic to client interfaces and self-contained. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk
-
- 14 12月, 2019 1 次提交
-
-
New macros ENGINE_TRACE(), CE_TRACE(), RQ_TRACE() and GT_TRACE() are introduce to tag device name and engine name with contexts and requests tracing in i915. Cc: Sudeep Dutt <sudeep.dutt@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: NVenkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-2-venkata.s.dhanalakota@intel.com
-
- 05 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
It is not acceptable for context pinning to fail with -ENOSPC as we should always be able to make space in the GGTT. The only reason we may fail is that other "temporary" context pins are reserving their space and we need to wait for an available slot. Closes: https://gitlab.freedesktop.org/drm/intel/issues/676Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191205113726.413351-2-chris@chris-wilson.co.uk
-
- 04 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Rather than assume if and only if the engine->default_state is not set that the context is invalid, instead track when we know the context has valid state -- either because we have copied the default_state or we have completed a context switch to save the HW state. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191203124155.3019926-1-chris@chris-wilson.co.uk
-
- 28 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
The expected downside to commit 58b4c1a0 ("drm/i915: Reduce nested prepare_remote_context() to a trylock") was that it would need to return -EAGAIN to userspace in order to resolve potential mutex inversion. Such an unsightly round trip is unnecessary if we could atomically insert a barrier into the i915_active_fence, so make it happen. Currently, we use the timeline->mutex (or some other named outer lock) to order insertion into the i915_active_fence (and so individual nodes of i915_active). Inside __i915_active_fence_set, we only need then serialise with the interrupt handler in order to claim the timeline for ourselves. However, if we remove the outer lock, we need to ensure the order is intact between not only multiple threads trying to insert themselves into the timeline, but also with the interrupt handler completing the previous occupant. We use xchg() on insert so that we have an ordered sequence of insertions (and each caller knows the previous fence on which to wait, preserving the chain of all fences in the timeline), but we then have to cmpxchg() in the interrupt handler to avoid overwriting the new occupant. The only nasty side-effect is having to temporarily strip off the RCU-annotations to apply the atomic operations, otherwise the rules are much more conventional! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112402 Fixes: 58b4c1a0 ("drm/i915: Reduce nested prepare_remote_context() to a trylock") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127134527.3438410-1-chris@chris-wilson.co.uk
-
- 27 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
On context retiring, we may invoke the kernel_context to unpin this context. Elsewhere, we may use the kernel_context to modify this context. This currently leads to an AB-BA lock inversion, so we need to back-off from the contended lock, and repeat. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Fixes: a9877da2 ("drm/i915/oa: Reconfigure contexts on the fly") Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191126065521.2331017-1-chris@chris-wilson.co.uk (cherry picked from commit 58b4c1a0) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 26 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
On context retiring, we may invoke the kernel_context to unpin this context. Elsewhere, we may use the kernel_context to modify this context. This currently leads to an AB-BA lock inversion, so we need to back-off from the contended lock, and repeat. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Fixes: a9877da2 ("drm/i915/oa: Reconfigure contexts on the fly") Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191126065521.2331017-1-chris@chris-wilson.co.uk
-
- 24 10月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Split the legacy submission backend from the common CS ring buffer handling. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk
-
- 08 10月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Avoid going to the base i915 device when we already have a path from gt to the runtime powermanagement interface. The benefit is that it looks a bit more self-consistent to always be acquiring the gt->uncore->rpm for use with the gt->uncore. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191007154531.1750-1-chris@chris-wilson.co.uk
-
- 04 10月, 2019 3 次提交
-
-
由 Chris Wilson 提交于
Keep track of the GEM contexts underneath i915->gem.contexts and assign them their own lock for the purposes of list management. v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex v3: Correct split with removal of logical HW ID Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Forgo the struct_mutex serialisation for i915_active, and interpose its own mutex handling for active/retire. This is a multi-layered sleight-of-hand. First, we had to ensure that no active/retire callbacks accidentally inverted the mutex ordering rules, nor assumed that they were themselves serialised by struct_mutex. More challenging though, is the rule over updating elements of the active rbtree. Instead of the whole i915_active now being serialised by struct_mutex, allocations/rotations of the tree are serialised by the i915_active.mutex and individual nodes are serialised by the caller using the i915_timeline.mutex (we need to use nested spinlocks to interact with the dma_fence callback lists). The pain point here is that instead of a single mutex around execbuf, we now have to take a mutex for active tracker (one for each vma, context, etc) and a couple of spinlocks for each fence update. The improvement in fine grained locking allowing for multiple concurrent clients (eventually!) should be worth it in typical loads. v2: Add some comments that barely elucidate anything :( Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As we need to use a mutex to serialise i915_active activation (because we want to allow the callback to sleep), we need to push the i915_active.retire into a worker callback in case we get need to retire from an atomic context. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-5-chris@chris-wilson.co.uk
-
- 20 9月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
The request->timeline is only valid until the request is retired (i.e. before it is completed). Upon retiring the request, the context may be unpinned and freed, and along with it the timeline may be freed. We therefore need to be very careful when chasing rq->timeline that the pointer does not disappear beneath us. The vast majority of users are in a protected context, either during request construction or retirement, where the timeline->mutex is held and the timeline cannot disappear. It is those few off the beaten path (where we access a second timeline) that need extra scrutiny -- to be added in the next patch after first adding the warnings about dangerous access. One complication, where we cannot use the timeline->mutex itself, is during request submission onto hardware (under spinlocks). Here, we want to check on the timeline to finalize the breadcrumb, and so we need to impose a second rule to ensure that the request->timeline is indeed valid. As we are submitting the request, it's context and timeline must be pinned, as it will be used by the hardware. Since it is pinned, we know the request->timeline must still be valid, and we cannot submit the idle barrier until after we release the engine->active.lock, ergo while submitting and holding that spinlock, a second thread cannot release the timeline. v2: Don't be lazy inside selftests; hold the timeline->mutex for as long as we need it, and tidy up acquiring the timeline with a bit of refactoring (i915_active_add_request) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
-
- 11 9月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Add an atomic counter and always take the spinlock around the pin/unpin events, so that we can perform the list manipulation concurrently. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190910212204.17190-1-chris@chris-wilson.co.uk
-
- 17 8月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
As every i915_active_request should be serialised by a dedicated lock, i915_active consists of a tree of locks; one for each node. Markup up the i915_active_request with what lock is supposed to be guarding it so that we can verify that the serialised updated are indeed serialised. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816121000.8507-2-chris@chris-wilson.co.uk
-
- 16 8月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Lift moving the timeline to/from the active_list on enter/exit in order to shorten the active tracking span in comparison to the existing pin/unpin. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-1-chris@chris-wilson.co.uk
-
- 10 8月, 2019 3 次提交
-
-
由 Chris Wilson 提交于
Move the timeline from being inside the intel_ring to intel_context itself. This saves much pointer dancing and makes the relations of the context to its timeline much clearer. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Push the ring creation flags from the outer GEM context to the inner intel_context to avoid an unsightly back-reference from inside the backend. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Refactor the backends to handle the deferred context allocation in a consistent manner, and allow calling it as an explicit first step in pinning a context for the first time. This should make it easier for backends to keep track of partially constructed contexts from initialisation. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-2-chris@chris-wilson.co.uk
-
- 03 8月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
The shrinker cannot touch objects used by the contexts (logical state and ring). Currently we mark those as "pin_global" to let the shrinker skip over them, however, if we remove them from the shrinker lists entirely, we don't event have to include them in our shrink accounting. By keeping the unshrinkable objects in our shrinker tracking, we report a large number of objects available to be shrunk, and leave the shrinker deeply unsatisfied when we fail to reclaim those. The shrinker will persist in trying to reclaim the unavailable objects, forcing the system into a livelock (not even hitting the dread oomkiller). v2: Extend unshrinkable protection for perma-pinned scratch and guc allocations (Tvrtko) v3: Notice that we should be pinned when marking unshrinkable and so the link cannot be empty; merge duplicate paths. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk
-
- 02 8月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
By placing our idle-barriers in the i915_active fence tree, we expose those for reuse by other components that are issuing requests along the kernel_context. Reusing the proto-barrier active_node is perfectly fine as the new request implies a context-switch, and so an opportune point to run the idle-barrier. However, the proto-barrier is not equivalent to a normal active_node and care must be taken to avoid dereferencing the ERR_PTR used as its request marker. v2: Comment the more egregious cheek v3: A glossary! Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: ce476c80 ("drm/i915: Keep contexts pinned until after the next kernel context switch") Fixes: a9877da2 ("drm/i915/oa: Reconfigure contexts on the fly") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190802100015.1281-1-chris@chris-wilson.co.uk
-
- 30 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Track the currently bound address space used by the HW context. Minor conversions to use the local intel_context.vm are made, leaving behind some more surgery required to make intel_context the primary through the selftests. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-2-chris@chris-wilson.co.uk
-
- 29 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Remember to keep the rings pinned as well as the context image until the GPU is no longer active. v2: Introduce a ring->pin_count primarily to hide the mock_ring that doesn't fit into the normal GGTT vma picture. v3: Order is important in teardown, ringbuffer submission needs to drop the pin count on the engine->kernel_context before it can gleefully free its ring. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110946 Fixes: ce476c80 ("drm/i915: Keep contexts pinned until after the next kernel context switch") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190619170135.15281-1-chris@chris-wilson.co.uk (cherry picked from commit 09c5ab38) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 27 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Modifying a remote context requires careful serialisation with requests on that context, and that serialisation requires us to take their timeline->mutex. Make it so. Note that while struct_mutex rules, we can't create more than one request in parallel, but that age is soon coming to an end. v2: Though it doesn't affect the current users, contexts may share timelines so check if we already hold the right mutex. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190725131447.27515-1-chris@chris-wilson.co.uk
-
- 23 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Prior to freeing the struct, call the fini function to cleanup the common members. Currently this only calls the debug functions to mark the structs as destroyed, but may be extended to real work in future. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190718070024.21781-2-chris@chris-wilson.co.uk
-
- 17 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Avoid a global idle barrier by reconfiguring each context by rewriting them with MI_STORE_DWORD from the kernel context. v2: We only need to determine the desired register values once, they are the same for all contexts. v3: Don't remove the kernel context from the list of known GEM contexts; the world is not ready for that yet. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190716213443.9874-1-chris@chris-wilson.co.uk
-
- 26 6月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Add the context pin/unpin events to the trace for post-mortem debugging. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190625194859.28005-1-chris@chris-wilson.co.uk
-
- 22 6月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
If we introduce a callback for i915_active that is only called the first time we use the i915_active and is symmetrically paired with the i915_active.retire callback, we can replace the open-coded and non-atomic implementations -- which will be very fragile (i.e. broken) upon removing the struct_mutex serialisation. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-4-chris@chris-wilson.co.uk
-
- 20 6月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Remember to keep the rings pinned as well as the context image until the GPU is no longer active. v2: Introduce a ring->pin_count primarily to hide the mock_ring that doesn't fit into the normal GGTT vma picture. v3: Order is important in teardown, ringbuffer submission needs to drop the pin count on the engine->kernel_context before it can gleefully free its ring. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110946 Fixes: ce476c80 ("drm/i915: Keep contexts pinned until after the next kernel context switch") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190619170135.15281-1-chris@chris-wilson.co.uk
-
- 19 6月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
The idea behind keeping the saturation mask local to a context backfired spectacularly. The premise with the local mask was that we would be more proactive in attempting to use semaphores after each time the context idled, and that all new contexts would attempt to use semaphores ignoring the current state of the system. This turns out to be horribly optimistic. If the system state is still oversaturated and the existing workloads have all stopped using semaphores, the new workloads would attempt to use semaphores and be deprioritised behind real work. The new contexts would not switch off using semaphores until their initial batch of low priority work had completed. Given sufficient backload load of equal user priority, this would completely starve the new work of any GPU time. To compensate, remove the local tracking in favour of keeping it as global state on the engine -- once the system is saturated and semaphores are disabled, everyone stops attempting to use semaphores until the system is idle again. One of the reason for preferring local context tracking was that it worked with virtual engines, so for switching to global state we could either do a complete check of all the virtual siblings or simply disable semaphores for those requests. This takes the simpler approach of disabling semaphores on virtual engines. The downside is that the decision that the engine is saturated is a local measure -- we are only checking whether or not this context was scheduled in a timely fashion, it may be legitimately delayed due to user priorities. We still have the same dilemma though, that we do not want to employ the semaphore poll unless it will be used. v2: Explain why we need to assume the worst wrt virtual engines. Fixes: ca6e56f6 ("drm/i915: Disable semaphore busywaits on saturated systems") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Dmitry Ermilov <dmitry.ermilov@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-8-chris@chris-wilson.co.uk
-