- 18 8月, 2020 2 次提交
-
-
由 Lionel Landwerlin 提交于
Introduces a new parameters to execbuf so that we can specify syncobj handles as well as timeline points. v2: Reuse i915_user_extension_fn v3: Check that the chained extension is only present once (Chris) v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel) v5: Use BIT_ULL (Chris) v6: Fix issue with already signaled timeline points, dma_fence_chain_find_seqno() setting fence to NULL (Chris) v7: Report ENOENT with invalid syncobj handle (Lionel) v8: Check for out of order timeline point insertion (Chris) v9: After explanations on https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html drop the ordering check from v8 (Lionel) v10: Set first extension enum item to 1 (Jason) v11: Rebase v12: Allow multiple extension nodes of timeline syncobj (Chris) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Co-authored-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v11) Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-3-lionel.g.landwerlin@intel.com Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
由 Lionel Landwerlin 提交于
We're planning to use this for a couple of new feature where we need to provide additional parameters to execbuf. v2: Check for invalid flags in execbuffer2 (Lionel) v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris) v4: Rebase Move array fence parsing in i915_gem_do_execbuffer() Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-2-lionel.g.landwerlin@intel.com Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
- 09 7月, 2020 1 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to indicate it contains the maximum engine list that can be found on a matching device. In preparation for other info being moved to the gt in follow up patches (sseu), introduce an intel_gt_info structure to group all gt-related runtime info. v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
-
- 03 7月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Rather than reuse the common ctx->mutex for locking the execbuffer LUT, split it into its own lock to avoid being taken [as part of ctx->mutex] at inappropriate times. In particular to avoid the inversion from taking the timeline->mutex for the whole execbuf submission in the next patch. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200703004306.11117-1-chris@chris-wilson.co.uk
-
- 01 7月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
The obj->lut_list is traversed when the object is closed as the file table is destroyed during process termination. As this occurs before we kill any outstanding context if, due to some bug or another, the closure is blocked, then we fail to shootdown any inflight operations potentially leaving the GPU spinning forever. As we only need to guard the list against concurrent closures and insertions, the hold is short and merits being treated as a simple spinlock. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk
-
- 08 6月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
If the execbuf is interrupted after building the cmdparser pipeline, and before we commit to submitting the request to HW, we would attempt to clean up the cmdparser early. While we held active references to the vma being parsed and constructed, we did not hold an active reference for the buffer pool itself. The result was that an interrupted execbuf could still have run the cmdparser pipeline, but since the buffer pool was idle, its target vma could have been recycled. Note this problem only occurs if the cmdparser is running async due to pipelined waits on busy fences, and the execbuf is interrupted. Fixes: 686c7c35 ("drm/i915/gem: Asynchronous cmdparser") Fixes: 16e87459 ("drm/i915/gt: Move the batch buffer pool from the engine to the gt") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk (cherry picked from commit 57a78ca4) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 06 6月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Unused as of commit 9e0f9464 ("drm/i915/gem: Async GPU relocations only"), but left behind. >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:933:21: error: unused function 'unmask_page' [-Werror,-Wunused-function] static inline void *unmask_page(unsigned long p) ^ >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:938:28: error: unused function 'unmask_flags' [-Werror,-Wunused-function] static inline unsigned int unmask_flags(unsigned long p) ^ >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:945:33: error: unused function 'cache_to_ggtt' [-Werror,-Wunused-function] static inline struct i915_ggtt *cache_to_ggtt(struct reloc_cache *cache) Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200605200357.13069-1-chris@chris-wilson.co.uk
-
- 05 6月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Reduce the 3 relocation paths down to the single path that accommodates all. The primary motivation for this is to guard the relocations with a natural fence (derived from the i915_request used to write the relocation from the GPU). The tradeoff in using async gpu relocations is that it increases latency over using direct CPU relocations, for the cases where the target is idle and accessible by the CPU. The benefit is greatly reduced lock contention and improved concurrency by pipelining. Note that forcing the async gpu relocations does reveal a few issues they have. Firstly, is that they are visible as writes to gem_busy, causing to mark some buffers are being to written to by the GPU even though userspace only reads. Secondly is that, in combination with the cmdparser, they can cause priority inversions. This should be the case where the work is being put into a common workqueue losing our priority information and so being executed in FIFO from the worker, denying us the opportunity to reorder the requests afterwards. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604211457.19696-1-chris@chris-wilson.co.uk
-
- 04 6月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
If the execbuf is interrupted after building the cmdparser pipeline, and before we commit to submitting the request to HW, we would attempt to clean up the cmdparser early. While we held active references to the vma being parsed and constructed, we did not hold an active reference for the buffer pool itself. The result was that an interrupted execbuf could still have run the cmdparser pipeline, but since the buffer pool was idle, its target vma could have been recycled. Note this problem only occurs if the cmdparser is running async due to pipelined waits on busy fences, and the execbuf is interrupted. Fixes: 686c7c35 ("drm/i915/gem: Asynchronous cmdparser") Fixes: 16e87459 ("drm/i915/gt: Move the batch buffer pool from the engine to the gt") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk
-
- 03 6月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
We infrequently use the direct i915 backpointer from the i915_request, so do we really need to waste the space in the struct for it? 8 bytes from the most frequently allocated struct vs an 3 bytes and pointer chasing in using rq->engine->i915? Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAkeem G Abodunrin <akeem.g.abodunrin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk
-
- 25 5月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Leave the error propagation in place, but limit the warnings to only show up in CI if the unlikely errors are hit. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200525141957.3061-2-chris@chris-wilson.co.uk
-
- 19 5月, 2020 1 次提交
-
-
由 Pankaj Bharadiya 提交于
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN* over WARN* at places where struct drm_device pointer can be extracted. Signed-off-by: NPankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504181600.18503-6-pankaj.laxminarayan.bharadiya@intel.com
-
- 14 5月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
Now that we have fast timeslicing on semaphores, we no longer need to prioritise none-semaphore work as we will yield any work blocked on a semaphore to the next in the queue. Previously with no timeslicing, blocking on the semaphore caused extremely bad scheduling with multiple clients utilising multiple rings. Now, there is no impact and we can remove the complication. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513173504.28322-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Since there can only be one of in_fence/exec_fence, just use the single in_fence local. v2: Consolidate lookup Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513180937.28992-1-chris@chris-wilson.co.uk
-
- 08 5月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Upon waiting a request (when asked), we gave that request a small priority boost, not enough for it to cause preemption, but enough for it to be scheduled next before all equals. We also used that bit to give new clients a small priority boost, similar to FQ_CODEL, such that we favoured short interactive tasks ahead of long running streams. However, this is causing lots of complications with timeslicing where we both want to honour the boost and yet ignore it. Those complications cause unexpected user behaviour (tasks not being timesliced and run concurrently as epxected), and the easiest way to resolve that is to remove the boost. Hopefully, we can find a compromise again if we need to, but in theory timeslicing itself and future more advanced schedulers should give us the interactivity boost we seek. Testcase: igt/gem_exec_schedule/lateslice Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk
-
- 04 5月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but left them writing to a physical address. The notes suggest that the primary reason would be so that the writes were cache coherent, as the CPU cache uses physical tagging. As such we did not implement the legacy variant of MI_STORE_DATA_IMM and so left all the relocations synchronous -- but with a small function to convert from the vma address into the physical address, we can implement asynchronous relocs on these older arches, fixing up a few tests that require them. In order to be able to test the legacy paths, refactor the gpu relocations so that we can hook them up to a selftest. v2: Use an array of offsets not enum labels for the selftest v3: Refactor the common igt_hexdump() Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
It is required that a chained batch be in the same address domain as its parent, and also that must be specified in the command for earlier gen as it is not inferred from the chaining until gen6. Fixes: 964a9b0f ("drm/i915/gem: Use chained reloc batches") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk
-
- 02 5月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
If at first we don't succeed, try try again. Not all engines may support the MI ops we need to perform asynchronous relocation patching, and so we end up falling back to a synchronous operation that has a liability of blocking. However, Tvrtko pointed out we don't need to use the same engine to perform the relocations as we are planning to execute the execbuf on, and so if we switch over to a working engine, we can perform the relocation asynchronously. The user execbuf will be queued after the relocations by virtue of fencing. This patch creates a new context per execbuf requiring asynchronous relocations on an unusable engines. This is perhaps a bit excessive and can be ameliorated by a small context cache, but for the moment we only need it for working around a little used engine on Sandybridge, and only if relocations are actually required to an active batch buffer. Now we just need to teach the relocation code to handle physical addressing for gen2/3, and we should then have universal support! Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Testcase: igt/gem_exec_reloc/basic-spin # snb Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As we can now keep chaining together a relocation batch to process any number of relocations, we can keep building that relocation batch for all of the target vma. This avoiding emitting a new request into the ring for each target, consuming precious ring space and a potential stall. v2: Propagate the failure from submitting the relocation batch. Testcase: igt/gem_exec_reloc/basic-wide-active Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The ring is a precious resource: we anticipate to only use a few hundred bytes for a request, and only try to reserve that before we start. If we go beyond our guess in building the request, then instead of waiting at the start of execbuf before we hold any locks or other resources, we may trigger a wait inside a critical region. One example is in using gpu relocations, where currently we emit a new MI_BB_START from the ring every time we overflow a page of relocation entries. However, instead of insert the command into the precious ring, we can chain the next page of relocation entries as MI_BB_START from the end of the previous. v2: Delay the emit_bb_start until after all the chained vma synchronisation is complete. Since the buffer pool batches are idle, this _should_ be a no-op, but one day we may some fancy async GPU bindings for new vma! v3: Use pool/batch consitently, once we start thinking in terms of the batch vma, use batch->obj. v4: Explain the magic number 4. Tvrtko spotted that we lose propagation of the error for failing to submit the relocation request; that's easier to fix up in the next patch. Testcase: igt/gem_exec_reloc/basic-many-active Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk
-
- 01 5月, 2020 2 次提交
-
-
由 Christophe Leroy 提交于
When i915_gem_execbuffer2_ioctl() is using user_access_begin(), that's only to perform unsafe_put_user() so use user_write_access_begin() in order to only open write access. Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ebf1250b6d4f351469fb339e5399d8b92aa8a1c1.1585898438.git.christophe.leroy@c-s.fr
-
由 Chris Wilson 提交于
Since the introduction of 'soft-rc6', we aim to park the device quickly and that results in frequent idling of the whole device. Currently upon idling we free the batch buffer pool, and so this renders the cache ineffective for many workloads. If we want to have an effective cache of recently allocated buffers available for reuse, we need to decouple that cache from the engine powermanagement and make it timer based. As there is no reason then to keep it within the engine (where it once made retirement order easier to track), we can move it up the hierarchy to the owner of the memory allocations. v2: Hook up to debugfs/drop_caches to clear the cache on demand. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk
-
- 24 4月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
The history of i915_vma_close() is confusing, as is its use. As the lifetime of the i915_vma is currently bounded by the object it is attached to, we needed a means of identify when a vma was no longer in use by userspace (via the user's fd). This is further complicated by that only ppgtt vma should be closed at the user's behest, as the ggtt were always shared. Now that we attach the vma to a lut on the user's context, the open count does indicate how many unique and open context/vm are referencing this vma from the user. As such, we can and should just use the open_count to track when the vma is still in use by userspace. It's a poor man's replacement for reference counting. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk
-
- 07 4月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
Tidy the code by casting remain to unsigned long once for the duration of eb_relocate_vma() Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407085930.19421-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
__i915_gem_object_flush_map() takes a byte range, so feed it the written bytes and do not mistake the u32 index as bytes! Fixes: a679f58d ("drm/i915: Flush pages on acquisition") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406114821.10949-1-chris@chris-wilson.co.uk (cherry picked from commit 30c88a47) Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
由 Chris Wilson 提交于
If the user passes in a readonly reloc[], by the time we notice we have already committed to modifying the execobjects, or have indeed done so already. Reporting the failure just compounds the issue as we have no second pass to fall back to anymore. "Be damned if you do, and damned if you don't." Testcase: igt/gem_exec_reloc/readonly Fixes: 7dc8f114 ("drm/i915/gem: Drop relocation slowpath") References: fddcd00a ("drm/i915: Force the slow path after a user-write error") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200331162150.3635-1-chris@chris-wilson.co.uk (cherry picked from commit 97a37c91) Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
- 06 4月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
If we set the debug flag to force ourselves not to relocate via the gpu, do not relocate via the gpu. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406123616.7334-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
__i915_gem_object_flush_map() takes a byte range, so feed it the written bytes and do not mistake the u32 index as bytes! Fixes: a679f58d ("drm/i915: Flush pages on acquisition") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406114821.10949-1-chris@chris-wilson.co.uk
-
- 02 4月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
If the current node/entry location is occupied, and the object is not pinned, try assigning it some free space. We cannot wait here, so if in doubt, we unreserve and try to grab all at once. v2: Use the final pin_flags so that we won't have to move the object if we find the wrong free space. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200401194135.5442-1-chris@chris-wilson.co.uk
-
- 01 4月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
If the user passes in a readonly reloc[], by the time we notice we have already committed to modifying the execobjects, or have indeed done so already. Reporting the failure just compounds the issue as we have no second pass to fall back to anymore. "Be damned if you do, and damned if you don't." Testcase: igt/gem_exec_reloc/readonly Fixes: 7dc8f114 ("drm/i915/gem: Drop relocation slowpath") References: fddcd00a ("drm/i915: Force the slow path after a user-write error") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200331162150.3635-1-chris@chris-wilson.co.uk
-
- 31 3月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Use a separate array allocation for the execbuf vma, so that we can track their lifetime independently from the copy of the user arguments. With luck, this has a secondary benefit of splitting the malloc size to within reason and avoid vmalloc. The downside is that we might require two separate vmallocs -- but much less likely. In the process, this prevents a memory leak on the ww_mutex error unwind. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1390Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200330133710.14385-1-chris@chris-wilson.co.uk
-
- 27 3月, 2020 2 次提交
-
-
由 Nathan Chancellor 提交于
A recent commit in clang added -Wtautological-compare to -Wall, which is enabled for i915 after -Wtautological-compare is disabled for the rest of the kernel so we see the following warning on x86_64: ../drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1433:22: warning: result of comparison of constant 576460752303423487 with expression of type 'unsigned int' is always false [-Wtautological-constant-out-of-range-compare] if (unlikely(remain > N_RELOC(ULONG_MAX))) ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~ ../include/linux/compiler.h:78:42: note: expanded from macro 'unlikely' # define unlikely(x) __builtin_expect(!!(x), 0) ^ 1 warning generated. It is not wrong in the case where ULONG_MAX > UINT_MAX but it does not account for the case where this file is built for 32-bit x86, where ULONG_MAX == UINT_MAX and this check is still relevant. Cast remain to unsigned long, which keeps the generated code the same (verified with clang-11 on x86_64 and GCC 9.2.0 on x86 and x86_64) and the warning is silenced so we can catch more potential issues in the future. Closes: https://github.com/ClangBuiltLinux/linux/issues/778Suggested-by: NMichel Dänzer <michel@daenzer.net> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200214054706.33870-1-natechancellor@gmail.com
-
由 Chris Wilson 提交于
I need to keep the GEM context around a bit longer so adding an explicit flag for syncing execbuf with closed/abandonded contexts. v2: * Use already available context flags. (Chris) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk (cherry picked from commit 207e4a71) Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
- 25 3月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
If the caller allows and we do not have to wait for any signals, immediately execute the work within the caller's process. By doing so we avoid the overhead of scheduling a new task, and the latency in executing it, at the cost of pulling that work back into the immediate context. (Sometimes we still prefer to offload the task to another cpu, especially if we plan on executing many such tasks in parallel for this client.) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200325120227.8044-2-chris@chris-wilson.co.uk
-
- 23 3月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
Drop the pretense of kicking the tasklet (used only for the defunct guc submission backend, it should just take ownership of the submit!) and so remove the bh-kicking from around submission. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As we store the handle lookup inside a radix tree, we do not need the gem_context->mutex except until we need to insert our lookup into the common radix tree. This takes a small bit of rearranging to ensure that the lut we insert into the tree is ready prior to actually inserting it (as soon as it is exposed via the radixtree, it is visible to any other submission). v2: For brownie points, remove the goto spaghetti. v3: Tighten up the closed-handle checks. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-8-chris@chris-wilson.co.uk
-
- 20 3月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
I need to keep the GEM context around a bit longer so adding an explicit flag for syncing execbuf with closed/abandonded contexts. v2: * Use already available context flags. (Chris) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk
-
- 13 3月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Since the relocations are no longer performed under a global struct_mutex, or any other lock, that is also held by pagefault handlers, we can relax and allow our fast path to take a fault. As we no longer need to abort the fast path for lock avoidance, we no longer need the slow path handling at all. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200311160310.26711-1-chris@chris-wilson.co.uk
-
- 12 3月, 2020 1 次提交
-
-
由 Matthew Auld 提交于
The alignment is u64, and yet is_power_of_2() assumes unsigned long, which might give different results between 32b and 64b kernel. Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200305203534.210466-1-matthew.auld@intel.com Cc: stable@vger.kernel.org (cherry picked from commit 2920516b) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 06 3月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
We only need to serialise the multiple pinning during the eb_reserve phase. Ideally this would be using the vm->mutex as an outer lock, or using a composite global mutex (ww_mutex), but at the moment we are using struct_mutex for the group. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1381 Fixes: 003d8b91 ("drm/i915/gem: Only call eb_lookup_vma once during execbuf ioctl") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200306071614.2846708-3-chris@chris-wilson.co.uk
-