- 26 7月, 2021 1 次提交
-
-
由 Jason Ekstrand 提交于
We don't roll them together entirely because there are still a couple cases where we want a separate can_migrate check. For instance, the display code checks that you can migrate a buffer to LMEM before it accepts it in fb_create. The dma-buf import code also uses it to do an early check and return a different error code if someone tries to attach a LMEM-only dma-buf to another driver. However, no one actually wants to call object_migrate when can_migrate has failed. The stated intention is for self-tests but none of those actually take advantage of this unsafe migration. Signed-off-by: NJason Ekstrand <jason@jlekstrand.net> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210723172142.3273510-2-jason@jlekstrand.net
-
- 22 7月, 2021 1 次提交
-
-
由 Daniel Vetter 提交于
This essentially reverts commit 84a10749 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 24 11:36:08 2018 +0000 drm/i915: Shrink the GEM kmem_caches upon idling mm/vmscan.c:do_shrink_slab() is a thing, if there's an issue with it then we need to fix that there, not hand-roll our own slab shrinking code in i915. Also when this was added there was only one other caller of kmem_cache_shrink (added 2005 to the acpi code). Now there's a 2nd one outside of i915 code in a kunit test, which seems legit since that wants to very carefully control what's in the kmem_cache. This out of a total of over 500 calls to kmem_cache_create. This alone should have been warning sign enough that we're doing something silly. Noticed while reviewing a patch set from Jason to fix up some issues in our i915_init() and i915_exit() module load/cleanup code. Now that i915_globals.c isn't any different than normal init/exit functions, we should convert them over to one unified table and remove i915_globals.[hc] entirely. v2: Improve commit message (Jason) Reviewed-by: NJason Ekstrand <jason@jlekstrand.net> Cc: David Airlie <airlied@linux.ie> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210721183229.4136488-1-daniel.vetter@ffwll.ch
-
- 09 7月, 2021 1 次提交
-
-
由 Matthew Auld 提交于
For discrete, users of pin_map() needs to obey the same rules at the TTM backend, where we map system only objects as WB, and everything else as WC. The simplest for now is to just force the correct mapping type as per the new rules for discrete. Suggested-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ramalingam C <ramalingam.c@intel.com> Reviewed-by: NRamalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210705135310.1502437-1-matthew.auld@intel.com
-
- 30 6月, 2021 2 次提交
-
-
由 Matthew Auld 提交于
A selftest for the gem object migrate functionality. Slightly adapted from the original by Matthew to the new interface and new fill blit code. v4: - Initialize buffers and check contents after migration (Suggested by Matthew Auld) - Perform async migration (if implemented) in the igt_lmem_pages_migrate test - Test also migration to the current region. Co-developed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> #v3 Link: https://patchwork.freedesktop.org/patch/msgid/20210629151203.209465-3-thomas.hellstrom@linux.intel.com
-
由 Thomas Hellström 提交于
Introduce an interface to migrate objects between regions. This is primarily intended to migrate objects to LMEM for display and to SYSTEM for dma-buf, but might be reused in one form or another for performance-based migration. v2: - Verify that the memory region given as an id really exists. (Reported by Matthew Auld) - Call i915_gem_object_{init,release}_memory_region() when switching region to handle also switching region lists. (Reported by Matthew Auld) v3: - Fix i915_gem_object_can_migrate() to return true if object is already in the correct region, even if the object ops doesn't have a migrate() callback. - Update typo in commit message. - Fix kerneldoc of i915_gem_object_wait_migration(). v4: - Improve documentation (Suggested by Mattew Auld and Michael Ruhl) - Always assume TTM migration hits a TTM move and unsets the pages through move_notify. (Reported by Matthew Auld) - Add a dma_fence_might_wait() annotation to i915_gem_object_wait_migration() (Suggested by Daniel Vetter) v5: - Re-add might_sleep() instead of __dma_fence_might_wait(), Sent v4 with the wrong version, didn't compile and __dma_fence_might_wait() is not exported. - Added an R-B. Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210629151203.209465-2-thomas.hellstrom@linux.intel.com
-
- 25 6月, 2021 1 次提交
-
-
由 Thomas Hellström 提交于
The object ops i915_GEM_OBJECT_HAS_IOMEM and the object I915_BO_ALLOC_STRUCT_PAGE flags are considered immutable by much of our code. Introduce a new mem_flags member to hold these and make sure checks for these flags being set are either done under the object lock or with pages properly pinned. The flags will change during migration under the object lock. Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-2-thomas.hellstrom@linux.intel.com
-
- 11 6月, 2021 2 次提交
-
-
由 Thomas Hellström 提交于
Since objects can be migrated or evicted when not pinned or locked, update the checks for lmem residency or future residency so that the value returned is not immediately stale. Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210610070152.572423-3-thomas.hellstrom@linux.intel.com
-
由 Thomas Hellström 提交于
Most logical place to introduce TTM buffer objects is as an i915 gem object backend. We need to add some ops to account for added functionality like delayed delete and LRU list manipulation. Initially we support only LMEM and SYSTEM memory, but SYSTEM (which in this case means evicted LMEM objects) is not visible to i915 GEM yet. The plan is to move the i915 gem system region over to the TTM system memory type in upcoming patches. We set up GPU bindings directly both from LMEM and from the system region, as there is no need to use the legacy TTM_TT memory type. We reserve that for future porting of GGTT bindings to TTM. Remove the old lmem backend. Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210610070152.572423-2-thomas.hellstrom@linux.intel.com
-
- 02 6月, 2021 1 次提交
-
-
由 Thomas Hellström 提交于
Embed a struct ttm_buffer_object into the i915 gem object, making sure we alias the gem object part. It's a bit unfortunate that the struct ttm_buffer_ojbect embeds a gem object since we otherwise could make the TTM part private to the TTM backend, and use the usual i915 gem object for the other backends. To make this a bit more storage efficient for the other backends, we'd have to use a pointer for the gem object which would require a lot of changes in the driver. We postpone that for later. Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210602083818.241793-3-thomas.hellstrom@linux.intel.com
-
- 01 6月, 2021 1 次提交
-
-
由 Thomas Hellström 提交于
We are currently sharing the VM reservation locks across a number of gem objects with page-table memory. Since TTM will individiualize the reservation locks when freeing objects, including accessing the shared locks, make sure that the shared locks are not freed until that is done. For PPGTT we add an additional refcount, for GGTT we take additional measures to make sure objects sharing the GGTT reservation lock are freed at GGTT takedown Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210601074654.3103-3-thomas.hellstrom@linux.intel.com
-
- 04 5月, 2021 1 次提交
-
-
由 Matthew Auld 提交于
Add new extension to support setting an immutable-priority-list of potential placements, at creation time. If we use the normal gem_create or gem_create_ext without the extensions/placements then we still get the old behaviour with only placing the object in system memory. v2(Daniel & Jason): - Add a bunch of kernel-doc - Simplify design for placements extension Testcase: igt/gem_create/create-ext-placement-sanity-check Testcase: igt/gem_create/create-ext-placement-each Testcase: igt/gem_create/create-ext-placement-all Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NCQ Tang <cq.tang@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Dave Airlie <airlied@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: mesa-dev@lists.freedesktop.org Reviewed-by: NKenneth Graunke <kenneth@whitecape.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210429103056.407067-6-matthew.auld@intel.com
-
- 25 3月, 2021 2 次提交
-
-
由 Maarten Lankhorst 提交于
With all callers and selftests fixed to use ww locking, we can now finally remove this lock. Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-62-maarten.lankhorst@linux.intel.com
-
由 Maarten Lankhorst 提交于
With userptr fixed, there is no need for all separate lockdep classes now, and we can remove all lockdep tricks used. A trylock in the shrinker is all we need now to flatten the locking hierarchy. Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> [danvet: Resolve conflict because we don't have the patch from Chris to rebrand i915_gem_shrinker_taints_mutex to fs_reclaim_taints_mutex. It's not a bad idea, but if we do it, it should be moved to the right header. See https://lore.kernel.org/intel-gfx/20210202154318.19246-1-chris@chris-wilson.co.uk/] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-18-maarten.lankhorst@linux.intel.com
-
- 24 3月, 2021 1 次提交
-
-
由 Maarten Lankhorst 提交于
We want to remove the changing of ops structure for attaching phys pages, so we need to kill off HAS_STRUCT_PAGE from ops->flags, and put it in the bo. This will remove a potential race of dereferencing the wrong obj->ops without ww mutex held. Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com> [danvet: apply with wiggle] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-8-maarten.lankhorst@linux.intel.com
-
- 22 1月, 2021 1 次提交
-
-
由 Imre Deak 提交于
Add a simple helper to read data with the CPU from the page of a GEM object. Do the read either via a kmap if the object has struct pages or an iomap otherwise. This is needed by the next patch, reading a u64 value from the object (w/o requiring the obj to be mapped to the GPU). Suggested by Chris. v2 (Chris): - Sanitize the type and order of func params. - Avoid consts requiring too many casts. - Use BUG_ON instead of WARN_ON, simplify the conditions. - Fix __iomem sparse errors. - Leave locking/syncing/pinning up to the caller, require only that the caller has pinned the object pages. - Check for iomem backing store before reading via an iomap. v3: - Fix offset passed to io_mapping_map_wc() missing a mem.region.start delta. (Chris, Matthew) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Signed-off-by: NImre Deak <imre.deak@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210120213834.1435710-1-imre.deak@intel.com
-
- 20 1月, 2021 1 次提交
-
-
由 Chris Wilson 提交于
flush_write_domain() is only used within the GEM domain management code, so move it to i915_gem_domain.c and drop the export. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210119144912.12653-5-chris@chris-wilson.co.uk
-
- 06 10月, 2020 1 次提交
-
-
由 Tvrtko Ursulin 提交于
As the previous patch fixed the places where we walk the whole scatterlist for DMA addresses, this patch fixes the random lookup functionality. To achieve this we have to add a second lookup iterator and add a i915_gem_object_get_sg_dma helper, to be used analoguous to existing i915_gem_object_get_sg_dma. Therefore two lookup caches are maintained per object and they are flushed at the same point for simplicity. (Strictly speaking the DMA cache should be flushed from i915_gem_gtt_finish_pages, but today this conincides with unsetting of the pages in general.) Partial VMA view is then fixed to use the new DMA lookup and properly query sg length. v2: * Checkpatch. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Tom Murphy <murphyt7@tcd.ie> Cc: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-2-tvrtko.ursulin@linux.intel.com
-
- 25 9月, 2020 1 次提交
-
-
由 Thomas Zimmermann 提交于
GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in i915. v2: * move object-function instance to i915_gem_object.c (Jani) Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: NChristian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200923102159.24084-7-tzimmermann@suse.de
-
- 03 7月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
Rather than reuse the common ctx->mutex for locking the execbuffer LUT, split it into its own lock to avoid being taken [as part of ctx->mutex] at inappropriate times. In particular to avoid the inversion from taking the timeline->mutex for the whole execbuf submission in the next patch. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200703004306.11117-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Avoid waking up the device and taking stale locks if we know that the object is not currently mmapped. This is particularly useful as not many object are actually mmapped and so we can destroy them without waking the device up, and gives us a little more freedom of workqueue ordering during shutdown. v2: Pull the release_mmap() into its single user in freeing the objects, where there can not be any race with a concurrent user of the freed object. Or so one hopes! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, Link: https://patchwork.freedesktop.org/patch/msgid/20200702163623.6402-2-chris@chris-wilson.co.uk
-
- 01 7月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
The obj->lut_list is traversed when the object is closed as the file table is destroyed during process termination. As this occurs before we kill any outstanding context if, due to some bug or another, the closure is blocked, then we fail to shootdown any inflight operations potentially leaving the GPU spinning forever. As we only need to guard the list against concurrent closures and insertions, the hold is short and merits being treated as a simple spinlock. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk
-
- 30 5月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
Name the object classes and their offspring for easier lockdep debugging. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200529183204.16850-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If we declare that an object type is shrinkable (any that we can reclaim to recover system pages), make sure we taint the object mutex so that lockdep expects us to use it within fs_reclaim. lockdep will then complain the first time we try to allocate while holding the plain mutex, as doing so invites potential recursion. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200529183204.16850-1-chris@chris-wilson.co.uk
-
- 04 5月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
We only need the device wakeref on freeing the objects if we have to unbind the object from the global GTT, or otherwise update device information. If the objects are clean, we never need the wakeref, so avoid taking until required. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: NJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200503171513.18704-1-chris@chris-wilson.co.uk
-
- 24 4月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
The history of i915_vma_close() is confusing, as is its use. As the lifetime of the i915_vma is currently bounded by the object it is attached to, we needed a means of identify when a vma was no longer in use by userspace (via the user's fd). This is further complicated by that only ppgtt vma should be closed at the user's behest, as the ggtt were always shared. Now that we attach the vma to a lut on the user's context, the open count does indicate how many unique and open context/vm are referencing this vma from the user. As such, we can and should just use the open_count to track when the vma is still in use by userspace. It's a poor man's replacement for reference counting. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk
-
- 02 4月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
We cached the number of vma bound to the object in order to speed up shrinker decisions. This has been superseded by being more proactive in removing objects we cannot shrink from the shrinker lists, and so we can drop the clumsy attempt at atomically counting the bind count and comparing it to the number of pinned mappings of the object. This will only get more clumsier with asynchronous binding and unbinding. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200401223924.16667-1-chris@chris-wilson.co.uk
-
- 02 3月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Call cond_resched() between each freed object in case we have a really, really long list, and we don't want to block normal processes. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200221100953.2587176-1-chris@chris-wilson.co.uk (cherry picked from commit deeee411) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 22 2月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Call cond_resched() between each freed object in case we have a really, really long list, and we don't want to block normal processes. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200221100953.2587176-1-chris@chris-wilson.co.uk
-
- 11 2月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Currently we create a new mmap_offset for every call to mmap_offset_ioctl. This exposes ourselves to an abusive client that may simply create new mmap_offsets ad infinitum, which will exhaust physical memory and the virtual address space. In addition to the exhaustion, a very long linear list of mmap_offsets causes other clients using the object to incur long list walks -- these long lists can also be generated by simply having many clients generate their own mmap_offset. However, we can simply use the drm_vma_node itself to manage the file association (allow/revoke) dropping our need to keep an mmo per-file. Then if we keep a small rbtree of per-type mmap_offsets, we can lookup duplicate requests quickly. Fixes: cc662126 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: NAbdiel Janulgue <abdiel.janulgue@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200120104924.4000706-3-chris@chris-wilson.co.uk (cherry picked from commit 78655598) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 20 1月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Currently we create a new mmap_offset for every call to mmap_offset_ioctl. This exposes ourselves to an abusive client that may simply create new mmap_offsets ad infinitum, which will exhaust physical memory and the virtual address space. In addition to the exhaustion, a very long linear list of mmap_offsets causes other clients using the object to incur long list walks -- these long lists can also be generated by simply having many clients generate their own mmap_offset. However, we can simply use the drm_vma_node itself to manage the file association (allow/revoke) dropping our need to keep an mmo per-file. Then if we keep a small rbtree of per-type mmap_offsets, we can lookup duplicate requests quickly. Fixes: cc662126 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: NAbdiel Janulgue <abdiel.janulgue@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200120104924.4000706-3-chris@chris-wilson.co.uk
-
- 23 12月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Start introducing a kref on i915_vma in order to protect the vma unbind (i915_gem_object_unbind) from a parallel destruction (i915_vma_parked). Later, we will use the refcount to manage all access and turn i915_vma into a first class container. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Acked-by: NImre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222210256.2066451-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Since obj->frontbuffer is no longer protected by the struct_mutex, as we are processing the execbuf, it may be removed. Mark the intel_frontbuffer as rcu protected, and so acquire a reference to the struct as we track activity upon it. Closes: https://gitlab.freedesktop.org/drm/intel/issues/827 Fixes: 8e7cb179 ("drm/i915: Extract intel_frontbuffer active tracking") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191218104043.3539458-1-chris@chris-wilson.co.uk (cherry picked from commit da42104f) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 18 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Since obj->frontbuffer is no longer protected by the struct_mutex, as we are processing the execbuf, it may be removed. Mark the intel_frontbuffer as rcu protected, and so acquire a reference to the struct as we track activity upon it. Closes: https://gitlab.freedesktop.org/drm/intel/issues/827 Fixes: 8e7cb179 ("drm/i915: Extract intel_frontbuffer active tracking") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191218104043.3539458-1-chris@chris-wilson.co.uk
-
- 04 12月, 2019 1 次提交
-
-
由 Abdiel Janulgue 提交于
This is really just an alias of mmap_gtt. The 'mmap offset' nomenclature comes from the value returned by this ioctl which is the offset into the device fd which userpace uses with mmap(2). mmap_gtt was our initial mmap_offset implementation, this extends our CPU mmap support to allow additional fault handlers that depends on the object's backing pages. Note that we multiplex mmap_gtt and mmap_offset through the same ioctl, and use the zero extending behaviour of drm to differentiate between them, when we inspect the flags. To support multiple mmap types on an object we need to support multiple mmap_offsets for an object (each offset in the global device address space corresponding to a unique instance of the object for a file + mmap type). As we drop the simplified drm core idea of a single mmap_offset, we need to provide replacement hooks for the dumb mmap interface as well. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1675 Testcase: igt/gem_mmap_offset Signed-off-by: NAbdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191204120032.3682839-1-chris@chris-wilson.co.uk
-
- 19 11月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Take the obj->vma.lock to prevent modifications to the list as we iterate, to avoid the dreaded NULL pointer. <1>[ 347.820823] BUG: kernel NULL pointer dereference, address: 0000000000000150 <1>[ 347.820856] #PF: supervisor read access in kernel mode <1>[ 347.820874] #PF: error_code(0x0000) - not-present page <6>[ 347.820892] PGD 0 P4D 0 <4>[ 347.820908] Oops: 0000 [#1] PREEMPT SMP NOPTI <4>[ 347.820926] CPU: 3 PID: 1303 Comm: gem_persistent_ Tainted: G U 5.4.0-rc7-CI-CI_DRM_7352+ #1 <4>[ 347.820956] Hardware name: /NUC6CAYB, BIOS AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018 <4>[ 347.821132] RIP: 0010:i915_gem_object_flush_write_domain+0xd9/0x1d0 [i915] <4>[ 347.821157] Code: 0f 84 e9 00 00 00 48 8b 80 e0 fd ff ff f6 c4 40 75 11 e9 ed 00 00 00 48 8b 80 e0 fd ff ff f6 c4 40 74 26 48 8b 83 b0 00 00 00 <48> 8b b8 50 01 00 00 e8 fb 20 fb ff 48 8b 83 30 03 00 00 49 39 c4 <4>[ 347.821210] RSP: 0018:ffffc90000a1f8f8 EFLAGS: 00010202 <4>[ 347.821229] RAX: 0000000000000000 RBX: ffffc900008479a0 RCX: 0000000000000018 <4>[ 347.821252] RDX: 0000000000000000 RSI: 000000000000000d RDI: ffff888275a090b0 <4>[ 347.821274] RBP: ffff8882673c8040 R08: ffff88825991b8d0 R09: 0000000000000000 <4>[ 347.821297] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8882673c8280 <4>[ 347.821319] R13: ffff8882673c8368 R14: 0000000000000000 R15: ffff888266a54000 <4>[ 347.821343] FS: 00007f75865f4240(0000) GS:ffff888277b80000(0000) knlGS:0000000000000000 <4>[ 347.821368] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 347.821389] CR2: 0000000000000150 CR3: 000000025aee0000 CR4: 00000000003406e0 <4>[ 347.821411] Call Trace: <4>[ 347.821555] i915_gem_object_prepare_read+0xea/0x2a0 [i915] <4>[ 347.821706] intel_engine_cmd_parser+0x5ce/0xe90 [i915] <4>[ 347.821834] ? __i915_sw_fence_complete+0x1a0/0x250 [i915] <4>[ 347.821990] i915_gem_do_execbuffer+0xb4c/0x2550 [i915] Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191119100929.2628356-8-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We only need the one loop to find the dirty vma flush them and their chipset. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191119100929.2628356-6-chris@chris-wilson.co.uk
-
- 07 11月, 2019 1 次提交
-
-
由 Daniel Vetter 提交于
The trouble with having a plain nesting flag for locks which do not naturally nest (unlike block devices and their partitions, which is the original motivation for nesting levels) is that lockdep will never spot a true deadlock if you screw up. This patch is an attempt at trying better, by highlighting a bit more of the actual nature of the nesting that's going on. Essentially we have two kinds of objects: - objects without pages allocated, which cannot be on any lru and are hence inaccessible to the shrinker. - objects which have pages allocated, which are on an lru, and which the shrinker can decide to throw out. For the former type of object, memory allocations while holding obj->mm.lock are permissible. For the latter they are not. And get/put_pages transitions between the two types of objects. This is still not entirely fool-proof since the rules might change. But as long as we run such a code ever at runtime lockdep should be able to observe the inconsistency and complain (like with any other lockdep class that we've split up in multiple classes). But there are a few clear benefits: - We can drop the nesting flag parameter from __i915_gem_object_put_pages, because that function by definition is never going allocate memory, and calling it on an object which doesn't have its pages allocated would be a bug. - We strictly catch more bugs, since there's not only one place in the entire tree which is annotated with the special class. All the other places that had explicit lockdep nesting annotations we're now going to leave up to lockdep again. - Specifically this catches stuff like calling get_pages from put_pages (which isn't really a good idea, if we can call get_pages so could the shrinker). I've seen patches do exactly that. Of course I fully expect CI will show me for the fool I am with this one here :-) v2: There can only be one (lockdep only has a cache for the first subclass, not for deeper ones, and we don't want to make these locks even slower). Still separate enums for better documentation. Real fix: don't forget about phys objs and pin_map(), and fix the shrinker to have the right annotations ... silly me. v3: Forgot usertptr too ... v4: Improve comment for pages_pin_count, drop the IMPORTANT comment and instead prime lockdep (Chris). v5: Appease checkpatch, no double empty lines (Chris) v6: More rebasing over selftest changes. Also somehow I forgot to push this patch :-/ Also format comments consistently while at it. v7: Fix typo in commit message (Joonas) Also drop the priming, with the lmem merge we now have allocations while holding the lmem lock, which wreaks the generic priming I've done in earlier patches. Should probably be resurrected when lmem is fixed. See commit 232a6eba Author: Matthew Auld <matthew.auld@intel.com> Date: Tue Oct 8 17:01:14 2019 +0100 drm/i915: introduce intel_memory_region I'm keeping the priming patch locally so it wont get lost. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Tang, CQ" <cq.tang@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6) Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191105090148.30269-1-daniel.vetter@ffwll.ch [mlankhorst: Fix commit typos pointed out by Michael Ruhl]
-
- 22 10月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Separate each object class into a separate lock type to avoid lockdep cross-contamination between paths (i.e. userptr!). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191022144501.26486-1-chris@chris-wilson.co.uk
-
- 04 10月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Replace the struct_mutex requirement for pinning the i915_vma with the local vm->mutex instead. Note that the vm->mutex is tainted by the shrinker (we require unbinding from inside fs-reclaim) and so we cannot allocate while holding that mutex. Instead we have to preallocate workers to do allocate and apply the PTE updates after we have we reserved their slot in the drm_mm (using fences to order the PTE writes with the GPU work and with later unbind). In adding the asynchronous vma binding, one subtle requirement is to avoid coupling the binding fence into the backing object->resv. That is the asynchronous binding only applies to the vma timeline itself and not to the pages as that is a more global timeline (the binding of one vma does not need to be ordered with another vma, nor does the implicit GEM fencing depend on a vma, only on writes to the backing store). Keeping the vma binding distinct from the backing store timelines is verified by a number of async gem_exec_fence and gem_exec_schedule tests. The way we do this is quite simple, we keep the fence for the vma binding separate and only wait on it as required, and never add it to the obj->resv itself. Another consequence in reducing the locking around the vma is the destruction of the vma is no longer globally serialised by struct_mutex. A natural solution would be to add a kref to i915_vma, but that requires decoupling the reference cycles, possibly by introducing a new i915_mm_pages object that is own by both obj->mm and vma->pages. However, we have not taken that route due to the overshadowing lmem/ttm discussions, and instead play a series of complicated games with trylocks to (hopefully) ensure that only one destruction path is called! v2: Add some commentary, and some helpers to reduce patch churn. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
-
- 11 9月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
In preparation for reducing struct_mutex stranglehold around the vm, make the vma.flags atomic so that we can acquire a pin on the vma atomically before deciding if we need to take the mutex. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190911090243.16786-1-chris@chris-wilson.co.uk
-