- 28 5月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Out scatterlist utility routines can be pulled out of i915_gem.h for a bit more decluttering. v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the caller. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-9-chris@chris-wilson.co.uk
-
- 27 5月, 2019 1 次提交
-
-
由 Colin Ian King 提交于
Currently when the allocation of ppgtt->work fails the error return path via err_free returns an uninitialized value in err. Fix this by setting err to the appropriate error return of -ENOMEM. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: d3622099 ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup") Signed-off-by: NColin Ian King <colin.king@canonical.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190524212627.24256-1-colin.king@canonical.com
-
- 24 5月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Having deferred the vma destruction to a worker where we can acquire the struct_mutex, we have to avoid chasing back into the now destroyed ppgtt. The pd_vma is special in having a custom unbind function to scan for unused pages despite the VMA itself being notionally part of the GGTT. As such, we need to disable that callback to avoid a use-after-free. This unfortunately blew up so early during boot that CI declared the machine unreachable as opposed to being the major failure it was. Oops. Fixes: d3622099 ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524064529.20514-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We rearranged the vm_destroy_ioctl to avoid taking struct_mutex, little realising that buried underneath the gen6 ppgtt release path was a struct_mutex requirement (to remove its GGTT vma). Until that struct_mutex is vanquished, take a detour in gen6_ppgtt_cleanup to do the i915_vma_destroy from inside a worker under the struct_mutex. <4> [257.740160] WARN_ON(debug_locks && !lock_is_held(&(&vma->vm->i915->drm.struct_mutex)->dep_map)) <4> [257.740213] WARNING: CPU: 3 PID: 1507 at drivers/gpu/drm/i915/i915_vma.c:841 i915_vma_destroy+0x1ae/0x3a0 [i915] <4> [257.740214] Modules linked in: snd_hda_codec_hdmi i915 x86_pkg_temp_thermal mei_hdcp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core r8169 realtek snd_pcm mei_me mei prime_numbers lpc_ich <4> [257.740224] CPU: 3 PID: 1507 Comm: gem_vm_create Tainted: G U 5.2.0-rc1-CI-CI_DRM_6118+ #1 <4> [257.740225] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016 <4> [257.740249] RIP: 0010:i915_vma_destroy+0x1ae/0x3a0 [i915] <4> [257.740250] Code: 00 00 00 48 81 c7 c8 00 00 00 e8 ed 08 f0 e0 85 c0 0f 85 78 fe ff ff 48 c7 c6 e8 ec 30 a0 48 c7 c7 da 55 33 a0 e8 42 8c e9 e0 <0f> 0b 8b 83 40 01 00 00 85 c0 0f 84 63 fe ff ff 48 c7 c1 c1 58 33 <4> [257.740251] RSP: 0018:ffffc90000aafc68 EFLAGS: 00010282 <4> [257.740252] RAX: 0000000000000000 RBX: ffff8883f7957840 RCX: 0000000000000003 <4> [257.740253] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffffff8212d1b9 <4> [257.740254] RBP: ffffc90000aafcc8 R08: 0000000000000000 R09: 0000000000000000 <4> [257.740255] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8883f4d5c2a8 <4> [257.740256] R13: ffff8883f4d5d680 R14: ffff8883f4d5c668 R15: ffff8883f4d5c2f0 <4> [257.740257] FS: 00007f777fa8fe40(0000) GS:ffff88840f780000(0000) knlGS:0000000000000000 <4> [257.740258] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [257.740259] CR2: 00007f777f6522b0 CR3: 00000003c612a006 CR4: 00000000001606e0 <4> [257.740260] Call Trace: <4> [257.740283] gen6_ppgtt_cleanup+0x25/0x60 [i915] <4> [257.740306] i915_ppgtt_release+0x102/0x290 [i915] <4> [257.740330] i915_gem_vm_destroy_ioctl+0x7c/0xa0 [i915] <4> [257.740376] ? i915_gem_vm_create_ioctl+0x160/0x160 [i915] <4> [257.740379] drm_ioctl_kernel+0x83/0xf0 <4> [257.740382] drm_ioctl+0x2f3/0x3b0 <4> [257.740422] ? i915_gem_vm_create_ioctl+0x160/0x160 [i915] <4> [257.740426] ? _raw_spin_unlock_irqrestore+0x39/0x60 <4> [257.740430] do_vfs_ioctl+0xa0/0x6e0 <4> [257.740433] ? lock_acquire+0xa6/0x1c0 <4> [257.740436] ? __task_pid_nr_ns+0xb9/0x1f0 <4> [257.740439] ksys_ioctl+0x35/0x60 <4> [257.740441] __x64_sys_ioctl+0x11/0x20 <4> [257.740443] do_syscall_64+0x55/0x1c0 <4> [257.740445] entry_SYSCALL_64_after_hwframe+0x49/0xbe References: e0695db7 ("drm/i915: Create/destroy VM (ppGTT) for use with contexts") Fixes: 7f3f317a ("drm/i915: Restore control over ppgtt for context creation ABI") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190523064933.23604-1-chris@chris-wilson.co.uk
-
- 20 5月, 2019 1 次提交
-
-
由 Ville Syrjälä 提交于
To overcome display engine stride limits we'll want to remap the pages in the GTT. To that end we need a new gtt_view type which is just like the "rotated" type except not rotated. v2: Use intel_remapped_plane_info base type s/unused/unused_mbz/ (Chris) Separate BUILD_BUG_ON()s (Chris) Use I915_GTT_PAGE_SIZE (Chris) v3: Use i915_gem_object_get_dma_address() (Chris) Trim the sg (Tvrtko) v4: Actually trim this time. Limit the max length to one row of pages to keep things simple Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.comReviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
-
- 25 4月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJani Nikula <jani.nikula@intel.com> Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
-
- 20 4月, 2019 2 次提交
-
-
由 Fernando Pacheco 提交于
GuC and HuC depend on struct_mutex for device reinitialization. Moving away from this dependency requires perma-pinning the firmware images in GGTT. The upper portion of the GuC address space has a sizeable hole (several MB) that is inaccessible by GuC. Reserve this range within GGTT as it can comfortably hold GuC/HuC firmware images. v2: Reserve node rather than insert (Chris) Simpler determination of node start/size (Daniele) Move reserve/release out to intel_guc.* files v3: Reserve starting at GUC_GGTT_TOP only and bail if this fails (Chris) Signed-off-by: NFernando Pacheco <fernando.pacheco@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190419230015.18121-3-fernando.pacheco@intel.com
-
由 Chris Wilson 提交于
If we know that the user cannot access the GGTT, by virtue of having a segregated memory area, we can skip clearing the unused entries as they cannot be accessed. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190419201207.5477-1-chris@chris-wilson.co.uk
-
- 12 4月, 2019 1 次提交
-
-
由 Mika Kuoppala 提交于
On gen11 writing to read only ppgtt page causes a gpu hang. This behaviour is different than with previous gen where read only ppgtt access is supported. On those, the write is just dropped without visible side effects. Disable ro ppgtt support on gen11 until a solution can be found to bring it into line with its predecessors. References: HSDES#1807136187 References: https://bugzilla.freedesktop.org/show_bug.cgi?id=108569 Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190411083034.28311-1-mika.kuoppala@linux.intel.com
-
- 22 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
In preparation to making the ppGTT binding for a context explicit (to facilitate reusing the same ppGTT between different contexts), allow the user to create and destroy named ppGTT. v2: Replace global barrier for swapping over the ppgtt and tlbs with a local context barrier (Tvrtko) v3: serialise with struct_mutex; it's lazy but required dammit v4: Rewrite igt_ctx_shared_exec to be more different (aimed to be more similarly, turned out different!) v5: Fix up test unwind for aliasing-ppgtt (snb) v6: Tighten language for uapi struct drm_i915_gem_vm_control. v7: Patch the context image for runtime ppgtt switching! Testcase: igt/gem_vm_create Testcase: igt/gem_ctx_param/vm Testcase: igt/gem_ctx_clone/vm Testcase: igt/gem_ctx_shared Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-2-chris@chris-wilson.co.uk
-
- 21 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
In later patches, it became apparent that userspace can see a partially constructed GEM context and begin using it before it was ready, to much hilarity. Close this window of opportunity by lifting the registration of the context with userspace (the insertion of the context into the filp's idr) to the very end of the CONTEXT_CREATE ioctl. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-1-chris@chris-wilson.co.uk
-
- 15 3月, 2019 3 次提交
-
-
由 Chris Wilson 提交于
The basic setup of the i915_hw_ppgtt is the same between gen6 and gen8, so refactor that into a common routine. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Bob Paauwe <bob.j.paauwe@intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Large ppGTT are differentiated by the requirement to go to four levels to address more than 32b. Given the introduction of more 4 level ppGTT with different sizes of addressable bits, rename i915_vm_is_48b() to better reflect the commonality of using 4 levels. Based on a patch by Bob Paauwe. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Bob Paauwe <bob.j.paauwe@intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As the maximum addressable bits is determined by platform, record that information in our static chipset tables. This has the advantage of being clearly recorded in our capability dumps for dmesg, debugfs and error states. Based on a patch by Bob Paauwe. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Bob Paauwe <bob.j.paauwe@intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-2-chris@chris-wilson.co.uk
-
- 06 3月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Small simplification to set all bits in the dirty mask rather than lookup the exact mask of populated engines. The bits for the engines that do not exist are unused and so can safely set and then ignored. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
-
- 05 3月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
As the scratch page is the only one to be allocated with variable size, rather than keep an unused slot in all i915_page_table structs, store it alongside the vm->scratch_page. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305135430.4948-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Replace the open-coded memset loops with the memset32/64 routines that reduce to a single instruction or two: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-83 (-83) Function old new delta gen6_ppgtt_clear_range 371 344 -27 gen8_ppgtt_clear_pd 575 519 -56 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190304230646.23714-1-chris@chris-wilson.co.uk
-
- 28 2月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
As our allocations are not device specific, we can move our slab caches to a global scope. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk
-
- 22 2月, 2019 1 次提交
-
-
由 Chengguang Xu 提交于
unlikely has already included in IS_ERR(), so just remove redundant likely/unlikely annotation. Signed-off-by: NChengguang Xu <cgxu519@gmx.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190221020819.21832-1-cgxu519@gmx.com
-
- 06 2月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Looking forward, we need to break the struct_mutex dependency on i915_gem_active. In the meantime, external use of i915_gem_active is quite beguiling, little do new users suspect that it implies a barrier as each request it tracks must be ordered wrt the previous one. As one of many, it can be used to track activity across multiple timelines, a shared fence, which fits our unordered request submission much better. We need to steer external users away from the singular, exclusive fence imposed by i915_gem_active to i915_active instead. As part of that process, we move i915_gem_active out of i915_request.c into i915_active.c to start separating the two concepts, and rename it to i915_active_request (both to tie it to the concept of tracking just one request, and to give it a longer, less appealing name). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We currently track GPU memory usage inside VMA, such that we never release memory used by the GPU until after it has finished accessing it. However, we may want to track other resources aside from VMA, or we may want to split a VMA into multiple independent regions and track each separately. For this purpose, generalise our request tracking (akin to struct reservation_object) so that we can embed it into other objects. v2: Tweak error handling during selftest setup. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk
-
- 29 1月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
A starting point to counter the pervasive struct_mutex. For the goal of avoiding (or at least blocking under them!) global locks during user request submission, a simple but important step is being able to manage each clients GTT separately. For which, we want to replace using the struct_mutex as the guard for all things GTT/VM and switch instead to a specific mutex inside i915_address_space. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Our goal is to remove struct_mutex and replace it with fine grained locking. One of the thorny issues is our eviction logic for reclaiming space for an execbuffer (or GTT mmaping, among a few other examples). While eviction itself is easy to move under a per-VM mutex, performing the activity tracking is less agreeable. One solution is not to do any MRU tracking and do a simple coarse evaluation during eviction of active/inactive, with a loose temporal ordering of last insertion/evaluation. That keeps all the locking constrained to when we are manipulating the VM itself, neatly avoiding the tricky handling of possible recursive locking during execbuf and elsewhere. Note that discarding the MRU (currently implemented as a pair of lists, to avoid scanning the active list for a NONBLOCKING search) is unlikely to impact upon our efficiency to reclaim VM space (where we think a LRU model is best) as our current strategy is to use random idle replacement first before doing a search, and over time the use of softpinned 48b per-ppGTT is growing (thereby eliminating any need to perform any eviction searches, in theory at least) with the remaining users being found on much older devices (gen2-gen6). v2: Changelog and commentary rewritten to elaborate on the duality of a single list being both an inactive and active list. v3: Consolidate bool parameters into a single set of flags; don't comment on the duality of a single variable being a multiplicity of bits. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
-
- 17 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Currently the code to reset the GPU and our state is spread widely across a few files. Pull the logic together into a common file. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk
-
- 15 1月, 2019 5 次提交
-
-
由 Chris Wilson 提交于
On Braswell, under heavy stress, if we update the GGTT while simultaneously accessing another region inside the GTT, we are returned the wrong values. To prevent this we stop the machine to update the GGTT entries so that no memory traffic can occur at the same time. This was first spotted in commit 5bab6f60 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Oct 23 18:43:32 2015 +0100 drm/i915: Serialise updates to GGTT with access through GGTT on Braswell but removed again in forlorn hope with commit 4509276e Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Feb 20 12:47:18 2017 +0000 drm/i915: Remove Braswell GGTT update w/a However, gem_concurrent_blit is once again only stable with the patch applied and CI is detecting the odd failure in forked gem_mmap_gtt tests (which smell like the same issue). Fwiw, a wide variety of CPU memory barriers (around GGTT flushing, fence updates, PTE updates) and GPU flushes/invalidates (between requests, after PTE updates) were tried as part of the investigation to find an alternate cause, nothing comes close to serialised GGTT updates. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105591 Testcase: igt/gem_concurrent_blit Testcase: igt/gem_mmap_gtt/*forked* References: 5bab6f60 ("drm/i915: Serialise updates to GGTT with access through GGTT on Braswell") References: 4509276e ("drm/i915: Remove Braswell GGTT update w/a") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114211729.30352-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We have two classes of VM, global GTT and per-process GTT. In order to allow ourselves the freedom to mix both along call chains, distinguish the two classes with regards to their mutex and lockdep maps. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114215956.32266-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Frequently, we use intel_runtime_pm_get/_put around a small block. Formalise that usage by providing a macro to define such a block with an automatic closure to scope the intel_runtime_pm wakeref to that block, i.e. macro abuse smelling of python. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Keep track of the temporary rpm wakerefs used for user access to the device, so that we can cancel them upon release and clearly identify any leaks. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-10-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The majority of runtime-pm operations are bounded and scoped within a function; these are easy to verify that the wakeref are handled correctly. We can employ the compiler to help us, and reduce the number of wakerefs tracked when debugging, by passing around cookies provided by the various rpm_get functions to their rpm_put counterpart. This makes the pairing explicit, and given the required wakeref cookie the compiler can verify that we pass an initialised value to the rpm_put (quite handy for double checking error paths). For regular builds, the compiler should be able to eliminate the unused local variables and the program growth should be minimal. Fwiw, it came out as a net improvement as gcc was able to refactor rpm_get and rpm_get_if_in_use together, v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual mark up for smaller more targeted patches. v3: Mention the cookie in Returns Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
-
- 10 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
If we fail to pin the ggtt vma slot for the ppgtt page tables, we need to unwind the locals before reporting the error. Or else on subsequent attempts to bind the page tables into the ggtt, we will already believe that the vma has been pinned and continue on blithely. If something else should happen to be at that location, choas ensues. Fixes: a2bbf714 ("drm/i915/gtt: Only keep gen6 page directories pinned while active") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: <stable@vger.kernel.org> # v4.19+ Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181222030623.21710-1-chris@chris-wilson.co.uk (cherry picked from commit d4de7535) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 09 1月, 2019 1 次提交
-
-
由 Jani Nikula 提交于
Needs just a few additional includes here and there. Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190108082709.3748-1-jani.nikula@intel.com
-
- 08 1月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Ignore trying to shrink from i915 if we fail to acquire the struct_mutex in the shrinker while performing direct-reclaim. The trade-off being (much) lower latency for non-i915 clients at an increased risk of being unable to obtain a page from direct-reclaim without hitting the oom-notifier. The proviso being that we still keep trying to hard obtain the lock for kswapd so that we can reap under heavy memory pressure. v2: Taint all mutexes taken within the shrinker with the struct_mutex subclass as an early warning system, and drop I915_SHRINK_ACTIVE from vmap to reduce the number of dangerous paths. We also have to drop I915_SHRINK_ACTIVE from oom-notifier to be able to make the same claim that ACTIVE is only used from outside context, which fits in with a longer strategy of avoiding stalls due to scanning active during shrinking. The danger in using the subclass struct_mutex is that we declare ourselves more knowledgable than lockdep and deprive ourselves of automatic coverage. Instead, we require ourselves to mark up any mutex taken inside the shrinker in order to detect lock-inversion, and if we miss any we are doomed to a deadlock at the worst possible moment. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190107115509.12523-1-chris@chris-wilson.co.uk
-
- 27 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
The information presented here is not relevant to current development. We can either use the context information, but more often we want to inspect the active gpu state. The ulterior motive is to eradicate dev->filelist. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181227121549.29139-1-chris@chris-wilson.co.uk
-
- 22 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
If we fail to pin the ggtt vma slot for the ppgtt page tables, we need to unwind the locals before reporting the error. Or else on subsequent attempts to bind the page tables into the ggtt, we will already believe that the vma has been pinned and continue on blithely. If something else should happen to be at that location, choas ensues. Fixes: a2bbf714 ("drm/i915/gtt: Only keep gen6 page directories pinned while active") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: <stable@vger.kernel.org> # v4.19+ Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181222030623.21710-1-chris@chris-wilson.co.uk
-
- 13 12月, 2018 1 次提交
-
-
由 Lucas De Marchi 提交于
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of gen_mask to do the comparison. Now callers can pass then gen as a parameter, so we don't require one macro for each gen. The following spatch was used to convert the users of these macros: @@ expression e; @@ ( - IS_GEN2(e) + IS_GEN(e, 2) | - IS_GEN3(e) + IS_GEN(e, 3) | - IS_GEN4(e) + IS_GEN(e, 4) | - IS_GEN5(e) + IS_GEN(e, 5) | - IS_GEN6(e) + IS_GEN(e, 6) | - IS_GEN7(e) + IS_GEN(e, 7) | - IS_GEN8(e) + IS_GEN(e, 8) | - IS_GEN9(e) + IS_GEN(e, 9) | - IS_GEN10(e) + IS_GEN(e, 10) | - IS_GEN11(e) + IS_GEN(e, 11) ) v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than using the bitmask Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
-
- 07 12月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Currently we face a severe problem on Braswell that manifests as invalid ppGTT accesses. The code tries to maintain the PDP (page directory pointers) inside the context in two ways, direct write into the context and a pipelined LRI update. The direct write into the context is fundamentally racy as it is unserialised with any access (read or write) the GPU is doing. By asserting that Braswell is not used with vGPU (currently an unsupported platform) we can eliminate the dangerous direct write into the context image and solely use the pipelined update. However, the LRI of the PDP fouls up the GPU, causing it to freeze and take out the machine with "forcewake ack timeouts". This seems possible to workaround by preventing the GPU from sleeping (via means of disabling the power-state management interface, i.e. forcing each ring to remain awake) around the update. Equally, it seems an EMIT_INVALIDATE before the LRI is sufficient to prevent the forcewake errors. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656 References: https://bugs.freedesktop.org/show_bug.cgi?id=108714Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181207090213.14352-3-chris@chris-wilson.co.uk
-
- 20 11月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
Since capturing the error state requires fiddling around with the GGTT to read arbitrary buffers and is itself run under stop_machine(), it deadlocks the machine (effectively a hard hang) when run in conjunction with Broxton's VTd workaround to serialize GGTT access. v2: Store the ERR_PTR in first_error so that the error can be reported to the user via sysfs. v3: Mention the quirk in dmesg (using info as per usual) Fixes: 0ef34ad6 ("drm/i915: Serialize GTT/Aperture accesses on BXT") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: John Harrison <john.C.Harrison@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-5-chris@chris-wilson.co.uk (cherry picked from commit fb6f0b64) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Since capturing the error state requires fiddling around with the GGTT to read arbitrary buffers and is itself run under stop_machine(), it deadlocks the machine (effectively a hard hang) when run in conjunction with Broxton's VTd workaround to serialize GGTT access. v2: Store the ERR_PTR in first_error so that the error can be reported to the user via sysfs. v3: Mention the quirk in dmesg (using info as per usual) Fixes: 0ef34ad6 ("drm/i915: Serialize GTT/Aperture accesses on BXT") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: John Harrison <john.C.Harrison@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-5-chris@chris-wilson.co.uk
-