- 03 8月, 2016 3 次提交
-
-
由 Chris Wilson 提交于
Now that emitting requests is identical between legacy and execlists, we can use the same function to build up the ring for submitting to either engine. (With the exception of i915_switch_contexts(), but in time that will also be handled gracefully.) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-30-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-21-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If is simpler and leads to more readable code through the callstack if the allocation returns the allocated struct through the return value. The importance of this is that it no longer looks like we accidentally allocate requests as side-effect of calling certain functions. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-19-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-9-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The state stored in this struct is not only the information about the buffer object, but the ring used to communicate with the hardware. Using buffer here is overly specific and, for me at least, conflates with the notion of buffer objects themselves. s/struct intel_ringbuffer/struct intel_ring/ s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/ s/describe_ctx_ringbuf()/describe_ctx_ring()/ s/intel_ring_get_active_head()/intel_engine_get_active_head()/ s/intel_ring_sync_index()/intel_engine_sync_index()/ s/intel_ring_init_seqno()/intel_engine_init_seqno()/ s/ring_stuck()/engine_stuck()/ s/intel_cleanup_engine()/intel_engine_cleanup()/ s/intel_stop_engine()/intel_engine_stop()/ s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/ s/intel_unpin_ringbuffer()/intel_unpin_ring()/ s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/ s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/ s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/ s/intel_ringbuffer_free()/intel_ring_free()/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk
-
- 27 7月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
A few places we use ring when referring to the struct intel_engine_cs. An anachronism we are pruning out. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-9-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-4-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
This patch transitions the execbuf engine selection away from using the ring nomenclature - though we still refer to the user's incoming selector as their user_ring_id. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-7-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-2-git-send-email-chris@chris-wilson.co.uk
-
- 26 7月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Upon release of the file (i.e. the user calls close(fd)), we decouple all objects from the client list so that we don't chase the dangling file_priv. As we always inspect file_priv first, we only need to nullify that pointer and can safely ignore the list_head. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-4-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469530913-17180-3-git-send-email-chris@chris-wilson.co.uk
-
- 24 7月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
During the idle-worker we disable the hangcheck and so kick any waiters that should have been completed (since the GPU is now idle). Unlike the hangcheck, we do not take any care to avoid the race between the irq handler and ourselves, and so it is possible for us to declare a missed interrupt even as the bottom-half is being scheduled to run. Let's ignore this race to stop a potential false-positive error. References: https://bugs.freedesktop.org/show_bug.cgi?id=96974Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469351421-13493-1-git-send-email-chris@chris-wilson.co.uk
-
- 22 7月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
This reverts commit b12e0ee2 ("drm/i915: Enable RC6 immediately"), as it was never meant to be sent anywhere other than the bug report for experimentation. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1469132179-4052-1-git-send-email-chris@chris-wilson.co.ukAcked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Now that PCU communication is reasonably fast, we do not need to defer RC6 initialisation to a workqueue. References: https://bugs.freedesktop.org/show_bug.cgi?id=97017Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 20 7月, 2016 10 次提交
-
-
由 Chris Wilson 提交于
Rather than recomputing whether semaphores are enabled, we can do that computation once during early initialisation as the i915.semaphores module parameter is now read-only. s/i915_semaphores_is_enabled/i915.semaphores/ v2: Add the state to the debug dmesg as well Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-10-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-9-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Whilst this ultimately wraps kref_put_mutex(), our goal here is the lockless variant, so keep the _unlocked() suffix until we need it no more. s/drm_gem_object_unreference_unlocked/i915_gem_object_put_unlocked/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-7-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-6-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Ultimately wraps kref_put(), so adopt its nomenclature for consistency with other subsystems. s/drm_gem_object_unreference/i915_gem_object_put/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-6-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-5-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Ultimately wraps kref_get(), so adopt its nomenclature for consistency with other subsystems. s/drm_gem_object_reference/i915_gem_object_get/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-5-git-send-email-chris@chris-wilson.co.ukReviewed-by: NDave Gordon <david.s.gordon@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-4-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
For symmetry with a forthcoming i915_gem_object_get() and i915_gem_object_put(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-4-git-send-email-chris@chris-wilson.co.ukReviewed-by: NDave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-3-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Now that we derive requests from struct fence, swap over to its nomenclature for references. It's shorter and more idiomatic across the kernel. s/i915_gem_request_reference/i915_gem_request_get/ s/i915_gem_request_unreference/i915_gem_request_put/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-2-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-1-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
When transitioning to the GTT or CPU domain we wait on all rendering from i915 to complete (with the optimisation of allowing concurrent read access by both the GPU and client). We don't yet ensure all rendering from third parties (tracked by implicit fences on the dma-buf) is complete. Since implicitly tracked rendering by third parties will ignore our cache-domain tracking, we have to always wait upon rendering from third-parties when transitioning to direct access to the backing store. We still rely on clients notifying us of cache domain changes (i.e. they need to move to the GTT read or write domain after doing a CPU access before letting the third party render again). v2: This introduces a potential WARN_ON into i915_gem_object_free() as the current i915_vma_unbind() calls i915_gem_object_wait_rendering(). To hit this path we first need to render with the GPU, have a dma-buf attached with an unsignaled fence and then interrupt the wait. It does get fixed later in the series (when i915_vma_unbind() only waits on the active VMA and not all, including third-party, rendering. To offset that risk, use the __i915_vma_unbind_no_wait hack. Testcase: igt/prime_vgem/basic-fence-read Testcase: igt/prime_vgem/basic-fence-mmap Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-8-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Since commit a6f766f3 ("drm/i915: Limit ring synchronisation (sw sempahores) RPS boosts") and commit bcafc4e3 ("drm/i915: Limit mmio flip RPS boosts") we have limited the waitboosting for semaphores and flips. Ideally we do not want to boost in either of these instances as no userspace consumer is waiting upon the results (though a userspace producer may be stalled trying to submit an execbuf - but in this case the producer is being throttled due to the engine being saturated with work). With the introduction of NO_WAITBOOST in the previous patch, we can finally disable these needless boosts. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-6-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Following a GPU reset upon hang, we retire all the requests and then mark them all as complete. If we mark them as complete first, we both keep the normal retirement order (completed first then retired) and provide a small optimisation for concurrent lookups. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-3-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Migrate the request operations out of the main body of i915_gem.c and into their own C file for easier expansion. v2: Move __i915_add_request() across as well Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-1-git-send-email-chris@chris-wilson.co.uk
-
- 19 7月, 2016 2 次提交
-
-
由 Daniel Vetter 提交于
Not sure why so much slips through when 0day is catching these. Hopefully the much faster sphinx toolchain helps in unlazying people. Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1468612088-9721-10-git-send-email-daniel.vetter@ffwll.ch
-
由 Chris Wilson 提交于
Even after adding individual page support for GTT mmaping, we can still fail to find any space within the mappable region, and drm_mm_insert_node() will then report ENOSPC. We have to then handle this error by using the shmem access to the pages. Fixes: b50a5371 ("drm/i915: Support for pread/pwrite ... objects") Testcase: igt/gem_concurrent_blit Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com Link: http://patchwork.freedesktop.org/patch/msgid/1468690956-23480-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 16 7月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Before suspend, and especially before building the hibernation image, we need to context image to be coherent in memory. To do this we require that we perform a context switch to a disposable context (i.e. the dev_priv->kernel_context) - when that switch is complete, all other context images will be complete. This leaves the kernel_context image as incomplete, but fortunately that is disposable and we can do a quick fixup of the logical state after resuming. v2: Share the nearly identical code to switch to the kernel context with eviction. v3: Explain why we need the switch and reset. Testcase: igt/gem_exec_suspend # bsw References: https://bugs.freedesktop.org/show_bug.cgi?id=96526Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468590980-6186-2-git-send-email-chris@chris-wilson.co.uk
-
- 14 7月, 2016 4 次提交
-
-
由 Chris Wilson 提交于
Some hardware requires a valid render context before it can initiate rc6 power gating of the GPU; the default state of the GPU is not sufficient and may lead to undefined behaviour. The first execution of any batch will load the "golden render state", at which point it is safe to enable rc6. As we do not forcibly load the kernel context at resume, we have to hook into the batch submission to be sure that the render state is setup before enabling rc6. However, since we don't enable powersaving until that first batch, we queued a delayed task in order to guarantee that the batch is indeed submitted. v2: Rearrange intel_disable_gt_powersave() to match. v3: Apply user specified cur_freq (or idle_freq if not set). v4: Give in, and supply a delayed work to autoenable rc6 v5: Mika suggested a couple of better names for delayed_resume_work v6: Rebalance rpm_put around the autoenable task Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468397438-21226-7-git-send-email-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
由 Chris Wilson 提交于
Upon resetting the GPU, we force the engines to be idle by clearing their request lists. However, I neglected to clear the GT active status and so the next request following the reset was not marking the device as busy again. (We had to wait until any outstanding retire worker finally ran and cleared the active status.) Fixes: 67d97da3 ("drm/i915: Only start retire worker when idle") Testcase: igt/pm_rps/reset Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468397438-21226-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Matthew Auld 提交于
This should already be handled by drm_gem_object_release, which is called later on. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467720019-31876-1-git-send-email-matthew.auld@intel.com
-
由 Tvrtko Ursulin 提交于
With the unified common engine setup done, and the execlist engine initialization loop clearly split into two phases, we can eliminate the separate legacy engine initialization code. v2: Fix cleanup path for legacy. v3: Rename constructors. (Chris Wilson) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris-wilson.co.uk>
-
- 11 7月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
Let's ensure that we cannot run indefinitely without the hangcheck worker being queued. We removed it from being kicked on every request because we were kicking it a few millions times in every hangcheck interval and only once is necessary! However, that leaves us with the issue of what if userspace never waits for a request, or runs out of resources, what if userspace just issues a request then spins on BUSY_IOCTL? Testcase: igt/gem_busy Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468055535-19740-3-git-send-email-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
由 Chris Wilson 提交于
Never go to sleep waiting on the GPU without first ensuring that we will get woken up. We have a choice of queuing the hangcheck before every schedule() or the first time we wakeup. In order to simply accommodate both the signaler and the ordinary waiter, move the queuing to the common point of enabling the irq. We lose the paranoid safety of ensuring that the hangcheck is active before the sleep, but avoid code duplication (and redundant hangcheck queuing). Testcase: igt/prime_busy Fixes: c81d4613 ("drm/i915: Convert trace-irq to the breadcrumb waiter") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468055535-19740-2-git-send-email-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
- 05 7月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
Since drm_i915_private is now a subclass of drm_device we do not need to chase the drm_i915_private->dev backpointer and can instead simply access drm_i915_private->drm directly. text data bss dec hex filename 1068757 4565 416 1073738 10624a drivers/gpu/drm/i915/i915.ko 1066949 4565 416 1071930 105b3a drivers/gpu/drm/i915/i915.ko Created by the coccinelle script: @@ struct drm_i915_private *d; identifier i; @@ ( - d->dev->i + d->drm.i | - d->dev + &d->drm ) and for good measure the dev_priv->dev backpointer was removed entirely. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
After Joonas complained about using READ_ONCE() on the only use of the variable in the function, where the intent was to simply document that the read was intentionally racy and unlocked, I switched the READ_ONCE() over to lockless_dereference(). However, in linux-next that has a stronger type-check to only allow pointers and is no longer interchangeable with READ_ONCE(), see commit 331b6d8c ("locking/barriers: Validate lockless_dereference() is used on a pointer type") Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Fixes: 67d97da3 ("drm/i915: Only start retire worker when idle") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467705276-707-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
- 04 7月, 2016 8 次提交
-
-
由 Dave Gordon 提交于
Also remove some redundant dev and dev_priv locals Signed-off-by: NDave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467626365-29871-1-git-send-email-david.s.gordon@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-2-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Since we now subclass struct drm_device, we can save pointer dances by noting the equivalence of struct drm_device and struct drm_i915_private, i.e. by using to_i915(). text data bss dec hex filename 1073824 4562 416 1078802 107612 drivers/gpu/drm/i915/i915.ko 1068976 4562 416 1073954 106322 drivers/gpu/drm/i915/i915.ko Created by the coccinelle script: @@ expression E; identifier p; @@ - struct drm_i915_private *p = E->dev_private; + struct drm_i915_private *p = to_i915(E); Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Now that we have (near) universal GPU recovery code, we can inject a real hang from userspace and not need any fakery. Not only does this mean that the testing is far more realistic, but we can simplify the kernel in the process. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-7-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Describe the intent of boosting the GPU frequency to maximum before waiting on the GPU. RPS waitboosting was introduced with commit b29c19b6 ("drm/i915: Boost RPS frequency for CPU stalls") but lacked a concise comment in the code to explain itself. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-5-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Ideally, we want to automagically have the GPU respond to the instantaneous load by reclocking itself. However, reclocking occurs relatively slowly, and to the client waiting for a result from the GPU, too late. To compensate and reduce the client latency, we allow the first wait from a client to boost the GPU clocks to maximum. This overcomes the lag in autoreclocking, at the expense of forcing the GPU clocks too high. So to offset the excessive power usage, we currently allow a client to only boost the clocks once before we detect the GPU is idle again. This works reasonably for say the first frame in a benchmark, but for many more synchronous workloads (like OpenCL) we find the GPU clocks remain too low. By noting a wait which would idle the GPU (i.e. we just waited upon the last known request), we can give that client the idle boost credit (for their next wait) without the 100ms delay required for us to detect the GPU idle state. The intention is to boost clients that are stalling in the process of feeding the GPU more work (and who in doing so let the GPU idle), without granting boost credits to clients that are throttling themselves (such as compositors). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: "Zou, Nanhai" <nanhai.zou@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-4-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We know, by design, that whilst the GPU is active (and thus we are throttling) the retire_worker is queued. Therefore attempting to requeue it with queue_delayed_work() is a no-op and we can safely remove it. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-3-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Rather than persistently postponing the idle-work everytime somebody calls i915_gem_retire_requests() (potentially ensuring that we never reach the idle state), queue the work the first time we detect all requests are complete. Then if in 100ms, more requests have been queued, we will abort the idle-worker and wait again until all the new requests have been completed. Of course, this does depend upon the idle worker cancelling itself gracefully from the previous patch. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-2-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The retire worker is a low frequency task that makes sure we retire outstanding requests if userspace is being lax. We only need to start it once as it remains active until the GPU is idle, so do a cheap test before the more expensive queue_work(). A consequence of this is that we need correct locking in the worker to make the hot path of request submission cheap. To keep the symmetry and keep hangcheck strictly bound by the GPU's wakelock, we move the cancel_sync(hangcheck) to the idle worker before dropping the wakelock. v2: Guard against RCU fouling the breadcrumbs bottom-half whilst we kick the waiter. v3: Remove the wakeref assertion squelching (now we hold a wakeref for the hangcheck, any rpm error there is genuine). v4: To prevent excess work when retiring requests, we split the busy flag into two, a boolean to denote whether we hold the wakeref and a bitmask of active engines. v5: Reorder cancelling hangcheck upon idling to avoid a race where we might cancel a hangcheck after being preempted by a new task Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> References: https://bugs.freedesktop.org/show_bug.cgi?id=88437Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-1-git-send-email-chris@chris-wilson.co.uk
-
- 02 7月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
If we convert the tracing over from direct use of ring->irq_get() and over to the breadcrumb infrastructure, we only have a single user of the ring->irq_get and so we will be able to simplify the driver routines (eliminating the redundant validation and irq refcounting). Process context is preferred over softirq (or even hardirq) for a couple of reasons: - we already utilize process context to have fast wakeup of a single client (i.e. the client waiting for the GPU inspects the seqno for itself following an interrupt to avoid the overhead of a context switch before it returns to userspace) - engine->irq_seqno() is not suitable for use from an softirq/hardirq context as we may require long waits (100-250us) to ensure the seqno write is posted before we read it from the CPU A signaling framework is a requirement for enabling dma-fences. v2: Move to a signaling framework based upon the waiter. v3: Track the first-signal to avoid having to walk the rbtree everytime. v4: Mark the signaler thread as RT priority to reduce latency in the indirect wakeups. v5: Make failure to allocate the thread fatal. v6: Rename kthreads to i915/signal:%u Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-16-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We have testcases to ensure that seqno wraparound works fine, so we can forgo forcing everyone to encounter seqno wraparound during early uptime. seqno wraparound incurs a full GPU stall so not forcing it will eliminate one jitter from the early system. Using the testcases, we have very deterministic testing which given how difficult it would be to debug an issue (GPU hang) stemming from a wraparound using pure postmortem analysis I see no value in forcing a wrap during boot. Advancing the global next_seqno after a GPU reset is equally pointless. References? https://bugs.freedesktop.org/show_bug.cgi?id=95023Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-15-git-send-email-chris@chris-wilson.co.uk
-