- 05 12月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
The i915_next_seqno read value is to be the next seqno used by the kernel. However, in the conversion to atomics ops for gt.next_seqno, in commit 28176ef4 ("drm/i915: Reserve space in the global seqno during request allocation"), this was changed from a post-increment to a pre-increment. This increment was missed from the value reported by debugfs, so in effect it was reporting the current seqno (last assigned), not the next seqno. Fixes: 28176ef4 ("drm/i915: Reserve space in the global seqno during request allocation") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81209Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161124094752.19129-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit 9607ae79) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Chris Wilson 提交于
i915_hws_info() has not been kept upto date (missing new engines) and so I consider it to be unused. HWS is included in the error state, which would be an avenue to retrieving it if required in future (possibly via i915_engine_info). As it is currently oopsing with an rpm testcase, just remove it. Fixes: 3b3f1650 ("drm/i915: Allocate intel_engine_cs structure only for the enabled engines") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98838Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161124093401.18852-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> (cherry picked from commit 30576a2c) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 17 11月, 2016 2 次提交
-
-
由 Tvrtko Ursulin 提交于
Similar to existing yesno and onoff and use it throughout the code. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479385814-2358-2-git-send-email-tvrtko.ursulin@linux.intel.com
-
由 Ville Syrjälä 提交于
It has been suggested that having per-plane modifiers is making life more difficult for userspace, so let's just retire modifier[1-3] and use modifier[0] to apply to the entire framebuffer. Obviosuly this means that if individual planes need different tiling layouts and whatnot we will need a new modifier for each combination of planes with different tiling layouts. For a bit of extra backwards compatilbilty the kernel will allow non-zero modifier[1+] but it require that they will match modifier[0]. This in case there's existing userspace out there that sets modifier[1+] to something non-zero with planar formats. Mostly a cocci job, with a bit of manual stuff mixed in. @@ struct drm_framebuffer *fb; expression E; @@ - fb->modifier[E] + fb->modifier @@ struct drm_framebuffer fb; expression E; @@ - fb.modifier[E] + fb.modifier Cc: Kristian Høgsberg <hoegsberg@gmail.com> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tomeu Vizoso <tomeu@tomeuvizoso.net> Cc: dczaplejewicz@collabora.co.uk Suggested-by: NKristian Høgsberg <hoegsberg@gmail.com> Acked-by: NBen Widawsky <ben@bwidawsk.net> Acked-by: NDaniel Stone <daniels@collabora.com> Acked-by: NRob Clark <robdclark@gmail.com> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1479295996-26246-1-git-send-email-ville.syrjala@linux.intel.com
-
- 15 11月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
Track the priority of each request and use it to determine the order in which we submit requests to the hardware via execlists. The priority of the request is determined by the user (eventually via the context) but may be overridden at any time by the driver. When we set the priority of the request, we bump the priority of all of its dependencies to match - so that a high priority drawing operation is not stuck behind a background task. When the request is ready to execute (i.e. we have signaled the submit fence following completion of all its dependencies, including third party fences), we put the request into a priority sorted rbtree to be submitted to the hardware. If the request is higher priority than all pending requests, it will be submitted on the next context-switch interrupt as soon as the hardware has completed the current request. We do not currently preempt any current execution to immediately run a very high priority request, at least not yet. One more limitation, is that this is first implementation is for execlists only so currently limited to gen8/gen9. v2: Replace recursive priority inheritance bumping with an iterative depth-first search list. v3: list_next_entry() for walking lists v4: Explain how the dfs solves the recursion problem with PI. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-8-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The execlist_lock is now completely subsumed by the engine->timeline->lock, and so we can remove the redundant layer of locking. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-5-chris@chris-wilson.co.uk
-
- 12 11月, 2016 1 次提交
-
-
由 Eric Engestrom 提交于
The function's behaviour was changed in 90844f00, without changing its signature, causing people to keep using it the old way without realising they were now leaking memory. Rob Clark also noticed it was also allocating GFP_KERNEL memory in atomic contexts, breaking them. Instead of having to allocate GFP_ATOMIC memory and fixing the callers to make them cleanup the memory afterwards, let's change the function's signature by having the caller take care of the memory and passing it to the function. The new parameter is a single-field struct in order to enforce the size of its buffer and help callers to correctly manage their memory. Fixes: 90844f00 ("drm: make drm_get_format_name thread-safe") Cc: Rob Clark <robdclark@gmail.com> Cc: Christian König <christian.koenig@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Acked-by: NRob Clark <robdclark@gmail.com> Acked-by: Sinclair Yeh <syeh@vmware.com> (vmwgfx) Reviewed-by: NJani Nikula <jani.nikula@intel.com> Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NEric Engestrom <eric@engestrom.ch> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161112011309.9799-1-eric@engestrom.ch
-
- 08 11月, 2016 1 次提交
-
-
由 Joonas Lahtinen 提交于
Get rid of sloppy inline functions now that we don't have more users: i915_gem_request_get_seqno i915_gem_request_get_engine v2: - request->engine is always non-NULL (Chris) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478589108-3702-1-git-send-email-joonas.lahtinen@linux.intel.com
-
- 02 11月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
When looking at freezes whilst working on execlists, knowing the order of the pending requests in the driver is useful. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20161027000348.4641-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Joonas Lahtinen 提交于
$ sed -i -r 's/\bglobal_list\b/global_link/g' *.c *.h Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478081764-8058-1-git-send-email-joonas.lahtinen@linux.intel.com
-
- 01 11月, 2016 5 次提交
-
-
由 Ville Syrjälä 提交于
Replace the open coded dev_priv->pipe_to_crtc_mapping[] usage with intel_get_crtc_for_pipe(). Mostly done with coccinelle, with a few manual tweaks @@ expression E1, E2; @@ ( - E1->pipe_to_crtc_mapping[E2] + intel_get_crtc_for_pipe(E1, E2) | - E1->plane_to_crtc_mapping[E2] + intel_get_crtc_for_plane(E1, E2) ) Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-12-git-send-email-ville.syrjala@linux.intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Ville Syrjälä 提交于
Unify our approach to things by passing around dev_priv instead of dev. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-11-git-send-email-ville.syrjala@linux.intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Ville Syrjälä 提交于
Unify our approach to things by passing around dev_priv instead of dev. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-8-git-send-email-ville.syrjala@linux.intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Ville Syrjälä 提交于
A lot of users of the {pipe,plane}_to_crtc_mapping[] will end up casting the result to intel_crtc, so let's just store the intel_crtc pointer in the first place. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-7-git-send-email-ville.syrjala@linux.intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
Whilst waiting on a request, we may do so without holding any locks or any guards beyond a reference to the request. In order to avoid taking locks within request deallocation, we drop references to its timeline (via the context and ppgtt) upon retirement. We should avoid chasing such pointers outside of their control, in particular we inspect the request->timeline to see if we may restore the RPS waitboost for a client. If we instead look at the engine->timeline, we will have similar behaviour on both full-ppgtt and !full-ppgtt systems and reduce the amount of reward we give towards stalling clients (i.e. only if the client stalls and the GPU is uncontended does it reclaim its boost). This restores behaviour back to pre-timelines, whilst fixing: [ 645.078485] BUG: KASAN: use-after-free in i915_gem_object_wait_fence+0x1ee/0x2e0 at addr ffff8802335643a0 [ 645.078577] Read of size 4 by task gem_exec_schedu/28408 [ 645.078638] CPU: 1 PID: 28408 Comm: gem_exec_schedu Not tainted 4.9.0-rc2+ #64 [ 645.078724] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [ 645.078816] ffff88022daef9a0 ffffffff8143d059 ffff880235402a80 ffff880233564200 [ 645.078998] ffff88022daef9c8 ffffffff81229c5c ffff88022daefa48 ffff880233564200 [ 645.079172] ffff880235402a80 ffff88022daefa38 ffffffff81229ef0 000000008110a796 [ 645.079345] Call Trace: [ 645.079404] [<ffffffff8143d059>] dump_stack+0x68/0x9f [ 645.079467] [<ffffffff81229c5c>] kasan_object_err+0x1c/0x70 [ 645.079534] [<ffffffff81229ef0>] kasan_report_error+0x1f0/0x4b0 [ 645.079601] [<ffffffff8122a244>] kasan_report+0x34/0x40 [ 645.079676] [<ffffffff81634f5e>] ? i915_gem_object_wait_fence+0x1ee/0x2e0 [ 645.079741] [<ffffffff81229951>] __asan_load4+0x61/0x80 [ 645.079807] [<ffffffff81634f5e>] i915_gem_object_wait_fence+0x1ee/0x2e0 [ 645.079876] [<ffffffff816364bf>] i915_gem_object_wait+0x19f/0x590 [ 645.079944] [<ffffffff81636320>] ? i915_gem_object_wait_priority+0x500/0x500 [ 645.080016] [<ffffffff8110fb30>] ? debug_show_all_locks+0x1e0/0x1e0 [ 645.080084] [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210 [ 645.080157] [<ffffffff8110a796>] ? __lock_is_held+0x46/0xc0 [ 645.080226] [<ffffffff8163bc61>] ? i915_gem_set_domain_ioctl+0x141/0x690 [ 645.080296] [<ffffffff8163bcc2>] i915_gem_set_domain_ioctl+0x1a2/0x690 [ 645.080366] [<ffffffff811f8f85>] ? __might_fault+0x75/0xe0 [ 645.080433] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640 [ 645.080508] [<ffffffff8163bb20>] ? i915_gem_obj_prepare_shmem_write+0x3a0/0x3a0 [ 645.080603] [<ffffffff815a52d0>] ? drm_ioctl_permit+0x120/0x120 [ 645.080670] [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210 [ 645.080738] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20 [ 645.080804] [<ffffffff8120268c>] ? do_mmap+0x47c/0x580 [ 645.080871] [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140 [ 645.080938] [<ffffffff812755f0>] ? ioctl_preallocate+0x150/0x150 [ 645.081011] [<ffffffff81108c53>] ? up_write+0x23/0x50 [ 645.081078] [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140 [ 645.081145] [<ffffffff811da450>] ? vma_is_stack_for_current+0x90/0x90 [ 645.081214] [<ffffffff8110d853>] ? mark_held_locks+0x23/0xc0 [ 645.082030] [<ffffffff81288408>] ? __fget+0x168/0x250 [ 645.082106] [<ffffffff819ad517>] ? entry_SYSCALL_64_fastpath+0x5/0xb1 [ 645.082176] [<ffffffff81288592>] ? __fget_light+0xa2/0xc0 [ 645.082242] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70 [ 645.082309] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 645.082374] Object at ffff880233564200, in cache kmalloc-8192 size: 8192 [ 645.082431] Allocated: [ 645.082480] PID = 28408 [ 645.082535] [ 645.082566] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20 [ 645.082623] [ 645.082656] [<ffffffff81228b06>] save_stack+0x46/0xd0 [ 645.082716] [ 645.082756] [<ffffffff812292fd>] kasan_kmalloc+0xad/0xe0 [ 645.082817] [ 645.082848] [<ffffffff81631752>] i915_ppgtt_create+0x52/0x220 [ 645.082908] [ 645.082941] [<ffffffff8161db96>] i915_gem_create_context+0x396/0x560 [ 645.083027] [ 645.083059] [<ffffffff8161f857>] i915_gem_context_create_ioctl+0x97/0xf0 [ 645.083152] [ 645.083183] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640 [ 645.083243] [ 645.083274] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20 [ 645.083334] [ 645.083372] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70 [ 645.083432] [ 645.083464] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 645.083551] Freed: [ 645.083599] PID = 27629 [ 645.083648] [ 645.083676] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20 [ 645.083738] [ 645.083770] [<ffffffff81228b06>] save_stack+0x46/0xd0 [ 645.083830] [ 645.083862] [<ffffffff81229203>] kasan_slab_free+0x73/0xc0 [ 645.083922] [ 645.083961] [<ffffffff812279c9>] kfree+0xa9/0x170 [ 645.084021] [ 645.084053] [<ffffffff81629f60>] i915_ppgtt_release+0x100/0x180 [ 645.084139] [ 645.084171] [<ffffffff8161d414>] i915_gem_context_free+0x1b4/0x230 [ 645.084257] [ 645.084288] [<ffffffff816537b2>] intel_lr_context_unpin+0x192/0x230 [ 645.084380] [ 645.084413] [<ffffffff81645250>] i915_gem_request_retire+0x620/0x630 [ 645.084500] [ 645.085226] [<ffffffff816473d1>] i915_gem_retire_requests+0x181/0x280 [ 645.085313] [ 645.085352] [<ffffffff816352ba>] i915_gem_retire_work_handler+0xca/0xe0 [ 645.085440] [ 645.085471] [<ffffffff810c725b>] process_one_work+0x4fb/0x920 [ 645.085532] [ 645.085562] [<ffffffff810c770d>] worker_thread+0x8d/0x840 [ 645.085622] [ 645.085653] [<ffffffff810d21e5>] kthread+0x185/0x1b0 [ 645.085718] [ 645.085750] [<ffffffff819ad7a7>] ret_from_fork+0x27/0x40 [ 645.085811] Memory state around the buggy address: [ 645.085869] ffff880233564280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 645.085956] ffff880233564300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 645.086053] >ffff880233564380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 645.086138] ^ [ 645.086193] ffff880233564400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 645.086283] ffff880233564480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb v2: Add a comment to document the hint like nature of intel_engine_last_submit() Fixes: 73cb9701 ("drm/i915: Combine seqno + tracking into a global timeline struct") Fixes: 80b204bc ("drm/i915: Enable multiple timelines") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-1-chris@chris-wilson.co.uk
-
- 29 10月, 2016 10 次提交
-
-
由 Chris Wilson 提交于
A restriction on our global seqno is that they cannot wrap, and that we cannot use the value 0. This allows us to detect when a request has not yet been submitted, its global seqno is still 0, and ensures that hardware semaphores are monotonic as required by older hardware. To meet these restrictions when we defer the assignment of the global seqno, we must check that we have an available slot in the global seqno space during request construction. If that test fails, we wait for all requests to be completed and reset the hardware back to 0. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-33-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The breadcrumbs are about to be used from within IRQ context sections (e.g. nouveau signals a fence from an interrupt handler causing us to submit a new request) and/or from bottom-half tasklets (i.e. intel_lrc_irq_handler), therefore we need to employ the irqsafe spinlock variants. For example, deferring the request submission to the intel_lrc_irq_handler generates this trace: [ 66.388639] ================================= [ 66.388650] [ INFO: inconsistent lock state ] [ 66.388663] 4.9.0-rc2+ #56 Not tainted [ 66.388672] --------------------------------- [ 66.388682] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 66.388695] swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] takes: [ 66.388706] (&(&b->lock)->rlock){+.?...} , at: [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150 [ 66.388761] {SOFTIRQ-ON-W} state was registered at: [ 66.388772] [ 66.388783] [<ffffffff810bd842>] __lock_acquire+0x682/0x1870 [ 66.388795] [ 66.388803] [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0 [ 66.388814] [ 66.388824] [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40 [ 66.388835] [ 66.388845] [<ffffffff81401e41>] intel_engine_reset_breadcrumbs+0x21/0xb0 [ 66.388857] [ 66.388866] [<ffffffff81403ae7>] gen8_init_common_ring+0x67/0x100 [ 66.388878] [ 66.388887] [<ffffffff81403b92>] gen8_init_render_ring+0x12/0x60 [ 66.388903] [ 66.388912] [<ffffffff813f8707>] i915_gem_init_hw+0xf7/0x2a0 [ 66.388927] [ 66.388936] [<ffffffff813f899b>] i915_gem_init+0xbb/0xf0 [ 66.388950] [ 66.388959] [<ffffffff813b4980>] i915_driver_load+0x7e0/0x1330 [ 66.388978] [ 66.388988] [<ffffffff813c09d8>] i915_pci_probe+0x28/0x40 [ 66.389003] [ 66.389013] [<ffffffff812fa0db>] pci_device_probe+0x8b/0xf0 [ 66.389028] [ 66.389037] [<ffffffff8147737e>] driver_probe_device+0x21e/0x430 [ 66.389056] [ 66.389065] [<ffffffff8147766e>] __driver_attach+0xde/0xe0 [ 66.389080] [ 66.389090] [<ffffffff814751ad>] bus_for_each_dev+0x5d/0x90 [ 66.389105] [ 66.389113] [<ffffffff81477799>] driver_attach+0x19/0x20 [ 66.389134] [ 66.389144] [<ffffffff81475ced>] bus_add_driver+0x15d/0x260 [ 66.389159] [ 66.389168] [<ffffffff81477e3b>] driver_register+0x5b/0xd0 [ 66.389183] [ 66.389281] [<ffffffff812fa19b>] __pci_register_driver+0x5b/0x60 [ 66.389301] [ 66.389312] [<ffffffff81aed333>] i915_init+0x3e/0x45 [ 66.389326] [ 66.389336] [<ffffffff81ac2ffa>] do_one_initcall+0x8b/0x118 [ 66.389350] [ 66.389359] [<ffffffff81ac323a>] kernel_init_freeable+0x1b3/0x23b [ 66.389378] [ 66.389387] [<ffffffff8160fc39>] kernel_init+0x9/0x100 [ 66.389402] [ 66.389411] [<ffffffff816180e7>] ret_from_fork+0x27/0x40 [ 66.389426] irq event stamp: 315865 [ 66.389438] hardirqs last enabled at (315864): [<ffffffff816178f1>] _raw_spin_unlock_irqrestore+0x31/0x50 [ 66.389469] hardirqs last disabled at (315865): [<ffffffff816176b3>] _raw_spin_lock_irqsave+0x13/0x50 [ 66.389499] softirqs last enabled at (315818): [<ffffffff8107a04c>] _local_bh_enable+0x1c/0x50 [ 66.389530] softirqs last disabled at (315819): [<ffffffff8107a50e>] irq_exit+0xbe/0xd0 [ 66.389559] [ 66.389559] other info that might help us debug this: [ 66.389580] Possible unsafe locking scenario: [ 66.389580] [ 66.389598] CPU0 [ 66.389609] ---- [ 66.389620] lock(&(&b->lock)->rlock); [ 66.389650] <Interrupt> [ 66.389661] lock(&(&b->lock)->rlock); [ 66.389690] [ 66.389690] *** DEADLOCK *** [ 66.389690] [ 66.389715] 2 locks held by swapper/1/0: [ 66.389728] #0: (&(&tl->lock)->rlock){..-...}, at: [<ffffffff81403e01>] intel_lrc_irq_handler+0x201/0x3c0 [ 66.389785] #1: (&(&req->lock)->rlock/1){..-...}, at: [<ffffffff813fc0af>] __i915_gem_request_submit+0x8f/0x170 [ 66.389854] [ 66.389854] stack backtrace: [ 66.389959] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc2+ #56 [ 66.389976] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [ 66.389999] ffff88027fd03c58 ffffffff812beae5 ffff88027696e680 ffffffff822afe20 [ 66.390036] ffff88027fd03ca8 ffffffff810bb420 0000000000000001 0000000000000000 [ 66.390070] 0000000000000000 0000000000000006 0000000000000004 ffff88027696ee10 [ 66.390104] Call Trace: [ 66.390117] <IRQ> [ 66.390128] [<ffffffff812beae5>] dump_stack+0x68/0x93 [ 66.390147] [<ffffffff810bb420>] print_usage_bug+0x1d0/0x1e0 [ 66.390164] [<ffffffff810bb8a0>] mark_lock+0x470/0x4f0 [ 66.390181] [<ffffffff810ba9d0>] ? print_shortest_lock_dependencies+0x1b0/0x1b0 [ 66.390203] [<ffffffff810bd75d>] __lock_acquire+0x59d/0x1870 [ 66.390221] [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0 [ 66.390237] [<ffffffff810bedbc>] ? lock_acquire+0x6c/0xb0 [ 66.390255] [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150 [ 66.390273] [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40 [ 66.390291] [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150 [ 66.390309] [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150 [ 66.390327] [<ffffffff813fc170>] __i915_gem_request_submit+0x150/0x170 [ 66.390345] [<ffffffff81403e8b>] intel_lrc_irq_handler+0x28b/0x3c0 [ 66.390363] [<ffffffff81079d97>] tasklet_action+0x57/0xc0 [ 66.390380] [<ffffffff8107a249>] __do_softirq+0x119/0x240 [ 66.390396] [<ffffffff8107a50e>] irq_exit+0xbe/0xd0 [ 66.390414] [<ffffffff8101afd5>] do_IRQ+0x65/0x110 [ 66.390431] [<ffffffff81618806>] common_interrupt+0x86/0x86 [ 66.390446] <EOI> [ 66.390457] [<ffffffff814ec6d1>] ? cpuidle_enter_state+0x151/0x200 [ 66.390480] [<ffffffff814ec7a2>] cpuidle_enter+0x12/0x20 [ 66.390498] [<ffffffff810b639e>] call_cpuidle+0x1e/0x40 [ 66.390516] [<ffffffff810b65ae>] cpu_startup_entry+0x10e/0x1f0 [ 66.390534] [<ffffffff81036133>] start_secondary+0x103/0x130 (This is split out of the defer global seqno allocation patch due to realisation that we need a more complete conversion if we want to defer request submission even further.) v2: lockdep was warning about mixed SOFTIRQ contexts not HARDIRQ contexts so we only need to use spin_lock_bh and not disable interrupts. v3: We need full irq protection as we may be called from a third party interrupt handler (via fences). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-32-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
This will be used for communicating issues with this context to userspace, so we want to identify the parent process and the individual context. Note that the name isn't quite unique, it makes the presumption of there only being a single device fd per process. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-31-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Currently we try to reduce the number of synchronisations (now the number of requests we need to wait upon) by noting that if we have earlier waited upon a request, all subsequent requests in the timeline will be after the wait. This only applies to requests in this timeline, as other timelines will not be ordered by that waiter. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-30-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Though we will have multiple timelines, we still have a single timeline of execution. This we can use to provide an execution and retirement order of requests. This keeps tracking execution of requests simple, and vital for preserving a single waiter (i.e. so that we can order the waiters so that only the earliest to wakeup need be woken). To accomplish this we distinguish the seqno used to order requests per-context (external) and that used internally for execution. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-26-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Our timelines are more than just a seqno. They also provide an ordered list of requests to be executed. Due to the restriction of handling individual address spaces, we are limited to a timeline per address space but we use a fence context per engine within. Our first step to introducing independent timelines per context (i.e. to allow each context to have a queue of requests to execute that have a defined set of dependencies on other requests) is to provide a timeline abstraction for the global execution queue. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In preparation to support many distinct timelines, we need to expand the activity tracking on the GEM object to handle more than just a request per engine. We already use the struct reservation_object on the dma-buf to handle many fence contexts, so integrating that into the GEM object itself is the preferred solution. (For example, we can now share the same reservation_object between every consumer/producer using this buffer and skip the manual import/export via dma-buf.) v2: Reimplement busy-ioctl (by walking the reservation object), postpone the ABI change for another day. Similarly use the reservation object to find the last_write request (if active and from i915) for choosing display CS flips. Caveats: * busy-ioctl: busy-ioctl only reports on the native fences, it will not warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is being rendered to by external fences. It also will not report the same busy state as wait-ioctl (or polling on the dma-buf) in the same circumstances. On the plus side, it does retain reporting of which *i915* engines are engaged with this object. * non-blocking atomic modesets take a step backwards as the wait for render completion blocks the ioctl. This is fixed in a subsequent patch to use a fence instead for awaiting on the rendering, see "drm/i915: Restore nonblocking awaits for modesetting" * dynamic array manipulation for shared-fences in reservation is slower than the previous lockless static assignment (e.g. gem_exec_lut_handle runtime on ivb goes from 42s to 66s), mainly due to atomic operations (maintaining the fence refcounts). * loss of object-level retirement callbacks, emulated by VMA retirement tracking. * minor loss of object-level last activity information from debugfs, could be replaced with per-vma information if desired Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We want to hide the latency of releasing objects and their backing storage from the submission, so we move the actual free to a worker. This allows us to switch to struct_mutex freeing of the object in the next patch. Furthermore, if we know that the object we are dereferencing remains valid for the duration of our access, we can forgo the usual synchronisation barriers and atomic reference counting. To ensure this we defer freeing an object til after an RCU grace period, such that any lookup of the object within an RCU read critical section will remain valid until after we exit that critical section. We also employ this delay for rate-limiting the serialisation on reallocation - we have to slow down object creation in order to prevent resource starvation (in particular, files). v2: Return early in i915_gem_tiling() ioctl to skip over superfluous work on error. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-19-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The plan is to make obtaining the backing storage for the object avoid struct_mutex (i.e. use its own locking). The first step is to update the API so that normal users only call pin/unpin whilst working on the backing storage. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
-
由 Matt Roper 提交于
This macro's name is a bit misleading; it doesn't actually iterate over all planes since it omits the cursor plane. Its only uses are in gen9 code which is using it to iterate over the universal planes (which we treat as primary+sprites); in these cases the legacy cursor registers are programmed independently if necessary. The macro's iterator value (0 for primary plane, spritenum+1 for each secondary plane) also isn't meaningful outside the gen9 context where the hardware considers them to all be "universal" planes that follow this numbering. This is just a renaming/clarification patch with no functional change. However it will make the subsequent patches more clear. Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1477522291-10874-2-git-send-email-matthew.d.roper@intel.comReviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
-
- 25 10月, 2016 4 次提交
-
-
由 Sagar Arun Kamble 提交于
This patch provides debugfs interface i915_guc_output_control for on the fly enabling/disabling of logging in GuC firmware and controlling the verbosity level of logs. The value written to the file, should have bit 0 set to enable logging and bits 4-7 should contain the verbosity info. v2: Add a forceful flush, to collect left over logs, on disabling logging. Useful for Validation. v3: Besides minor cleanup, implement read method for the debugfs file and set the guc_log_level to -1 when logging is disabled. (Tvrtko) v4: Minor cleanup & rebase. (Tvrtko) v5: - Lock struct_mutex after the NULL check for guc log buffer vma. (Chris) - Rebase. Signed-off-by: NSagar Arun Kamble <sagar.a.kamble@intel.com> Signed-off-by: NAkash Goel <akash.goel@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
由 Akash Goel 提交于
GuC firmware sends an interrupt to flush the log buffer when it becomes half full. GuC firmware also tracks how many times the buffer overflowed. It would be useful to maintain a statistics of how many flush interrupts were received and for which type of log buffer, along with the overflow count of each buffer type. Augmented i915_log_info debugfs to report back these statistics. v2: - Update the logic to detect multiple overflows between the 2 flush interrupts and also log a message for overflow (Tvrtko) - Track the number of times there was no free sub buffer to capture the GuC log buffer. (Tvrtko) v3: - Fix the printf field width for overflow counter, set it to 10 as per the max value of u32, which takes 10 digits in decimal form. (Tvrtko) v4: - Move the log buffer overflow handling to a new function for better readability. (Tvrtko) Signed-off-by: NAkash Goel <akash.goel@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
由 Akash Goel 提交于
So far there were 2 fields related to GuC logs in 'intel_guc' structure. For the support of capturing GuC logs & storing them in a local buffer, multiple new fields would have to be added. This warrants a separate structure to contain the fields related to GuC logging state. Added a new structure 'intel_guc_log' and instance of it inside 'intel_guc' structure. v2: Rebase. Signed-off-by: NAkash Goel <akash.goel@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
由 Paulo Zanoni 提交于
Its size is 11:0 instead of 10:0. Found by inspecting the spec. I'm not aware of any real-world IGT failures caused by this. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1477065346-13736-2-git-send-email-paulo.r.zanoni@intel.com
-
- 24 10月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
We can remove the false coupling between RPM and struct mutex by the observation that we can use the RPM wakeref as the barrier around user mmap access. That is as we tear down the user's PTE atomically from within rpm suspend and then to fault in new PTE requires the rpm wakeref, means that no user access is possible through those PTE without RPM being awake. Having made that observation, we can then remove the presumption of having to take rpm outside of struct_mutex and so allow fine grained acquisition of a wakeref around hw access rather than having to remember to acquire the wakeref early on. v2: Rejig placement of the new intel_runtime_pm_get() to be as tight as possible around the GTT pread/pwrite. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NDaniel Vetter <daniel@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We want to decouple RPM and struct_mutex, but currently RPM has to walk the list of bound objects and remove userspace mmapping before we suspend (otherwise userspace may continue to access the GTT whilst it is powered down). This currently requires the struct_mutex to walk the bound_list, but if we move that to a separate list and lock we can take the first step towards removing the struct_mutex. v2: Split runtime suspend unmapping vs regular unmapping, to make the locking (and barriers) clearer. Add the object to the userfault_list prior to inserting the first PTE, the race between add/revoke depends upon struct_mutex for regular unmappings and rpm for runtime-suspend. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #v1 Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-1-chris@chris-wilson.co.uk
-
- 18 10月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Internally we allow for using more objects than a single process can allocate, i.e. we allow for a 64bit GPU address space even on a 32bit system. Using size_t may oveerflow. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161018120251.25043-1-chris@chris-wilson.co.uk
-
- 17 10月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
drm_atomic_state has a complicated single owner model that tracks the single reference from allocation through to destruction on another thread - or perhaps on a local error path. We can simplify this tracking by using reference counting (at a cost of a few more atomics). This is even more beneficial when the lifetime of the state becomes more convoluted than being passed to a single worker thread for the commit. v2: Double check !intel atomic_commit functions for missing gets v3: Update kerneldocs Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Reviewed-by: NEric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: NSean Paul <seanpaul@chromium.org> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161014121833.439-27-chris@chris-wilson.co.uk
-
- 14 10月, 2016 2 次提交
-
-
由 Tvrtko Ursulin 提交于
Saves 1416 bytes of .rodata strings. v2: Add parantheses around dev_priv. (Ville Syrjala) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1476352990-2504-1-git-send-email-tvrtko.ursulin@linux.intel.com
-
由 Akash Goel 提交于
With the possibility of addition of many more number of rings in future, the drm_i915_private structure could bloat as an array, of type intel_engine_cs, is embedded inside it. struct intel_engine_cs engine[I915_NUM_ENGINES]; Though this is still fine as generally there is only a single instance of drm_i915_private structure used, but not all of the possible rings would be enabled or active on most of the platforms. Some memory can be saved by allocating intel_engine_cs structure only for the enabled/active engines. Currently the engine/ring ID is kept static and dev_priv->engine[] is simply indexed using the enums defined in intel_engine_id. To save memory and continue using the static engine/ring IDs, 'engine' is defined as an array of pointers. struct intel_engine_cs *engine[I915_NUM_ENGINES]; dev_priv->engine[engine_ID] will be NULL for disabled engine instances. There is a text size reduction of 928 bytes, from 1028200 to 1027272, for i915.o file (but for i915.ko file text size remain same as 1193131 bytes). v2: - Remove the engine iterator field added in drm_i915_private structure, instead pass a local iterator variable to the for_each_engine** macros. (Chris) - Do away with intel_engine_initialized() and instead directly use the NULL pointer check on engine pointer. (Chris) v3: - Remove for_each_engine_id() macro, as the updated macro for_each_engine() can be used in place of it. (Chris) - Protect the access to Render engine Fault register with a NULL check, as engine specific init is done later in Driver load sequence. v4: - Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris) - Kill the superfluous init_engine_lists(). v5: - Cleanup the intel_engines_init() & intel_engines_setup(), with respect to allocation of intel_engine_cs structure. (Chris) v6: - Rebase. v7: - Optimize the for_each_engine_masked() macro. (Chris) - Change the type of 'iter' local variable to enum intel_engine_id. (Chris) - Rebase. v8: Rebase. v9: Rebase. v10: - For index calculation use engine ID instead of pointer based arithmetic in intel_engine_sync_index() as engine pointers are not contiguous now (Chris) - For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas) - Use for_each_engine macro for cleanup in intel_engines_init() and remove check for NULL engine pointer in cleanup() routines. (Joonas) v11: Rebase. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NAkash Goel <akash.goel@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
-
- 12 10月, 2016 3 次提交
-
-
由 Chris Wilson 提交于
The current meaning of whether an object has a GGTT vma is very ill-defined (and note we don't check for any partials either), it just means that at some point it was in the GGTT but it may not be now. The information we really care about here is whether it is taking up precious mappable aperture space. This is the obj->fault_mappable flag. We have a redundant long form reprinting of this information, so remove that in favour of the compact flag. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161012114827.17031-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We currently capture the GPU state after we detect a hang. This is vital for us to both triage and debug hangs in the wild (post-mortem debugging). However, it comes at the cost of running some potentially dangerous code (since it has to make very few assumption about the state of the driver) that is quite resource intensive. This patch introduces both a method to disable error capture at runtime (for users who hit bugs at runtime and need a workaround) and to disable error capture at compiletime (for realtime users who want to minimise any possible latency, and never require error capture, saving ~30k of code). The cost is that we now have to be wary of (and test!) a kconfig flag and a module parameter. The effect of the module parameter is easy to verify through code inspection and runtime testing, but a kconfig flag needs regular compile checking. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In the next patch, I want to conditionally compile i915_gpu_error.c and that requires moving the functions used by debug out of i915_gpu_error.c! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-1-chris@chris-wilson.co.uk
-
- 05 10月, 2016 2 次提交
-
-
由 Joonas Lahtinen 提交于
Get rid of SEP_SEMICOLON and SEP_BLANK in DEV_INFO_FOR_EACH_FLAG. Consolidate the debug output so that instead of one huge line with "cap1,cap2,capN" each capability is split to own line and displayed as "capN: [yes|no]" to make the dumps more historically informative. v2: - Do not break auto-indent by keeping semicolon after macro (Jani) - Consolidate and use yesno() in all locations (Chris) Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
It is convenient to know what processes are waiting when looking at hangcheck status in debugfs. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161004201132.21801-8-chris@chris-wilson.co.uk
-