提交 · 0adb90d330bb5f0d7fba511af5af3fc1ba93fb7a · openeuler / raspberrypi-kernel

09 5月, 2018 1 次提交

drm/i915: Annotate timeline lock nesting · 0adb90d3

由 Chris Wilson 提交于 5月 08, 2018

CI noticed

<4>[   23.430701] ============================================
<4>[   23.430706] WARNING: possible recursive locking detected
<4>[   23.430713] 4.17.0-rc4-CI-CI_DRM_4156+ #1 Not tainted
<4>[   23.430720] --------------------------------------------
<4>[   23.430725] systemd-udevd/169 is trying to acquire lock:
<4>[   23.430732]         (ptrval) (&(&timeline->lock)->rlock){....}, at: move_to_timeline+0x48/0x12c [i915]
<4>[   23.430888]
                  but task is already holding lock:
<4>[   23.430894]         (ptrval) (&(&timeline->lock)->rlock){....}, at: i915_request_submit+0x1a/0x40 [i915]
<4>[   23.430995]
                  other info that might help us debug this:
<4>[   23.431002]  Possible unsafe locking scenario:

<4>[   23.431007]        CPU0
<4>[   23.431010]        ----
<4>[   23.431013]   lock(&(&timeline->lock)->rlock);
<4>[   23.431021]   lock(&(&timeline->lock)->rlock);
<4>[   23.431028]
                   *** DEADLOCK ***

<4>[   23.431036]  May be due to missing lock nesting notation

<4>[   23.431044] 5 locks held by systemd-udevd/169:
<4>[   23.431049]  #0:         (ptrval) (&dev->mutex){....}, at: __driver_attach+0x42/0xe0
<4>[   23.431065]  #1:         (ptrval) (&dev->mutex){....}, at: __driver_attach+0x50/0xe0
<4>[   23.431078]  #2:         (ptrval) (&dev->struct_mutex){+.+.}, at: i915_gem_init+0xca/0x630 [i915]
<4>[   23.431174]  #3:         (ptrval) (rcu_read_lock){....}, at: submit_notify+0x35/0x124 [i915]
<4>[   23.431271]  #4:         (ptrval) (&(&timeline->lock)->rlock){....}, at: i915_request_submit+0x1a/0x40 [i915]
<4>[   23.431369]
                  stack backtrace:
<4>[   23.431377] CPU: 0 PID: 169 Comm: systemd-udevd Not tainted 4.17.0-rc4-CI-CI_DRM_4156+ #1
<4>[   23.431385] Hardware name: Dell Inc.                 OptiPlex GX280               /0G8310, BIOS A04 02/09/2005
<4>[   23.431394] Call Trace:
<4>[   23.431403]  dump_stack+0x67/0x9b
<4>[   23.431411]  __lock_acquire+0xc67/0x1b50
<4>[   23.431421]  ? ring_buffer_lock_reserve+0x154/0x3f0
<4>[   23.431429]  ? lock_acquire+0xa6/0x210
<4>[   23.431435]  lock_acquire+0xa6/0x210
<4>[   23.431530]  ? move_to_timeline+0x48/0x12c [i915]
<4>[   23.431540]  _raw_spin_lock+0x2a/0x40
<4>[   23.431634]  ? move_to_timeline+0x48/0x12c [i915]
<4>[   23.431730]  move_to_timeline+0x48/0x12c [i915]
<4>[   23.431826]  __i915_request_submit+0xfa/0x280 [i915]
<4>[   23.431923]  i915_request_submit+0x25/0x40 [i915]
<4>[   23.432024]  i9xx_submit_request+0x11/0x140 [i915]
<4>[   23.432120]  submit_notify+0x8d/0x124 [i915]
<4>[   23.432202]  __i915_sw_fence_complete+0x81/0x250 [i915]
<4>[   23.432300]  __i915_request_add+0x31c/0x7c0 [i915]
<4>[   23.432395]  i915_gem_init+0x621/0x630 [i915]
<4>[   23.432476]  i915_driver_load+0xbee/0x10b0 [i915]
<4>[   23.432485]  ? trace_hardirqs_on_caller+0xe0/0x1b0
<4>[   23.432566]  i915_pci_probe+0x29/0x90 [i915]
<4>[   23.432574]  pci_device_probe+0xa1/0x130
<4>[   23.432582]  driver_probe_device+0x306/0x480
<4>[   23.432589]  __driver_attach+0xb7/0xe0
<4>[   23.432596]  ? driver_probe_device+0x480/0x480
<4>[   23.432602]  ? driver_probe_device+0x480/0x480
<4>[   23.432609]  bus_for_each_dev+0x74/0xc0
<4>[   23.432616]  bus_add_driver+0x15f/0x250
<4>[   23.432623]  ? 0xffffffffa02d7000
<4>[   23.432629]  driver_register+0x52/0xc0
<4>[   23.432635]  ? 0xffffffffa02d7000
<4>[   23.432642]  do_one_initcall+0x58/0x370
<4>[   23.432653]  ? do_init_module+0x1d/0x1ea
<4>[   23.432660]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[   23.432667]  ? kmem_cache_alloc_trace+0x282/0x2e0
<4>[   23.432675]  do_init_module+0x56/0x1ea
<4>[   23.432682]  load_module+0x2435/0x2b20
<4>[   23.432694]  ? __se_sys_finit_module+0xd3/0xf0
<4>[   23.432701]  __se_sys_finit_module+0xd3/0xf0
<4>[   23.432710]  do_syscall_64+0x55/0x190
<4>[   23.432717]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[   23.432724] RIP: 0033:0x7fa780782839
<4>[   23.432729] RSP: 002b:00007ffcea73e668 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
<4>[   23.432738] RAX: ffffffffffffffda RBX: 0000561a472a4b30 RCX: 00007fa780782839
<4>[   23.432745] RDX: 0000000000000000 RSI: 00007fa7804610e5 RDI: 000000000000000e
<4>[   23.432752] RBP: 00007fa7804610e5 R08: 0000000000000000 R09: 00007ffcea73e780
<4>[   23.432758] R10: 000000000000000e R11: 0000000000000246 R12: 0000000000000000
<4>[   23.432765] R13: 0000561a47296450 R14: 0000000000020000 R15: 0000561a472a4b30

but did not report it as an issue as it only occurred during the first
module on boot. This is due to the removal of the distinct global
timeline, and its separate lock class. So instead mark up the expected
nesting. An alternative would be to define a separate lock class for the
engine, but since we only expect to have a single point of nesting, we
can avoid having multiple lock classes for the struct.

Fixes: a89d1f92 ("drm/i915: Split i915_gem_timeline into individual timelines")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Tested-by: NMichel Thierry <michel.thierry@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180508153514.20251-1-chris@chris-wilson.co.uk

0adb90d3

08 5月, 2018 1 次提交

drm/i915: Disable tasklet scheduling across initial scheduling · 71ace7ca

由 Chris Wilson 提交于 5月 07, 2018

During request submission, we call the engine->schedule() function so
that we may reorder the active requests as required for inheriting the
new request's priority. This may schedule several tasklets to run on the
local CPU, but we will need to schedule the tasklets again for the new
request. Delay all the local tasklets until the end, so that we only
have to process the queue just once.

v2: Beware PREEMPT_RCU, as then local_bh_disable() is then not a
superset of rcu_read_lock().
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180507135731.10587-2-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

71ace7ca

04 5月, 2018 2 次提交

drm/i915: Remove assertion of active_rings must be non-empty if active_requests · 43c8c441

由 Chris Wilson 提交于 5月 04, 2018

"An outstanding request must still be on an active ring somewhere" is
only true if we haven't just been interrupted by the shrinker in the
middle of allocating the request itself. (At the start of
i915_request_alloc() we pin the context and prepare the GT for activity,
marking it as active, and then try to allocate the request. If this
allocation invokes the shrinker, we try to reclaim some space by calling
i915_retire_requests() which may then be confused by the pre-reservation
of active_requests.)

<3>[  125.472695] i915_retire_requests:1429 GEM_BUG_ON(list_empty(&i915->gt.active_rings))
<2>[  125.472792] kernel BUG at drivers/gpu/drm/i915/i915_request.c:1429!
<4>[  125.472822] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
<4>[  125.498764] Modules linked in: snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel btusb btrtl btbcm btintel cdc_ether snd_hda_codec_realtek bluetooth i915 snd_hda_codec_generic usbnet r8152 mii ecdh_generic lpc_ich mei_me snd_hda_intel snd_hda_codec mei snd_hwdep snd_hda_core snd_pcm prime_numbers
<4>[  125.498923] CPU: 0 PID: 1115 Comm: gem_exec_create Tainted: G     U            4.17.0-rc3-gc49cbe0d1eb8-kasan_32+ #1
<4>[  125.498955] Hardware name: GOOGLE Peppy/Peppy, BIOS MrChromebox 02/04/2018
<4>[  125.499074] RIP: 0010:i915_retire_requests+0x3f2/0x590 [i915]
<4>[  125.499095] RSP: 0018:ffff88004e5dec40 EFLAGS: 00010282
<4>[  125.499117] RAX: 0000000000000010 RBX: ffff8800458f0000 RCX: 0000000000000000
<4>[  125.499140] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff880060c2f6f0
<4>[  125.499164] RBP: ffff88004e5dee30 R08: ffffed000c185ee6 R09: ffffed000c185ee6
<4>[  125.499187] R10: 0000000000000001 R11: ffffed000c185ee5 R12: ffff8800553da160
<4>[  125.499210] R13: dffffc0000000000 R14: 0000000000000000 R15: ffff8800458faed0
<4>[  125.499235] FS:  00007fe18f052980(0000) GS:ffff880065400000(0000) knlGS:0000000000000000
<4>[  125.499262] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  125.499282] CR2: 00007f01df11efb8 CR3: 00000000518d4001 CR4: 00000000000606f0
<4>[  125.499304] Call Trace:
<4>[  125.499417]  i915_gem_shrink+0x576/0xb50 [i915]
<4>[  125.499532]  ? i915_gem_shrinker_count+0x2f0/0x2f0 [i915]
<4>[  125.499561]  ? trace_hardirqs_on_thunk+0x1a/0x1c
<4>[  125.499671]  ? i915_gem_shrinker_count+0x1d6/0x2f0 [i915]
<4>[  125.499782]  ? i915_gem_shrinker_scan+0xc4/0x320 [i915]
<4>[  125.499889]  i915_gem_shrinker_scan+0xc4/0x320 [i915]
<4>[  125.499997]  ? i915_gem_shrinker_vmap+0x3a0/0x3a0 [i915]
<4>[  125.500021]  ? do_raw_spin_unlock+0x4f/0x240
<4>[  125.500042]  ? _raw_spin_unlock+0x29/0x40
<4>[  125.500149]  ? i915_gem_shrinker_count+0x1d6/0x2f0 [i915]
<4>[  125.500177]  shrink_slab.part.18+0x23e/0x8f0
<4>[  125.500202]  ? unregister_shrinker+0x1f0/0x1f0
<4>[  125.500226]  ? mem_cgroup_iter+0x379/0xcc0
<4>[  125.500249]  shrink_node+0xa7e/0x1180
<4>[  125.500276]  ? shrink_node_memcg+0x11f0/0x11f0
<4>[  125.500297]  ? __delayacct_freepages_start+0x38/0x80
<4>[  125.500319]  ? __is_insn_slot_addr+0xe3/0x1a0
<4>[  125.500342]  ? recalibrate_cpu_khz+0x10/0x10
<4>[  125.500361]  ? ktime_get+0xb2/0x140
<4>[  125.500382]  do_try_to_free_pages+0x2d3/0xe40
<4>[  125.500407]  ? allow_direct_reclaim.part.23+0x1e0/0x1e0
<4>[  125.500429]  ? shrink_node+0x1180/0x1180
<4>[  125.500450]  ? __read_once_size_nocheck.constprop.4+0x10/0x10
<4>[  125.500476]  try_to_free_pages+0x1af/0x560
<4>[  125.500497]  ? do_try_to_free_pages+0xe40/0xe40
<4>[  125.500525]  __alloc_pages_nodemask+0xadc/0x2130
<4>[  125.500553]  ? gfp_pfmemalloc_allowed+0x150/0x150
<4>[  125.500654]  ? i915_gem_do_execbuffer+0x219d/0x32e0 [i915]
<4>[  125.500678]  ? debug_check_no_locks_freed+0x2a0/0x2a0
<4>[  125.500701]  ? __debug_object_init+0x322/0xd90
<4>[  125.500722]  ? debug_check_no_locks_freed+0x2a0/0x2a0
<4>[  125.500827]  ? i915_gem_do_execbuffer+0xdc2/0x32e0 [i915]
<4>[  125.500942]  ? i915_request_alloc+0x5b5/0x13f0 [i915]
<4>[  125.500964]  ? page_frag_free+0x170/0x170
<4>[  125.500984]  ? debug_check_no_locks_freed+0x2a0/0x2a0
<4>[  125.501008]  new_slab+0x21d/0x5c0
<4>[  125.501029]  ___slab_alloc.constprop.35+0x322/0x3e0
<4>[  125.501052]  ? reservation_object_reserve_shared+0x10b/0x250
<4>[  125.501074]  ? __ww_mutex_lock.constprop.3+0x1104/0x2cf0
<4>[  125.501097]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4>[  125.501120]  ? fs_reclaim_acquire+0x10/0x10
<4>[  125.501138]  ? lock_acquire+0x138/0x3c0
<4>[  125.501156]  ? lock_acquire+0x3c0/0x3c0
<4>[  125.501176]  ? reservation_object_reserve_shared+0x10b/0x250
<4>[  125.501198]  ? __slab_alloc.isra.27.constprop.34+0x3d/0x70
<4>[  125.501219]  __slab_alloc.isra.27.constprop.34+0x3d/0x70
<4>[  125.501243]  ? reservation_object_reserve_shared+0x10b/0x250
<4>[  125.501265]  __kmalloc_track_caller+0x313/0x350
<4>[  125.501287]  krealloc+0x62/0xb0
<4>[  125.501305]  reservation_object_reserve_shared+0x10b/0x250
<4>[  125.501411]  i915_gem_do_execbuffer+0x2040/0x32e0 [i915]
<4>[  125.501522]  ? eb_relocate_slow+0xad0/0xad0 [i915]
<4>[  125.501544]  ? debug_check_no_locks_freed+0x2a0/0x2a0
<4>[  125.501646]  ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
<4>[  125.501755]  ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
<4>[  125.501779]  ? drm_dev_get+0x20/0x20
<4>[  125.501803]  ? __might_fault+0xea/0x1a0
<4>[  125.501902]  ? i915_gem_execbuffer2_ioctl+0x108/0x770 [i915]
<4>[  125.502012]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
<4>[  125.502116]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
<4>[  125.502218]  i915_gem_execbuffer2_ioctl+0x3c5/0x770 [i915]
<4>[  125.502243]  ? drm_dev_enter+0xe0/0xe0
<4>[  125.502260]  ? lock_acquire+0x138/0x3c0
<4>[  125.502362]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
<4>[  125.502470]  ? i915_gem_object_create.part.28+0x570/0x570 [i915]
<4>[  125.502575]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
<4>[  125.502680]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
<4>[  125.502702]  drm_ioctl_kernel+0x151/0x200
<4>[  125.502721]  ? drm_ioctl_permit+0x2a0/0x2a0
<4>[  125.502746]  drm_ioctl+0x63a/0x920
<4>[  125.502844]  ? i915_gem_execbuffer_ioctl+0xb90/0xb90 [i915]
<4>[  125.502868]  ? drm_getstats+0x20/0x20
<4>[  125.502886]  ? trace_hardirqs_on_thunk+0x1a/0x1c
<4>[  125.502919]  do_vfs_ioctl+0x173/0xe90
<4>[  125.502936]  ? trace_hardirqs_on_thunk+0x1a/0x1c
<4>[  125.502957]  ? ioctl_preallocate+0x170/0x170
<4>[  125.502978]  ? trace_hardirqs_on_thunk+0x1a/0x1c
<4>[  125.503002]  ? retint_kernel+0x2d/0x2d
<4>[  125.503024]  ksys_ioctl+0x35/0x60
<4>[  125.503043]  __x64_sys_ioctl+0x6a/0xb0
<4>[  125.503061]  do_syscall_64+0x97/0x400
<4>[  125.503081]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  125.503101] RIP: 0033:0x7fe18e4f65d7
<4>[  125.503116] RSP: 002b:00007ffe2ffc06a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  125.503145] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe18e4f65d7
<4>[  125.503168] RDX: 00007ffe2ffc07f0 RSI: 0000000040406469 RDI: 0000000000000003
<4>[  125.503191] RBP: 00007ffe2ffc07f0 R08: 0000000000000004 R09: 00007ffe2ffcf080
<4>[  125.503215] R10: 000000000002c7de R11: 0000000000000246 R12: 0000000040406469
<4>[  125.503238] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
<4>[  125.503268] Code: e8 18 a0 c9 da 48 8b 35 25 3a 47 00 49 c7 c0 a0 3b 88 c0 b9 95 05 00 00 48 c7 c2 e0 49 88 c0 48 c7 c7 8d 3b 5d c0 e8 ee 7e db da <0f> 0b 48 89 ef e8 a4 26 f5 da e9 51 fe ff ff e8 8a 26 f5 da e9
<1>[  125.503548] RIP: i915_retire_requests+0x3f2/0x590 [i915] RSP: ffff88004e5dec40

Fixes: 643b450a ("drm/i915: Only track live rings for retiring")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180504101147.26286-1-chris@chris-wilson.co.uk

43c8c441

drm/i915: Keep one request in our ring_list · 7c572e1b

由 Chris Wilson 提交于 5月 03, 2018

Don't pre-emptively retire the oldest request in our ring's list if it
is the only request. We keep various bits of state alive using the
active reference from the request and would rather transfer that state
over to a new request rather than the more involved process of retiring
and reacquiring it.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-2-chris@chris-wilson.co.uk

7c572e1b

03 5月, 2018 3 次提交

drm/i915: Reset the hangcheck timestamp before repeating a seqno · ea491b23

由 Chris Wilson 提交于 5月 02, 2018

In the unusual circumstance where we reuse a seqno (for example, in
igt), make sure that we reset the hangcheck timestamp before it sees the
same seqno again.

References: https://bugs.freedesktop.org/show_bug.cgi?id=106215Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502220313.6459-1-chris@chris-wilson.co.uk

ea491b23

drm/i915: Split i915_gem_timeline into individual timelines · a89d1f92

由 Chris Wilson 提交于 5月 02, 2018

We need to move to a more flexible timeline that doesn't assume one
fence context per engine, and so allow for a single timeline to be used
across a combination of engines. This means that preallocating a fence
context per engine is now a hindrance, and so we want to introduce the
singular timeline. From the code perspective, this has the notable
advantage of clearing up a lot of mirky semantics and some clumsy
pointer chasing.

By splitting the timeline up into a single entity rather than an array
of per-engine timelines, we can realise the goal of the previous patch
of tracking the timeline alongside the ring.

v2: Tweak wait_for_idle to stop the compiling thinking that ret may be
uninitialised.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk

a89d1f92

drm/i915: Move timeline from GTT to ring · 65fcb806

由 Chris Wilson 提交于 5月 02, 2018

In the future, we want to move a request between engines. To achieve
this, we first realise that we have two timelines in effect here. The
first runs through the GTT is required for ordering vma access, which is
tracked currently by engine. The second is implied by sequential
execution of commands inside the ringbuffer. This timeline is one that
maps to userspace's expectations when submitting requests (i.e. given the
same context, batch A is executed before batch B). As the rings's
timelines map to userspace and the GTT timeline an implementation
detail, move the timeline from the GTT into the ring itself (per-context
in logical-ring-contexts/execlists, or a global per-engine timeline for
the shared ringbuffers in legacy submission.

The two timelines are still assumed to be equivalent at the moment (no
migrating requests between engines yet) and so we can simply move from
one to the other without adding extra ordering.

v2: Reinforce that one isn't allowed to mix the engine execution
timeline with the client timeline from userspace (on the ring).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk

65fcb806

30 4月, 2018 4 次提交

drm/i915: Only track live rings for retiring · 643b450a

由 Chris Wilson 提交于 4月 30, 2018

We don't need to track every ring for its lifetime as they are managed
by the contexts/engines. What we do want to track are the live rings so
that we can sporadically clean up requests if userspace falls behind. We
can simply restrict the gt->rings list to being only gt->live_rings.

v2: s/live/active/ for consistency with gt.active_requests
Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-4-chris@chris-wilson.co.uk

643b450a

drm/i915: Retire requests along rings · b887d615

由 Chris Wilson 提交于 4月 30, 2018

In the next patch, rings are the central timeline as requests may jump
between engines. Therefore in the future as we retire in order along the
engine timeline, we may retire out-of-order within a ring (as the ring now
occurs along multiple engines), leading to much hilarity in miscomputing
the position of ring->head.

As an added bonus, retiring along the ring reduces the penalty of having
one execlists client do cleanup for another (old legacy submission
shares a ring between all clients). The downside is that slow and
irregular (off the critical path) process of cleaning up stale requests
after userspace becomes a modicum less efficient.

In the long run, it will become apparent that the ordered
ring->request_list matches the ring->timeline, a fun challenge for the
future will be unifying the two lists to avoid duplication!

v2: We need both engine-order and ring-order processing to maintain our
knowledge of where individual rings have completed upto as well as
knowing what was last executing on any engine. And finally by decoupling
retiring the contexts on the engine and the timelines along the rings,
we do have to keep a reference to the context on each request
(previously it was guaranteed by the context being pinned).

v3: Not just a reference to the context, but we need to keep it pinned
as we manipulate the rings; i.e. we need a pin for both the manipulation
of the engine state during its retirements, and a separate pin for the
manipulation of the ring state.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-3-chris@chris-wilson.co.uk

b887d615

drm/i915: Wrap engine->context_pin() and engine->context_unpin() · ab82a063

由 Chris Wilson 提交于 4月 30, 2018

Make life easier in upcoming patches by moving the context_pin and
context_unpin vfuncs into inline helpers.

v2: Fixup mock_engine to mark the context as pinned on use.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-2-chris@chris-wilson.co.uk

ab82a063

drm/i915: Stop tracking timeline->inflight_seqnos · 52d7f16e

由 Chris Wilson 提交于 4月 30, 2018

In commit 9b6586ae ("drm/i915: Keep a global seqno per-engine"), we
moved from a global inflight counter to per-engine counters in the
hope that will be easy to run concurrently in future. However, with the
advent of the desire to move requests between engines, we do need a
global counter to preserve the semantics that no engine wraps in the
middle of a submit. (Although this semantic is now only required for gen7
semaphore support, which only supports greater-then comparisons!)

v2: Keep a global counter of all requests ever submitted and force the
reset when it wraps.

References: 9b6586ae ("drm/i915: Keep a global seqno per-engine")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-1-chris@chris-wilson.co.uk

52d7f16e

19 4月, 2018 2 次提交

drm/i915: Pack params to engine->schedule() into a struct · b7268c5e

由 Chris Wilson 提交于 4月 18, 2018

Today we only want to pass along the priority to engine->schedule(), but
in the future we want to have much more control over the various aspects
of the GPU during a context's execution, for example controlling the
frequency allowed. As we need an ever growing number of parameters for
scheduling, move those into a struct for convenience.

v2: Move the anonymous struct into its own function for legibility and
ye olde gcc.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180418184052.7129-3-chris@chris-wilson.co.uk

b7268c5e

drm/i915: Rename priotree to sched · 0c7112a0

由 Chris Wilson 提交于 4月 18, 2018

Having moved the priotree struct into i915_scheduler.h, identify it as
the scheduling element and rebrand into i915_sched. This becomes more
useful as we start attaching more information we require to propagate
through the scheduler.

v2: Use i915_sched_node for future distinctiveness
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180418184052.7129-2-chris@chris-wilson.co.uk

0c7112a0

09 4月, 2018 1 次提交

drm/i915/execlists: Log fence context & seqno throughout GEM_TRACE · 0c5c7df3

由 Tvrtko Ursulin 提交于 4月 06, 2018

Include fence context and seqno in low level tracing so it is easier to
follow flows of individual requests when things go bad.

Also added tracing on the reset side of things.

v2:
 Chris Wilson:
 * Standardize global_seqno and seqno as global.
 * Include current hws seqno in execlists_cancel_port_requests.

v3:
 * Fix port printk format for all builds.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v2
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406123514.5809-1-tvrtko.ursulin@linux.intel.com

0c5c7df3

07 4月, 2018 2 次提交

drm/i915: Pass the set of guilty engines to i915_reset() · d0667e9c

由 Chris Wilson 提交于 4月 06, 2018

Currently, we rely on inspecting the hangcheck state from within the
i915_reset() routines to determine which engines were guilty of the
hang. This is problematic for cases where we want to run
i915_handle_error() and call i915_reset() independently of hangcheck.
Instead of relying on the indirect parameter passing, turn it into an
explicit parameter providing the set of stalled engines which then are
treated as guilty until proven innocent.

While we are removing the implicit stalled parameter, also make the
reason into an explicit parameter to i915_reset(). We still need a
back-channel for i915_handle_error() to hand over the task to the locked
waiter, but let's keep that its own channel rather than incriminate
another.

This leaves stalled/seqno as being private to hangcheck, with no more
nefarious snooping by reset, be it whole-device or per-engine. \o/

The only real issue now is that this makes it crystal clear that we
don't actually do any testing of hangcheck per se in
drv_selftest/live_hangcheck, merely of resets!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406220354.18911-2-chris@chris-wilson.co.uk

d0667e9c

drm/i915: Split out parking from the idle worker for reuse · e4d2006f

由 Chris Wilson 提交于 4月 06, 2018

We will want to park GEM before disengaging the drive^W^W^W unwedging.
Since we already do the work for idling, expose the guts as a new
function that we can then reuse.

v2: Just skip if already parked; makes it more forgiving to use by
future callers.
v3: Extract mark_busy, rename it to i915_gem_unpark and place it next to
i915_gem_park so that we can evaluate it for symmetry more easily.
Calling GEM from inside i915_request looks to be a bit of a layering
violation, for the moment I am imaging them as being notify_cb.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> #v1
Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406155144.27791-1-chris@chris-wilson.co.uk

e4d2006f

29 3月, 2018 1 次提交

drm/i915: Include the HW breadcrumb whenever we trace the global_seqno · e7702760

由 Chris Wilson 提交于 3月 27, 2018

When we include a request's global_seqno in a GEM_TRACE it often helps
to know how that relates to the current breadcrumb as seen by the
hardware.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180327210157.16896-3-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

e7702760

22 3月, 2018 2 次提交

drm/i915: Remove local timeline var from submit/unsubmit · 4ccfee92

由 Chris Wilson 提交于 3月 22, 2018

Both request_submit and request_unsubmit deal with transferring the
request from the client's timeline onto the execution timeline and back
again. As both functions deal with a pair of timeline's, using a
shorthand for just one of them is slightly confusing, especially as the
different functions use the shorthand for the alternate timeline.
Instead, use the full version of each timeline so it should be easier to
keep track of the transfer between the request/client and the engine.

v2: Refactor the common lock+list_move
v3: Be clear we require the other timeline list to be locked as well.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180322131034.6036-1-chris@chris-wilson.co.uk

4ccfee92

drm/i915: Fix tracing of submit seqno · 0e59c209

由 Chris Wilson 提交于 3月 22, 2018

We pre-increment the timeline->seqno when handing it to the request,
make sure the GEM_TRACE takes this into account. Otherwise, it appears
that we go backwards over a preemption point:

1d..1 157681077us : __i915_request_unsubmit: vcs0 fence 75e:3 <- global_seqno 17
0d.s1 157681113us : __i915_request_submit: vcs0 fence 75e:3 -> global_seqno 16

Fixes: d9b13c4d ("drm/i915: Trace GEM steps between submit and wedging")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180322110059.4467-1-chris@chris-wilson.co.uk

0e59c209

20 3月, 2018 1 次提交

drm/i915: Add control flags to i915_handle_error() · ce800754

由 Chris Wilson 提交于 3月 20, 2018

Not all callers want the GPU error to handled in the same way, so expose
a control parameter. In the first instance, some callers do not want the
heavyweight error capture so add a bit to request the state to be
captured and saved.

v2: Pass msg down to i915_reset/i915_reset_engine so that we include the
reason for the reset in the dev_notice(), superseding the earlier option
to not print that notice.
v3: Stash the reason inside the i915->gpu_error to handover to the direct
reset from the blocking waiter.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180320100449.1360-2-chris@chris-wilson.co.uk

ce800754

16 3月, 2018 2 次提交

drm/i915: Wrap engine->schedule in RCU locks for set-wedge protection · 7e9d3a4a

由 Chris Wilson 提交于 3月 07, 2018

Similar to the staging around handling of engine->submit_request, we
need to stop adding to the execlists->queue prior to calling
engine->cancel_requests. cancel_requests will move requests from the
queue onto the timeline, so if we add a request onto the queue after that
point, it will be lost.

Fixes: af7a8ffa ("drm/i915: Use rcu instead of stop_machine in set_wedged")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-5-chris@chris-wilson.co.uk
(cherry picked from commit 47650db0)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

7e9d3a4a

drm/i915: Trace GEM steps between submit and wedging · d9b13c4d

由 Chris Wilson 提交于 3月 15, 2018

We still have an odd race with wedging/unwedging as shown by igt/gem_eio
that defies expectations. Add some more trace_printks to try and
visualize the flow over the precipice.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180315131451.4060-1-chris@chris-wilson.co.uk

d9b13c4d

13 3月, 2018 1 次提交

drm/i915: Remove the impedance mismatch around intel_engine_enable_signaling · 6f9ec414

由 Chris Wilson 提交于 3月 08, 2018

There is some redundancy between dma_fence->ops->enable_signaling (via
i915_fence_enable_signaling) and our backend,
intel_engine_enable_signaling() in that both levels recheck the fence
status multiple times. If we convert intel_engine_enable_signaling() to
return the information desired by dma_fence->ops->enable_signaling, we
can reduce i915_fence_enable_signaling to a simple stub and avoid
trying to reinterpret the same information.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308140732.25090-1-chris@chris-wilson.co.uk

6f9ec414

09 3月, 2018 2 次提交

drm/i915: Wrap engine->schedule in RCU locks for set-wedge protection · 47650db0

由 Chris Wilson 提交于 3月 07, 2018

Similar to the staging around handling of engine->submit_request, we
need to stop adding to the execlists->queue prior to calling
engine->cancel_requests. cancel_requests will move requests from the
queue onto the timeline, so if we add a request onto the queue after that
point, it will be lost.

Fixes: af7a8ffa ("drm/i915: Use rcu instead of stop_machine in set_wedged")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-5-chris@chris-wilson.co.uk

47650db0

drm/i915: Update ring position from request on retiring · 36620032

由 Chris Wilson 提交于 3月 07, 2018

When wedged, we do not update the ring->tail as we submit the requests
causing us to leak the ring->space upon cleaning up the wedged driver.
We can just use the value stored in rq->tail, and keep the submission
backend details away from set-wedge.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-3-chris@chris-wilson.co.uk

36620032

07 3月, 2018 2 次提交

drm/i915: Flush waiters on seqno wraparound · f41d19be

由 Chris Wilson 提交于 3月 06, 2018

Previously, we would spin waiting for all waiters to wake up and notice
their request had completed before we would reset the seqno upon
wraparound. However, we can mark their waits as complete and wake them
up directly using the existing machinery for handling the flushing of
missed wakeups when idling.
Suggested-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-2-chris@chris-wilson.co.uk

f41d19be

drm/i915: Stop kicking the signaling thread on seqno wraparound · 93eef7d6

由 Chris Wilson 提交于 3月 06, 2018

Since commit fd10e2ce ("drm/i915/breadcrumbs: Ignore unsubmitted
signalers"), we cancel the signaler when retiring the request and so
upon wraparound, where we wait for all requests to be retired, we no
longer need to spin waiting for the signaling thread to release its
references to the in-flight requests, and so we can assert that the
signaler is idle.

References: fd10e2ce ("drm/i915/breadcrumbs: Ignore unsubmitted signalers")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-1-chris@chris-wilson.co.uk

93eef7d6

24 2月, 2018 1 次提交

drm/i915: Update missing parts after the rename to i915_request · e532be89

由 Michel Thierry 提交于 2月 22, 2018

Mostly doc/print messages that were not updated after commit e61e0f51
("drm/i915: Rename drm_i915_gem_request to i915_request").
Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222172405.11386-1-michel.thierry@intel.com

e532be89

22 2月, 2018 1 次提交

drm/i915: Rename drm_i915_gem_request to i915_request · e61e0f51

由 Chris Wilson 提交于 2月 21, 2018

We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)

In short, the spatch:
@@

@@
- struct drm_i915_gem_request
+ struct i915_request

A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

e61e0f51

21 2月, 2018 1 次提交

drm/i915: Make global seqno known in i915_gem_request_execute tracepoint · 158863fb

由 Tvrtko Ursulin 提交于 2月 20, 2018

Commit fe49789f ("drm/i915: Deconstruct execute fence") re-arranged
the code and moved the i915_gem_request_execute tracepoint to before the
global seqno is assigned to the request.

We need to move the tracepoint a bit later so this information is once
again available.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: fe49789f ("drm/i915: Deconstruct execute fence")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220104742.565-1-tvrtko.ursulin@linux.intel.com

158863fb

08 2月, 2018 2 次提交

drm/i915: Fix kerneldoc warnings for i915_gem_request · d8802126

由 Chris Wilson 提交于 2月 08, 2018

drivers/gpu/drm/i915/i915_gem_request.c:941: warning: No description found for parameter 'write'
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180208111453.32567-1-chris@chris-wilson.co.uk

d8802126

drm/i915: Remove superfluous worker wakeups when RPS is already boosted · 253a2817

由 Chris Wilson 提交于 2月 06, 2018

We only need to wake up the RPS worker once when initially enabling the
client boost, it remains in effect then until the last client no longer
requires the boost.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102250
References: 7b92c1bd ("drm/i915: Avoid keeping waitboost active for signaling threads")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206143137.15509-1-chris@chris-wilson.co.uk

253a2817

07 2月, 2018 2 次提交

drm/i915: Trim the retired request queue after submitting · c22b355f

由 Chris Wilson 提交于 2月 07, 2018

If we submit a request and see that the previous request on this
timeline was already signaled, we first do not need to add the
dependency tracker for that completed request and secondly we know that
we there is then a large backlog in retiring requests affecting this
timeline. Given that we just submitted more work to the HW, now would be
a good time to catch up on those retirements.

v2: Try to sum up the compromises involved in flushing the retirement
queue after submission.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207084350.3929-1-chris@chris-wilson.co.uk

c22b355f

drm/i915: Skip request serialisation if the timeline is already complete · 8ac71d1d

由 Chris Wilson 提交于 2月 07, 2018

If the last request on the timeline is already complete, we do not need
to emit the serialisation barriers.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207084350.3929-2-chris@chris-wilson.co.uk

8ac71d1d

05 2月, 2018 1 次提交

drm/i915/breadcrumbs: Drop request reference for the signaler thread · b7a3f33b

由 Chris Wilson 提交于 2月 03, 2018

If we remember to cancel the signaler on a request when retiring it
(after we know that the request has been signaled), we do not need to
carry an additional request in the signaler itself. This prevents an
issue whereby the signaler threads may be delayed and hold on to
thousands of request references, causing severe memory fragmentation and
premature oom (most noticeable on 32b snb due to the limited GFP_KERNEL
and frequent use of inter-engine fences).

v2: Rename first_signal(), document reads outside of locks.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180203101914.24880-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

b7a3f33b

01 2月, 2018 1 次提交

drm/i915: Always run hangcheck while the GPU is busy · b26a32a8

由 Chris Wilson 提交于 1月 29, 2018

Previously, we relied on only running the hangcheck while somebody was
waiting on the GPU, in order to minimise the amount of time hangcheck
had to run. (If nobody was watching the GPU, nobody would notice if the
GPU wasn't responding -- eventually somebody would care and so kick
hangcheck into action.) However, this falls apart from around commit
4680816b ("drm/i915: Wait first for submission, before waiting for
request completion"), as not all waiters declare themselves to hangcheck
and so we could switch off hangcheck and miss GPU hangs even when
waiting under the struct_mutex.

If we enable hangcheck from the first request submission, and let it run
until the GPU is idle again, we forgo all the complexity involved with
only enabling around waiters. We just have to remember to be careful that
we do not declare a GPU hang when idly waiting for the next request to
be come ready, as we will run hangcheck continuously even when the
engines are stalled waiting for external events. This should be true
already as we should only be tracking requests submitted to hardware for
execution as an indicator that the engine is busy.

Fixes: 4680816b ("drm/i915: Wait first for submission, before waiting for request completion"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104840Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180129144104.3921-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
(cherry picked from commit 88923048)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

b26a32a8

31 1月, 2018 1 次提交

drm/i915: Always run hangcheck while the GPU is busy · 88923048

由 Chris Wilson 提交于 1月 29, 2018

88923048

29 1月, 2018 1 次提交

drm/i915: Assert that we do not try to unsubmit a completed request · c7cc144d

由 Chris Wilson 提交于 1月 29, 2018

Assert that we do not try to unsubmit a completed request, as should we
try to resubmit it later, the ring is already past the request's
breadcrumb and the breadcrumb will not be updated.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180129094912.14428-1-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>

c7cc144d

24 1月, 2018 1 次提交

drm/i915: Track the number of times we have woken the GPU up · 6f56103d

由 Chris Wilson 提交于 1月 24, 2018

By counting the number of times we have woken up, we have a very simple
means of defining an epoch, which will come in handy if we want to
perform deferred tasks at the end of an epoch (i.e. while we are going
to sleep) without imposing on the next activity cycle.

v2: No reason to specify precise number of bits here.
v3: Take Tvrtko's advice and reserve 0 as an invalid epoch.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180124113608.14909-1-chris@chris-wilson.co.uk

6f56103d

20 1月, 2018 1 次提交

drm/i915: Shrink the request kmem_cache on allocation error · f0111b04

由 Chris Wilson 提交于 1月 19, 2018

If we fail to allocate a new request, make sure we recover the pages
that are in the process of being freed by inserting an RCU barrier.

v2: Comment before the shrink and barrier in the error path.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180119144657.22606-1-chris@chris-wilson.co.uk

f0111b04