提交 · 8e87e0139aff59c5961347ab1ef06814f092c439 · openeuler / Kernel

23 3月, 2020 1 次提交

drm/i915/gt: Mark timeline->cacheline as destroyed after rcu grace period · 8e87e013

由 Chris Wilson 提交于 3月 23, 2020

Since we take advantage of RCU for some i915_active objects, like the
intel_timeline_cacheline, we need to delay the i915_active_fini until
after the RCU grace period and we perform the kfree -- that is until
after all RCU protected readers.

<3> [108.204873] ODEBUG: assert_init not available (active state 0) object type: i915_active hint: __cacheline_active+0x0/0x80 [i915]
<4> [108.207377] WARNING: CPU: 3 PID: 2342 at lib/debugobjects.c:488 debug_print_object+0x67/0x90
<4> [108.207400] Modules linked in: vgem snd_hda_codec_hdmi x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_hda_codec ax88179_178a snd_hwdep usbnet btusb snd_hda_core btrtl mii btbcm btintel snd_pcm bluetooth ecdh_generic ecc i915 i2c_hid pinctrl_sunrisepoint pinctrl_intel intel_lpss_pci prime_numbers
<4> [108.207587] CPU: 3 PID: 2342 Comm: gem_exec_parall Tainted: G     U            5.6.0-rc6-CI-Patchwork_17047+ #1
<4> [108.207609] Hardware name: Google Soraka/Soraka, BIOS MrChromebox-4.10 08/25/2019
<4> [108.207639] RIP: 0010:debug_print_object+0x67/0x90
<4> [108.207668] Code: 83 c2 01 8b 4b 14 4c 8b 45 00 89 15 87 d2 8a 02 8b 53 10 4c 89 e6 48 c7 c7 38 2b 32 82 48 8b 14 d5 80 2f 07 82 e8 49 d5 b7 ff <0f> 0b 5b 83 05 c3 f6 22 01 01 5d 41 5c c3 83 05 b8 f6 22 01 01 c3
<4> [108.207692] RSP: 0018:ffffc90000e7f890 EFLAGS: 00010282
<4> [108.207723] RAX: 0000000000000000 RBX: ffffc90000e7f8b0 RCX: 0000000000000001
<4> [108.207747] RDX: 0000000080000001 RSI: ffff88817ada8cb8 RDI: 00000000ffffffff
<4> [108.207770] RBP: ffffffffa0341cc0 R08: ffff88816b5a8948 R09: 0000000000000000
<4> [108.207792] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82322d54
<4> [108.207814] R13: ffffffffa0341cc0 R14: ffffffff83df9568 R15: ffff88816064f400
<4> [108.207839] FS:  00007f437d753700(0000) GS:ffff88817ad80000(0000) knlGS:0000000000000000
<4> [108.207863] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [108.207887] CR2: 00007f2ad1fb5000 CR3: 00000001725d8004 CR4: 00000000003606e0
<4> [108.207907] Call Trace:
<4> [108.207959]  debug_object_assert_init+0x15c/0x180
<4> [108.208475]  ? i915_active_acquire_if_busy+0x10/0x50 [i915]
<4> [108.208513]  ? rcu_read_lock_held+0x4d/0x60
<4> [108.208970]  i915_active_acquire_if_busy+0x10/0x50 [i915]
<4> [108.209380]  intel_timeline_read_hwsp+0x81/0x540 [i915]
<4> [108.210262]  __emit_semaphore_wait+0x45/0x1b0 [i915]
<4> [108.210726]  ? i915_request_await_dma_fence+0x143/0x560 [i915]
<4> [108.211156]  i915_request_await_dma_fence+0x28a/0x560 [i915]
<4> [108.211633]  i915_request_await_object+0x24a/0x3f0 [i915]
<4> [108.212102]  eb_submit.isra.47+0x58f/0x920 [i915]
<4> [108.212622]  i915_gem_do_execbuffer+0x1706/0x2c70 [i915]
<4> [108.213071]  ? i915_gem_execbuffer2_ioctl+0xc0/0x470 [i915]
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-1-chris@chris-wilson.co.uk

8e87e013

07 3月, 2020 1 次提交

drm/i915/gt: Close race between cacheline_retire and free · 2d4bd971

由 Chris Wilson 提交于 3月 06, 2020

If the cacheline may still be busy, atomically mark it for future
release, and only if we can determine that it will never be used again,
immediately free it.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1392
Fixes: ebece753 ("drm/i915: Keep timeline HWSP allocated until idle across the system")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Link: https://patchwork.freedesktop.org/patch/msgid/20200306154647.3528345-1-chris@chris-wilson.co.uk

2d4bd971

03 2月, 2020 1 次提交

drm/i915/gt: Warn about the hidden i915_vma_pin in timeline_get_seqno · 8faa7251

由 Chris Wilson 提交于 2月 03, 2020

On seqno rollover, we need to allocate ourselves a new cacheline. This
might incur grabbing a new page and pinning it into the GGTT, with some
rather unfortunate lockdep implications.

To avoid a mutex, and more specifically pinning in the GGTT from inside
the kernel context being used to flush the GGTT in emergencies, we will
likely need to lift the next-cacheline allocation to a pre-reservation.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200203094152.4150550-3-chris@chris-wilson.co.uk

8faa7251

31 1月, 2020 1 次提交

drm/i915: Use the async worker to avoid reclaim tainting the ggtt->mutex · e3793468

由 Chris Wilson 提交于 1月 30, 2020

On Braswell and Broxton (also known as Valleyview and Apollolake), we
need to serialise updates of the GGTT using the big stop_machine()
hammer. This has the side effect of appearing to lockdep as a possible
reclaim (since it uses the cpuhp mutex and that is tainted by per-cpu
allocations). However, we want to use vm->mutex (including ggtt->mutex)
from within the shrinker and so must avoid such possible taints. For this
purpose, we introduced the asynchronous vma binding and we can apply it
to the PIN_GLOBAL so long as take care to add the necessary waits for
the worker afterwards.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/211Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130181710.2030251-3-chris@chris-wilson.co.uk

e3793468

09 1月, 2020 1 次提交

drm/i915/gt: Drop a defunct timeline assertion · 6e8b0f53

由 Chris Wilson 提交于 1月 07, 2020

intel_timeline_enter() has been decoupled from intel_timeline_pin() and
both enter/exit & pin/unpin are allowed [read expected] to run
concurrently with one another. The assertion that they had better not is
stale.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/940
References: a6edbca7 ("drm/i915/gt: Close race between engine_park and intel_gt_retire_requests")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200107143826.3298401-1-chris@chris-wilson.co.uk

6e8b0f53

18 12月, 2019 1 次提交

drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp · 85bedbf1

由 Chris Wilson 提交于 12月 17, 2019

As we stash a pointer to the HWSP cacheline on the request, when reading
it we only need confirm that the cacheline is still valid by checking
that the request and timeline are still intact.

v2: Protect hwsp_cachline with RCU
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217011659.3092130-1-chris@chris-wilson.co.uk

85bedbf1

14 12月, 2019 1 次提交

drm/i915: Use EAGAIN for trylock failures · f1925f33

由 Chris Wilson 提交于 12月 13, 2019

While not good behaviour, it is, however, established behaviour that we
can punt EAGAIN to userspace if we need to retry the ioctl. When trying
to acquire a mutex, prefer to use EAGAIN to propagate losing the race
so that if it does end up back in userspace, we try again.

Fixes: c81471f5 ("drm/i915: Copy across scheduler behaviour flags across submit fences")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/800Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213160347.1789004-1-chris@chris-wilson.co.uk

f1925f33

28 11月, 2019 1 次提交

drm/i915: Serialise i915_active_fence_set() with itself · df9f85d8

由 Chris Wilson 提交于 11月 27, 2019

The expected downside to commit 58b4c1a0 ("drm/i915: Reduce nested
prepare_remote_context() to a trylock") was that it would need to return
-EAGAIN to userspace in order to resolve potential mutex inversion. Such
an unsightly round trip is unnecessary if we could atomically insert a
barrier into the i915_active_fence, so make it happen.

Currently, we use the timeline->mutex (or some other named outer lock)
to order insertion into the i915_active_fence (and so individual nodes
of i915_active). Inside __i915_active_fence_set, we only need then
serialise with the interrupt handler in order to claim the timeline for
ourselves.

However, if we remove the outer lock, we need to ensure the order is
intact between not only multiple threads trying to insert themselves
into the timeline, but also with the interrupt handler completing the
previous occupant. We use xchg() on insert so that we have an ordered
sequence of insertions (and each caller knows the previous fence on
which to wait, preserving the chain of all fences in the timeline), but
we then have to cmpxchg() in the interrupt handler to avoid overwriting
the new occupant. The only nasty side-effect is having to temporarily
strip off the RCU-annotations to apply the atomic operations, otherwise
the rules are much more conventional!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112402
Fixes: 58b4c1a0 ("drm/i915: Reduce nested prepare_remote_context() to a trylock")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127134527.3438410-1-chris@chris-wilson.co.uk

df9f85d8

25 11月, 2019 3 次提交

drm/i915/gt: Schedule request retirement when timeline idles · 31177017

由 Chris Wilson 提交于 11月 25, 2019

The major drawback of commit 7e34f4e4 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster...  Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a283 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f874)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

31177017

drm/i915/gt: Close race between engine_park and intel_gt_retire_requests · ca1711d1

由 Chris Wilson 提交于 11月 20, 2019

The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.

This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.

And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].

The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.

Fixes: e5dadff4 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-1-chris@chris-wilson.co.uk
(cherry picked from commit a6edbca7)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

ca1711d1

drm/i915/gt: Schedule request retirement when timeline idles · 4f88f874

由 Chris Wilson 提交于 11月 25, 2019

The major drawback of commit 7e34f4e4 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster... Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a283 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk

4f88f874

21 11月, 2019 2 次提交

drm/i915/gt: Declare timeline.lock to be irq-free · 88cec497

由 Chris Wilson 提交于 11月 20, 2019

Now that we never allow the intel_wakeref callbacks to be invoked from
interrupt context, we do not need the irqsafe spinlock for the timeline.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120170858.3965380-1-chris@chris-wilson.co.uk

88cec497

drm/i915/gt: Close race between engine_park and intel_gt_retire_requests · a6edbca7

由 Chris Wilson 提交于 11月 20, 2019

The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.

This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.

And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].

The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.

Fixes: e5dadff4 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-1-chris@chris-wilson.co.uk

a6edbca7

20 11月, 2019 1 次提交

drm/i915/gt: Move new timelines to the end of active_list · 1683d24c

由 Chris Wilson 提交于 11月 19, 2019

When adding a new active timeline, place it at the end of the list. This
allows for intel_gt_retire_requests() to pick up the newcomer more
quickly and hopefully complete the retirement sooner. A miniscule
optimisation.

References: 7936a22d ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119162559.3313003-1-chris@chris-wilson.co.uk

1683d24c

01 11月, 2019 1 次提交

drm/i915/gt: Pull timeline initialise to intel_gt_init_early · 4605bb73

由 Chris Wilson 提交于 11月 01, 2019

Our timelines are currently contained within an intel_gt, and we only
need to perform list/spinlock initialisation, so we can pull the
intel_timelines_init() into our intel_gt_init_early().
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101130406.4142-1-chris@chris-wilson.co.uk

4605bb73

24 10月, 2019 1 次提交

drm/i915/gt: Split intel_ring_submission · 2871ea85

由 Chris Wilson 提交于 10月 24, 2019

Split the legacy submission backend from the common CS ring buffer
handling.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk

2871ea85

04 10月, 2019 2 次提交

drm/i915: Coordinate i915_active with its own mutex · b1e3177b

由 Chris Wilson 提交于 10月 04, 2019

Forgo the struct_mutex serialisation for i915_active, and interpose its
own mutex handling for active/retire.

This is a multi-layered sleight-of-hand. First, we had to ensure that no
active/retire callbacks accidentally inverted the mutex ordering rules,
nor assumed that they were themselves serialised by struct_mutex. More
challenging though, is the rule over updating elements of the active
rbtree. Instead of the whole i915_active now being serialised by
struct_mutex, allocations/rotations of the tree are serialised by the
i915_active.mutex and individual nodes are serialised by the caller
using the i915_timeline.mutex (we need to use nested spinlocks to
interact with the dma_fence callback lists).

The pain point here is that instead of a single mutex around execbuf, we
now have to take a mutex for active tracker (one for each vma, context,
etc) and a couple of spinlocks for each fence update. The improvement in
fine grained locking allowing for multiple concurrent clients
(eventually!) should be worth it in typical loads.

v2: Add some comments that barely elucidate anything :(
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk

b1e3177b

drm/i915: Push the i915_active.retire into a worker · 274cbf20

由 Chris Wilson 提交于 10月 04, 2019

As we need to use a mutex to serialise i915_active activation
(because we want to allow the callback to sleep), we need to push the
i915_active.retire into a worker callback in case we get need to retire
from an atomic context.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-5-chris@chris-wilson.co.uk

274cbf20

20 9月, 2019 2 次提交

drm/i915: Protect timeline->hwsp dereferencing · 9eee0dd7

由 Chris Wilson 提交于 9月 19, 2019

As not only is the signal->timeline volatile, so will be acquiring the
timeline's HWSP. We must first carefully acquire the timeline from the
signaling request and then lock the timeline. With the removal of the
struct_mutex serialisation of request construction, we can have multiple
timelines active at once, and so we must avoid using the nested mutex
lock as it is quite possible for both timelines to be establishing
semaphores on the other and so deadlock.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-3-chris@chris-wilson.co.uk

9eee0dd7

drm/i915: Mark i915_request.timeline as a volatile, rcu pointer · d19d71fc

由 Chris Wilson 提交于 9月 19, 2019

The request->timeline is only valid until the request is retired (i.e.
before it is completed). Upon retiring the request, the context may be
unpinned and freed, and along with it the timeline may be freed. We
therefore need to be very careful when chasing rq->timeline that the
pointer does not disappear beneath us. The vast majority of users are in
a protected context, either during request construction or retirement,
where the timeline->mutex is held and the timeline cannot disappear. It
is those few off the beaten path (where we access a second timeline) that
need extra scrutiny -- to be added in the next patch after first adding
the warnings about dangerous access.

One complication, where we cannot use the timeline->mutex itself, is
during request submission onto hardware (under spinlocks). Here, we want
to check on the timeline to finalize the breadcrumb, and so we need to
impose a second rule to ensure that the request->timeline is indeed
valid. As we are submitting the request, it's context and timeline must
be pinned, as it will be used by the hardware. Since it is pinned, we
know the request->timeline must still be valid, and we cannot submit the
idle barrier until after we release the engine->active.lock, ergo while
submitting and holding that spinlock, a second thread cannot release the
timeline.

v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
as we need it, and tidy up acquiring the timeline with a bit of
refactoring (i915_active_add_request)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk

d19d71fc

07 9月, 2019 1 次提交

drm/i915: Hold irq-off for the entire fake lock period · ff36c5c4

由 Chris Wilson 提交于 8月 23, 2019

Sadly lockdep records when the irqs are re-enabled and then marks up the
fake lock as being irq-unsafe. Our hand is forced and so we must mark up
the entire fake lock critical section as irq-off.

Hopefully this is the last tweak required!

v2: Not quite, we need to mark the timeline spinlock as irqsafe. That
was a genuine bug being hidden by the earlier lockdep splat.

Fixes: d6773926 ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk
(cherry picked from commit 6dcb85a0)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

ff36c5c4

24 8月, 2019 1 次提交

drm/i915: Hold irq-off for the entire fake lock period · 6dcb85a0

由 Chris Wilson 提交于 8月 23, 2019

Sadly lockdep records when the irqs are re-enabled and then marks up the
fake lock as being irq-unsafe. Our hand is forced and so we must mark up
the entire fake lock critical section as irq-off.

Hopefully this is the last tweak required!

v2: Not quite, we need to mark the timeline spinlock as irqsafe. That
was a genuine bug being hidden by the earlier lockdep splat.

Fixes: d6773926 ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk

6dcb85a0

17 8月, 2019 2 次提交

drm/i915: Markup expected timeline locks for i915_active · 25ffd4b1

由 Chris Wilson 提交于 8月 16, 2019

As every i915_active_request should be serialised by a dedicated lock,
i915_active consists of a tree of locks; one for each node. Markup up
the i915_active_request with what lock is supposed to be guarding it so
that we can verify that the serialised updated are indeed serialised.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816121000.8507-2-chris@chris-wilson.co.uk

25ffd4b1

drm/i915/gt: Mark context->active_count as protected by timeline->mutex · 6c69a454

由 Chris Wilson 提交于 8月 16, 2019

We use timeline->mutex to protect modifications to
context->active_count, and the associated enable/disable callbacks.
Due to complications with engine-pm barrier there is a path where we used
a "superlock" to provide serialised protect and so could not
unconditionally assert with lockdep that it was always held. However,
we can mark the mutex as taken (noting that we may be nested underneath
ourselves) which means we can be reassured the right timeline->mutex is
always treated as held and let lockdep roam free.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816121000.8507-1-chris@chris-wilson.co.uk

6c69a454

16 8月, 2019 3 次提交

drm/i915/gt: Guard timeline pinning without relying on struct_mutex · ccb23d2d

由 Chris Wilson 提交于 8月 15, 2019

In preparation for removing struct_mutex from around context retirement,
we need to make timeline pinning and unpinning safe. Since multiple
engines/contexts can share a single timeline, we cannot rely on
borrowing the context mutex (otherwise we could state that the timeline
is only pinned/unpinned inside the context pin/unpin and so guarded by
it). However, we only perform a sequence of atomic operations inside the
timeline pin/unpin and the sequence of those operations is safe for a
concurrent unpin / pin, so we can relax the struct_mutex requirement.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-3-chris@chris-wilson.co.uk

ccb23d2d

drm/i915/gt: Convert timeline tracking to spinlock · 338aade9

由 Chris Wilson 提交于 8月 15, 2019

Convert the active_list manipulation of timelines to use spinlocks so
that we can perform the updates from underneath a quick interrupt
callback, if need be.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-2-chris@chris-wilson.co.uk

338aade9

drm/i915/gt: Track timeline activeness in enter/exit · 531958f6

由 Chris Wilson 提交于 8月 15, 2019

Lift moving the timeline to/from the active_list on enter/exit in order
to shorten the active tracking span in comparison to the existing
pin/unpin.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-1-chris@chris-wilson.co.uk

531958f6

26 6月, 2019 2 次提交

drm/i915/gt: Always call kref_init for the timeline · f0ca820c

由 Chris Wilson 提交于 6月 26, 2019

Always initialise the refcount, even for the embedded timelines inside
mock devices.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625233349.32371-2-chris@chris-wilson.co.uk

f0ca820c

drm/i915/gt: Drop stale commentary for timeline density · b38565fa

由 Chris Wilson 提交于 6月 26, 2019

We no longer allocate a contiguous set of timeline ids for all engines
upon creation, so we no longer should assume that the timelines are
densely allocated within a context. Hopefully, the set of fences used
within a workload are still dense enough for us to take advantage of
the compressed radix tree used for the syncmap.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625233349.32371-1-chris@chris-wilson.co.uk

b38565fa

22 6月, 2019 2 次提交

drm/i915: Provide an i915_active.acquire callback · 12c255b5

由 Chris Wilson 提交于 6月 21, 2019

If we introduce a callback for i915_active that is only called the first
time we use the i915_active and is symmetrically paired with the
i915_active.retire callback, we can replace the open-coded and
non-atomic implementations -- which will be very fragile (i.e. broken)
upon removing the struct_mutex serialisation.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-4-chris@chris-wilson.co.uk

12c255b5

drm/i915: Remove waiting & retiring from shrinker paths · 9e953980

由 Chris Wilson 提交于 6月 21, 2019

i915_gem_wait_for_idle() and i915_retire_requests() introduce a
dependency on the timeline->mutex. This is problematic as we want to
later perform allocations underneath i915_active.mutex, forming a link
between the shrinker, the timeline and active mutexes. Nip this cycle in
the bud by removing the acquisition of the timeline mutex (i.e.
retiring) from inside the shrinker.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-1-chris@chris-wilson.co.uk

9e953980

21 6月, 2019 4 次提交

drm/i915/gt: Rename i915_gt_timelines · c6fe28b0

由 Chris Wilson 提交于 6月 21, 2019

Since the anonymous i915_gt became struct intel_gt and encloses
struct i915_gt_timelines, rename i915_gt_timelines to intel_gt_timelines
to match its parentage.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621131640.28864-1-chris@chris-wilson.co.uk

c6fe28b0

drm/i915: Rename i915_timeline to intel_timeline and move under gt · f0c02c1b

由 Tvrtko Ursulin 提交于 6月 21, 2019

Move all timeline code under gt and rename to intel_gt prefix.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com

f0c02c1b

drm/i915: Make timelines gt centric · 4c6d51ea

由 Tvrtko Ursulin 提交于 6月 21, 2019

Our timelines are stored inside intel_gt so we can convert the interface
to take exactly that and not i915.

At the same time re-order the params to our more typical layout and
replace the backpointer to the new containing structure.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-31-tvrtko.ursulin@linux.intel.com

4c6d51ea

drm/i915: Compartmentalize timeline_init/park/fini · 390c8205

由 Tvrtko Ursulin 提交于 6月 21, 2019

Continuing on the theme of better logical organization of our code, make
the first step towards making the timeline code better isolated from wider
struct drm_i915_private.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-25-tvrtko.ursulin@linux.intel.com

390c8205

15 6月, 2019 1 次提交

drm/i915: Replace engine->timeline with a plain list · 422d7df4

由 Chris Wilson 提交于 6月 14, 2019

To continue the onslaught of removing the assumption of a global
execution ordering, another casualty is the engine->timeline. Without an
actual timeline to track, it is overkill and we can replace it with a
much less grand plain list. We still need a list of requests inflight,
for the simple purpose of finding inflight requests (for retiring,
resetting, preemption etc).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk

422d7df4

06 6月, 2019 1 次提交

drm/i915: Move object close under its own lock · 155ab883

由 Chris Wilson 提交于 6月 06, 2019

Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk

155ab883

09 4月, 2019 1 次提交

drm/i915: Consolidate the timeline->barrier · de220cc2

由 Chris Wilson 提交于 4月 08, 2019

The timeline is strictly ordered, so by inserting the timeline->barrier
request into the timeline->last_request it naturally provides the same
barrier. Consolidate the pair of barriers into one as they serve the
same purpose.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190408091728.20207-4-chris@chris-wilson.co.uk

de220cc2

21 3月, 2019 1 次提交

drm/i915: Stop storing the context name as the timeline name · 4daffb66

由 Chris Wilson 提交于 3月 21, 2019

The timeline->name is only used for convenience in pretty printing the
i915_request.fence->ops->get_timeline_name() and it is just as
convenient to pull it from the gem_context directly. The few instances
of its use inside GEM_TRACE() has proven more of a nuisance than
helpful, so not worth saving imo.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-4-chris@chris-wilson.co.uk

4daffb66

02 3月, 2019 1 次提交

drm/i915: Keep timeline HWSP allocated until idle across the system · ebece753

由 Chris Wilson 提交于 3月 01, 2019

In preparation for enabling HW semaphores, we need to keep in flight
timeline HWSP alive until its use across entire system has completed,
as any other timeline active on the GPU may still refer back to the
already retired timeline. We both have to delay recycling available
cachelines and unpinning old HWSP until the next idle point.

An easy option would be to simply keep all used HWSP until the system as
a whole was idle, i.e. we could release them all at once on parking.
However, on a busy system, we may never see a global idle point,
essentially meaning the resource will be leaked until we are forced to
do a GC pass. We already employ a fine-grained idle detection mechanism
for vma, which we can reuse here so that each cacheline can be freed
immediately after the last request using it is retired.

v3: Keep track of the activity of each cacheline.
v4: cacheline_free() on canceling the seqno tracking
v5: Finally with a testcase to exercise wraparound
v6: Pack cacheline into empty bits of page-aligned vaddr
v7: Use i915_utils to hide the pointer casting around bit manipulation
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-2-chris@chris-wilson.co.uk

ebece753

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功