提交 · 65fcb8064dd0e54d4674e8e2c6bf6ed7264a29e9 · openanolis / cloud-kernel

03 5月, 2018 1 次提交

drm/i915: Move timeline from GTT to ring · 65fcb806

由 Chris Wilson 提交于 5月 02, 2018

In the future, we want to move a request between engines. To achieve
this, we first realise that we have two timelines in effect here. The
first runs through the GTT is required for ordering vma access, which is
tracked currently by engine. The second is implied by sequential
execution of commands inside the ringbuffer. This timeline is one that
maps to userspace's expectations when submitting requests (i.e. given the
same context, batch A is executed before batch B). As the rings's
timelines map to userspace and the GTT timeline an implementation
detail, move the timeline from the GTT into the ring itself (per-context
in logical-ring-contexts/execlists, or a global per-engine timeline for
the shared ringbuffers in legacy submission.

The two timelines are still assumed to be equivalent at the moment (no
migrating requests between engines yet) and so we can simply move from
one to the other without adding extra ordering.

v2: Reinforce that one isn't allowed to mix the engine execution
timeline with the client timeline from userspace (on the ring).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk

65fcb806

02 5月, 2018 1 次提交

drm/i915: Show ring->start for the ELSP context/request queue · 3a068721

由 Chris Wilson 提交于 5月 02, 2018

Since the advent of execlists, the HW no longer executes from a single
statically assigned ring, but instead switches to a different ring for
each context (logical ringbuffer contexts as it is called). So a good way
to tally the executing context against what we have queued is by
comparing the RING_START register against our requests. Make it so.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502104150.29874-1-chris@chris-wilson.co.uk

3a068721

30 4月, 2018 2 次提交

drm/i915: Wrap engine->context_pin() and engine->context_unpin() · ab82a063

由 Chris Wilson 提交于 4月 30, 2018

Make life easier in upcoming patches by moving the context_pin and
context_unpin vfuncs into inline helpers.

v2: Fixup mock_engine to mark the context as pinned on use.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-2-chris@chris-wilson.co.uk

ab82a063

drm/i915: Stop tracking timeline->inflight_seqnos · 52d7f16e

由 Chris Wilson 提交于 4月 30, 2018

In commit 9b6586ae ("drm/i915: Keep a global seqno per-engine"), we
moved from a global inflight counter to per-engine counters in the
hope that will be easy to run concurrently in future. However, with the
advent of the desire to move requests between engines, we do need a
global counter to preserve the semantics that no engine wraps in the
middle of a submit. (Although this semantic is now only required for gen7
semaphore support, which only supports greater-then comparisons!)

v2: Keep a global counter of all requests ever submitted and force the
reset when it wraps.

References: 9b6586ae ("drm/i915: Keep a global seqno per-engine")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-1-chris@chris-wilson.co.uk

52d7f16e

26 4月, 2018 1 次提交

drm/i915: Use seqlock in engine stats · 741258cd

由 Tvrtko Ursulin 提交于 4月 26, 2018

We can convert engine stats from a spinlock to seqlock to ensure interrupt
processing is never even a tiny bit delayed by parallel readers.

There is a smidgen bit more cost on the write lock side, and an extremely
unlikely chance that readers will have to retry a few times in face of
heavy interrupt load. But it should be extremely unlikely given how
lightweight read side section is compared to the interrupt processing
side, and also compared to the rest of the code paths which can lead into
it. Furthermore, writer is the ones doing the real, latency sensitive
work, while readers are only informative.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180426074716.7352-1-tvrtko.ursulin@linux.intel.com

741258cd

24 4月, 2018 3 次提交

drm/i915: Skip printing global offsets for per-engine scratch pages · aaab22bc

由 Chris Wilson 提交于 4月 24, 2018

Knowing the offset of the per-engine scratch/HWS page during boot is not
very informative, so remove the DRM_DEBUG.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180424115236.2022-1-chris@chris-wilson.co.uk

aaab22bc

drm/i915: Don't dump umpteen thousand requests · 56021f48

由 Chris Wilson 提交于 4月 24, 2018

If we have more than a few, possibly several thousand request in the
queue, don't show the central portion, just the first few and the last
being executed and/or queued. The first few should be enough to help
identify a problem in execution, and most often comparing the first/last
in the queue is enough to identify problems in the scheduling.

We may need some fine tuning to set MAX_REQUESTS_TO_SHOW for common
debug scenarios, but for the moment if we can avoiding spending more
than a few seconds dumping the GPU state that will avoid a nasty
livelock (where hangcheck spends so long dumping the state, it fires
again and starts to dump the state again in parallel, ad infinitum).

v2: Remember to print last not the stale rq iter after the loop.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180424081600.27544-1-chris@chris-wilson.co.uk

56021f48

drm/i915: Build request info on stack before printk · 247870ac

由 Chris Wilson 提交于 4月 24, 2018

printk unhelpfully inserts a '\n' between consecutive calls, and since
our drm_printf wrapper may be emitting info a seq_file instead,
KERN_CONT is not an option. To work with any drm_printf destination, we
need to build up the output into a temporary buf on the stack and then
feed the complete line in a single call to printk.

Fixes: b7268c5e ("drm/i915: Pack params to engine->schedule() into a struct")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180424010839.22860-1-chris@chris-wilson.co.uk

247870ac

19 4月, 2018 2 次提交

drm/i915: Pack params to engine->schedule() into a struct · b7268c5e

由 Chris Wilson 提交于 4月 18, 2018

Today we only want to pass along the priority to engine->schedule(), but
in the future we want to have much more control over the various aspects
of the GPU during a context's execution, for example controlling the
frequency allowed. As we need an ever growing number of parameters for
scheduling, move those into a struct for convenience.

v2: Move the anonymous struct into its own function for legibility and
ye olde gcc.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180418184052.7129-3-chris@chris-wilson.co.uk

b7268c5e

drm/i915: Rename priotree to sched · 0c7112a0

由 Chris Wilson 提交于 4月 18, 2018

Having moved the priotree struct into i915_scheduler.h, identify it as
the scheduling element and rebrand into i915_sched. This becomes more
useful as we start attaching more information we require to propagate
through the scheduler.

v2: Use i915_sched_node for future distinctiveness
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180418184052.7129-2-chris@chris-wilson.co.uk

0c7112a0

13 4月, 2018 1 次提交

drm/i915/cnl: Use mmio access to context status buffer · 61bf9719

由 Mika Kuoppala 提交于 4月 12, 2018

Evidence indicates that Cannonlake HWSP is not coherent
as it should. Revert to using mmio access for now.

Testcase: igt/gem_ctx_switch
References: https://bugs.freedesktop.org/show_bug.cgi?id=105888
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180412145802.23313-1-mika.kuoppala@linux.intel.com

61bf9719

12 4月, 2018 2 次提交

drm/i915: Move a bunch of workaround-related code to its own file · 7d3c425f

由 Oscar Mateo 提交于 4月 10, 2018

This has grown to be a sizable amount of code, so move it to
its own file before we try to refactor anything. For the moment,
we are leaving behind the WA BB code and the WAs that get applied
(incorrectly) in init_clock_gating, but we will deal with it later.

v2: Use intel_ prefix for code that deals with the hardware (Chris)
v3: Rebased
v4:
  - Rebased
  - New license header
v5:
  - Rebased
  - Added some organisational notes to the file (Chris)
v6: Include DOC section in the documentation build (Jani)
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
[ickle: appease checkpatch, mostly]
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1523376767-18480-1-git-send-email-oscar.mateo@intel.com

7d3c425f

drm/i915/execlists: Set queue priority from secondary port · 15c83c43

由 Chris Wilson 提交于 4月 11, 2018

We can refine our current execlists->queue_priority if we inspect
ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
the unsubmitted queue and say that if a subsequent request is more
important than the current queue, we will rerun the submission tasklet
to evaluate the need for preemption. However, we only want to preempt if
we need to jump ahead of a currently executing request in ELSP. The
second reason for running the submission tasklet is amalgamate requests
into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
(Though repeatedly amalgamating requests into the active context and
triggering many lite-restore is off question gain, the goal really is to
put a context into ELSP[1] to cover the interrupt.) So if instead of
looking at the head of the queue, we look at the context in ELSP[1] we
can answer both of the questions more accurately -- we don't need to
rerun the submission tasklet unless our new request is important enough
to feed into, at least, ELSP[1].

v2: Add some comments from the discussion with Tvrtko.
v3: More commentary to cross-reference queue_request()

References: f6322edd ("drm/i915/preemption: Allow preemption between submission ports")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180411103929.27374-1-chris@chris-wilson.co.uk

15c83c43

27 3月, 2018 1 次提交

drm/i915: Include submission tasklet state in engine dump · 90408713

由 Chris Wilson 提交于 3月 26, 2018

For the off-chance we have an interrupt posted and haven't processed the
CSB.

v2: Include tasklet enable/disable state for good measure.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180326115044.2505-4-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>

90408713

20 3月, 2018 1 次提交

drm/i915/icl: Update subslice define for ICL 11 · d3d57927

由 Kelvin Gardiner 提交于 3月 16, 2018

ICL 11 has a greater number of maximum subslices. This patch
reflects this.

v2: GEN11 updates to MCR_SELECTOR (Oscar)
v3: Copypasta error in the new defines (Lionel)

Bspec: 21139
BSpec: 21108
Signed-off-by: NKelvin Gardiner <kelvin.gardiner@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> (v1)
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180316121456.11577-3-mika.kuoppala@linux.intel.comSigned-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>

d3d57927

15 3月, 2018 3 次提交

drm/i915: move gen8 irq shifts to intel_lrc.c · fa6f071d

由 Daniele Ceraolo Spurio 提交于 3月 14, 2018

The only usage outside the intel_lrc.c file is in the ringbuffer
init, but the irq mask calculated there is then overwritten for
all engines that have a non-zero shift, so we can drop it.

This change is not aimed at code saving but at removing from
intel_engines information that does not apply to all gens that have
the engine. When checking without the temporary WARN_ON, code size
is basically unchanged.

v2: make the irq_shifts array static const
v3: rebase, move irq_shifts array to logical_ring_default_irqs
v4: move array inside the if and use u8 for it (Chris)
Suggested-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180314182653.26981-4-daniele.ceraolospurio@intel.com

fa6f071d

drm/i915: add a selftest for the mmio_bases table · 74419daa

由 Daniele Ceraolo Spurio 提交于 3月 14, 2018

Check that the entries are in reverse gen order and that all entries
with gen > 0 have an mmio base set.

v2: loop forward, simplify logic, use i915_subtests (Chris)
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180314182653.26981-2-daniele.ceraolospurio@intel.com

74419daa

drm/i915: store all mmio bases in intel_engines · 80b216b9

由 Daniele Ceraolo Spurio 提交于 3月 14, 2018

The mmio bases we're currently storing in the intel_engines array are
only valid for a subset of gens, so we need to ignore them and use
different values in some cases. Instead of doing that, we can have a
table of [starting gen, mmio base] pairs for each engine in
intel_engines and select the correct one based on the gen we're running
on in a consistent way.

v2: document that the list goes in reverse order, update starting gen
    for render (Chris)

v3: starting gen for render back to 1 to make our life easier with
    selftests (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v2
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180314182653.26981-1-daniele.ceraolospurio@intel.com

80b216b9

14 3月, 2018 1 次提交

drm/i915: Check rq->timeline before deference · ab268151

由 Chris Wilson 提交于 3月 14, 2018

Not only is the context suspect to disappearing, but so is it's
timeline. Under a lockless inspection of the requests for
debugging from intel_engine_dump(), the context may already have been
freed and we have to check before chasing the dangling pointer.

[28033.681755] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel snd_pcm mei_me mei i915 r8169 mii prime_numbers i2c_hid
[28033.681796] CPU: 3 PID: 3058 Comm: gem_exec_schedu Tainted: G     U           4.16.0-rc5+ #9
[28033.681804] Hardware name: Acer Aspire E5-575G/Ironman_SK  , BIOS V1.12 08/02/2016
[28033.681834] RIP: 0010:print_request+0x2b/0xb0 [i915]
[28033.681840] RSP: 0018:ffffc90004afbc18 EFLAGS: 00010202
[28033.681847] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8801921b5a40 RCX: 0000000000000006
[28033.681854] RDX: ffffc90004afbc60 RSI: ffff8801921b5a40 RDI: 0000000000000004
[28033.681861] RBP: ffffc90004afbd80 R08: 0000000000000000 R09: 0000000000000001
[28033.681868] R10: ffffc90004afbbd0 R11: ffffc90004afbc73 R12: ffffc90004afbc60
[28033.681875] R13: ffffc90004afbd80 R14: ffff8801d40ec670 R15: ffff8801921b5a40
[28033.681883] FS:  00007fbba5f6c8c0(0000) GS:ffff8801e8400000(0000) knlGS:0000000000000000
[28033.681891] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[28033.681897] CR2: 00007fbba5f8f000 CR3: 00000001b2efa002 CR4: 00000000003606e0
[28033.681904] Call Trace:
[28033.681932]  intel_engine_print_registers+0x6a7/0x930 [i915]
[28033.681962]  intel_engine_dump+0x30d/0x740 [i915]
[28033.681971]  ? seq_printf+0x3a/0x50
[28033.681995]  i915_engine_info+0xb8/0xe0 [i915]
[28033.682003]  ? drm_get_color_range_name+0x20/0x20
[28033.682010]  seq_read+0xe1/0x440
[28033.682018]  full_proxy_read+0x51/0x80
[28033.682025]  __vfs_read+0x21/0x130
[28033.682031]  ? do_sys_open+0x134/0x220
[28033.682037]  ? kmem_cache_free+0x177/0x2b0
[28033.682043]  vfs_read+0xa1/0x150
[28033.682049]  SyS_read+0x40/0xa0
[28033.682055]  do_syscall_64+0x6b/0x1b0
[28033.682063]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
[28033.682069] RIP: 0033:0x7fbba4655d11
[28033.682074] RSP: 002b:00007ffd8c49da58 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[28033.682082] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbba4655d11
[28033.682089] RDX: 000000000000003f RSI: 00005647bfbfc260 RDI: 0000000000000006
[28033.682096] RBP: 000000000000003f R08: 00000000ffffffff R09: 0000000000000000
[28033.682104] R10: 0000000000000000 R11: 0000000000000246 R12: 00005647bfbfc260
[28033.682111] R13: 0000000000000006 R14: 0000000000000000 R15: 00005647bfbfc260
[28033.682119] Code: 41 55 41 54 49 89 d4 55 53 48 89 fd 48 8b 86 c8 00 00 00 48 8b 3d d6 1e 14 e2 48 89 f3 48 2b be a8 02 00 00 48 8b 80 b0 00 00 00 <4c> 8b 68 18 e8 bc 80 02 e1 8b 8b 70 02 00 00 8b b3 28 02 00 00
[28033.682206] RIP: print_request+0x2b/0xb0 [i915] RSP: ffffc90004afbc18
Reported-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180314101630.8933-1-chris@chris-wilson.co.uk

ab268151

10 3月, 2018 1 次提交

drm/i915: Change parameters order in i915_gem_batch_pool_init · c5781351

由 Michal Wajdeczko 提交于 3月 08, 2018

Function i915_gem_batch_pool_init() failed to follow obj-verb
naming schema. Fix that by swapping function parameters.
While here, change license text to SPDX format.

v2: use intel_engine_init_batch_pool (Chris) as proxy (Michal)
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308095037.18264-3-michal.wajdeczko@intel.com

c5781351

09 3月, 2018 1 次提交

drm/i915: Include ring->emit in debugging · ef5032a0

由 Chris Wilson 提交于 3月 07, 2018

Include ring->emit and ring->space alongside ring->(head,tail) when
printing debug information.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-4-chris@chris-wilson.co.uk

ef5032a0

07 3月, 2018 2 次提交

drm/i915/icl: new context descriptor support · ac52da6a

由 Daniele Ceraolo Spurio 提交于 3月 02, 2018

Starting from Gen11 the context descriptor format has been updated in
the HW. The hw_id field has been considerably reduced in size and engine
class and instance fields have been added.

There is a slight name clashing issue because the field that we call
hw_id is actually called SW Context ID in the specs for Gen11+.

With the current size of the hw_id field we can have a maximum of 2k
contexts at any time, but we could use the sw_counter field (which is sw
defined) to increase that because the HW requirement is that
engine_id + sw id + sw_counter is a unique number.
GuC uses a similar method to support more contexts but does its tracking
at lrc level. To avoid doing an implementation that will need to be
reworked once GuC support lands, defer it for now and mark it as TODO.

v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT
v3: rebased, bring back lost code from i915_gem_context.c
v4: make TODO comment more generic
v5: be consistent with bit ordering, add extra checks (Chris)

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NOscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.comSigned-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>

ac52da6a

drm/i915/icl: Correctly initialize the Gen11 engines · 5f79e7c6

由 Oscar Mateo 提交于 3月 02, 2018

Gen11 has up to 4 VCS and up to 2 VECS engines, this patch adds mmio
base definitions for all of them.

Bspec: 20944
Bspec: 7021

v2: Set the correct mmio_base in intel_engines_init_mmio; updating the
base mmio values any later would cause incorrect reads in
i915_gem_sanitize (Michel).

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com>
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-2-mika.kuoppala@linux.intel.comSigned-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>

5f79e7c6

28 2月, 2018 1 次提交

drm/i915: Don't deref request->ctx inside unlocked print_request() · 367a35a6

由 Chris Wilson 提交于 2月 28, 2018

Although we protect the request itself, we don't lock inside
intel_engine_dump() and so the request maybe retired as we peek into it.
One consequence is that the request->ctx may be freed before we
dereference it, leading to a use-after-free. Replace the hw_id we are
peeking from inside request->ctx with the request->fence.context, with
which we can still track from which context the request originated
(although to tie to HW reports requires a little more legwork, but is
good enough to follow the GEM traces).

[52640.729670] general protection fault: 0000 [#2] SMP
[52640.729694] Dumping ftrace buffer:
[52640.729701]    (ftrace buffer empty)
[52640.729705] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_\
temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep gha\
sh_clmulni_intel snd_hda_core snd_pcm mei_me mei i915 r8169 mii prime_numbers i2c_hid
[52640.729748] CPU: 2 PID: 4335 Comm: gem_exec_schedu Tainted: G     UD W        4.16.0-rc3+ #7
[52640.729759] Hardware name: Acer Aspire E5-575G/Ironman_SK  , BIOS V1.12 08/02/2016
[52640.729803] RIP: 0010:print_request+0x2b/0xb0 [i915]
[52640.729811] RSP: 0018:ffffc90001453c18 EFLAGS: 00010206
[52640.729820] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8801e0292d40 RCX: 0000000000000006
[52640.729829] RDX: ffffc90001453c60 RSI: ffff8801e0292d40 RDI: 0000000000000003
[52640.729838] RBP: ffffc90001453d80 R08: 0000000000000000 R09: 0000000000000001
[52640.729847] R10: ffffc90001453bd0 R11: ffffc90001453c73 R12: ffffc90001453c60
[52640.729856] R13: ffffc90001453d80 R14: ffff8801d5a683c8 R15: ffff8801e0292d40
[52640.729866] FS:  00007f1ee50548c0(0000) GS:ffff8801e8200000(0000) knlGS:0000000000000000
[52640.729876] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[52640.729884] CR2: 00007f1ee5077000 CR3: 00000001d9411004 CR4: 00000000003606e0
[52640.729893] Call Trace:
[52640.729922]  intel_engine_print_registers+0x623/0x890 [i915]
[52640.729948]  intel_engine_dump+0x4a3/0x590 [i915]
[52640.729957]  ? seq_printf+0x3a/0x50
[52640.729977]  i915_engine_info+0xb8/0xe0 [i915]
[52640.729984]  ? drm_mode_gamma_get_ioctl+0xf0/0xf0
[52640.729990]  seq_read+0xd5/0x410
[52640.729997]  full_proxy_read+0x4b/0x70
[52640.730004]  __vfs_read+0x1e/0x120
[52640.730009]  ? do_sys_open+0x134/0x220
[52640.730015]  ? kmem_cache_free+0x174/0x2b0
[52640.730021]  vfs_read+0xa1/0x150
[52640.730026]  SyS_read+0x40/0xa0
[52640.730032]  do_syscall_64+0x65/0x1a0
[52640.730038]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
Reported-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180228094732.28462-1-chris@chris-wilson.co.uk

367a35a6

24 2月, 2018 1 次提交

drm/i915/preemption: Allow preemption between submission ports · f6322edd

由 Chris Wilson 提交于 2月 22, 2018

Sometimes we need to boost the priority of an in-flight request, which
may lead to the situation where the second submission port then contains
a higher priority context than the first and so we need to inject a
preemption event. To do so we must always check inside
execlists_dequeue() whether there is a priority inversion between the
ports themselves as well as the head of the priority sorted queue, and we
cannot just skip dequeuing if the queue is empty.

As Michał noted, this doesn't simply extend to handling more than 2-port
submission, as we may need to reorder within the array of executing
requests which themselves are lower priority than the first. A task for
later!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222142229.14517-1-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>

f6322edd

22 2月, 2018 1 次提交

drm/i915: Rename drm_i915_gem_request to i915_request · e61e0f51

由 Chris Wilson 提交于 2月 21, 2018

We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)

In short, the spatch:
@@

@@
- struct drm_i915_gem_request
+ struct i915_request

A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

e61e0f51

14 2月, 2018 1 次提交

drm/i915: Lock out execlist tasklet while peeking inside for busy-stats · edb76b01

由 Chris Wilson 提交于 2月 13, 2018

In order to prevent a race condition where we may end up overaccounting
the active state and leaving the busy-stats believing the GPU is 100%
busy, lock out the tasklet while we reconstruct the busy state. There is
no direct spinlock guard for the execlists->port[], so we need to
utilise tasklet_disable() as a synchronous barrier to prevent it, the
only writer to execlists->port[], from running at the same time as the
enable.

Fixes: 4900727d ("drm/i915/pmu: Reconstruct active state on starting busy-stats")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180115092041.13509-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit 99e48bf9)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213095747.2424-1-tvrtko.ursulin@linux.intel.com

edb76b01

13 2月, 2018 1 次提交

drm/i915: Don't wake the device up to check if the engine is asleep · 7292b9e6

由 Chris Wilson 提交于 2月 12, 2018

If the entire device is powered off, we can safely assume that the
engine is also asleep (and idle).
Reported-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: a091d4ee ("drm/i915: Hold a wakeref for probing the ring registers")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180212093928.6005-1-chris@chris-wilson.co.uk
(cherry picked from commit 74d00d28)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

7292b9e6

12 2月, 2018 2 次提交

drm/i915: Hold rpm wakeref for printing the engine's register state · 3ceda3a4

由 Chris Wilson 提交于 2月 12, 2018

When dumping the engine, we print out the current register values. This
requires the rpm wakeref. If the device is alseep, we can assume the
engine is asleep (and the register state is uninteresting) so skip and
only acquire the rpm wakeref if the device is already awake.
Reported-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180212102415.24246-1-chris@chris-wilson.co.uk

3ceda3a4

drm/i915: Don't wake the device up to check if the engine is asleep · 74d00d28

由 Chris Wilson 提交于 2月 12, 2018

If the entire device is powered off, we can safely assume that the
engine is also asleep (and idle).
Reported-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: a091d4ee ("drm/i915: Hold a wakeref for probing the ring registers")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180212093928.6005-1-chris@chris-wilson.co.uk

74d00d28

09 2月, 2018 1 次提交

drm/i915: Remove redundant check on execlists interrupt · 8e47b4b6

由 Chris Wilson 提交于 2月 08, 2018

Since commit 4a118ecb ("drm/i915: Filter out spurious execlists
context-switch interrupts") we probe execlists->active, and no longer
have to peek at the execlist interrupt to determine if the tasklet still
needs to be run to drain the ELSP.

References: 4a118ecb ("drm/i915: Filter out spurious execlists context-switch interrupts")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180208151224.16285-1-chris@chris-wilson.co.uk

8e47b4b6

08 2月, 2018 1 次提交

drm/i915: Only allocate preempt context when required · d6376374

由 Chris Wilson 提交于 2月 07, 2018

If we remove some hardcoded assumptions about the preempt context having
a fixed id, reserved from use by normal user contexts, we may only
allocate the i915_gem_context when required. Then the subsequent
decisions on using preemption reduce to having the preempt context
available.

v2: Include an assert that we don't allocate the preempt context twice.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-3-chris@chris-wilson.co.ukReviewed-by: NMichel Thierry <michel.thierry@intel.com>

d6376374

01 2月, 2018 1 次提交

drm/i915/pmu: Reconstruct active state on starting busy-stats · cc4f8fc7

由 Chris Wilson 提交于 1月 11, 2018

We have a hole in our busy-stat accounting if the pmu is enabled during
a long running batch, the pmu will not start accumulating busy-time
until the next context switch. This then fails tests that are only
sampling a single batch.

v2: Count each active port just once (context in/out events are only on
the first and last assignment to a port).
v3: Avoid hardcoding knowledge of 2 submission ports

Fixes: 30e17b78 ("drm/i915: Engine busy time tracking")
Testcase: igt/perf_pmu/busy-start
Testcase: igt/perf_pmu/busy-double-start
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180111073031.14614-1-chris@chris-wilson.co.uk
(cherry picked from commit 4900727d)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

cc4f8fc7

23 1月, 2018 1 次提交

drm/i915: Downgrade incorrect engine constructor usage warnings to development · ae504be2

由 Tvrtko Ursulin 提交于 1月 19, 2018

Render engine constructor helpers must only be called from the render
engine constructors, but there is no need to burden the production
binaries with warnings which can only be triggered during development.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180119100005.9072-1-tvrtko.ursulin@linux.intel.com

ae504be2

20 1月, 2018 2 次提交

drm/i915/icl: Gen11 render context size · b86aa445

由 Tvrtko Ursulin 提交于 1月 11, 2018

Gen11 removes the Resource Streamer, which frees up a big chunk of
the context image. BSpec indicates 12544 DWORDs (13 pages), plus
one page for PPHWSP.

Please notice that, when looking at the BSpec context image table,
the right filter has to be applied as some rows are excluded for
specific GENs. Also, some rows apply per-subslice (for the
calculation above, we have supposed I915_MAX_SUBSLICES = 8).

v2: Rebase.
v3: Use the right size as per the BSpec.
v4:
  - Rebased on top of the default context size (Rodrigo)
  - Clarify in the commit message where the subslice calculation
    comes from.
v5: s/12538/12544/ (Daniele)

BSpec: 18907

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Ben Widawsky <benjamin.widawsky@intel.com> (older version)
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1515711307-28979-2-git-send-email-oscar.mateo@intel.com

b86aa445

drm/i915: Return a default RCS context size · 7ab4adbd

由 Oscar Mateo 提交于 1月 11, 2018

Instead of returning whatever size the latest GEN used. This is because
context sizes for new GENs can go up or down, but the only safe thing to
do for missing cases is to use the largest known one, whatever that is.
Suggested-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1515711307-28979-1-git-send-email-oscar.mateo@intel.com

7ab4adbd

15 1月, 2018 1 次提交

drm/i915: Lock out execlist tasklet while peeking inside for busy-stats · 99e48bf9

由 Chris Wilson 提交于 1月 15, 2018

In order to prevent a race condition where we may end up overaccounting
the active state and leaving the busy-stats believing the GPU is 100%
busy, lock out the tasklet while we reconstruct the busy state. There is
no direct spinlock guard for the execlists->port[], so we need to
utilise tasklet_disable() as a synchronous barrier to prevent it, the
only writer to execlists->port[], from running at the same time as the
enable.

Fixes: 4900727d ("drm/i915/pmu: Reconstruct active state on starting busy-stats")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180115092041.13509-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

99e48bf9

11 1月, 2018 2 次提交

drm/i915/pmu: Reconstruct active state on starting busy-stats · 4900727d

由 Chris Wilson 提交于 1月 11, 2018

We have a hole in our busy-stat accounting if the pmu is enabled during
a long running batch, the pmu will not start accumulating busy-time
until the next context switch. This then fails tests that are only
sampling a single batch.

v2: Count each active port just once (context in/out events are only on
the first and last assignment to a port).
v3: Avoid hardcoding knowledge of 2 submission ports

Fixes: 30e17b78 ("drm/i915: Engine busy time tracking")
Testcase: igt/perf_pmu/busy-start
Testcase: igt/perf_pmu/busy-double-start
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180111073031.14614-1-chris@chris-wilson.co.uk

4900727d

drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake. · 4636bda8

由 Kenneth Graunke 提交于 1月 05, 2018

Geminilake requires the 3D driver to select whether barriers are
intended for compute shaders, or tessellation control shaders, by
whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when
switching pipelines.  Failure to do this properly can result in GPU
hangs.

Unfortunately, this means it needs to switch mid-batch, so only
userspace can properly set it.  To facilitate this, the kernel needs
to whitelist the register.

The workarounds page currently tags this as applying to Broxton only,
but that doesn't make sense.  The documentation for the register it
references says the bit userspace is supposed to toggle only exists on
Geminilake.  Empirically, the Mesa patch to toggle this bit appears to
fix intermittent GPU hangs in tessellation control shader barrier tests
on Geminilake; we haven't seen those hangs on Broxton.

v2: Mention WA #0862 in the comment (it doesn't have a name).
Signed-off-by: NKenneth Graunke <kenneth@whitecape.org>
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180105085905.9298-1-kenneth@whitecape.org
(cherry picked from commit ab062639)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

4636bda8

06 1月, 2018 1 次提交

drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake. · ab062639

由 Kenneth Graunke 提交于 1月 05, 2018

Geminilake requires the 3D driver to select whether barriers are
intended for compute shaders, or tessellation control shaders, by
whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when
switching pipelines.  Failure to do this properly can result in GPU
hangs.

Unfortunately, this means it needs to switch mid-batch, so only
userspace can properly set it.  To facilitate this, the kernel needs
to whitelist the register.

The workarounds page currently tags this as applying to Broxton only,
but that doesn't make sense.  The documentation for the register it
references says the bit userspace is supposed to toggle only exists on
Geminilake.  Empirically, the Mesa patch to toggle this bit appears to
fix intermittent GPU hangs in tessellation control shader barrier tests
on Geminilake; we haven't seen those hangs on Broxton.

v2: Mention WA #0862 in the comment (it doesn't have a name).
Signed-off-by: NKenneth Graunke <kenneth@whitecape.org>
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180105085905.9298-1-kenneth@whitecape.org

ab062639

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功