openanolis / cloud-kernel
1 年多前同步成功

36

7

代码
- 文件
- 提交
- 分支
- Tags
- 贡献者
- 分支图
- Diff
Issue 10
- 列表
- 看板
- 标记
- 里程碑
合并请求 2
Wiki 0
- Wiki
分析
- 仓库
- DevOps
项目成员
Pages

05 8月, 2016 2 次提交

C

drm/i915: Remove forced stop ring on suspend/unload · 90f4fcd5

由 Chris Wilson 提交于 8月 05, 2016

Before suspending (or unloading), we would first wait upon all rendering
to be completed and then disable the rings. This later step is a remanent
from DRI1 days when we did not use request tracking for all operations
upon the ring. Now that we are sure we are waiting upon the very last
operation by the engine, we can forgo clobbering the ring registers,
though we do keep the assert that the engine is indeed idle before
sleeping.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-5-git-send-email-chris@chris-wilson.co.uk

90f4fcd5

C

drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin() · de895082

由 Chris Wilson 提交于 8月 04, 2016

Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for
i915_gem_object_ggtt_pin(), spare us the confusion and remove it.
Removing it now simplifies later patches to change the i915_vma_pin()
(and friends) interface.

v2: Add a redundant GEM_BUG_ON(!view) to
i915_gem_obj_lookup_or_create_ggtt_vma()
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-18-git-send-email-chris@chris-wilson.co.uk

de895082

03 8月, 2016 15 次提交

C

drm/i915: Move the common engine cleanup to intel_engine_cs.c · 96a945aa

由 Chris Wilson 提交于 8月 03, 2016

Now that we initialize the state to both legacy and execlists inside
intel_engine_cs, we should also clean up that state from the common
functions.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470226756-24401-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

96a945aa

C

drm/i915: Unify legacy/execlists submit_execbuf callbacks · 5b043f4e

由 Chris Wilson 提交于 8月 02, 2016

Now that emitting requests is identical between legacy and execlists, we
can use the same function to build up the ring for submitting to either
engine. (With the exception of i915_switch_contexts(), but in time that
will also be handled gracefully.)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-30-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-21-git-send-email-chris@chris-wilson.co.uk

5b043f4e

C

drm/i915: Remove duplicate golden render state init from execlists · e40f9ee6

由 Chris Wilson 提交于 8月 02, 2016

Now that we use the same vfuncs for emitting the batch buffer in both
execlists and legacy, the golden render state initialisation is
identical between both.

v2: gcc wants so.ggtt_offset initialised (even though it is not used)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-28-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-19-git-send-email-chris@chris-wilson.co.uk

e40f9ee6

C

drm/i915/lrc: Update function names to match request flow · f4ea6bdd

由 Chris Wilson 提交于 8月 02, 2016

With adding engine->submit_request, we now have a bunch of functions
with similar names used at different stages of the execlist submission.
Try a different coat of paint, to hopefully reduce confusion between the
requests, intel_engine_cs and the actual execlists submision process.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-24-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-15-git-send-email-chris@chris-wilson.co.uk

f4ea6bdd

C

drm/i915: Unify request submission · ddd66c51

由 Chris Wilson 提交于 8月 02, 2016

Move request submission from emit_request into its own common vfunc
from i915_add_request().

v2: Convert I915_DISPATCH_flags to BIT(x) whilst passing
v3: Rename a few functions to match.
v4: Reenable execlists submission after disabling guc.
v5: Be aware that everyone calls i915_guc_submission_disable()!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-23-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-14-git-send-email-chris@chris-wilson.co.uk

ddd66c51

C

drm/i915: Move the modulus for ring emission to the register write · 8f942018

由 Chris Wilson 提交于 8月 02, 2016

Space reservation is already safe with respect to the ring->size
modulus, but hardware only expects to see values in the range
0...ring->size-1 (inclusive) and so requires the modulus to prevent us
writing the value ring->size instead of 0. As this is only required for
the register itself, we can defer the modulus to the register update and
not perform it after every command packet. We keep the
intel_ring_advance() around in the code to provide demarcation for the
end-of-packet (which then can be compared against intel_ring_begin() as
the number of dwords emitted must match the reserved space).

v2: Assert that the ring size is a power-of-two to match assumptions in
the code. Simplify the comment before writing the tail value to explain
why the modulus is necessary.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-13-git-send-email-chris@chris-wilson.co.uk

8f942018

C

drm/i915: Unify legacy/execlists emission of MI_BATCHBUFFER_START · 803688ba

由 Chris Wilson 提交于 8月 02, 2016

Both the ->dispatch_execbuffer and ->emit_bb_start callbacks do exactly
the same thing, add MI_BATCHBUFFER_START to the request's ringbuffer -
we need only one vfunc.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-20-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-10-git-send-email-chris@chris-wilson.co.uk

803688ba

C

drm/i915: Simplify request_alloc by returning the allocated request · 8e637178

由 Chris Wilson 提交于 8月 02, 2016

If is simpler and leads to more readable code through the callstack if
the allocation returns the allocated struct through the return value.

The importance of this is that it no longer looks like we accidentally
allocate requests as side-effect of calling certain functions.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-19-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-9-git-send-email-chris@chris-wilson.co.uk

8e637178

C

drm/i915: Reduce engine->emit_flush() to a single mode parameter · 7c9cf4e3

由 Chris Wilson 提交于 8月 02, 2016

Rather than passing a complete set of GPU cache domains for either
invalidation or for flushing, or even both, just pass a single parameter
to the engine->emit_flush to determine the required operations.

engine->emit_flush(GPU, 0) -> engine->emit_flush(EMIT_INVALIDATE)
engine->emit_flush(0, GPU) -> engine->emit_flush(EMIT_FLUSH)
engine->emit_flush(GPU, GPU) -> engine->emit_flush(EMIT_FLUSH | EMIT_INVALIDATE)

This allows us to extend the behaviour easily in future, for example if
we want just a command barrier without the overhead of flushing.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-8-git-send-email-chris@chris-wilson.co.uk

7c9cf4e3

C

drm/i915: Remove obsolete engine->gpu_caches_dirty · c7fe7d25

由 Chris Wilson 提交于 8月 02, 2016

Space for flushing the GPU cache prior to completing the request is
preallocated and so cannot fail - the GPU caches will always be flushed
along with the completed request. This means we no longer have to track
whether the GPU cache is dirty between batches like we had to with the
outstanding_lazy_seqno.

With the removal of the duplication in the per-backend entry points for
emitting the obsolete lazy flush, we can then further unify the
engine->emit_flush.

v2: Expand a bit on the legacy of gpu_caches_dirty
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-18-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-7-git-send-email-chris@chris-wilson.co.uk

c7fe7d25

C

drm/i915: Rename intel_pin_and_map_ring() · aad29fbb

由 Chris Wilson 提交于 8月 02, 2016

For more consistent oop-naming, we would use intel_ring_verb, so pick
intel_ring_pin() and intel_ring_unpin().
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-17-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-6-git-send-email-chris@chris-wilson.co.uk

aad29fbb

C

drm/i915: Rename struct intel_ringbuffer to struct intel_ring · 7e37f889

由 Chris Wilson 提交于 8月 02, 2016

The state stored in this struct is not only the information about the
buffer object, but the ring used to communicate with the hardware. Using
buffer here is overly specific and, for me at least, conflates with the
notion of buffer objects themselves.

s/struct intel_ringbuffer/struct intel_ring/
s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/
s/describe_ctx_ringbuf()/describe_ctx_ring()/
s/intel_ring_get_active_head()/intel_engine_get_active_head()/
s/intel_ring_sync_index()/intel_engine_sync_index()/
s/intel_ring_init_seqno()/intel_engine_init_seqno()/
s/ring_stuck()/engine_stuck()/
s/intel_cleanup_engine()/intel_engine_cleanup()/
s/intel_stop_engine()/intel_engine_stop()/
s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/
s/intel_unpin_ringbuffer()/intel_unpin_ring()/
s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/
s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/
s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/
s/intel_ringbuffer_free()/intel_ring_free()/
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk

7e37f889

C

drm/i915: Rename intel_context[engine].ringbuf · dca33ecc

由 Chris Wilson 提交于 8月 02, 2016

Perform s/ringbuf/ring/ on the context struct for consistency with the
ring/engine split.

v2: Kill an outdated error_ringbuf label
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-14-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-3-git-send-email-chris@chris-wilson.co.uk

dca33ecc

C

drm/i915: Rename request->ringbuf to request->ring · 1dae2dfb

由 Chris Wilson 提交于 8月 02, 2016

Now that we have disambuigated ring and engine, we can use the clearer
and more consistent name for the intel_ringbuffer pointer in the
request.

@@
struct drm_i915_gem_request *r;
@@
- r->ringbuf
+ r->ring
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-12-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-2-git-send-email-chris@chris-wilson.co.uk

1dae2dfb

C

drm/i915: Unify intel_logical_ring_emit and intel_ring_emit · b5321f30

由 Chris Wilson 提交于 8月 02, 2016

Both perform the same actions with more or less indirection, so just
unify the code.

v2: Add back a few intel_engine_cs locals
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-11-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-1-git-send-email-chris@chris-wilson.co.uk

b5321f30

27 7月, 2016 1 次提交

C

drm/i915/cmdparser: Remove stray intel_engine_cs *ring · 33a051a5

由 Chris Wilson 提交于 7月 27, 2016

When we refer to intel_engine_cs, we want to use engine so as not to
confuse ourselves about ringbuffers.

v2: Rename all the functions as well, as well as a few more stray comments.
v3: Split the really long error message strings
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-6-git-send-email-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-1-git-send-email-chris@chris-wilson.co.uk

33a051a5

20 7月, 2016 5 次提交

M

drm/i915/gen9: Add WaDisableGatherAtSetShaderCommonSlice · 873e8171

由 Mika Kuoppala 提交于 7月 20, 2016

Add WaDisableGatherAtSetShaderCommonSlice for all gen9 as stated
by bspec. The bspec told to put this workaround to the per ctx bb.
Initial implementation and subsequent review were done based on
bspec. Arun raised a suspicion that this would belong to indirect bb
instead and he conducted more throughout investigation on the matter
and indeed the documentation was wrong.

v2: Move to indirect_ctx wa bb, as it is correct place (Arun)

References: HSD#2135817
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com> (v1)
Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469013973-24104-1-git-send-email-mika.kuoppala@intel.com

873e8171

C

drm/i915: Rename drm_gem_object_unreference in preparation for lockless free · f8c417cd

由 Chris Wilson 提交于 7月 20, 2016

Ultimately wraps kref_put(), so adopt its nomenclature for consistency
with other subsystems.

s/drm_gem_object_unreference/i915_gem_object_put/
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-6-git-send-email-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-5-git-send-email-chris@chris-wilson.co.uk

f8c417cd

C

drm/i915: Rename i915_gem_context_reference/unreference() · 9a6feaf0

由 Chris Wilson 提交于 7月 20, 2016

As these are wrappers around kref_get/kref_put() it is preferable to
follow the naming convention and use the same verb get/put in our
wrapper names for manipulating a reference to the context.

s/i915_gem_context_reference/i915_gem_context_get/
s/i915_gem_context_unreference/i915_gem_context_put/
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-3-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-2-git-send-email-chris@chris-wilson.co.uk

9a6feaf0

C

drm/i915: Rename request reference/unreference to get/put · e8a261ea

由 Chris Wilson 提交于 7月 20, 2016

Now that we derive requests from struct fence, swap over to its
nomenclature for references. It's shorter and more idiomatic across the
kernel.

s/i915_gem_request_reference/i915_gem_request_get/
s/i915_gem_request_unreference/i915_gem_request_put/
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-2-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-1-git-send-email-chris@chris-wilson.co.uk

e8a261ea

C

drm/i915: Derive GEM requests from dma-fence · 04769652

由 Chris Wilson 提交于 7月 20, 2016

dma-buf provides a generic fence class for interoperation between
drivers. Internally we use the request structure as a fence, and so with
only a little bit of interfacing we can rebase those requests on top of
dma-buf fences. This will allow us, in the future, to pass those fences
back to userspace or between drivers.

v2: The fence_context needs to be globally unique, not just unique to
this device.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-4-git-send-email-chris@chris-wilson.co.uk

04769652

19 7月, 2016 1 次提交

D

drm/i915: Clean up kerneldoc for intel_lrc.c · 6e5248b5

由 Daniel Vetter 提交于 7月 15, 2016

Fairly minimal, there's still lots of functions without any docs, and
which aren't static. But probably we want to first clean this up some more.

- Drop the bogus const. Marking argument pointers themselves (instead of
  what they point at) as const provides roughly 0 value. And it's confusing,
  since the data the pointer points at _is_ being changed.

- Remove kerneldoc for static functions. Keep comments where they seem valuable.

- Indent and whitespace fixes.

- Blockquote the bit field definitions of the descriptor for correct layouting.
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1468612088-9721-9-git-send-email-daniel.vetter@ffwll.ch

6e5248b5

15 7月, 2016 3 次提交

M

drm/i915/kbl: Add WaClearSlmSpaceAtContextSwitch · 703d1282

由 Mika Kuoppala 提交于 6月 07, 2016

This workaround for bdw and chv, is also needed for kbl A0.

References: HSD#1911519, BSID#569
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1465309159-30531-24-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 066d4628)
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>

703d1282

M

drm/i915/kbl: Add WaForGAMHang · a725e1dc

由 Mika Kuoppala 提交于 6月 07, 2016

Add this workaround for A0 and B0 revisions

References: HSD#2226935
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1465309159-30531-19-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 0b2d0934)
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>

a725e1dc

M

drm/i915/kbl: Add WaDisableLSQCROPERFforOCL · 738fa1b3

由 Mika Kuoppala 提交于 6月 07, 2016

Extend the scope of this workaround, already used in skl,
to also take effect in kbl.

v2: Fix KBL_REVID_E0 (Matthew)

References: HSD#2132677
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1465309159-30531-12-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit fe905819)
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>

738fa1b3

14 7月, 2016 5 次提交

T

drm/i915: Pull out some more common engine init code · 019bf277

由 Tvrtko Ursulin 提交于 7月 13, 2016

Created two common helpers for engine setup and engine init phases
respectively to help with code sharing.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468422221-12132-1-git-send-email-tvrtko.ursulin@linux.intel.com
Reviewed-by: Chris Wilson <chris-wilson.co.uk>

019bf277

T

drm/i915: Move common engine setup into intel_engine_cs.c · 88d2ba2e

由 Tvrtko Ursulin 提交于 7月 13, 2016

Common code deserves to be put in a separate file from legacy and
execlists implementation for clarity and ease of maintenance.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris-wilson.co.uk>

88d2ba2e

T

drm/i915: Unify engine init loop · 8b3e2d36

由 Tvrtko Ursulin 提交于 7月 13, 2016

With the unified common engine setup done, and the execlist engine
initialization loop clearly split into two phases, we can eliminate
the separate legacy engine initialization code.

v2: Fix cleanup path for legacy.
v3: Rename constructors. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris-wilson.co.uk>

8b3e2d36

T

drm/i915: Prepare for engine init unification · bb45438f

由 Tvrtko Ursulin 提交于 7月 13, 2016

Move the execlist engine setup to vfuncs so that the engine
init loop is clearly split into the mode agnostic and
specific steps.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris-wilson.co.uk>

bb45438f

D

drm/i915: unify first-stage engine struct setup · c2c7f240

由 Dave Gordon 提交于 7月 13, 2016

intel_lrc.c has a table of "logical rings" (meaning engines), while
intel_ringbuffer.c has separately open-coded initialisation for each
engine. We can deduplicate this somewhat by using the same first-stage
engine-setup function for both modes.

So here we expose the function that transfers information from the
static table of (all) known engines to the dev_priv->engine array of
engines available on this device (adjusting the names along the way)
and then embed calls to it in both the LRC and the legacy-mode setup.
Signed-off-by: NDave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris-wilson.co.uk>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

c2c7f240

07 7月, 2016 1 次提交

T

drm/i915:gen9: implement WaMediaPoolStateCmdInWABB · 3485d99e

由 Tim Gore 提交于 7月 05, 2016

This patch applies WaMediaPoolStateCmdInWABB which fixes
a problem with the restoration of thread counts on resuming
from RC6.

References: HSD#2137167
Signed-off-by: NTim Gore <tim.gore@intel.com>
Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467709290-5941-1-git-send-email-tim.gore@intel.com

3485d99e

05 7月, 2016 1 次提交

C

drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm · 91c8a326

由 Chris Wilson 提交于 7月 05, 2016

Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.

   text	   data	    bss	    dec	    hex	filename
1068757	   4565	    416	1073738	 10624a	drivers/gpu/drm/i915/i915.ko
1066949	   4565	    416	1071930	 105b3a	drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)

and for good measure the dev_priv->dev backpointer was removed entirely.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk

91c8a326

04 7月, 2016 2 次提交

C

drm/i915: Mass convert dev->dev_private to to_i915(dev) · fac5e23e

由 Chris Wilson 提交于 7月 04, 2016

Since we now subclass struct drm_device, we can save pointer dances by
noting the equivalence of struct drm_device and struct drm_i915_private,
i.e. by using to_i915().

   text    data     bss     dec     hex filename
1073824    4562     416 1078802  107612 drivers/gpu/drm/i915/i915.ko
1068976    4562     416 1073954  106322 drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:

@@
expression E;
identifier p;
@@
- struct drm_i915_private *p = E->dev_private;
+ struct drm_i915_private *p = to_i915(E);
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk

fac5e23e

C

drm/i915: Remove stop-rings debugfs interface · 7b4d3a16

由 Chris Wilson 提交于 7月 04, 2016

Now that we have (near) universal GPU recovery code, we can inject a
real hang from userspace and not need any fakery. Not only does this
mean that the testing is far more realistic, but we can simplify the
kernel in the process.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-7-git-send-email-chris@chris-wilson.co.uk

7b4d3a16

02 7月, 2016 4 次提交

C

drm/i915: Move the get/put irq locking into the caller · 31bb59cc

由 Chris Wilson 提交于 7月 01, 2016

With only a single callsite for intel_engine_cs->irq_get and ->irq_put,
we can reduce the code size by moving the common preamble into the
caller, and we can also eliminate the reference counting.

For completeness, as we are no longer doing reference counting on irq,
rename the get/put vfunctions to enable/disable respectively and are
able to review the use of posting reads. We only require the
serialisation with hardware when enabling the interrupt (i.e. so we
cannot miss an interrupt by going to sleep before the hardware truly
enables it).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-18-git-send-email-chris@chris-wilson.co.uk

31bb59cc

C

drm/i915: Refactor scratch object allocation for gen2 w/a buffer · 7d5ea807

由 Chris Wilson 提交于 7月 01, 2016

The gen2 w/a buffer is stuffed into the same slot as the gen5+ scratch
buffer. If we pass in the size we want to allocate for the scratch
buffer, both callers can use the same routine.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-11-git-send-email-chris@chris-wilson.co.uk

7d5ea807

C

drm/i915: Use HWS for seqno tracking everywhere · 1b7744e7

由 Chris Wilson 提交于 7月 01, 2016

By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.

v2: Fix semaphore_passed() to look at the signaling engine (not the
waiter's)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-8-git-send-email-chris@chris-wilson.co.uk

1b7744e7

C

drm/i915: Slaughter the thundering i915_wait_request herd · 688e6c72

由 Chris Wilson 提交于 7月 01, 2016

One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batchbuffer - hence the thunder of hooves as then every client must
do its heavyweight dance to read a coherent seqno to see if it is the
lucky one.

Ideally, we only want one client to wake up after the interrupt and
check its request for completion. Since the requests must retire in
order, we can select the first client on the oldest request to be woken.
Once that client has completed his wait, we can then wake up the
next client and so on. However, all clients then incur latency as every
process in the chain may be delayed for scheduling - this may also then
cause some priority inversion. To reduce the latency, when a client
is added or removed from the list, we scan the tree for completed
seqno and wake up all the completed waiters in parallel.

Using igt/benchmarks/gem_latency, we can demonstrate this effect. The
benchmark measures the number of GPU cycles between completion of a
batch and the client waking up from a call to wait-ioctl. With many
concurrent waiters, with each on a different request, we observe that
the wakeup latency before the patch scales nearly linearly with the
number of waiters (before external factors kick in making the scaling much
worse). After applying the patch, we can see that only the single waiter
for the request is being woken up, providing a constant wakeup latency
for every operation. However, the situation is not quite as rosy for
many waiters on the same request, though to the best of my knowledge this
is much less likely in practice. Here, we can observe that the
concurrent waiters incur extra latency from being woken up by the
solitary bottom-half, rather than directly by the interrupt. This
appears to be scheduler induced (having discounted adverse effects from
having a rbtree walk/erase in the wakeup path), each additional
wake_up_process() costs approximately 1us on big core. Another effect of
performing the secondary wakeups from the first bottom-half is the
incurred delay this imposes on high priority threads - rather than
immediately returning to userspace and leaving the interrupt handler to
wake the others.

To offset the delay incurred with additional waiters on a request, we
could use a hybrid scheme that did a quick read in the interrupt handler
and dequeued all the completed waiters (incurring the overhead in the
interrupt handler, not the best plan either as we then incur GPU
submission latency) but we would still have to wake up the bottom-half
every time to do the heavyweight slow read. Or we could only kick the
waiters on the seqno with the same priority as the current task (i.e. in
the realtime waiter scenario, only it is woken up immediately by the
interrupt and simply queues the next waiter before returning to userspace,
minimising its delay at the expense of the chain, and also reducing
contention on its scheduler runqueue). This is effective at avoid long
pauses in the interrupt handler and at avoiding the extra latency in
realtime/high-priority waiters.

v2: Convert from a kworker per engine into a dedicated kthread for the
bottom-half.
v3: Rename request members and tweak comments.
v4: Use a per-engine spinlock in the breadcrumbs bottom-half.
v5: Fix race in locklessly checking waiter status and kicking the task on
adding a new waiter.
v6: Fix deciding when to force the timer to hide missing interrupts.
v7: Move the bottom-half from the kthread to the first client process.
v8: Reword a few comments
v9: Break the busy loop when the interrupt is unmasked or has fired.
v10: Comments, unnecessary churn, better debugging from Tvrtko
v11: Wake all completed waiters on removing the current bottom-half to
reduce the latency of waking up a herd of clients all waiting on the
same request.
v12: Rearrange missed-interrupt fault injection so that it works with
igt/drv_missed_irq_hang
v13: Rename intel_breadcrumb and friends to intel_wait in preparation
for signal handling.
v14: RCU commentary, assert_spin_locked
v15: Hide BUG_ON behind the compiler; report on gem_latency findings.
v16: Sort seqno-groups by priority so that first-waiter has the highest
task priority (and so avoid priority inversion).
v17: Add waiters to post-mortem GPU hang state.
v18: Return early for a completed wait after acquiring the spinlock.
Avoids adding ourselves to the tree if the is already complete, and
skips the awkward question of why we don't do completion wakeups for
waits earlier than or equal to ourselves.
v19: Prepare for init_breadcrumbs to fail. Later patches may want to
allocate during init, so be prepared to propagate back the error code.

Testcase: igt/gem_concurrent_blit
Testcase: igt/benchmarks/gem_latency
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> #v18
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-6-git-send-email-chris@chris-wilson.co.uk

688e6c72