提交 · 5d4bac5503fcc67dd7999571e243cee49371aef7 · openeuler / raspberrypi-kernel

23 3月, 2017 1 次提交

drm/i915: Restore marking context objects as dirty on pinning · 5d4bac55

由 Chris Wilson 提交于 3月 22, 2017

Commit e8a9c58f ("drm/i915: Unify active context tracking between
legacy/execlists/guc") converted the legacy intel_ringbuffer submission
to the same context pinning mechanism as execlists - that is to pin the
context until the subsequent request is retired. Previously it used the
vma retirement of the context object to keep itself pinned until the
next request (after i915_vma_move_to_active()). In the conversion, I
missed that the vma retirement was also responsible for marking the
object as dirty. Mark the context object as dirty when pinning
(equivalent to execlists) which ensures that if the context is swapped
out due to mempressure or suspend/hibernation, when it is loaded back in
it does so with the previous state (and not all zero).

Fixes: e8a9c58f ("drm/i915: Unify active context tracking between legacy/execlists/guc")
Reported-by: NDennis Gilmore <dennis@ausil.us>
Reported-by: NMathieu Marquer <mathieu.marquer@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99993
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100181Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.11-rc1
Link: http://patchwork.freedesktop.org/patch/msgid/20170322205930.12762-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

5d4bac55

21 3月, 2017 1 次提交

drm/i915: Remove intel_ring.last_retired_head · fe085f13

由 Chris Wilson 提交于 3月 21, 2017

Storing the position of the breadcrumb of the last retired request as
a separate last_retired_head is superfluous as we always copy that into
head prior to recalculation of the intel_ring.space.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170321102552.24357-1-chris@chris-wilson.co.uk

fe085f13

17 3月, 2017 2 次提交

drm/i915: Assert that the context pin_counts do not overflow · a533b4ba

由 Chris Wilson 提交于 3月 16, 2017

This should be impossible, but let's assert that we do not pin a context
4 billion times before retiring!

v2: Fix the assertion -- the patch had just one job to do!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171628.3228-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>

a533b4ba

drm/i915: Move engine->submit_request selection to a vfunc · ff44ad51

由 Chris Wilson 提交于 3月 16, 2017

It turns out that we may want to restore the original
engine->submit_request (and engine->schedule) callbacks from more than
just the guc <-> execlists transition. Move this to a vfunc so we can
have a common interface.

v2: Move initial selection to intel_engines_init_common(), repaint vfunc
with engine->set_default_submission (and a similar colour for the
helper).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-2-chris@chris-wilson.co.uk

ff44ad51

28 2月, 2017 1 次提交

drm/i915: Reduce context alignment · afeddf50

由 Chris Wilson 提交于 2月 27, 2017

No hardware was ever shipped that needed more than 4096 byte alignment
and future hardware will not use this legacy path. So reduce the
alignment to make it easier and quicker to launch workloads.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227135913.8056-3-chris@chris-wilson.co.uk

afeddf50

20 2月, 2017 1 次提交

drm/i915: Assert that the request->tail is always qword aligned · 944a36d4

由 Chris Wilson 提交于 2月 17, 2017

The hardware requires that the tail pointer only advance in qword units,
so assert that the value we write is aligned to qwords, and similarly
enforce this restriction onto the request->tail.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170217163833.731-1-chris@chris-wilson.co.ukReviewed-by: NMichał Winiarski <michal.winiarski@intel.com>

944a36d4

17 2月, 2017 3 次提交

drm/i915: Consolidate gen8_emit_pipe_control · 9f235dfa

由 Tvrtko Ursulin 提交于 2月 16, 2017

We have a few open coded instances in the execlists code and an
almost suitable helper in intel_ringbuf.c

We can consolidate to a single helper if we change the existing
helper to emit directly to ring buffer memory and move the space
reservation outside it.

v2: Drop memcpy for memset. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170216122325.31391-2-tvrtko.ursulin@linux.intel.com

9f235dfa

drm/i915: Move common workaround code to intel_engine_cs · 133b4bd7

由 Tvrtko Ursulin 提交于 2月 16, 2017

It is used by all submission backends.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>

133b4bd7

drm/i915: Make int __intel_ring_space static · 2f35afe9

由 Tvrtko Ursulin 提交于 2月 16, 2017

It is only used within intel_ringbuffer.c
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.oc.uk>

2f35afe9

16 2月, 2017 1 次提交

drm/i915: Restore context and pd for ringbuffer submission after reset · ec62ed3e

由 Chris Wilson 提交于 2月 07, 2017

Following a reset, the context and page directory registers are lost.
However, the queue of requests that we resubmit after the reset may
depend upon them - the registers are restored from a context image, but
that restore may be inhibited and may simply be absent from the request
if it was in the middle of a sequence using the same context. If we
prime the CCID/PD registers with the first request in the queue (even
for the hung request), we prevent invalid memory access for the
following requests (and continually hung engines).

v2: Magic BIT(8), reserved for future use but still appears unused.
v3: Some commentary on handling innocent vs guilty requests
v4: Add a wait for PD_BASE fetch. The reload appears to be instant on my
Ivybridge, but this bit probably exists for a reason.

Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170207152437.4252-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
(cherry picked from commit c0dcb203)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

ec62ed3e

15 2月, 2017 2 次提交

drm/i915: Remove duplicate intel_logical_ring_workarounds_emit · 4ac9659e

由 Tvrtko Ursulin 提交于 2月 14, 2017

intel_ring_workarounds_emit is exactly the same code.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170214150017.16058-1-tvrtko.ursulin@linux.intel.com

4ac9659e

drm/i915: fix for WaDisableDopClockGating:bdw · 9cc19733

由 Robert Bragg 提交于 2月 12, 2017

This workaround for BDW was incomplete as it also requires EUTC clock
gating to be disabled via UCGCTL1.

v2: read modify write UCGTL1 in broadwell_init_clock_gating (Ville)
Signed-off-by: NRobert Bragg <robert@sixbynine.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170212133252.20990-1-robert@sixbynine.orgReviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>

9cc19733

14 2月, 2017 1 次提交

drm/i915: Emit to ringbuffer directly · 73dec95e

由 Tvrtko Ursulin 提交于 2月 14, 2017

This removes the usage of intel_ring_emit in favour of
directly writing to the ring buffer.

intel_ring_emit was preventing the compiler for optimising
fetch and increment of the current ring buffer pointer and
therefore generating very verbose code for every write.

It had no useful purpose since all ringbuffer operations
are started and ended with intel_ring_begin and
intel_ring_advance respectively, with no bail out in the
middle possible, so it is fine to increment the tail in
intel_ring_begin and let the code manage the pointer
itself.

Useless instruction removal amounts to approximately
two and half kilobytes of saved text on my build.

Not sure if this has any measurable performance
implications but executing a ton of useless instructions
on fast paths cannot be good.

v2:
 * Change return from intel_ring_begin to error pointer by
   popular demand.
 * Move tail increment to intel_ring_advance to enable some
   error checking.

v3:
 * Move tail advance back into intel_ring_begin.
 * Rebase and tidy.

v4:
 * Complete rebase after a few months since v3.

v5:
 * Remove unecessary cast and fix !debug compile. (Chris Wilson)

v6:
 * Make intel_ring_offset take request as well.
 * Fix recording of request postfix plus a sprinkle of asserts.
   (Chris Wilson)

v7:
 * Use intel_ring_offset to get the postfix. (Chris Wilson)
 * Convert GVT code as well.

v8:
 * Rename *out++ to *cs++.

v9:
 * Fix GVT out to cs conversion in GVT.

v10:
 * Rebase for new intel_ring_begin in selftests.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170214113242.29241-1-tvrtko.ursulin@linux.intel.com

73dec95e

11 2月, 2017 1 次提交

drm/i915: Rename conditional GEM execution macros · ae9a043b

由 Chris Wilson 提交于 2月 07, 2017

After a brief discussion, we settled on a naming convention for the
conditional GEM debugging data that should be clearer to the casual
user: GEM_DEBUG
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170207102319.10910-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

ae9a043b

10 2月, 2017 1 次提交

drm/i915: Always pin contexts into the high GGTT · 72b72ae4

由 Chris Wilson 提交于 2月 10, 2017

Now that we have fast top-down insertion into the drm_mm, we can use it
for frequent runtime operations like insertion of the context object,
whereas before we limited it to the one-off insertion of the pinned
kernel context. Keeping the active context objects out of the mappable
region of the global GTT (except under memory pressure) improves our
ability to allocate mappable aperture region without triggering a GPU
stall.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170210101422.1598-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>

72b72ae4

08 2月, 2017 1 次提交

drm/i915: Restore context and pd for ringbuffer submission after reset · c0dcb203

由 Chris Wilson 提交于 2月 07, 2017

Following a reset, the context and page directory registers are lost.
However, the queue of requests that we resubmit after the reset may
depend upon them - the registers are restored from a context image, but
that restore may be inhibited and may simply be absent from the request
if it was in the middle of a sequence using the same context. If we
prime the CCID/PD registers with the first request in the queue (even
for the hung request), we prevent invalid memory access for the
following requests (and continually hung engines).

v2: Magic BIT(8), reserved for future use but still appears unused.
v3: Some commentary on handling innocent vs guilty requests
v4: Add a wait for PD_BASE fetch. The reload appears to be instant on my
Ivybridge, but this bit probably exists for a reason.

Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170207152437.4252-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>

c0dcb203

07 2月, 2017 1 次提交

drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance() · eca56a35

由 Chris Wilson 提交于 2月 06, 2017

It is required that the caller declare the exact number of dwords they
wish to write into the ring. This is required for two reasons, we need
to allocate sufficient space for the entire command packet and we need
to be sure that the contents are completely written to avoid executing
stale data. The current interface requires for any bug to be caught in
review, the reader has to carefully count the number of
intel_ring_emit() between intel_ring_begin() and intel_ring_advance().
If we record the end of the packet of each intel_ring_begin() we can
also have CI check for us.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170206170502.30944-1-chris@chris-wilson.co.uk

eca56a35

30 1月, 2017 1 次提交

drm/i915/glk: Turn on workarounds that apply to Geminilake too · 9fb5026f

由 Ander Conselvan de Oliveira 提交于 1月 26, 2017

Apply workarounds to Geminilake, and annotate those that are applied
unconditionally when they apply to GLK based on the workaround database.

v2: Fix commit message typos. (David)
v3: Rebase.
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485422218-9102-1-git-send-email-ander.conselvan.de.oliveira@intel.com

9fb5026f

19 1月, 2017 1 次提交

drm/i915: Remove i915_vma_create from VMA API · a01cb37a

由 Chris Wilson 提交于 1月 16, 2017

With the introduce of i915_vma_instance() for obtaining the VMA
singleton for a (obj, vm, view) tuple, we can remove the
i915_vma_create() in favour of a single entry point. We do incur a
lookup onto an empty tree, but the i915_vma_create() were being called
infrequently and during initialisation, so the small overhead is
negligible.

v2: Drop the i915_ prefix from the now static vma_create() function
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-4-chris@chris-wilson.co.uk

a01cb37a

18 1月, 2017 1 次提交

drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround. · 4fc020d8

由 Francisco Jerez 提交于 1月 12, 2017

The WaDisableLSQCROPERFforOCL workaround has the side effect of
disabling an L3SQ optimization that has huge performance implications
and is unlikely to be necessary for the correct functioning of usual
graphic workloads.  Userspace is free to re-enable the workaround on
demand, and is generally in a better position to determine whether the
workaround is necessary than the DRM is (e.g. only during the
execution of compute kernels that rely on both L3 fences and HDC R/W
requests).

The same workaround seems to apply to BDW (at least to production
stepping G1) and SKL as well (the internal workaround database claims
that it does for all steppings, while the BSpec workaround table only
mentions pre-production steppings), but the DRM doesn't do anything
beyond whitelisting the L3SQCREG4 register so userspace can enable it
when it sees fit.  Do the same on KBL platforms.

Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
This is followed by a regression of 35% and 10% respectively for the
same benchmarks and platform caused by my recent patch series
switching userspace to use the dataport constant cache instead of the
sampler to implement uniform pull constant loads, which caused us to
hit more heavily the L3 cache (and on platforms other than KBL had the
opposite effect of improving performance of the same two benchmarks).
The overall effect on KBL of this change combined with the recent
userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
was affected by the constant cache changes (though it improved as it
did on other platforms rather than regressing), but is not
significantly affected by this patch (with statistical significance of
5% and sample size 20).

v2: Drop some more code to avoid unused variable warning.

Fixes: 738fa1b3 ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256Signed-off-by: NFrancisco Jerez <currojerez@riseup.net>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: beignet@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.7+
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
[Removed double Fixes tag]
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 8726f2fa)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

4fc020d8

12 1月, 2017 1 次提交

drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround. · 8726f2fa

由 Francisco Jerez 提交于 1月 12, 2017

The WaDisableLSQCROPERFforOCL workaround has the side effect of
disabling an L3SQ optimization that has huge performance implications
and is unlikely to be necessary for the correct functioning of usual
graphic workloads.  Userspace is free to re-enable the workaround on
demand, and is generally in a better position to determine whether the
workaround is necessary than the DRM is (e.g. only during the
execution of compute kernels that rely on both L3 fences and HDC R/W
requests).

The same workaround seems to apply to BDW (at least to production
stepping G1) and SKL as well (the internal workaround database claims
that it does for all steppings, while the BSpec workaround table only
mentions pre-production steppings), but the DRM doesn't do anything
beyond whitelisting the L3SQCREG4 register so userspace can enable it
when it sees fit.  Do the same on KBL platforms.

Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
This is followed by a regression of 35% and 10% respectively for the
same benchmarks and platform caused by my recent patch series
switching userspace to use the dataport constant cache instead of the
sampler to implement uniform pull constant loads, which caused us to
hit more heavily the L3 cache (and on platforms other than KBL had the
opposite effect of improving performance of the same two benchmarks).
The overall effect on KBL of this change combined with the recent
userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
was affected by the constant cache changes (though it improved as it
did on other platforms rather than regressing), but is not
significantly affected by this patch (with statistical significance of
5% and sample size 20).

v2: Drop some more code to avoid unused variable warning.

Fixes: 738fa1b3 ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256Signed-off-by: NFrancisco Jerez <currojerez@riseup.net>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: beignet@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.7+
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
[Removed double Fixes tag]
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com

8726f2fa

11 1月, 2017 1 次提交

drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE · f51455d4

由 Chris Wilson 提交于 1月 10, 2017

Start converting over from the byte count to its semantic macro, either
we want to allocate the size of a physical page in main memory or we
want the size of a virtual page in the GTT. 4096 could mean either, but
PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
code comprehension and future changes. In the future, we may want to use
variable GTT page sizes and so have the challenge of knowing which
hardcoded values were used to represent a physical page vs the virtual
page.

v2: Look for a few more 4096s to convert, discover IS_ALIGNED().
v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep
bdw stolen w/a as 4096 until we know better.
v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page
sizes.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

f51455d4

07 1月, 2017 1 次提交

drm/i915: Simplify testing for am-I-the-kernel-context? · 984ff29f

由 Chris Wilson 提交于 1月 06, 2017

The kernel context (dev_priv->kernel_context) is unique in that it is
not associated with any user filp - it is the only one with
ctx->file_priv == NULL. This is a simpler test than comparing it against
dev_priv->kernel_context which involves some pointer dancing.

In checking that this is true, we notice that the gvt context is
allocating itself a i915_hw_ppgtt it doesn't use and not flagging that
its file_priv should be invalid.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-5-chris@chris-wilson.co.uk

984ff29f

24 12月, 2016 1 次提交

drm/i915: request ring to be pinned above GUC_WOPCM_TOP · d3ef1af6

由 Daniele Ceraolo Spurio 提交于 12月 23, 2016

GuC will validate the ring offset and fail if it is in the
[0, GUC_WOPCM_TOP) range. The bias is conditionally applied only
if GuC loading is enabled (we can't check for guc submission enabled as
in other cases because HuC loading requires this fix).

Note that the default context is processed before enable_guc_loading is
sanitized, so we might still apply the bias to its ring even if it is
not needed.

v2: compute the value during ctx init and pass it to
    intel_ring_pin (Chris), updated commit message
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1482537382-28584-1-git-send-email-daniele.ceraolospurio@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

d3ef1af6

19 12月, 2016 2 次提交

drm/i915: Swap if(enable_execlists) in i915_gem_request_alloc for a vfunc · f73e7399

由 Chris Wilson 提交于 12月 18, 2016

A fairly trivial move of a matching pair of routines (for preparing a
request for construction) onto an engine vfunc. The ulterior motive is
to be able to create a mock request implementation.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-7-chris@chris-wilson.co.uk

f73e7399

drm/i915: Unify active context tracking between legacy/execlists/guc · e8a9c58f

由 Chris Wilson 提交于 12月 18, 2016

The requests conversion introduced a nasty bug where we could generate a
new request in the middle of constructing a request if we needed to idle
the system in order to evict space for a context. The request to idle
would be executed (and waited upon) before the current one, creating a
minor havoc in the seqno accounting, as we will consider the current
request to already be completed (prior to deferred seqno assignment) but
ring->last_retired_head would have been updated and still could allow
us to overwrite the current request before execution.

We also employed two different mechanisms to track the active context
until it was switched out. The legacy method allowed for waiting upon an
active context (it could forcibly evict any vma, including context's),
but the execlists method took a step backwards by pinning the vma for
the entire active lifespan of the context (the only way to evict was to
idle the entire GPU, not individual contexts). However, to circumvent
the tricky issue of locking (i.e. we cannot take struct_mutex at the
time of i915_gem_request_submit(), where we would want to move the
previous context onto the active tracker and unpin it), we take the
execlists approach and keep the contexts pinned until retirement.
The benefit of the execlists approach, more important for execlists than
legacy, was the reduction in work in pinning the context for each
request - as the context was kept pinned until idle, it could short
circuit the pinning for all active contexts.

We introduce new engine vfuncs to pin and unpin the context
respectively. The context is pinned at the start of the request, and
only unpinned when the following request is retired (this ensures that
the context is idle and coherent in main memory before we unpin it). We
move the engine->last_context tracking into the retirement itself
(rather than during request submission) in order to allow the submission
to be reordered or unwound without undue difficultly.

And finally an ulterior motive for unifying context handling was to
prepare for mock requests.

v2: Rename to last_retired_context, split out legacy_context tracking
for MI_SET_CONTEXT.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-3-chris@chris-wilson.co.uk

e8a9c58f

07 12月, 2016 1 次提交

drm/i915: add some more "i" in platform names for consistency · 2a307c2e

由 Jani Nikula 提交于 11月 30, 2016

Consistency FTW.
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/9ab811dc06570bd3fc05a917ade1bdc9bb805a75.1480520526.git.jani.nikula@intel.com

2a307c2e

02 12月, 2016 2 次提交

drm/i915: Make GEM object create and create from data take dev_priv · 12d79d78

由 Tvrtko Ursulin 提交于 12月 01, 2016

Makes all GEM object constructors consistent.

v2: Fix compilation in GVT code.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v1)

12d79d78

drm/i915: Make GEM object alloc/free and stolen created take dev_priv · 187685cb

由 Tvrtko Ursulin 提交于 12月 01, 2016

Where it is more appropriate and also to be consistent with
the direction of the driver.

v2: Leave out object alloc/free inlining. (Joonas Lahtinen)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

187685cb

15 11月, 2016 1 次提交

drm/i915: Defer transfer onto execution timeline to actual hw submission · d55ac5bf

由 Chris Wilson 提交于 11月 14, 2016

Defer the transfer from the client's timeline onto the execution
timeline from the point of readiness to the point of actual submission.
For example, in execlists, a request is finally submitted to hardware
when the hardware is ready, and only put onto the hardware queue when
the request is ready. By deferring the transfer, we ensure that the
timeline is maintained in retirement order if we decide to queue the
requests onto the hardware in a different order than fifo.

v2: Rebased onto distinct global/user timeline lock classes.
v3: Play with the position of the spin_lock().
v4: Nesting finally resolved with distinct sw_fence lock classes.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-4-chris@chris-wilson.co.uk

d55ac5bf

01 11月, 2016 1 次提交

drm/i915: Export a function to flush the context upon pinning · 07c9a21a

由 Chris Wilson 提交于 10月 30, 2016

For legacy contexts we employ an optimisation to only flush the context
when binding into the global GTT. This avoids stalling on the GPU when
reloading an active context. Wrap this detail up into a helper and
export it for a potential third user. (Longer term, context pinning
needs to be reworked as the current handling of switch context pins too
late and so risks eviction and corrupting the request. Plans, plans,
plans.)

v2: Expand the comment explaining the optimisation for avoiding the
stall on active contexts.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161030132820.32163-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>

07c9a21a

29 10月, 2016 9 次提交

drm/i915: Move the global sync optimisation to the timeline · 85e17f59