提交 · 4f7af1948abcb18b4772fe1bcd84d7d27d96258c · openeuler / Kernel

06 11月, 2019 4 次提交

drm/i915: Support ro ppgtt mapped cmdparser shadow buffers · 4f7af194

由 Jon Bloomfield 提交于 5月 22, 2018

For Gen7, the original cmdparser motive was to permit limited
use of register read/write instructions in unprivileged BB's.
This worked by copying the user supplied bb to a kmd owned
bb, and running it in secure mode, from the ggtt, only if
the scanner finds no unsafe commands or registers.

For Gen8+ we can't use this same technique because running bb's
from the ggtt also disables access to ppgtt space. But we also
do not actually require 'secure' execution since we are only
trying to reduce the available command/register set. Instead we
will copy the user buffer to a kmd owned read-only bb in ppgtt,
and run in the usual non-secure mode.

Note that ro pages are only supported by ppgtt (not ggtt), but
luckily that's exactly what we need.

Add the required paths to map the shadow buffer to ppgtt ro for Gen8+

v2: IS_GEN7/IS_GEN (Mika)
v3: rebase
v4: rebase
v5: rebase
Signed-off-by: NJon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NChris Wilson <chris.p.wilson@intel.com>

4f7af194

drm/i915: Add support for mandatory cmdparsing · 311a50e7

由 Jon Bloomfield 提交于 8月 01, 2018

The existing cmdparser for gen7 can be bypassed by specifying
batch_len=0 in the execbuf call. This is safe because bypassing
simply reduces the cmd-set available.

In a later patch we will introduce cmdparsing for gen9, as a
security measure, which must be strictly enforced since without
it we are vulnerable to DoS attacks.

Introduce the concept of 'required' cmd parsing that cannot be
bypassed by submitting zero-length bb's.

v2: rebase (Mika)
v2: rebase (Mika)
v3: fix conflict on engine flags (Mika)
Signed-off-by: NJon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NChris Wilson <chris.p.wilson@intel.com>

311a50e7

drm/i915: Remove Master tables from cmdparser · 66d8aba1

由 Jon Bloomfield 提交于 6月 08, 2018

The previous patch has killed support for secure batches
on gen6+, and hence the cmdparsers master tables are
now dead code. Remove them.
Signed-off-by: NJon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: NChris Wilson <chris.p.wilson@intel.com>

66d8aba1

drm/i915: Disable Secure Batches for gen6+ · 44157641

由 Jon Bloomfield 提交于 6月 08, 2018

Retroactively stop reporting support for secure batches
through the api for gen6+ so that older binaries trigger
the fallback path instead.

Older binaries use secure batches pre gen6 to access resources
that are not available to normal usermode processes. However,
all known userspace explicitly checks for HAS_SECURE_BATCHES
before relying on the secure batch feature.

Since there are no known binaries relying on this for newer gens
we can kill secure batches from gen6, via I915_PARAM_HAS_SECURE_BATCHES.

v2: rebase (Mika)
v3: rebase (Mika)
Signed-off-by: NJon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NChris Wilson <chris.p.wilson@intel.com>

44157641

22 8月, 2019 1 次提交

drm/i915: Replace i915_vma_put_fence() · 1f7fd484

由 Chris Wilson 提交于 8月 22, 2019

Avoid calling i915_vma_put_fence() by using our alternate paths that
bind a secondary vma avoiding the original fenced vma. For the few
instances where we need to release the fence (i.e. on binding when the
GGTT range becomes invalid), replace the put_fence with a revoke_fence.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822061557.18402-1-chris@chris-wilson.co.uk

1f7fd484

21 8月, 2019 1 次提交

drm/i915: Replace PIN_NONFAULT with calls to PIN_NOEVICT · 6846895f

由 Chris Wilson 提交于 8月 21, 2019

When under severe stress for GTT mappable space, the LRU eviction model
falls off a cliff. We spend all our time scanning the much larger
non-mappable area searching for something within the mappable zone we can
evict. Turn this on its head by only using the full vma for the object if
it is already pinned in the mappable zone or there is sufficient *free*
space to accommodate it (prioritizing speedy reuse). If there is not,
immediately fall back to using small chunks (tilerow for GTT mmap, single
pages for pwrite/relocation) and using random eviction before doing a full
search.

Testcase: igt/gem_concurrent_blt
References: https://bugs.freedesktop.org/show_bug.cgi?id=110848Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821123234.19194-1-chris@chris-wilson.co.uk

6846895f

20 8月, 2019 1 次提交

drm/i915: Serialize insertion into the file->mm.request_list · 44c22f3f

由 Chris Wilson 提交于 8月 20, 2019

Currently, we remove the from per-file request list for throttling and
retirement under a dedicated spinlock, but insertion is governed by
struct_mutex. This needs to be the same lock so that the
retirement/insertion of neighbouring requests (at the tail) doesn't
break the list.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190820080907.4665-1-chris@chris-wilson.co.uk

44c22f3f

19 8月, 2019 1 次提交

drm/i915: Serialize against vma moves · 70d6894d

由 Chris Wilson 提交于 8月 19, 2019

Make sure that when submitting requests, we always serialize against
potential vma moves and clflushes.

Time for a i915_request_await_vma() interface!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190819112033.30638-1-chris@chris-wilson.co.uk

70d6894d

16 8月, 2019 1 次提交

drm/i915: Protect request retirement with timeline->mutex · e5dadff4

由 Chris Wilson 提交于 8月 15, 2019

Forgo the struct_mutex requirement for request retirement as we have
been transitioning over to only using the timeline->mutex for
controlling the lifetime of a request on that timeline.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-4-chris@chris-wilson.co.uk

e5dadff4

13 8月, 2019 1 次提交

dma-buf: rename reservation_object to dma_resv · 52791eee

由 Christian König 提交于 8月 11, 2019

Be more consistent with the naming of the other DMA-buf objects.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/

52791eee

10 8月, 2019 2 次提交

drm/i915: Remove redundant user_access_end() from __copy_from_user() error path · e6a9522a

由 Josh Poimboeuf 提交于 7月 25, 2019

Objtool reports:

  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x36: redundant UACCESS disable

__copy_from_user() already does both STAC and CLAC, so the
user_access_end() in its error path adds an extra unnecessary CLAC.

Fixes: 0b2c8f8b ("i915: fix missing user_access_end() in page fault exception case")
Reported-by: NThomas Gleixner <tglx@linutronix.de>
Reported-by: NSedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NNick Desaulniers <ndesaulniers@google.com>
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://github.com/ClangBuiltLinux/linux/issues/617
Link: https://lkml.kernel.org/r/51a4155c5bc2ca847a9cbe85c1c11918bb193141.1564086017.git.jpoimboe@redhat.com

e6a9522a

drm/i915: Lift timeline into intel_context · 75d0a7f3

由 Chris Wilson 提交于 8月 09, 2019

Move the timeline from being inside the intel_ring to intel_context
itself. This saves much pointer dancing and makes the relations of the
context to its timeline much clearer.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-4-chris@chris-wilson.co.uk

75d0a7f3

09 8月, 2019 3 次提交

drm/i915: Generalise BSD default selection · 1a07e86c

由 Chris Wilson 提交于 8月 09, 2019

For the default I915_EXEC_BSD round robin selector, it may select any
available VCS engine. Make it so.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809091010.23281-3-chris@chris-wilson.co.uk

1a07e86c

drm/i915: Replace global bsd_dispatch_index with random seed · 6b86f900

由 Chris Wilson 提交于 8月 09, 2019

We keep a global seed for the legacy BSD round-robin selector, but in
our testing of multiple simultaneous client workloads, a random seed
spreads the load more evenly. (As even as an initial round-robin selector
can be!) Removing the global is one less variable we have to find a home
for!

We can simulate multi-client (both same and mixed workloads) using
igt/gem_wsim to work out optimal strategies and then compare our
simulation with the actual transcoder on multi-engine machines. This
fixed round-robin turns out to be one of the worst methods.

No user is advised to use this method; the current suggestion is to use
a virtual engine for agnostic batches, randomised submission or using
the busyness tracking to select the most idle engine at the time of
dispatch. At the present time, intel-media is explicit, but libva still
seems to use it, with the exception of batches that must execute on vcs0.
Oh well.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809091010.23281-2-chris@chris-wilson.co.uk

6b86f900

drm/i915: Check for a second VCS engine more carefully · d5b2a3a4

由 Chris Wilson 提交于 8月 09, 2019

To use the legacy BSD selector, you must have a second VCS engine, or
else the ABI simply maps the request for another engine onto VCS0.
However, we only checked a single VCS1 location and overlooking the
possibility of a sparse VCS set being mapped to the dense ABI.

v2: num_vcs_engines() turns out to be reusable and futureproof it so we
never have to worry about this silly bit of ABI again!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809123153.20574-1-chris@chris-wilson.co.uk

d5b2a3a4

07 8月, 2019 2 次提交

drm/i915: remove unnecessary includes of intel_display_types.h header · 6da4a2c4

由 Jani Nikula 提交于 8月 06, 2019

With its original name intel_drv.h the intel_display_types.h header was
superfluously cargo-cult included all over the place, while it's really
mostly about display internals. Remove the unnecessary includes.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e3d737f0ab87c55969e62c1e077e15c04c238297.1565085692.git.jani.nikula@intel.com

6da4a2c4

drm/i915: rename intel_drv.h to display/intel_display_types.h · 1d455f8d

由 Jani Nikula 提交于 8月 06, 2019

Everything about the file is about display, and mostly about types
related to display. Move under display/ as intel_display_types.h to
reflect the facts.

There's still plenty to clean up, but start off with moving the file
where it logically belongs and naming according to contents.

v2: fix the include guard name in the renamed file
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806113933.11799-1-jani.nikula@intel.com

1d455f8d

04 8月, 2019 2 次提交

drm/i915: Replace struct_mutex for batch pool serialisation · b40d7378

由 Chris Wilson 提交于 8月 04, 2019

Switch to tracking activity via i915_active on individual nodes, only
keeping a list of retired objects in the cache, and reaping the cache
when the engine itself idles.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190804124826.30272-2-chris@chris-wilson.co.uk

b40d7378

drm/i915: Teach execbuffer to take the engine wakeref not GT · a4e57f90

由 Chris Wilson 提交于 8月 04, 2019

In the next patch, we would like to couple into the engine wakeref to
free the batch pool on idling. The caveat here is that we therefore want
to track the engine wakeref more precisely and to hold it instead of the
broader GT wakeref as we process the ioctl.

v2: Avoid introducing odd semantics for a shortlived timeline->mutex
acquisition interface.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190804124826.30272-1-chris@chris-wilson.co.uk

a4e57f90

02 8月, 2019 1 次提交

drm/i915: Flush extra hard after writing relocations through the GTT · 576f0586

由 Chris Wilson 提交于 7月 30, 2019

Recently discovered in commit bdae33b8 ("drm/i915: Use maximum write
flush for pwrite_gtt") was that we needed to our full write barrier
before changing the GGTT PTE to ensure that our indirect writes through
the GTT landed before the PTE changed (and the writes end up in a
different page). That also applies to our GGTT relocation path.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: NPrathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730112151.5633-4-chris@chris-wilson.co.uk

576f0586

30 7月, 2019 1 次提交

drm/i915/gt: Provide a local intel_context.vm · f5d974f9

由 Chris Wilson 提交于 7月 30, 2019

Track the currently bound address space used by the HW context. Minor
conversions to use the local intel_context.vm are made, leaving behind
some more surgery required to make intel_context the primary through the
selftests.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-2-chris@chris-wilson.co.uk

f5d974f9

13 7月, 2019 1 次提交

drm/i915/gt: Use intel_gt as the primary object for handling resets · cb823ed9

由 Chris Wilson 提交于 7月 12, 2019

Having taken the first step in encapsulating the functionality by moving
the related files under gt/, the next step is to start encapsulating by
passing around the relevant structs rather than the global
drm_i915_private. In this step, we pass intel_gt to intel_reset.c
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk

cb823ed9

26 6月, 2019 1 次提交

drm/i915/gt: Pass intel_gt to pm routines · 0c91621c

由 Chris Wilson 提交于 6月 25, 2019

Switch from passing the i915 container to newly named struct intel_gt.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-2-chris@chris-wilson.co.uk

0c91621c

21 6月, 2019 1 次提交

drm/i915: Move i915_gem_chipset_flush to intel_gt · baea429d

由 Tvrtko Ursulin 提交于 6月 21, 2019

This aligns better with the rest of restructuring.

v2:
 * Move call out of line. (Chris)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-24-tvrtko.ursulin@linux.intel.com

baea429d

17 6月, 2019 1 次提交

drm/i915: move modesetting core code under display/ · df0566a6

由 Jani Nikula 提交于 6月 13, 2019

Now that we have a new subdirectory for display code, continue by moving
modesetting core code.

display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this
is, again, a surprisingly clean operation.

v2:
- don't move intel_sideband.[ch] (Ville)
- use tabs for Makefile file lists and sort them

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com

df0566a6

11 6月, 2019 1 次提交

drm/i915: Pull kref into i915_address_space · e568ac38

由 Chris Wilson 提交于 6月 11, 2019

Make the kref common to both derived structs (i915_ggtt and i915_ppgtt)
so that we can safely reference count an abstract ctx->vm address space.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-1-chris@chris-wilson.co.uk

e568ac38

06 6月, 2019 1 次提交

drm/i915: Move object close under its own lock · 155ab883

由 Chris Wilson 提交于 6月 06, 2019

Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk

155ab883

28 5月, 2019 4 次提交

drm/i915: Move GEM object domain management from struct_mutex to local · 6951e589

由 Chris Wilson 提交于 5月 28, 2019

Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk

6951e589

drm/i915: Move more GEM objects under gem/ · 10be98a7

由 Chris Wilson 提交于 5月 28, 2019

Continuing the theme of separating out the GEM clutter.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk

10be98a7

drm/i915: Move GEM domain management to its own file · f0e4a063

由 Chris Wilson 提交于 5月 28, 2019

Continuing the decluttering of i915_gem.c, that of the read/write
domains, perhaps the biggest of GEM's follies?
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-7-chris@chris-wilson.co.uk

f0e4a063

drm/i915: Pull GEM ioctls interface to its own file · afa13085

由 Chris Wilson 提交于 5月 28, 2019

Declutter i915_drv/gem.h by moving the ioctl API into its own header.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-2-chris@chris-wilson.co.uk

afa13085

22 5月, 2019 2 次提交

drm/i915: Allow specification of parallel execbuf · a88b6e4c

由 Chris Wilson 提交于 5月 21, 2019

There is a desire to split a task onto two engines and have them run at
the same time, e.g. scanline interleaving to spread the workload evenly.
Through the use of the out-fence from the first execbuf, we can
coordinate secondary execbuf to only become ready simultaneously with
the first, so that with all things idle the second execbufs are executed
in parallel with the first. The key difference here between the new
EXEC_FENCE_SUBMIT and the existing EXEC_FENCE_IN is that the in-fence
waits for the completion of the first request (so that all of its
rendering results are visible to the second execbuf, the more common
userspace fence requirement).

Since we only have a single input fence slot, userspace cannot mix an
in-fence and a submit-fence. It has to use one or the other! This is not
such a harsh requirement, since by virtue of the submit-fence, the
secondary execbuf inherit all of the dependencies from the first
request, and for the application the dependencies should be common
between the primary and secondary execbuf.
Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_fence/parallel
Link: https://github.com/intel/media-driver/pull/546Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-10-chris@chris-wilson.co.uk

a88b6e4c

drm/i915: Allow a context to define its set of engines · 976b55f0

由 Chris Wilson 提交于 5月 21, 2019

Over the last few years, we have debated how to extend the user API to
support an increase in the number of engines, that may be sparse and
even be heterogeneous within a class (not all video decoders created
equal). We settled on using (class, instance) tuples to identify a
specific engine, with an API for the user to construct a map of engines
to capabilities. Into this picture, we then add a challenge of virtual
engines; one user engine that maps behind the scenes to any number of
physical engines. To keep it general, we want the user to have full
control over that mapping. To that end, we allow the user to constrain a
context to define the set of engines that it can access, order fully
controlled by the user via (class, instance). With such precise control
in context setup, we can continue to use the existing execbuf uABI of
specifying a single index; only now it doesn't automagically map onto
the engines, it uses the user defined engine map from the context.

v2: Fixup freeing of local on success of get_engines()
v3: Allow empty engines[]
v4: s/nengine/num_engines/
v5: Replace 64 limit on num_engines with a note that execbuf is
currently limited to only using the first 64 engines.
v6: Actually use the engines_mutex to guard the ctx->engines.

Testcase: igt/gem_ctx_engines
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-2-chris@chris-wilson.co.uk

976b55f0

27 4月, 2019 2 次提交

drm/i915: Switch back to an array of logical per-engine HW contexts · 5e2a0419

由 Chris Wilson 提交于 4月 26, 2019

We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.
v5: Push iterator locking to caller
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk

5e2a0419

drm/i915: Export intel_context_instance() · fa9f6681

由 Chris Wilson 提交于 4月 26, 2019

We want to pass in a intel_context into intel_context_pin() and that
requires us to first be able to lookup the intel_context!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-2-chris@chris-wilson.co.uk

fa9f6681

25 4月, 2019 1 次提交

drm/i915: Explicitly pin the logical context for execbuf · 8f2a1057

由 Chris Wilson 提交于 4月 25, 2019

In order to separate the reservation phase of building a request from
its emission phase, we need to pull some of the request alloc activities
from deep inside i915_request to the surface, GEM_EXECBUFFER.

v2: Be frivolous, use a local drm_i915_private.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190425050143.811-1-chris@chris-wilson.co.uk

8f2a1057

03 4月, 2019 1 次提交

i915, uaccess: Fix redundant CLAC · 8f4faed0

由 Peter Zijlstra 提交于 2月 28, 2019

New tooling noticed this:

 drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x3c: redundant UACCESS disable
 drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x66: redundant UACCESS disable

You don't need user_access_end() if user_access_begin() fails.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

8f4faed0

22 3月, 2019 1 次提交

drm/i915: Flush pages on acquisition · a679f58d

由 Chris Wilson 提交于 3月 21, 2019

When we return pages to the system, we ensure that they are marked as
being in the CPU domain since any external access is uncontrolled and we
must assume the worst. This means that we need to always flush the pages
on acquisition if we need to use them on the GPU, and from the beginning
have used set-domain. Set-domain is overkill for the purpose as it is a
general synchronisation barrier, but our intent is to only flush the
pages being swapped in. If we move that flush into the pages acquisition
phase, we know then that when we have obj->mm.pages, they are coherent
with the GPU and need only maintain that status without resorting to
heavy handed use of set-domain.

The principle knock-on effect for userspace is through mmap-gtt
pagefaulting. Our uAPI has always implied that the GTT mmap was async
(especially as when any pagefault occurs is unpredicatable to userspace)
and so userspace had to apply explicit domain control itself
(set-domain). However, swapping is transparent to the kernel, and so on
first fault we need to acquire the pages and make them coherent for
access through the GTT. Our use of set-domain here leaks into the uABI
that the first pagefault was synchronous. This is unintentional and
baring a few igt should be unoticed, nevertheless we bump the uABI
version for mmap-gtt to reflect the change in behaviour.

Another implication of the change is that gem_create() is presumed to
create an object that is coherent with the CPU and is in the CPU write
domain, so a set-domain(CPU) following a gem_create() would be a minor
operation that merely checked whether we could allocate all pages for
the object. On applying this change, a set-domain(CPU) causes a clflush
as we acquire the pages. This will have a small impact on mesa as we move
the clflush here on !llc from execbuf time to create, but that should
have minimal performance impact as the same clflush exists but is now
done early and because of the clflush issue, userspace recycles bo and
so should resist allocating fresh objects.

Internally, the presumption that objects are created in the CPU
write-domain and remain so through writes to obj->mm.mapping is more
prevalent than I expected; but easy enough to catch and apply a manual
flush.

For the future, we should push the page flush from the central
set_pages() into the callers so that we can more finely control when it
is applied, but for now doing it one location is easier to validate, at
the cost of sometimes flushing when there is no need.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk

a679f58d

08 3月, 2019 1 次提交

drm/i915: Move over to intel_context_lookup() · c4d52feb

由 Chris Wilson 提交于 3月 08, 2019

In preparation for an ever growing number of engines and so ever
increasing static array of HW contexts within the GEM context, move the
array over to an rbtree, allocated upon first use.

Unfortunately, this imposes an rbtree lookup at a few frequent callsites,
but we should be able to mitigate those by moving over to using the HW
context as our primary type and so only incur the lookup on the boundary
with the user GEM context and engines.

v2: Check for no HW context in guc_stage_desc_init
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-4-chris@chris-wilson.co.uk

c4d52feb

06 3月, 2019 1 次提交

drm/i915: Store the BIT(engine->id) as the engine's mask · 8a68d464

由 Chris Wilson 提交于 3月 05, 2019

In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.

v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk

8a68d464

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功