提交 · 1ac159e23c2c033f1fcbf7d60286b90335a4e9b2 · openeuler / Kernel

29 5月, 2019 1 次提交

drm/i915: Expand subslice mask · 1ac159e2

由 Stuart Summers 提交于 5月 24, 2019

Currently, the subslice_mask runtime parameter is stored as an
array of subslices per slice. Expand the subslice mask array to
better match what is presented to userspace through the
I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
then calculated:
  slice * subslice stride + subslice index / 8

v2: fix spacing in set_sseu_info args
    use set_sseu_info to initialize sseu data when building
    device status in debugfs
    rename variables in intel_engine_types.h to avoid checkpatch
    warnings
v3: update headers in intel_sseu.h
v4: add const to some sseu_dev_info variables
    use sseu->eu_stride for EU stride calculations
v5: address review comments from Tvrtko and Daniele
v6: remove extra space in intel_sseu_get_subslices
    return the correct subslice enable in for_each_instdone
    add GEM_BUG_ON to ensure user doesn't pass invalid ss_mask size
    use printk formatted string for subslice mask
v7: remove string.h header and rebase

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NStuart Summers <stuart.summers@intel.com>
Signed-off-by: NManasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190524154022.13575-6-stuart.summers@intel.com

1ac159e2

28 5月, 2019 3 次提交

drm/i915: Drop the deferred active reference · c017cf6b

由 Chris Wilson 提交于 5月 28, 2019

An old optimisation to reduce the number of atomics per batch sadly
relies on struct_mutex for coordination. In order to remove struct_mutex
from serialising object/context closing, always taking and releasing an
active reference on first use / last use greatly simplifies the locking.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-15-chris@chris-wilson.co.uk

c017cf6b

drm/i915: Move more GEM objects under gem/ · 10be98a7

由 Chris Wilson 提交于 5月 28, 2019

Continuing the theme of separating out the GEM clutter.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk

10be98a7

drm/i915/guc: Updates for GuC 32.0.3 firmware · ffd5ce22

由 Michal Wajdeczko 提交于 5月 27, 2019

New GuC 32.0.3 firmware made many changes around its ABI that
require driver updates:

* FW release version numbering schema now includes patch number
* FW release version encoding in CSS header
* Boot parameters
* Suspend/resume protocol
* Sample-forcewake command
* Additional Data Structures (ADS)

This commit is a squash of patches 3-8 from series [1].
[1] https://patchwork.freedesktop.org/series/58760/Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # numbering schema
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # ccs heaser
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # boot params
Acked-by: John Spotswood <john.a.spotswood@intel.com> # suspend/resume
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # sample-forcewake
Acked-by: John Spotswood <john.a.spotswood@intel.com> # sample-forcewake
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # ADS
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190527183613.17076-4-michal.wajdeczko@intel.com

ffd5ce22

22 5月, 2019 1 次提交

drm/i915: Engine discovery query · c5d3e39c

由 Tvrtko Ursulin 提交于 5月 22, 2019

Engine discovery query allows userspace to enumerate engines, probe their
configuration features, all without needing to maintain the internal PCI
ID based database.

A new query for the generic i915 query ioctl is added named
DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure
drm_i915_query_engine_info. The address of latter should be passed to the
kernel in the query.data_ptr field, and should be large enough for the
kernel to fill out all known engines as struct drm_i915_engine_info
elements trailing the query.

As with other queries, setting the item query length to zero allows
userspace to query minimum required buffer size.

Enumerated engines have common type mask which can be used to query all
hardware engines, versus engines userspace can submit to using the execbuf
uAPI.

Engines also have capabilities which are per engine class namespace of
bits describing features not present on all engine instances.

v2:
 * Fixed HEVC assignment.
 * Reorder some fields, rename type to flags, increase width. (Lionel)
 * No need to allocate temporary storage if we do it engine by engine.
   (Lionel)

v3:
 * Describe engine flags and mark mbz fields. (Lionel)
 * HEVC only applies to VCS.

v4:
 * Squash SFC flag into main patch.
 * Tidy some comments.

v5:
 * Add uabi_ prefix to engine capabilities. (Chris Wilson)
 * Report exact size of engine info array. (Chris Wilson)
 * Drop the engine flags. (Joonas Lahtinen)
 * Added some more reserved fields.
 * Move flags after class/instance.

v6:
 * Do not check engine info array was zeroed by userspace but zero the
   unused fields for them instead.

v7:
 * Simplify length calculation loop. (Lionel Landwerlin)

v8:
 * Remove MBZ comments where not applicable.
 * Rename ABI flags to match engine class define naming.
 * Rename SFC ABI flag to reflect it applies to VCS and VECS.
 * SFC is wired to even _logical_ engine instances.
 * SFC applies to VCS and VECS.
 * HEVC is present on all instances on Gen11. (Tony)
 * Simplify length calculation even more. (Chris Wilson)
 * Move info_ptr assigment closer to loop for clarity. (Chris Wilson)
 * Use vdbox_sfc_access from runtime info.
 * Rebase for RUNTIME_INFO.
 * Refactor for lower indentation.
 * Rename uAPI class/instance to engine_class/instance to avoid C++
   keyword.

v9:
 * Rebase for s/num_rings/num_engines/ in RUNTIME_INFO.

v10:
 * Use new copy_query_item.

v11:
 * Consolidate with struct i915_engine_class_instnace.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com

c5d3e39c

08 5月, 2019 1 次提交

drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD · 519a0194

由 Chris Wilson 提交于 5月 08, 2019

After realising we need to sample RING_START to detect context switches
from preemption events that do not allow for the seqno to advance, we
can also realise that the seqno itself is just a distance along the ring
and so can be replaced by sampling RING_HEAD.

v2: Bonus comment for the mystery separate CS_STALL before MI_USER_INTERRUPT
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190508080704.24223-1-chris@chris-wilson.co.uk

519a0194

07 5月, 2019 1 次提交

drm/i915: Assert the local engine->wakeref is active · dc58958d

由 Chris Wilson 提交于 5月 03, 2019

Due to the asynchronous tasklet and recursive GT wakeref, it may happen
that we submit to the engine (underneath it's own wakeref) prior to the
central wakeref being marked as taken. Switch to checking the local wakeref
for greater consistency.

Fixes: 79ffac85 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190503115225.30831-3-chris@chris-wilson.co.uk

dc58958d

03 5月, 2019 1 次提交

drm/i915/execlists: Flush the tasklet on parking · c34c5bca

由 Chris Wilson 提交于 5月 03, 2019

Tidy up the cleanup sequence by always ensure that the tasklet is
flushed on parking (before we cleanup). The parking provides a
convenient point to ensure that the backend is truly idle.

v2: Do the full check for idleness before parking, to be sure we flush
any residual interrupt.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190503080942.30151-1-chris@chris-wilson.co.uk

c34c5bca

02 5月, 2019 1 次提交

drm/i915: Include fence signaled bit in print_request() · 8c334f24

由 Chris Wilson 提交于 5月 01, 2019

Show the fence flags view of request completion in addition to the
normal hwsp check and whether signaling is enabled.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501114541.10077-2-chris@chris-wilson.co.uk

8c334f24

01 5月, 2019 1 次提交

drm/i915: Move the engine->destroy() vfunc onto the engine · 45b9c968

由 Chris Wilson 提交于 5月 01, 2019

Make the engine responsible for cleaning itself up!

This removes the i915->gt.cleanup vfunc that has been annoying the
casual reader and myself for the last several years, and helps keep a
future patch to add more cleanup tidy.

v2: Assert that engine->destroy is set after the backend starts
allocating its own state.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501103204.18632-1-chris@chris-wilson.co.uk

45b9c968

27 4月, 2019 3 次提交

drm/i915: Switch back to an array of logical per-engine HW contexts · 5e2a0419

由 Chris Wilson 提交于 4月 26, 2019

We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.
v5: Push iterator locking to caller
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk

5e2a0419

drm/i915: Split engine setup/init into two phases · 11334c6a

由 Chris Wilson 提交于 4月 26, 2019

In the next patch, we require the engine vfuncs setup prior to
initialising the pinned kernel contexts, so split the vfunc setup from
the engine initialisation and call it earlier.

v2: s/setup_xcs/setup_common/ for intel_ring_submission_setup()
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-6-chris@chris-wilson.co.uk

11334c6a

drm/i915: Export intel_context_instance() · fa9f6681

由 Chris Wilson 提交于 4月 26, 2019

We want to pass in a intel_context into intel_context_pin() and that
requires us to first be able to lookup the intel_context!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-2-chris@chris-wilson.co.uk

fa9f6681

26 4月, 2019 2 次提交

drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) · 9ce9bdb0

由 Chris Wilson 提交于 4月 19, 2019

Broadwater and the rest of gen4  do support being able to saving and
reloading context specific registers between contexts, providing isolation
of the basic GPU state (as programmable by userspace). This allows
userspace to assume that the GPU retains their state from one batch to the
next, minimising the amount of state it needs to reload and manually save
across batches.

v2: CONSTANT_BUFFER woes

Running through piglit turned up an interesting issue, a GPU hang inside
the context load. The context image includes the CONSTANT_BUFFER command
that loads an address into a on-gpu buffer, and the context load was
executing that immediately. However, since it was reading from the GTT
there is no guarantee that the GTT retains the same configuration as
when the context was saved, resulting in stray reads and a GPU hang.

Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
ring before saving the context to no avail, we resort to patching out
the instruction inside the context image before loading.

This does impose that gen4 always reissues CONSTANT_BUFFER commands on
each batch, but due to the use of a shared GTT that was and will remain
a requirement.

v3: ECOSKPD to the rescue

Ville found the magic bit in the ECOSKPD to disable saving and restoring
the CONSTANT_BUFFER from the context image, thereby completely avoiding
the GPU hangs from chasing invalid pointers. This appears to be the
default behaviour for gen5, and so we just need to tweak gen4 to match.

v4: Fix spelling of ECOSKPD and discover it already exists
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: NKenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419172720.5462-1-chris@chris-wilson.co.uk

9ce9bdb0

drm/i915: Enable render context support for Ironlake (gen5) · 1215d28e

由 Chris Wilson 提交于 4月 19, 2019

Ironlake does support being able to saving and reloading context specific
registers between contexts, providing isolation of the basic GPU state
(as programmable by userspace). This allows userspace to assume that the
GPU retains their state from one batch to the next, minimising the
amount of state it needs to reload, or manually save and restore.

v2: Fix off-by-one in reading CXT_SIZE, and add a comment that the
CXT_SIZE and context-layout do not match in bspec, but the difference is
irrelevant as we overallocate the full page anyway (Ville).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419111749.3910-2-chris@chris-wilson.co.uk

1215d28e

25 4月, 2019 2 次提交

drm/i915: Invert the GEM wakeref hierarchy · 79ffac85

由 Chris Wilson 提交于 4月 24, 2019

In the current scheme, on submitting a request we take a single global
GEM wakeref, which trickles down to wake up all GT power domains. This
is undesirable as we would like to be able to localise our power
management to the available power domains and to remove the global GEM
operations from the heart of the driver. (The intent there is to push
global GEM decisions to the boundary as used by the GEM user interface.)

Now during request construction, each request is responsible via its
logical context to acquire a wakeref on each power domain it intends to
utilize. Currently, each request takes a wakeref on the engine(s) and
the engines themselves take a chipset wakeref. This gives us a
transition on each engine which we can extend if we want to insert more
powermangement control (such as soft rc6). The global GEM operations
that currently require a struct_mutex are reduced to listening to pm
events from the chipset GT wakeref. As we reduce the struct_mutex
requirement, these listeners should evaporate.

Perhaps the biggest immediate change is that this removes the
struct_mutex requirement around GT power management, allowing us greater
flexibility in request construction. Another important knock-on effect,
is that by tracking engine usage, we can insert a switch back to the
kernel context on that engine immediately, avoiding any extra delay or
inserting global synchronisation barriers. This makes tracking when an
engine and its associated contexts are idle much easier -- important for
when we forgo our assumed execution ordering and need idle barriers to
unpin used contexts. In the process, it means we remove a large chunk of
code whose only purpose was to switch back to the kernel context.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk

79ffac85

drm/i915: Move GraphicsTechnology files under gt/ · 112ed2d3

由 Chris Wilson 提交于 4月 24, 2019

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk

112ed2d3

24 4月, 2019 1 次提交

drm/i915: Store the default sseu setup on the engine · 09407579

由 Chris Wilson 提交于 4月 24, 2019

As we push for better compartmentalisation, it is more convenient to
copy the default sseu configuration from the engine into the derived
logical context, than it is to dig it out from i915->runtime_info.

v2: Use intel_sseu_from_device_info() to describe the converter
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424095134.30249-1-chris@chris-wilson.co.uk

09407579

11 4月, 2019 2 次提交

drm/i915: Prepare for larger CSB status FIFO size · 7d4c75d9

由 Mika Kuoppala 提交于 4月 05, 2019

Make csb entry count variable in preparation for larger
CSB status FIFO size found on gen11+ hardware.

v2: adapt to hwsp access only (Chris)
    non continuous mmio (Daniele)
v3: entries (Chris), fix macro for checkpatch
v4: num_entries (Chris)
v5: consistency on num_entries

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190405204657.12887-1-chris@chris-wilson.co.uk

7d4c75d9

drm/i915: Only reset the pinned kernel contexts on resume · 9726920b

由 Chris Wilson 提交于 4月 10, 2019

On resume, we know that the only pinned contexts in danger of seeing
corruption are the kernel context, and so we do not need to walk the
list of all GEM contexts as we tracked them on each engine.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190410190120.830-1-chris@chris-wilson.co.uk

9726920b

27 3月, 2019 3 次提交

drm/i915: take a reference to uncore in the engine and use it · baba6e57

由 Daniele Ceraolo Spurio 提交于 3月 25, 2019

A few advantages:

- Prepares us for the planned split of display uncore from GT uncore

- Improves our engine-centric view of the world in the engine code
  and allows us to avoid jumping back to dev_priv.

- Allows us to wrap accesses to engine register in nice macros that
  automatically pick the right mmio base.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-10-daniele.ceraolospurio@intel.com

baba6e57

drm/i915: intel_wait_for_register_fw to uncore · d2d551c0

由 Daniele Ceraolo Spurio 提交于 3月 25, 2019

The intel_uncore structure is the owner of register access, so
subclass the function to it.

While at it, use a local uncore var and switch to the new read/write
functions where it makes sense.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-8-daniele.ceraolospurio@intel.com

d2d551c0

drm/i915: switch intel_uncore_forcewake_for_reg to intel_uncore · 4319382e

由 Daniele Ceraolo Spurio 提交于 3月 25, 2019

The intel_uncore structure is the owner of FW, so subclass the
function to it.

While at it, use a local uncore var and switch to the new read/write
functions where it makes sense.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-7-daniele.ceraolospurio@intel.com

4319382e

22 3月, 2019 1 次提交

drm/i915: Flush pages on acquisition · a679f58d

由 Chris Wilson 提交于 3月 21, 2019

When we return pages to the system, we ensure that they are marked as
being in the CPU domain since any external access is uncontrolled and we
must assume the worst. This means that we need to always flush the pages
on acquisition if we need to use them on the GPU, and from the beginning
have used set-domain. Set-domain is overkill for the purpose as it is a
general synchronisation barrier, but our intent is to only flush the
pages being swapped in. If we move that flush into the pages acquisition
phase, we know then that when we have obj->mm.pages, they are coherent
with the GPU and need only maintain that status without resorting to
heavy handed use of set-domain.

The principle knock-on effect for userspace is through mmap-gtt
pagefaulting. Our uAPI has always implied that the GTT mmap was async
(especially as when any pagefault occurs is unpredicatable to userspace)
and so userspace had to apply explicit domain control itself
(set-domain). However, swapping is transparent to the kernel, and so on
first fault we need to acquire the pages and make them coherent for
access through the GTT. Our use of set-domain here leaks into the uABI
that the first pagefault was synchronous. This is unintentional and
baring a few igt should be unoticed, nevertheless we bump the uABI
version for mmap-gtt to reflect the change in behaviour.

Another implication of the change is that gem_create() is presumed to
create an object that is coherent with the CPU and is in the CPU write
domain, so a set-domain(CPU) following a gem_create() would be a minor
operation that merely checked whether we could allocate all pages for
the object. On applying this change, a set-domain(CPU) causes a clflush
as we acquire the pages. This will have a small impact on mesa as we move
the clflush here on !llc from execbuf time to create, but that should
have minimal performance impact as the same clflush exists but is now
done early and because of the clflush issue, userspace recycles bo and
so should resist allocating fresh objects.

Internally, the presumption that objects are created in the CPU
write-domain and remain so through writes to obj->mm.mapping is more
prevalent than I expected; but easy enough to catch and apply a manual
flush.

For the future, we should push the page flush from the central
set_pages() into the callers so that we can more finely control when it
is applied, but for now doing it one location is easier to validate, at
the cost of sometimes flushing when there is no need.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk

a679f58d

21 3月, 2019 2 次提交

drm/i915: Stop storing the context name as the timeline name · 4daffb66

由 Chris Wilson 提交于 3月 21, 2019

The timeline->name is only used for convenience in pretty printing the
i915_request.fence->ops->get_timeline_name() and it is just as
convenient to pull it from the gem_context directly. The few instances
of its use inside GEM_TRACE() has proven more of a nuisance than
helpful, so not worth saving imo.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-4-chris@chris-wilson.co.uk

4daffb66

drm/i915: use intel_uncore for all forcewake get/put · 3ceea6a1

由 Daniele Ceraolo Spurio 提交于 3月 19, 2019

Now that the internal code all works on intel_uncore, flip the
external-facing interface.

v2: fix GVT.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-4-daniele.ceraolospurio@intel.com

3ceea6a1

08 3月, 2019 4 次提交

drm/i915: Track the pinned kernel contexts on each engine · 9dbfea98

由 Chris Wilson 提交于 3月 08, 2019

Each engine acquires a pin on the kernel contexts (normal and preempt)
so that the logical state is always available on demand. Keep track of
each engines pin by storing the returned pointer on the engine for quick
access.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-6-chris@chris-wilson.co.uk

9dbfea98

drm/i915: Move over to intel_context_lookup() · c4d52feb

由 Chris Wilson 提交于 3月 08, 2019

In preparation for an ever growing number of engines and so ever
increasing static array of HW contexts within the GEM context, move the
array over to an rbtree, allocated upon first use.

Unfortunately, this imposes an rbtree lookup at a few frequent callsites,
but we should be able to mitigate those by moving over to using the HW
context as our primary type and so only incur the lookup on the boundary
with the user GEM context and engines.

v2: Check for no HW context in guc_stage_desc_init
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-4-chris@chris-wilson.co.uk

c4d52feb

drm/i915: Remove has-kernel-context · 7d6ce558

由 Chris Wilson 提交于 3月 08, 2019

We can no longer assume execution ordering, and in particular we cannot
assume which context will execute last. One side-effect of this is that
we cannot determine if the kernel-context is resident on the GPU, so
remove the routines that claimed to do so.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-4-chris@chris-wilson.co.uk

7d6ce558

drm/i915: Reduce presumption of request ordering for barriers · c6eeb479

由 Chris Wilson 提交于 3月 08, 2019

Currently we assume that we know the order in which requests run and so
can determine if we need to reissue a switch-to-kernel-context prior to
idling. That assumption does not hold for the future, so instead of
tracking which barriers have been used, simply determine if we have ever
switched away from the kernel context by using the engine and before
idling ensure that all engines that have been used since the last idle
are synchronously switched back to the kernel context for safety (and
else of shrinking memory while idle).

v2: Use intel_engine_mask_t and ALL_ENGINES
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-3-chris@chris-wilson.co.uk

c6eeb479

06 3月, 2019 3 次提交

drm/i915: Move find_active_request() to the engine · cf4331dd

由 Chris Wilson 提交于 3月 05, 2019

To find the active request, we need only search along the individual
engine for the right request. This does not require touching any global
GEM state, so move it into the engine compartment.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-3-chris@chris-wilson.co.uk

cf4331dd

drm/i915: Store the BIT(engine->id) as the engine's mask · 8a68d464

由 Chris Wilson 提交于 3月 05, 2019

In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.

v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk

8a68d464

drm/i915: Remove last traces of exec-id (GEM_BUSY) · c8b50242

由 Chris Wilson 提交于 3月 05, 2019

As we allow per-context engine allows the legacy concept of
I915_EXEC_RING no longer applies universally. We are still exposing the
unrelated exec-id in GEM_BUSY, so transition this ioctl (once more
slightly changing its ABI, but no one cares) over to only reporting the
uabi-class (not instance as we can not foreseeably fit those into the
small bitmask).

The only user of the extended ring information from GEM_BUSY is ddx/sna,
which tries to use the non-rcs business information to guide which
engine to use for subsequent operations on foreign bo. All that matters
for it is the decision between rcs and !rcs, so it is unaffected by the
change in higher bits.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305162643.20243-1-chris@chris-wilson.co.uk

c8b50242

02 3月, 2019 1 次提交

drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ · e8861964

由 Chris Wilson 提交于 3月 01, 2019

Having introduced per-context seqno, we now have a means to identity
progress across the system without feel of rollback as befell the
global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
advance of submission safe in the knowledge that our target seqno and
address is stable.

However, since we are telling the GPU to busy-spin on the target address
until it matches the signaling seqno, we only want to do so when we are
sure that busy-spin will be completed quickly. To achieve this we only
submit the request to HW once the signaler is itself executing (modulo
preemption causing us to wait longer), and we only do so for default and
above priority requests (so that idle priority tasks never themselves
hog the GPU waiting for others).

As might be reasonably expected, HW semaphores excel in inter-engine
synchronisation microbenchmarks (where the 3x reduced latency / increased
throughput more than offset the power cost of spinning on a second ring)
and have significant improvement (can be up to ~10%, most see no change)
for single clients that utilize multiple engines (typically media players
and transcoders), without regressing multiple clients that can saturate
the system or changing the power envelope dramatically.

v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
v4: Tell the world and include it as part of scheduler caps.

Testcase: igt/gem_exec_whisper
Testcase: igt/benchmarks/gem_wsim
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk

e8861964

28 2月, 2019 3 次提交

drm/i915: Report engines are idle if already parked · bd2be141

由 Chris Wilson 提交于 2月 27, 2019

If we have parked, then we must have passed an idleness test and still
be idle. We chose not to use this shortcut in the past so that we could
use the idleness test at any time and inspect HW. However, some HW like
Sandybridge, doesn't like being woken up frivolously, so avoid doing so.

References: 0b702dca ("drm/i915: Avoid waking the engines just to check if they are idle")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190227214159.7946-1-chris@chris-wilson.co.uk

bd2be141

Revert "drm/i915: Avoid waking the engines just to check if they are idle" · 44f8b802

由 Chris Wilson 提交于 2月 27, 2019

This reverts commit 0b702dca.

CI reports that this is not as reliable as it first appears, with
failures starting to sporadically occur in selftests.

Fixes: 0b702dca ("drm/i915: Avoid waking the engines just to check if they are idle")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190227204654.14907-1-chris@chris-wilson.co.uk

44f8b802

drm/i915: Compute the global scheduler caps · 2d5eaad0

由 Chris Wilson 提交于 2月 26, 2019

Do a pass over all the engines upon starting to determine the global
scheduler capability flags (those that are agreed upon by all).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226102404.29153-7-chris@chris-wilson.co.uk

2d5eaad0

27 2月, 2019 1 次提交

drm/i915: Avoid waking the engines just to check if they are idle · 0b702dca

由 Chris Wilson 提交于 2月 27, 2019

Exploit that reads of the ring registers return 0 from the engine when
it is idle and we do not apply forcewake to know that if the engine is
idle then both reads will be identical (and so we interpret the ring as
idle).

The ulterior motive is to try and reduce the number of spurious wakeups
to avoid untimely death, such as:

<3> [85.046836] [drm:fw_domains_get [i915]] *ERROR* render: timed out waiting for forcewake ack request.
<4> [85.051916] ------------[ cut here ]------------
<4> [85.051917] GT thread status wait timed out
<4> [85.051963] WARNING: CPU: 2 PID: 2195 at drivers/gpu/drm/i915/intel_uncore.c:303 __gen6_gt_wait_for_thread_c0+0x6e/0xa0 [i915]
<4> [85.051964] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal coretemp mei_hdcp crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_codec broadcom bcm_phy_lib i2c_i801 snd_hwdep snd_hda_core tg3 snd_pcm ptp pps_core mei_me mei prime_numbers lpc_ich
<4> [85.051980] CPU: 2 PID: 2195 Comm: drm_read Tainted: G     U            5.0.0-rc8-CI-CI_DRM_5662+ #1
<4> [85.051981] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
<4> [85.052012] RIP: 0010:__gen6_gt_wait_for_thread_c0+0x6e/0xa0 [i915]
<4> [85.052015] Code: 8b 92 5c 80 13 00 83 e2 07 75 d5 5b 5d c3 80 3d 5b 6a 1a 00 00 75 f4 48 c7 c7 38 21 31 a0 c6 05 4b 6a 1a 00 01 e8 e2 84 ea e0 <0f> 0b eb dd 80 3d 3a 6a 1a 00 00 75 98 48 c7 c6 08 21 31 a0 48 c7
<4> [85.052016] RSP: 0018:ffffc9000043bd00 EFLAGS: 00010086
<4> [85.052019] RAX: 0000000000000000 RBX: ffff888217c50000 RCX: 0000000000000000
<4> [85.052020] RDX: 0000000000000007 RSI: ffffffff820cb141 RDI: 00000000ffffffff
<4> [85.052022] RBP: 00000013cd30f2fb R08: 0000000000000000 R09: 0000000000000001
<4> [85.052024] R10: ffffc9000043bce0 R11: 0000000000000000 R12: ffff888217c50ee0
<4> [85.052025] R13: 0000000000000001 R14: 00000000ffffffff R15: ffff888218076530
<4> [85.052028] FS:  00007fc79d049980(0000) GS:ffff888227a80000(0000) knlGS:0000000000000000
<4> [85.052029] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [85.052031] CR2: 00007f782e2940f8 CR3: 000000022458e006 CR4: 00000000000606e0
<4> [85.052033] Call Trace:
<4> [85.052064]  gen6_read32+0x14e/0x250 [i915]
<4> [85.052096]  intel_engine_is_idle+0x7d/0x180 [i915]
<4> [85.052126]  intel_engines_are_idle+0x29/0x50 [i915]
<4> [85.052153]  i915_drop_caches_set+0x21c/0x290 [i915]
<4> [85.052160]  simple_attr_write+0xb0/0xd0
<4> [85.052165]  full_proxy_write+0x51/0x80
<4> [85.052170]  __vfs_write+0x31/0x190
<4> [85.052176]  ? rcu_read_lock_sched_held+0x6f/0x80
<4> [85.052178]  ? rcu_sync_lockdep_assert+0x29/0x50
<4> [85.052181]  ? __sb_start_write+0x152/0x1f0
<4> [85.052183]  ? __sb_start_write+0x163/0x1f0
<4> [85.052187]  vfs_write+0xbd/0x1b0
<4> [85.052191]  ksys_write+0x50/0xc0
<4> [85.052196]  do_syscall_64+0x55/0x190
<4> [85.052200]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [85.052202] RIP: 0033:0x7fc79c9d3281
<4> [85.052204] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4> [85.052206] RSP: 002b:00007fffa4a0a7f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4> [85.052208] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fc79c9d3281
<4> [85.052210] RDX: 0000000000000005 RSI: 00007fffa4a0a880 RDI: 0000000000000008
<4> [85.052212] RBP: 00007fffa4a0a820 R08: 0000000000000000 R09: 0000000000000000
<4> [85.052213] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc79c9bc718
<4> [85.052215] R13: 0000000000000003 R14: 00007fc79c9c1628 R15: 00007fc79c9bdd80
<4> [85.052223] irq event stamp: 71630
<4> [85.052226] hardirqs last  enabled at (71629): [<ffffffff8197b64c>] _raw_spin_unlock_irqrestore+0x4c/0x60
<4> [85.052228] hardirqs last disabled at (71630): [<ffffffff8197b4bd>] _raw_spin_lock_irqsave+0xd/0x50
<4> [85.052231] softirqs last  enabled at (70444): [<ffffffff81c0033a>] __do_softirq+0x33a/0x4b9
<4> [85.052234] softirqs last disabled at (70433): [<ffffffff810b51b1>] irq_exit+0xd1/0xe0
<4> [85.052264] WARNING: CPU: 2 PID: 2195 at drivers/gpu/drm/i915/intel_uncore.c:303 __gen6_gt_wait_for_thread_c0+0x6e/0xa0 [i915]
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190227114958.32438-1-chris@chris-wilson.co.uk

0b702dca

26 2月, 2019 2 次提交

drm/i915: Remove i915_request.global_seqno · b300fde8

由 Chris Wilson 提交于 2月 26, 2019

Having weaned the interrupt handling off using a single global execution
queue, we no longer need to emit a global_seqno. Note that we still have
a few assumptions about execution order along engine timelines, but this
removes the most obvious artefact!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-3-chris@chris-wilson.co.uk

b300fde8

drm/i915: Remove access to global seqno in the HWSP · 8892f477

由 Chris Wilson 提交于 2月 26, 2019

Stop accessing the HWSP to read the global seqno, and stop tracking the
mirror in the engine's execution timeline -- it is unused.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-2-chris@chris-wilson.co.uk

8892f477

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功