提交 · 6a45008ab7bb5e13b543de0c141b94aaa71d8397 · openeuler / Kernel

12 10月, 2019 2 次提交

drm/i915/perf: allow for CS OA configs to be created lazily · 6a45008a

由 Lionel Landwerlin 提交于 10月 12, 2019

Here we introduce a mechanism by which the execbuf part of the i915
driver will be able to request that a batch buffer containing the
programming for a particular OA config be created.

We'll execute these OA configuration buffers right before executing a
set of userspace commands so that a particular user batchbuffer be
executed with a given OA configuration.

This mechanism essentially allows the userspace driver to go through
several OA configuration without having to open/close the i915/perf
stream.

v2: No need for locking on object OA config object creation (Chris)
    Flush cpu mapping of OA config (Chris)

v3: Properly deal with the perf_metric lock (Chris/Lionel)

v4: Fix oa config unref/put when not found (Lionel)

v5: Allocate BOs for configurations on the stream instead of globally
    (Lionel)

v6: Fix 64bit division (Chris)

v7: Store allocated config BOs into the stream (Lionel)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191012072308.30312-1-chris@chris-wilson.co.uk

6a45008a

drm/i915/perf: Replace global wakeref tracking with engine-pm · a5efcde6

由 Chris Wilson 提交于 10月 11, 2019

As we now have a specific engine to use OA on, exchange the top-level
runtime-pm wakeref with the engine-pm. This still results in the same
top-level runtime-pm, but with more nuances to keep the engine and its
gt awake.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191011190325.10979-1-chris@chris-wilson.co.uk

a5efcde6

11 10月, 2019 2 次提交

drm/i915/perf: Store shortcut to intel_uncore · 52111c46

由 Chris Wilson 提交于 10月 10, 2019

Now that we have the engine stored in i915_perf, we have a means of
accessing intel_gt should we require it. However, we are currently only
using the intel_gt to find the right intel_uncore, so replace our
i915_perf.gt pointer with the more useful i915_perf.uncore.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010150520.26488-2-chris@chris-wilson.co.uk

52111c46

drm/i915/perf: store the associated engine of a stream · 9a61363a

由 Lionel Landwerlin 提交于 10月 10, 2019

We'll use this information later to verify that a client trying to
reconfigure the stream does so on the right engine. For now, we want to
pull the knowledge of which engine we use into a central property.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010150520.26488-1-chris@chris-wilson.co.uk

9a61363a

08 10月, 2019 3 次提交

drm/i915/perf: drop list of streams · 23b9e41a

由 Lionel Landwerlin 提交于 10月 08, 2019

At some point in time there was the idea that we could have multiple
stream from the same piece of HW but that never materialized and given
the hard time we already have making everything work with the
submission side, there is no real point having this list of 1 element
around.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008140111.5437-1-chris@chris-wilson.co.uk

23b9e41a

drm/i915/perf: Set the exclusive stream under perf->lock · a4c969d1

由 Chris Wilson 提交于 10月 07, 2019

The BKL struct_mutex is no more, the only serialisation we required for
setting the exclusive stream is already managed by ce->pin_mutex in
gen8_configure_all_contexts(). As such, we can manipulate
i915_perf.exclusive_stream underneath our own (already held) perf->lock.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191007140812.10963-2-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20191007210942.18145-2-chris@chris-wilson.co.uk

a4c969d1

drm/i915/perf: Wean ourselves off dev_priv · 8f8b1171

由 Chris Wilson 提交于 10月 07, 2019

Use the local uncore accessors for the GT rather than using the [not-so]
magic global dev_priv mmio routines. In the process, we also teach the
perf stream to use backpointers to the i915_perf rather than digging it
out of dev_priv.

v2: Rebase onto i915_perf_types.h
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20191007140812.10963-1-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20191007210942.18145-1-chris@chris-wilson.co.uk

8f8b1171

04 10月, 2019 3 次提交

drm/i915: Move context management under GEM · a4e7ccda

由 Chris Wilson 提交于 10月 04, 2019

Keep track of the GEM contexts underneath i915->gem.contexts and assign
them their own lock for the purposes of list management.

v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex
v3: Correct split with removal of logical HW ID
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk

a4e7ccda

drm/i915: Remove logical HW ID · 2935ed53

由 Chris Wilson 提交于 10月 04, 2019

With the introduction of ctx->engines[] we allow multiple logical
contexts to be used on the same engine (e.g. with virtual engines).
According to bspec, aach logical context requires a unique tag in order
for context-switching to occur correctly between them. [Simple
experiments show that it is not so easy to trick the HW into performing
a lite-restore with matching logical IDs, though my memory from early
Broadwell experiments do suggest that it should be generating
lite-restores.]

We only need to keep a unique tag for the active lifetime of the
context, and for as long as we need to identify that context. The HW
uses the tag to determine if it should use a lite-restore (why not the
LRCA?) and passes the tag back for various status identifies. The only
status we need to track is for OA, so when using perf, we assign the
specific context a unique tag.

v2: Calculate required number of tags to fill ELSP.

Fixes: 976b55f0 ("drm/i915: Allow a context to define its set of engines")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk

2935ed53

drm/i915: Pull i915_vma_pin under the vm->mutex · 2850748e

由 Chris Wilson 提交于 10月 04, 2019

Replace the struct_mutex requirement for pinning the i915_vma with the
local vm->mutex instead. Note that the vm->mutex is tainted by the
shrinker (we require unbinding from inside fs-reclaim) and so we cannot
allocate while holding that mutex. Instead we have to preallocate
workers to do allocate and apply the PTE updates after we have we
reserved their slot in the drm_mm (using fences to order the PTE writes
with the GPU work and with later unbind).

In adding the asynchronous vma binding, one subtle requirement is to
avoid coupling the binding fence into the backing object->resv. That is
the asynchronous binding only applies to the vma timeline itself and not
to the pages as that is a more global timeline (the binding of one vma
does not need to be ordered with another vma, nor does the implicit GEM
fencing depend on a vma, only on writes to the backing store). Keeping
the vma binding distinct from the backing store timelines is verified by
a number of async gem_exec_fence and gem_exec_schedule tests. The way we
do this is quite simple, we keep the fence for the vma binding separate
and only wait on it as required, and never add it to the obj->resv
itself.

Another consequence in reducing the locking around the vma is the
destruction of the vma is no longer globally serialised by struct_mutex.
A natural solution would be to add a kref to i915_vma, but that requires
decoupling the reference cycles, possibly by introducing a new
i915_mm_pages object that is own by both obj->mm and vma->pages.
However, we have not taken that route due to the overshadowing lmem/ttm
discussions, and instead play a series of complicated games with
trylocks to (hopefully) ensure that only one destruction path is called!

v2: Add some commentary, and some helpers to reduce patch churn.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk

2850748e

25 9月, 2019 1 次提交

drm/i915/selftests: Verify the LRC register layout between init and HW · 7dc56af5

由 Chris Wilson 提交于 9月 24, 2019

Before we submit the first context to HW, we need to construct a valid
image of the register state. This layout is defined by the HW and should
match the layout generated by HW when it saves the context image.
Asserting that this should be equivalent should help avoid any undefined
behaviour and verify that we haven't missed anything important!

Of course, having insisted that the initial register state within the
LRC should match that returned by HW, we need to ensure that it does.

v2: Drop the RELATIVE_MMIO flag from gen11, we ignore it for
constructing the lrc image.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190924145950.3011-1-chris@chris-wilson.co.uk

7dc56af5

31 8月, 2019 1 次提交

drm/i915/perf: Assert locking for i915_init_oa_perf_state() · dffa8feb

由 Chris Wilson 提交于 8月 30, 2019

We use the context->pin_mutex to serialise updates to the OA config and
the registers values written into each new context. Document this
relationship and assert we do hold the context->pin_mutex as used by
gen8_configure_all_contexts() to serialise updates to the OA config
itself.

v2: Add a white-lie for when we call intel_gt_resume() from init.
v3: Lie while we have the context pinned inside atomic reset.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20190830181929.18663-1-chris@chris-wilson.co.uk

dffa8feb

27 8月, 2019 1 次提交

drm/i915/tgl/perf: use the same oa ctx_id format as icl · 45e9c829

由 Michel Thierry 提交于 8月 23, 2019

Compared to Icelake, Tigerlake's MAX_CONTEXT_HW_ID is smaller by one, but
since we just use the upper 32 bits of the lrc_desc, it's guaranteed OA
will use the correct one.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-19-lucas.demarchi@intel.com

45e9c829

09 8月, 2019 1 次提交

drm/i915: extract i915_perf.h from i915_drv.h · db94e9f1

由 Jani Nikula 提交于 8月 08, 2019

It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.

Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.

No functional changes.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d7826e365695f691a3ac69a69ff6f2bbdb62700d.1565271681.git.jani.nikula@intel.com

db94e9f1

08 8月, 2019 1 次提交

drm/i915/perf: Refactor oa object to better manage resources · a37f08a8

由 Umesh Nerlige Ramappa 提交于 8月 06, 2019

The oa object manages the oa buffer and must be allocated when the user
intends to read performance counter snapshots. This can be achieved by
making the oa object part of the stream object which is allocated when a
stream is opened by the user.

Attributes in the oa object that are gen-specific are moved to the perf
object so that they can be initialized on driver load.

The split provides a better separation of the objects used in perf
implementation of i915 driver so that resources are allocated and
initialized only when needed.

v2: Fix checkpatch warnings
v3: Addressed Lionel's review comment
v4: Rebase
v5: Fix rebase/merge issue with ratelimit_state_init
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806233002.984-1-umesh.nerlige.ramappa@intel.com

a37f08a8

06 8月, 2019 1 次提交

drm/i915/gt: Move the [class][inst] lookup for engines onto the GT · 750e76b4

由 Chris Wilson 提交于 8月 06, 2019

To maintain a fast lookup from a GT centric irq handler, we want the
engine lookup tables on the intel_gt. To avoid having multiple copies of
the same multi-dimension lookup table, move the generic user engine
lookup into an rbtree (for fast and flexible indexing).

v2: Split uabi_instance cf uabi_class
v3: Set uabi_class/uabi_instance after collating all engines to provide a
stable uabi across parallel unordered construction.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk

750e76b4

31 7月, 2019 1 次提交

drm/i915: Avoid ce->gem_context->i915 · cb0c43f3

由 Chris Wilson 提交于 7月 30, 2019

My plan for the future is to have kernel contexts not to have a GEM
context backpointer (as they will not belong to any GEM context). In a
few places, we use ce->gem_context to simply obtain the i915 backpointer,
for which we can use ce->engine->i915 instead.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730163441.16477-1-chris@chris-wilson.co.uk

cb0c43f3

29 7月, 2019 3 次提交

drm/i915/perf: add missing delay for OA muxes configuration · 8f48de49

由 Lionel Landwerlin 提交于 7月 10, 2019

This was dropped from the original patch series, we weren't sure
whether it was needed at the time. More recent tests show it's
definitely needed to have acurate performance data.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 19f81df2 ("drm/i915/perf: Add OA unit support for Gen 8+")
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
[ickle: combine duplicate code and comments]
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190710105524.23017-1-chris@chris-wilson.co.uk
(cherry picked from commit 14bfcd3e)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

8f48de49

drm/i915/perf: ensure we keep a reference on the driver · 06c12ae3

由 Lionel Landwerlin 提交于 7月 09, 2019

The i915 perf stream has its own file descriptor and is tied to
reference of the driver. We haven't taken care of keep the driver
alive.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Fixes: eec688e1 ("drm/i915: Add i915 perf infrastructure")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190709123351.5645-2-lionel.g.landwerlin@intel.com
(cherry picked from commit a5af1df7)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

06c12ae3

drm/i915/perf: fix ICL perf register offsets · 95eef14c

由 Lionel Landwerlin 提交于 6月 10, 2019

We got the wrong offsets (could they have changed?). New values were
computed off an error state by looking up the register offset in the
context image as written by the HW.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1de401c0 ("drm/i915/perf: enable perf support on ICL")
Cc: <stable@vger.kernel.org> # v4.18+
Acked-by: NKenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190610081914.25428-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 8dcfdfb4)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

95eef14c

26 7月, 2019 1 次提交

drm/i915/perf: Initialise err to 0 before looping over ce->engines · 5cca5038

由 Chris Wilson 提交于 7月 26, 2019

Smatch warning that the loop may be empty causing us to check err before
it had been set. Ensure that it is initialised to 0, just in case.

v2: Refactor the inner loop for better scooping and clarity

Fixes: a9877da2 ("drm/i915/oa: Reconfigure contexts on the fly")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190726131458.8310-1-chris@chris-wilson.co.uk

5cca5038

19 7月, 2019 1 次提交

proc/sysctl: add shared variables for range check · eec4844f

由 Matteo Croce 提交于 7月 18, 2019

In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range.  This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.

On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.

The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:

    $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
    248

Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.

This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:

    # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
    add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
    Data                                         old     new   delta
    sysctl_vals                                    -      12     +12
    __kstrtab_sysctl_vals                          -      12     +12
    max                                           14      10      -4
    int_max                                       16       -     -16
    one                                           68       -     -68
    zero                                         128      28    -100
    Total: Before=20583249, After=20583085, chg -0.00%

[mcroce@redhat.com: tipc: remove two unused variables]
  Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
  Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.comSigned-off-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAaron Tomlin <atomlin@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eec4844f

17 7月, 2019 1 次提交

drm/i915/oa: Reconfigure contexts on the fly · a9877da2

由 Chris Wilson 提交于 7月 16, 2019

Avoid a global idle barrier by reconfiguring each context by rewriting
them with MI_STORE_DWORD from the kernel context.

v2: We only need to determine the desired register values once, they are
the same for all contexts.
v3: Don't remove the kernel context from the list of known GEM contexts;
the world is not ready for that yet.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190716213443.9874-1-chris@chris-wilson.co.uk

a9877da2

10 7月, 2019 2 次提交

drm/i915/perf: add missing delay for OA muxes configuration · 14bfcd3e

由 Lionel Landwerlin 提交于 7月 10, 2019

This was dropped from the original patch series, we weren't sure
whether it was needed at the time. More recent tests show it's
definitely needed to have acurate performance data.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 19f81df2 ("drm/i915/perf: Add OA unit support for Gen 8+")
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
[ickle: combine duplicate code and comments]
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190710105524.23017-1-chris@chris-wilson.co.uk

14bfcd3e

drm/i915/perf: ensure we keep a reference on the driver · a5af1df7

由 Lionel Landwerlin 提交于 7月 09, 2019

The i915 perf stream has its own file descriptor and is tied to
reference of the driver. We haven't taken care of keep the driver
alive.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Fixes: eec688e1 ("drm/i915: Add i915 perf infrastructure")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190709123351.5645-2-lionel.g.landwerlin@intel.com

a5af1df7

27 6月, 2019 1 次提交

drm/i915: Move OA files to separate folder · 5ed7a0cf

由 Michal Wajdeczko 提交于 6月 26, 2019

OA files look to be auto-generated so we can keep them all in
dedicated subdirectory.
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190626123826.39760-1-michal.wajdeczko@intel.com

5ed7a0cf

25 6月, 2019 1 次提交

drm/i915/perf: fix ICL perf register offsets · 8dcfdfb4

由 Lionel Landwerlin 提交于 6月 10, 2019

We got the wrong offsets (could they have changed?). New values were
computed off an error state by looking up the register offset in the
context image as written by the HW.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1de401c0 ("drm/i915/perf: enable perf support on ICL")
Acked-by: NKenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190610081914.25428-1-lionel.g.landwerlin@intel.com

8dcfdfb4

14 6月, 2019 1 次提交

drm/i915: update rpm_get/put to use the rpm structure · d858d569

由 Daniele Ceraolo Spurio 提交于 6月 13, 2019

The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com

d858d569

12 6月, 2019 1 次提交

drm/i915/perf: fix whitelist on Gen10+ · c5cc0bf8

由 Lionel Landwerlin 提交于 6月 02, 2019

Gen10 added an additional NOA_WRITE register (high bits) and we forgot
to whitelist it for userspace.

Fixes: 95690a02 ("drm/i915/perf: enable perf support on CNL")
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NKenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190601225845.12600-1-lionel.g.landwerlin@intel.com
(cherry picked from commit bf210f6c)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

c5cc0bf8

10 6月, 2019 1 次提交

drm/i915/perf: fix whitelist on Gen10+ · bf210f6c

由 Lionel Landwerlin 提交于 6月 02, 2019

Gen10 added an additional NOA_WRITE register (high bits) and we forgot
to whitelist it for userspace.

Fixes: 95690a02 ("drm/i915/perf: enable perf support on CNL")
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NKenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190601225845.12600-1-lionel.g.landwerlin@intel.com

bf210f6c

28 5月, 2019 2 次提交

drm/i915: Move more GEM objects under gem/ · 10be98a7

由 Chris Wilson 提交于 5月 28, 2019

Continuing the theme of separating out the GEM clutter.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk

10be98a7

drm/i915: Move shmem object setup to its own file · 8475355f

由 Chris Wilson 提交于 5月 28, 2019

Split the plain old shmem object into its own file to start decluttering
i915_gem.c

v2: Lose the confusing, hysterical raisins, suffix of _gtt.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-4-chris@chris-wilson.co.uk

8475355f

27 4月, 2019 2 次提交

drm/i915: Switch back to an array of logical per-engine HW contexts · 5e2a0419

由 Chris Wilson 提交于 4月 26, 2019

We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.
v5: Push iterator locking to caller
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk

5e2a0419

drm/i915: Export intel_context_instance() · fa9f6681

由 Chris Wilson 提交于 4月 26, 2019

We want to pass in a intel_context into intel_context_pin() and that
requires us to first be able to lookup the intel_context!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-2-chris@chris-wilson.co.uk

fa9f6681

25 4月, 2019 2 次提交

drm/i915: Pass intel_context to i915_request_create() · 2ccdf6a1

由 Chris Wilson 提交于 4月 24, 2019

Start acquiring the logical intel_context and using that as our primary
means for request allocation. This is the initial step to allow us to
avoid requiring struct_mutex for request allocation along the
perma-pinned kernel context, but it also provides a foundation for
breaking up the complex request allocation to handle different scenarios
inside execbuf.

For the purpose of emitting a request from inside retirement (see the
next patch for engine power management), we also need to lift control
over the timeline mutex to the caller.

v2: Note that the request carries the active reference upon construction.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-4-chris@chris-wilson.co.uk

2ccdf6a1

drm/i915: Move GraphicsTechnology files under gt/ · 112ed2d3

由 Chris Wilson 提交于 4月 24, 2019

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk

112ed2d3

24 4月, 2019 1 次提交

drm/i915: Store the default sseu setup on the engine · 09407579

由 Chris Wilson 提交于 4月 24, 2019

As we push for better compartmentalisation, it is more convenient to
copy the default sseu configuration from the engine into the derived
logical context, than it is to dig it out from i915->runtime_info.

v2: Use intel_sseu_from_device_info() to describe the converter
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424095134.30249-1-chris@chris-wilson.co.uk

09407579

27 3月, 2019 1 次提交

drm/i915: switch intel_wait_for_register to uncore · 97a04e0d

由 Daniele Ceraolo Spurio 提交于 3月 25, 2019

The intel_uncore structure is the owner of register access, so
subclass the function to it.

While at it, use a local uncore var and switch to the new read/write
functions where it makes sense.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-9-daniele.ceraolospurio@intel.com

97a04e0d

22 3月, 2019 1 次提交

drm/i915: Flush pages on acquisition · a679f58d

由 Chris Wilson 提交于 3月 21, 2019

When we return pages to the system, we ensure that they are marked as
being in the CPU domain since any external access is uncontrolled and we
must assume the worst. This means that we need to always flush the pages
on acquisition if we need to use them on the GPU, and from the beginning
have used set-domain. Set-domain is overkill for the purpose as it is a
general synchronisation barrier, but our intent is to only flush the
pages being swapped in. If we move that flush into the pages acquisition
phase, we know then that when we have obj->mm.pages, they are coherent
with the GPU and need only maintain that status without resorting to
heavy handed use of set-domain.

The principle knock-on effect for userspace is through mmap-gtt
pagefaulting. Our uAPI has always implied that the GTT mmap was async
(especially as when any pagefault occurs is unpredicatable to userspace)
and so userspace had to apply explicit domain control itself
(set-domain). However, swapping is transparent to the kernel, and so on
first fault we need to acquire the pages and make them coherent for
access through the GTT. Our use of set-domain here leaks into the uABI
that the first pagefault was synchronous. This is unintentional and
baring a few igt should be unoticed, nevertheless we bump the uABI
version for mmap-gtt to reflect the change in behaviour.

Another implication of the change is that gem_create() is presumed to
create an object that is coherent with the CPU and is in the CPU write
domain, so a set-domain(CPU) following a gem_create() would be a minor
operation that merely checked whether we could allocate all pages for
the object. On applying this change, a set-domain(CPU) causes a clflush
as we acquire the pages. This will have a small impact on mesa as we move
the clflush here on !llc from execbuf time to create, but that should
have minimal performance impact as the same clflush exists but is now
done early and because of the clflush issue, userspace recycles bo and
so should resist allocating fresh objects.

Internally, the presumption that objects are created in the CPU
write-domain and remain so through writes to obj->mm.mapping is more
prevalent than I expected; but easy enough to catch and apply a manual
flush.

For the future, we should push the page flush from the central
set_pages() into the callers so that we can more finely control when it
is applied, but for now doing it one location is easier to validate, at
the cost of sometimes flushing when there is no need.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk

a679f58d

21 3月, 2019 1 次提交

drm/i915: use intel_uncore for all forcewake get/put · 3ceea6a1

由 Daniele Ceraolo Spurio 提交于 3月 19, 2019

Now that the internal code all works on intel_uncore, flip the
external-facing interface.

v2: fix GVT.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-4-daniele.ceraolospurio@intel.com

3ceea6a1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功