提交 · 6352219c39c04ed3f9a8d1cf93f87c21753a213e · openeuler / Kernel

04 4月, 2020 1 次提交

drm/i915/perf: Do not clear pollin for small user read buffers · 6352219c

由 Ashutosh Dixit 提交于 4月 02, 2020

It is wrong to block the user thread in the next poll when OA data is
already available which could not fit in the user buffer provided in
the previous read. In several cases the exact user buffer size is not
known. Blocking user space in poll can lead to data loss when the
buffer size used is smaller than the available data.

This change fixes this issue and allows user space to read all OA data
even when using a buffer size smaller than the available data using
multiple non-blocking reads rather than staying blocked in poll till
the next timer interrupt.

v2: Fix ret value for blocking reads (Umesh)
v3: Mistake during patch send (Ashutosh)
v4: Remove -EAGAIN from comment (Umesh)
v5: Improve condition for clearing pollin and return (Lionel)
v6: Improve blocking read loop and other cleanups (Lionel)
v7: Added Cc stable

Testcase: igt/perf/polling-small-buf
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200403010120.3067-1-ashutosh.dixit@intel.com

6352219c

31 3月, 2020 2 次提交

drm/i915/perf: don't read head/tail pointers outside critical section · d16e137e

由 Lionel Landwerlin 提交于 3月 30, 2020

Reading or writing those fields should only happen under
stream->oa_buffer.ptr_lock.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d1df41eb ("drm/i915/perf: rework aging tail workaround")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330091411.37357-1-lionel.g.landwerlin@intel.com

d16e137e

drm/i915/perf: Schedule oa_config after modifying the contexts · d7d50f80

由 Chris Wilson 提交于 3月 27, 2020

We wish that the scheduler emit the context modification commands prior
to enabling the oa_config, for which we must explicitly inform it of the
ordering constraints. This is especially important as we now wait for
the final oa_config setup to be completed and as this wait may be on a
distinct context to the state modifications, we need that command packet
to be always last in the queue.

We borrow the i915_active for its ability to track multiple timelines
and the last dma_fence on each; a flexible dma_resv. Keeping track of
each dma_fence is important for us so that we can efficiently schedule
the requests and reprioritise as required.
Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200327112212.16046-3-chris@chris-wilson.co.uk

d7d50f80

27 3月, 2020 3 次提交

drm/i915/perf: add new open param to configure polling of OA buffer · 4ef10fe0

由 Lionel Landwerlin 提交于 3月 24, 2020

This new parameter let's the application choose how often the OA
buffer should be checked on the CPU side for data availability. Longer
polling period tend to reduce CPU overhead if the application does not
care about somewhat real time data collection.

v2: Allow disabling polling completely with 0 value (Lionel)
v3: Version the new parameter (Joonas)
v4: Rebase (Umesh)
v5: Make poll delay value of 0 invalid (Umesh)
v6:
- Describe poll_oa_period (Ashutosh)
- Fix comment for new poll parameter (Lionel)
- Drop open_flags in read_properties_unlocked (Lionel)
- Rename uapi parameter (Ashutosh)
v7: Reword the comment in uapi (Ashutosh)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200324185457.14635-4-umesh.nerlige.ramappa@intel.com

4ef10fe0

drm/i915/perf: move pollin setup to non hw specific code · c51dbc6e

由 Lionel Landwerlin 提交于 3月 24, 2020

This isn't really gen specific stuff, so just move it to the common
code.

v2: Rebase (Umesh)
v3: Remove comment, pollin is a per stream state already (Ashutosh)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200324185457.14635-3-umesh.nerlige.ramappa@intel.com

c51dbc6e

drm/i915/perf: rework aging tail workaround · d1df41eb

由 Lionel Landwerlin 提交于 3月 24, 2020

We're about to introduce an options to open the perf stream, giving
the user ability to configure how often it wants the kernel to poll
the OA registers for available data.

Right now the workaround against the OA tail pointer race condition
requires at least twice the internal kernel polling timer to make any
data available.

This changes introduce checks on the OA data written into the circular
buffer to make as much data as possible available on the first
iteration of the polling timer.

v2: Use OA_TAKEN macro without the gtt_offset (Lionel)
v3: (Umesh)
- Rebase
- Change report to report32 from below review
  https://patchwork.freedesktop.org/patch/330704/?series=66697&rev=1
v4: (Ashutosh, Lionel)
- Fix checkpatch errors
- Fix aging_timestamp initialization
- Check for only one valid landed report
- Fix check for unlanded report
v5: (Ashutosh)
- Fix bug in accurately determining landed report.
- Optimize the check for landed reports by going as far as the
  previously determined aged tail.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200324185457.14635-2-umesh.nerlige.ramappa@intel.com

d1df41eb

18 3月, 2020 1 次提交

drm/i915/perf: Invalidate OA TLB on when closing perf stream · a639b0c1

由 Umesh Nerlige Ramappa 提交于 3月 09, 2020

On running several back to back perf capture sessions involving closing
and opening the perf stream, invalid OA reports are seen in the
beginning of the OA buffer in some sessions. Fix this by invalidating OA
TLB when the perf stream is closed or disabled on gen12.
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL")
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200309211057.38575-1-umesh.nerlige.ramappa@intel.com

a639b0c1

17 3月, 2020 3 次提交

drm/i915/perf: introduce global sseu pinning · 11ecbddd

由 Lionel Landwerlin 提交于 3月 17, 2020

On Gen11 powergating half the execution units is a functional
requirement when using the VME samplers. Not fullfilling this
requirement can lead to hangs.

This unfortunately plays fairly poorly with the NOA requirements. NOA
requires a stable power configuration to maintain its configuration.

As a result using OA (and NOA feeding into it) so far has required us
to use a power configuration that can work for all contexts. The only
power configuration fullfilling this is powergating half the execution
units.

This makes performance analysis for 3D workloads somewhat pointless.

Failing to find a solution that would work for everybody, this change
introduces a new i915-perf stream open parameter that punts the
decision off to userspace. If this parameter is omitted, the existing
Gen11 behavior remains (half EU array powergating).

This change takes the initiative to move all perf related sseu
configuration into i915_perf.c

v2: Make parameter priviliged if different from default

v3: Fix context modifying its sseu config while i915-perf is enabled

v4: Always consider global sseu a privileged operation (Tvrtko)
Override req_sseu point in intel_sseu_make_rpcs() (Tvrtko)
Remove unrelated changes (Tvrtko)

v5: Some typos (Tvrtko)
Process sseu param in read_properties_unlocked() (Tvrtko)

v6: Actually commit the bits from v5...
Fixup some checkpath warnings

v7: Only compare engine uabi field (Chris)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-3-lionel.g.landwerlin@intel.com

11ecbddd

drm/i915/perf: remove redundant power configuration register override · 371aba6e

由 Lionel Landwerlin 提交于 3月 17, 2020

The caller of i915_oa_init_reg_state() already sets this.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-2-lionel.g.landwerlin@intel.com

371aba6e

drm/i915/perf: remove generated code · 9aba9c18

由 Lionel Landwerlin 提交于 3月 17, 2020

A little bit of history :

   Back when i915-perf was introduced (4.13), there was no way to
   dynamically add new OA configurations to i915. Only the generated
   configs baked in at build time were allowed.

   It quickly became obvious that we would need to allow applications
   to upload their own configurations, for instance to be able to test
   new ones, and so by the next stable version (4.14) we added uAPIs
   to allow uploading new configurations.

   When adding that capability, we took the opportunity to remove most
   HW configurations except the TestOa one which is a configuration
   IGT would rely on to verify that the HW is outputting correct
   values. At the time it made sense to have that confiuration in at
   the same time a given HW platform added to the i915-perf driver.

Now that IGT has become the reference point for HW configurations (see
commit 53f8f541ca ("lib: Add i915_perf library"), previously this was
located in the GPUTop repository), the need for having those
configurations in i915-perf is gone.

On the Mesa side, we haven't relied on this test configuration for a
while. The MDAPI library always required 4.14 feature level and always
loaded its configuration into i915.

I'm sure nobody will miss this generated stuff in i915 :)

v2: Fix selftests by creating an empty config

v3: Fix unlocking on allocation error (Dan Carpenter)

v4: Fixup checkpatch warnings

v5: Fix incorrect unlock in error path (Umesh)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-1-lionel.g.landwerlin@intel.com

9aba9c18

03 3月, 2020 1 次提交

drm/i915/perf: Reintroduce wait on OA configuration completion · 4b4e973d

由 Chris Wilson 提交于 3月 02, 2020

We still need to wait for the initial OA configuration to happen
before we enable OA report writes to the OA buffer.
Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15d0ace1 ("drm/i915/perf: execute OA configuration from command stream")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1356
Testcase: igt/perf/stream-open-close
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200302085812.4172450-7-chris@chris-wilson.co.uk

4b4e973d

28 2月, 2020 2 次提交

drm/i915/perf: Manually acquire engine-wakeref around use of kernel_context · d236e2ac

由 Chris Wilson 提交于 2月 27, 2020

The engine->kernel_context is a special case for request emission. Since
it is used as the barrier within the engine's wakeref, we must acquire the
wakeref before submitting a request to the kernel_context.
Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-3-chris@chris-wilson.co.uk

d236e2ac

drm/i915/perf: Mark up the racy use of perf->exclusive_stream · a5af081d

由 Chris Wilson 提交于 2月 27, 2020

Inside the general i915_oa_init_reg_state() we avoid using the
perf->mutex. However, we rely on perf->exclusive_stream being valid to
access at that point, and for that we have to control the race with
disabling perf. This relies on the disabling being a heavy barrier that
inspects all active contexts, after marking the perf->exclusive_stream
as not available. This should ensure that there are no more concurrent
accesses to the perf->exclusive_stream as we destroy it.

Mark up the races around the perf->exclusive_stream so that they stand
out much more. (And hopefully we will be running kcsan to start
validating that the only races we have are carefully controlled.)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-2-chris@chris-wilson.co.uk

a5af081d

21 2月, 2020 1 次提交

drm/i915/perf: conversion to struct drm_device based logging macros. · 0bf85735

由 Wambui Karuga 提交于 2月 18, 2020

Manual conversion of instances of printk based drm logging macros to the
struct drm_device based logging macros in i915/i915_perf.c.
Also involves extraction of the struct drm_i915_private device from
various intel types for use in the macros.

Instances of the DRM_DEBUG printk macro were not converted due to the
lack of an analogous struct drm_device based logging macro.

v2: remove instances of DRM_DEBUG that were converted.

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.htmlSigned-off-by: NWambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200218173936.19664-1-wambui.karugax@gmail.com

0bf85735

04 2月, 2020 1 次提交

drm/i915: conversion to drm_device logging macros when drm_i915_private is present. · 00376ccf

由 Wambui Karuga 提交于 1月 31, 2020

Converts various instances of the printk drm logging macros to the
struct drm_device based logging macros in the drm/i915 folder using the
following coccinelle script that transforms based on the existence of
the struct drm_i915_private device pointer:
@@
identifier fn, T;
@@

fn(...) {
...
struct drm_i915_private *T = ...;
<+...
(
-DRM_INFO(
+drm_info(&T->drm,
...)
|
-DRM_ERROR(
+drm_err(&T->drm,
...)
|
-DRM_WARN(
+drm_warn(&T->drm,
...)
|
-DRM_DEBUG_KMS(
+drm_dbg_kms(&T->drm,
...)
|
-DRM_DEBUG_DRIVER(
+drm_dbg(&T->drm,
...)
|
-DRM_DEBUG_ATOMIC(
+drm_dbg_atomic(&T->drm,
...)
)
...+>
}

@@
identifier fn, T;
@@

fn(...,struct drm_i915_private *T,...) {
<+...
(
-DRM_INFO(
+drm_info(&T->drm,
...)
|
-DRM_ERROR(
+drm_err(&T->drm,
...)
|
-DRM_WARN(
+drm_warn(&T->drm,
...)
|
-DRM_DEBUG_DRIVER(
+drm_dbg(&T->drm,
...)
|
-DRM_DEBUG_KMS(
+drm_dbg_kms(&T->drm,
...)
|
-DRM_DEBUG_ATOMIC(
+drm_dbg_atomic(&T->drm,
...)
)
...+>
}

Checkpatch warnings were fixed manually.

Instances of the DRM_DEBUG macro were not converted due to lack of a
consensus of an analogous struct drm_device based macro.

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.htmlSigned-off-by: NWambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200131093416.28431-2-wambui.karugax@gmail.com

00376ccf

28 1月, 2020 1 次提交

drm/i915/perf: Fix OA context id overlap with idle context id · 6f280b13

由 Umesh Nerlige Ramappa 提交于 1月 23, 2020

Engine context pinned in perf OA was set to same context id as
the idle context. Set the context id to an unused value.

Clear the sw context id field in lrc descriptor before ORing with
ce->tag (Chris)

Closes: https://gitlab.freedesktop.org/drm/intel/issues/756Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200124013701.40609-1-umesh.nerlige.ramappa@intel.com

6f280b13

22 1月, 2020 1 次提交

drm/i915: Make WARN* drm specific where uncore or stream ptr is available · a9f236d1

由 Pankaj Bharadiya 提交于 1月 15, 2020

drm specific WARN* calls include device information in the
backtrace, so we know what device the warnings originate from.

Covert all the calls of WARN* with device specific drm_WARN*
variants in functions where intel_uncore/i915_perf_stream  struct
pointer is readily available.

The conversion was done automatically with below coccinelle semantic
patch. checkpatch errors/warnings are fixed manually.

@@
identifier func, T;
@@
func(...) {
...
struct intel_uncore *T = ...;
<...
(
-WARN(
+drm_WARN(&T->i915->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&T->i915->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&T->i915->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&T->i915->drm,
...)
)
...>

}

@@
identifier func, T;
@@
func(struct intel_uncore *T,...) {
<...
(
-WARN(
+drm_WARN(&T->i915->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&T->i915->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&T->i915->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&T->i915->drm,
...)
)
...>

}

@@
identifier func, T;
@@
func(struct i915_perf_stream *T,...) {
+struct drm_i915_private *i915 = T->perf->i915;
<+...
(
-WARN(
+drm_WARN(&i915->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&i915->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&i915->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&i915->drm,
...)
)
...+>

}

command: ls drivers/gpu/drm/i915/*.c | xargs spatch --sp-file <script> \
					--linux-spacing --in-place
Signed-off-by: NPankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200115034455.17658-11-pankaj.laxminarayan.bharadiya@intel.com

a9f236d1

09 1月, 2020 1 次提交

drm/i915: Pin the context as we work on it · feed5c7b

由 Chris Wilson 提交于 1月 09, 2020

Since we now allow the intel_context_unpin() to run unserialised, we
risk our operations under the intel_context_lock_pinned() being run as
the context is unpinned (and thus invalidating our state). We can
atomically acquire the pin, testing to see if it is pinned in the
process, thus ensuring that the state remains consistent during the
course of the whole operation.

Fixes: 84135022 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200109085142.871563-1-chris@chris-wilson.co.uk

feed5c7b

22 12月, 2019 1 次提交

drm/i915: Remove i915->kernel_context · e6ba7648

由 Chris Wilson 提交于 12月 21, 2019

Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: NAndi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk

e6ba7648

20 12月, 2019 1 次提交

drm/i915: Drop GEM context as a direct link from i915_request · 9f3ccd40

由 Chris Wilson 提交于 12月 20, 2019

Keep the intel_context as being the primary state for i915_request, with
the GEM context a backpointer from the low level state for the rarer
cases we need client information. Our goal is to remove such references
to clients from the backend, and leave the HW submission agnostic to
client interfaces and self-contained.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: NAndi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk

9f3ccd40

14 12月, 2019 1 次提交

drm/i915/perf: Register sysctl path globally · 3dc716fd

由 Venkata Sandeep Dhanalakota 提交于 12月 13, 2019

We do not require to register the sysctl paths per instance,
so making registration global.

v2: make sysctl path register and unregister function driver
    specific (Tvrtko and Lucas).

Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: NVenkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-1-venkata.s.dhanalakota@intel.com

3dc716fd

10 12月, 2019 2 次提交

drm/i915/perf: Configure OAR for specific context · 177e876a

由 Umesh Nerlige Ramappa 提交于 12月 06, 2019

Gen12 supports saving/restoring render counters per context. Apply OAR
configuration only for the context that is passed in to perf.

v2:
- Fix OACTXCONTROL value to only stop/resume counters.
- Remove gen12_update_reg_state_unlocked as power state is already
  applied by the caller.

v3: (Lionel)
- Move register initialization into the array
- Assume a valid oa_config in enable_metric_set
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL")
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-2-umesh.nerlige.ramappa@intel.com
(cherry picked from commit ccdeed49)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

177e876a

drm/i915/perf: Allow non-privileged access when OA buffer is not sampled · 2a264a0f

由 Umesh Nerlige Ramappa 提交于 12月 06, 2019

SAMPLE_OA_REPORT enables sampling of OA reports from the OA buffer.
Since reports from OA buffer had system wide visibility, collecting
samples from the OA buffer was a privileged operation on previous
platforms. Prior to TGL, it was also necessary to sample the OA buffer
to normalize reports from MI REPORT PERF COUNT.

TGL has a dedicated OAR unit to sample perf reports for a specific
render context. This removes the necessity to sample OA buffer.

- If not sampling the OA buffer, allow non-privileged access. An earlier
  patch allows the non-privilege access:
  https://patchwork.freedesktop.org/patch/337716/?series=68582&rev=1
- Clear up the path for non-privileged access in this patch
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL")
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 322d56aa)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

2a264a0f

09 12月, 2019 2 次提交

drm/i915/perf: Configure OAR for specific context · ccdeed49

由 Umesh Nerlige Ramappa 提交于 12月 06, 2019

Gen12 supports saving/restoring render counters per context. Apply OAR
configuration only for the context that is passed in to perf.

v2:
- Fix OACTXCONTROL value to only stop/resume counters.
- Remove gen12_update_reg_state_unlocked as power state is already
  applied by the caller.

v3: (Lionel)
- Move register initialization into the array
- Assume a valid oa_config in enable_metric_set
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL")
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-2-umesh.nerlige.ramappa@intel.com

ccdeed49

drm/i915/perf: Allow non-privileged access when OA buffer is not sampled · 322d56aa

由 Umesh Nerlige Ramappa 提交于 12月 06, 2019

SAMPLE_OA_REPORT enables sampling of OA reports from the OA buffer.
Since reports from OA buffer had system wide visibility, collecting
samples from the OA buffer was a privileged operation on previous
platforms. Prior to TGL, it was also necessary to sample the OA buffer
to normalize reports from MI REPORT PERF COUNT.

TGL has a dedicated OAR unit to sample perf reports for a specific
render context. This removes the necessity to sample OA buffer.

- If not sampling the OA buffer, allow non-privileged access. An earlier
  patch allows the non-privilege access:
  https://patchwork.freedesktop.org/patch/337716/?series=68582&rev=1
- Clear up the path for non-privileged access in this patch
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL")
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-1-umesh.nerlige.ramappa@intel.com

322d56aa

04 12月, 2019 1 次提交

drm/i915/perf: drop pointless static qualifier in i915_perf_add_config_ioctl() · c415ef2a

由 Mao Wenan 提交于 12月 04, 2019

There is no need to have the 'T *v' variable static
since new value always be assigned before use it.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NMao Wenan <maowenan@huawei.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191204010154.152396-1-maowenan@huawei.com

c415ef2a

25 11月, 2019 1 次提交

drm/i915: Serialise with engine-pm around requests on the kernel_context · de5825be

由 Chris Wilson 提交于 11月 25, 2019

As the engine->kernel_context is used within the engine-pm barrier, we
have to be careful when emitting requests outside of the barrier, as the
strict timeline locking rules do not apply. Instead, we must ensure the
engine_park() cannot be entered as we build the request, which is
simplest by taking an explicit engine-pm wakeref around the request
construction.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-1-chris@chris-wilson.co.uk

de5825be

18 11月, 2019 1 次提交

drm/i915/perf: don't forget noa wait after oa config · 7e89d508

由 Lionel Landwerlin 提交于 11月 13, 2019

I'm observing incoherence metric values, changing from run to run.

It appears the patches introducing noa wait & reconfiguration from
command stream switched places in the series multiple times during the
review. This lead to the dependency of one onto the order to go
missing...
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15d0ace1 ("drm/i915/perf: execute OA configuration from command stream")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113154639.27144-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 93937659)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

7e89d508

16 11月, 2019 1 次提交

drm/i915/perf: Add preemption check while waiting for OA · dd590f68

由 Lionel Landwerlin 提交于 11月 14, 2019

While we're waiting for the OA configuration to apply, let's give a
chance to other contexts that might need to run other workloads.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114140224.21818-1-lionel.g.landwerlin@intel.com

dd590f68

14 11月, 2019 1 次提交

drm/i915/perf: don't forget noa wait after oa config · 93937659

由 Lionel Landwerlin 提交于 11月 13, 2019

I'm observing incoherence metric values, changing from run to run.

It appears the patches introducing noa wait & reconfiguration from
command stream switched places in the series multiple times during the
review. This lead to the dependency of one onto the order to go
missing...
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15d0ace1 ("drm/i915/perf: execute OA configuration from command stream")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113154639.27144-1-lionel.g.landwerlin@intel.com

93937659

12 11月, 2019 1 次提交

drm/i915/perf: always consider holding preemption a privileged op · 2b3c7f0d

由 Lionel Landwerlin 提交于 11月 11, 2019

The ordering of the checks in the existing code can lead to holding
preemption not being considered as privileged op.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9cd20ef7 ("drm/i915/perf: allow holding preemption on filtered ctx")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111095308.2550-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 0b0120d4)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

2b3c7f0d

11 11月, 2019 1 次提交

drm/i915/perf: always consider holding preemption a privileged op · 0b0120d4

由 Lionel Landwerlin 提交于 11月 11, 2019

The ordering of the checks in the existing code can lead to holding
preemption not being considered as privileged op.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9cd20ef7 ("drm/i915/perf: allow holding preemption on filtered ctx")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111095308.2550-1-lionel.g.landwerlin@intel.com

0b0120d4

02 11月, 2019 1 次提交

drm/i915/perf: Reverse a ternary to make sparse happy · 9278bbb6

由 Chris Wilson 提交于 11月 01, 2019

Avoid

drivers/gpu/drm/i915/i915_perf.c:2442:85: warning: dubious: x | !y

simply by inverting the predicate and reversing the ternary.

v2: Move the long lines into their own function so there is no confusion
on operator precedence.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101192116.12647-1-chris@chris-wilson.co.uk

9278bbb6

29 10月, 2019 3 次提交

drm/i915/tgl: Add perf support on TGL · 00a7f0d7

由 Lionel Landwerlin 提交于 10月 25, 2019

The design of the OA unit has been split into several units. We now
have a global unit (OAG) and a render specific unit (OAR). This leads
to some changes on how we program things. Some details :

OAR:
  - has its own set of counter registers, they are per-context
    saved/restored
  - counters are not written to the circular OA buffer
  - a snapshot of the counters can be acquired with
    MI_RECORD_PERF_COUNT, or a single counter can be read with
    MI_STORE_REGISTER_MEM.

OAG:
  - has global counters that increment across context switches
  - counters are written into the circular OA buffer (if requested)

v2: Fix checkpatch warnings on code style (Lucas)
v3: (Umesh)
  - Update register from which tail, status and head are read
  - Update logic to sample context reports
  - Update whitelist mux and b counter regs
v4: Fix a bug when updating context image for new contexts (Umesh)
v5: Squash patch enabling save/restore of counters into context image

    We want this so we can preempt performance queries and keep the
    system responsive even when long running queries are ongoing. We
    avoid doing it for all contexts.

    - use LRI to modify context control (Chris)
    - use MASKED_FIELD to program just the masked bits (Chris)
    - disable save/restore of counters on cleanup (Chris)
v6: Do not use implicit parameters (Chris)

BSpec: 28727, 30021
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Acked-by: NChris Wilson <chris.p.wilson@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025193746.47155-2-umesh.nerlige.ramappa@intel.com

00a7f0d7

drm/i915/perf: Add helper macros for comparing with whitelisted registers · fc215230

由 Umesh Nerlige Ramappa 提交于 10月 25, 2019

Add helper macros for range and equality comparisons and use them to
check with whitelisted registers in oa configurations.
Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025193746.47155-1-umesh.nerlige.ramappa@intel.com

fc215230

drm/i915/execlists: Use vfunc to check engine submission mode · 19c17b76

由 Michal Wajdeczko 提交于 10月 28, 2019

While processing CSB there is no need to look at GuC submission
settings, just check if engine is configured for execlists mode.

While today GuC submission is disabled it's settings are still
based on modparam values that might not correctly reflect actual
submission status in case of any fallback. Until that is fully
fixed, use alternate method to confirm that engine really runs in
execlists mode by comparing set_default_submission vfunc.

v2: add other immediate use of new helper
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: NJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028164520.31772-1-michal.wajdeczko@intel.com

19c17b76

24 10月, 2019 1 次提交

drm/i915/gt: Split intel_ring_submission · 2871ea85

由 Chris Wilson 提交于 10月 24, 2019

Split the legacy submission backend from the common CS ring buffer
handling.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk

2871ea85

22 10月, 2019 1 次提交

drm/i915: Drop assertion that ce->pin_mutex guards state updates · 0587152b

由 Chris Wilson 提交于 10月 22, 2019

The actual conditions are that we know the GPU is not accessing the
context, and we hold a pin on the context image to allow CPU access. We
used a fake lock on ce->pin_mutex so that we could try and use lockdep
to assert that access is serialised, but the various different
hardirq/softirq contexts where we need to *fake* holding the pin_mutex
are causing more trouble.

Still it would be nice if we did have a way to reassure ourselves that
the direct update to the context image is serialised with GPU execution.
In the meantime, stop lockdep complaining about false irq inversions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111923Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022122845.25038-1-chris@chris-wilson.co.uk

0587152b

20 10月, 2019 1 次提交

drm/i915/perf: fix oa config reconfiguration · 8814c6d0

由 Lionel Landwerlin 提交于 10月 20, 2019

The current logic just reapplies the same configuration already stored
into stream->oa_config instead of the newly selected one.
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7831e9a9 ("drm/i915/perf: Allow dynamic reconfiguration of the OA stream")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191019214647.27866-1-lionel.g.landwerlin@intel.com

8814c6d0

15 10月, 2019 1 次提交

drm/i915/perf: allow holding preemption on filtered ctx · 9cd20ef7

由 Lionel Landwerlin 提交于 10月 14, 2019

We would like to make use of perf in Vulkan. The Vulkan API is much
lower level than OpenGL, with applications directly exposed to the
concept of command buffers (pretty much equivalent to our batch
buffers). In Vulkan, queries are always limited in scope to a command
buffer. In OpenGL, the lack of command buffer concept meant that
queries' duration could span multiple command buffers.

With that restriction gone in Vulkan, we would like to simplify
measuring performance just by measuring the deltas between the counter
snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the
more complex scheme we currently have in the GL driver, using 2
MI_RECORD_PERF_COUNT commands and doing some post processing on the
stream of OA reports, coming from the global OA buffer, to remove any
unrelated deltas in between the 2 MI_RECORD_PERF_COUNT.

Disabling preemption only apply to a single context with which want to
query performance counters for and is considered a privileged
operation, by default protected by CAP_SYS_ADMIN. It is possible to
enable it for a normal user by disabling the paranoid stream setting.

v2: Store preemption setting in intel_context (Chris)

v3: Use priorities to avoid preemption rather than the HW mechanism

v4: Just modify the port priority reporting function

v5: Add nopreempt flag on gem context and always flag requests
appropriately, regarless of OA reconfiguration.

Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-4-chris@chris-wilson.co.uk

9cd20ef7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功