提交 · 5451646467436695f90a23274845c8d54d950f19 · openeuler / Kernel

15 10月, 2019 4 次提交

drm/i915/perf: allow holding preemption on filtered ctx · 9cd20ef7

由 Lionel Landwerlin 提交于 10月 14, 2019

We would like to make use of perf in Vulkan. The Vulkan API is much
lower level than OpenGL, with applications directly exposed to the
concept of command buffers (pretty much equivalent to our batch
buffers). In Vulkan, queries are always limited in scope to a command
buffer. In OpenGL, the lack of command buffer concept meant that
queries' duration could span multiple command buffers.

With that restriction gone in Vulkan, we would like to simplify
measuring performance just by measuring the deltas between the counter
snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the
more complex scheme we currently have in the GL driver, using 2
MI_RECORD_PERF_COUNT commands and doing some post processing on the
stream of OA reports, coming from the global OA buffer, to remove any
unrelated deltas in between the 2 MI_RECORD_PERF_COUNT.

Disabling preemption only apply to a single context with which want to
query performance counters for and is considered a privileged
operation, by default protected by CAP_SYS_ADMIN. It is possible to
enable it for a normal user by disabling the paranoid stream setting.

v2: Store preemption setting in intel_context (Chris)

v3: Use priorities to avoid preemption rather than the HW mechanism

v4: Just modify the port priority reporting function

v5: Add nopreempt flag on gem context and always flag requests
appropriately, regarless of OA reconfiguration.

Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-4-chris@chris-wilson.co.uk

9cd20ef7

drm/i915/perf: Allow dynamic reconfiguration of the OA stream · 7831e9a9

由 Chris Wilson 提交于 10月 14, 2019

Introduce a new perf_ioctl command to change the OA configuration of the
active stream. This allows the OA stream to be reconfigured between
batch buffers, giving greater flexibility in sampling. We inject a
request into the OA context to reconfigure the stream asynchronously on
the GPU in between and ordered with execbuffer calls.

Original patch for dynamic reconfiguration by Lionel Landwerlin.

Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-3-chris@chris-wilson.co.uk

7831e9a9

drm/i915: add support for perf configuration queries · 4f6ccc74

由 Lionel Landwerlin 提交于 10月 14, 2019

Listing configurations at the moment is supported only through sysfs.
This might cause issues for applications wanting to list
configurations from a container where sysfs isn't available.

This change adds a way to query the number of configurations and their
content through the i915 query uAPI.

v2: Fix sparse warnings (Lionel)
    Add support to query configuration using uuid (Lionel)

v3: Fix some inconsistency in uapi header (Lionel)
    Fix unlocking when not locked issue (Lionel)
    Add debug messages (Lionel)

v4: Fix missing unlock (Dan)

v5: Drop lock when copying config content to userspace (Chris)

v6: Drop lock when copying config list to userspace (Chris)
    Fix deadlock when calling i915_perf_get_oa_config() under
    perf.metrics_lock (Lionel)
    Add i915_oa_config_get() (Chris)

Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-2-chris@chris-wilson.co.uk

4f6ccc74

drm/i915/perf: introduce a versioning of the i915-perf uapi · b8d49f28

由 Lionel Landwerlin 提交于 10月 14, 2019

Reporting this version will help application figure out what level of
the support the running kernel provides.

v2: Add i915_perf_ioctl_version() (Chris)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-1-chris@chris-wilson.co.uk

b8d49f28

21 9月, 2019 1 次提交

drm/i915/tgl: s/ss/eu fuse reading support · 601734f7

由 Daniele Ceraolo Spurio 提交于 9月 13, 2019

Gen12 has dual-subslices (DSS), which compared to gen11 subslices have
some duplicated resources/paths. Although DSS behave similarly to 2
subslices, instead of splitting this and presenting userspace with bits
not directly representative of hardware resources, present userspace
with a subslice_mask made up of DSS bits instead.

v2: GEM_BUG_ON on mask size (Lionel)

Bspec: 29547
Bspec: 12247
Cc: Kelvin Gardiner <kelvin.gardiner@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
CC: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com> #v1
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NJames Ausmus <james.ausmus@intel.com>
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NSudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: NStuart Summers <stuart.summers@intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913075137.18476-2-chris@chris-wilson.co.ukSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>

601734f7

04 7月, 2019 1 次提交

drm/i915: Show support for accurate sw PMU busyness tracking · bf73fc0f

由 Chris Wilson 提交于 7月 03, 2019

Expose whether or not we support the PMU software tracking in our
scheduler capabilities, so userspace can query at runtime.

v2: Use I915_SCHEDULER_CAP_ENGINE_BUSY_STATS for a less ambiguous
capability name.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703143702.11339-1-chris@chris-wilson.co.uk

bf73fc0f

22 5月, 2019 9 次提交

drm/i915: Engine discovery query · c5d3e39c

由 Tvrtko Ursulin 提交于 5月 22, 2019

Engine discovery query allows userspace to enumerate engines, probe their
configuration features, all without needing to maintain the internal PCI
ID based database.

A new query for the generic i915 query ioctl is added named
DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure
drm_i915_query_engine_info. The address of latter should be passed to the
kernel in the query.data_ptr field, and should be large enough for the
kernel to fill out all known engines as struct drm_i915_engine_info
elements trailing the query.

As with other queries, setting the item query length to zero allows
userspace to query minimum required buffer size.

Enumerated engines have common type mask which can be used to query all
hardware engines, versus engines userspace can submit to using the execbuf
uAPI.

Engines also have capabilities which are per engine class namespace of
bits describing features not present on all engine instances.

v2:
 * Fixed HEVC assignment.
 * Reorder some fields, rename type to flags, increase width. (Lionel)
 * No need to allocate temporary storage if we do it engine by engine.
   (Lionel)

v3:
 * Describe engine flags and mark mbz fields. (Lionel)
 * HEVC only applies to VCS.

v4:
 * Squash SFC flag into main patch.
 * Tidy some comments.

v5:
 * Add uabi_ prefix to engine capabilities. (Chris Wilson)
 * Report exact size of engine info array. (Chris Wilson)
 * Drop the engine flags. (Joonas Lahtinen)
 * Added some more reserved fields.
 * Move flags after class/instance.

v6:
 * Do not check engine info array was zeroed by userspace but zero the
   unused fields for them instead.

v7:
 * Simplify length calculation loop. (Lionel Landwerlin)

v8:
 * Remove MBZ comments where not applicable.
 * Rename ABI flags to match engine class define naming.
 * Rename SFC ABI flag to reflect it applies to VCS and VECS.
 * SFC is wired to even _logical_ engine instances.
 * SFC applies to VCS and VECS.
 * HEVC is present on all instances on Gen11. (Tony)
 * Simplify length calculation even more. (Chris Wilson)
 * Move info_ptr assigment closer to loop for clarity. (Chris Wilson)
 * Use vdbox_sfc_access from runtime info.
 * Rebase for RUNTIME_INFO.
 * Refactor for lower indentation.
 * Rename uAPI class/instance to engine_class/instance to avoid C++
   keyword.

v9:
 * Rebase for s/num_rings/num_engines/ in RUNTIME_INFO.

v10:
 * Use new copy_query_item.

v11:
 * Consolidate with struct i915_engine_class_instnace.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com

c5d3e39c

drm/i915: Allow specification of parallel execbuf · a88b6e4c