- 15 10月, 2019 4 次提交
-
-
由 Lionel Landwerlin 提交于
We would like to make use of perf in Vulkan. The Vulkan API is much lower level than OpenGL, with applications directly exposed to the concept of command buffers (pretty much equivalent to our batch buffers). In Vulkan, queries are always limited in scope to a command buffer. In OpenGL, the lack of command buffer concept meant that queries' duration could span multiple command buffers. With that restriction gone in Vulkan, we would like to simplify measuring performance just by measuring the deltas between the counter snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the more complex scheme we currently have in the GL driver, using 2 MI_RECORD_PERF_COUNT commands and doing some post processing on the stream of OA reports, coming from the global OA buffer, to remove any unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. Disabling preemption only apply to a single context with which want to query performance counters for and is considered a privileged operation, by default protected by CAP_SYS_ADMIN. It is possible to enable it for a normal user by disabling the paranoid stream setting. v2: Store preemption setting in intel_context (Chris) v3: Use priorities to avoid preemption rather than the HW mechanism v4: Just modify the port priority reporting function v5: Add nopreempt flag on gem context and always flag requests appropriately, regarless of OA reconfiguration. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Introduce a new perf_ioctl command to change the OA configuration of the active stream. This allows the OA stream to be reconfigured between batch buffers, giving greater flexibility in sampling. We inject a request into the OA context to reconfigure the stream asynchronously on the GPU in between and ordered with execbuffer calls. Original patch for dynamic reconfiguration by Lionel Landwerlin. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-3-chris@chris-wilson.co.uk
-
由 Lionel Landwerlin 提交于
Listing configurations at the moment is supported only through sysfs. This might cause issues for applications wanting to list configurations from a container where sysfs isn't available. This change adds a way to query the number of configurations and their content through the i915 query uAPI. v2: Fix sparse warnings (Lionel) Add support to query configuration using uuid (Lionel) v3: Fix some inconsistency in uapi header (Lionel) Fix unlocking when not locked issue (Lionel) Add debug messages (Lionel) v4: Fix missing unlock (Dan) v5: Drop lock when copying config content to userspace (Chris) v6: Drop lock when copying config list to userspace (Chris) Fix deadlock when calling i915_perf_get_oa_config() under perf.metrics_lock (Lionel) Add i915_oa_config_get() (Chris) Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-2-chris@chris-wilson.co.uk
-
由 Lionel Landwerlin 提交于
Reporting this version will help application figure out what level of the support the running kernel provides. v2: Add i915_perf_ioctl_version() (Chris) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-1-chris@chris-wilson.co.uk
-
- 04 10月, 2019 1 次提交
-
-
由 Raymond Smith 提交于
Add the DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED modifier to denote the 16x16 block u-interleaved format used in Arm Utgard and Midgard GPUs. Changes from v1:- 1. Reserved the upper four bits (out of the 56 bits assigned to each vendor) to denote the category of Arm specific modifiers. Currently, we have two categories ie AFBC and MISC. Changes from v2:- 1. Preserved Ray's authorship 2. Cleanups/changes suggested by Brian 3. Added r-bs of Brian and Qiang Signed-off-by: NRaymond Smith <raymond.smith@arm.com> Reviewed-by: NBrian Starkey <brian.starkey@arm.com> Reviewed-by: NQiang Yu <yuq825@gmail.com> Signed-off-by: NAyan kumar halder <ayan.halder@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004141222.22337-1-ayan.halder@arm.com
-
- 03 10月, 2019 1 次提交
-
-
由 Marek Olšák 提交于
UMDs need this for correct programming of harvested chips. Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 9月, 2019 1 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
Gen12 has dual-subslices (DSS), which compared to gen11 subslices have some duplicated resources/paths. Although DSS behave similarly to 2 subslices, instead of splitting this and presenting userspace with bits not directly representative of hardware resources, present userspace with a subslice_mask made up of DSS bits instead. v2: GEM_BUG_ON on mask size (Lionel) Bspec: 29547 Bspec: 12247 Cc: Kelvin Gardiner <kelvin.gardiner@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> CC: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> #v1 Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NJames Ausmus <james.ausmus@intel.com> Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Signed-off-by: NSudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: NStuart Summers <stuart.summers@intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190913075137.18476-2-chris@chris-wilson.co.ukSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 20 9月, 2019 1 次提交
-
-
由 Iago Toral Quiroga 提交于
Extends the user space ioctl for CL submissions so it can include a request to flush the cache once the CL execution has completed. Fixes memory write violation messages reported by the kernel in workloads involving shader memory writes (SSBOs, shader images, scratch, etc) which sometimes also lead to GPU resets during Piglit and CTS workloads. v2: if v3d_job_init() fails we need to kfree() the job instead of v3d_job_put() it (Eric Anholt). v3 (Eric Anholt): - Drop _FLAG suffix from the new flag name. - Add a new param so userspace can tell whether cache flushing is implemented in the kernel. Signed-off-by: NIago Toral Quiroga <itoral@igalia.com> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-off-by: NEric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190919071016.4578-1-itoral@igalia.com
-
- 15 8月, 2019 1 次提交
-
-
由 Lucas Stach 提交于
With softpin we allow the userspace to take control over the GPU virtual address space. The new capability is relected by a bump of the minor DRM version. There are a few restrictions for userspace to take into account: 1. The kernel reserves a bit of the address space to implement zero page faulting and mapping of the kernel internal ring buffer. Userspace can query the kernel for the first usable GPU VM address via ETNAVIV_PARAM_SOFTPIN_START_ADDR. 2. We only allow softpin on GPUs, which implement proper process separation via PPAS. If softpin is not available the softpin start address will be set to ~0. 3. Softpin is all or nothing. A submit using softpin must not use any address fixups via relocs. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NPhilipp Zabel <p.zabel@pengutronix.de> Reviewed-by: NGuido Günther <agx@sigxcpu.org>
-
- 13 8月, 2019 2 次提交
-
-
由 Rob Herring 提交于
The midgard/bifrost GPUs need to allocate GPU heap memory which is allocated on GPU page faults and not pinned in memory. The vendor driver calls this functionality GROW_ON_GPF. This implementation assumes that BOs allocated with the PANFROST_BO_NOEXEC flag are never mmapped or exported. Both of those may actually work, but I'm unsure if there's some interaction there. It would cause the whole object to be pinned in memory which would defeat the point of this. On faults, we map in 2MB at a time in order to utilize huge pages (if enabled). Currently, once we've mapped pages in, they are only unmapped if the BO is freed. Once we add shrinker support, we can unmap pages with the shrinker. Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Robin Murphy <robin.murphy@arm.com> Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: NSteven Price <steven.price@arm.com> Signed-off-by: NRob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190808222200.13176-9-robh@kernel.org
-
由 Rob Herring 提交于
Executable buffers have an alignment restriction that they can't cross 16MB boundary as the GPU program counter is 24-bits. This restriction is currently not handled and we just get lucky. As current userspace assumes all BOs are executable, that has to remain the default. So add a new PANFROST_BO_NOEXEC flag to allow userspace to indicate which BOs are not executable. There is also a restriction that executable buffers cannot start or end on a 4GB boundary. This is mostly avoided as there is only 4GB of space currently and the beginning is already blocked out for NULL ptr detection. Add support to handle this restriction fully regardless of the current constraints. For existing userspace, all created BOs remain executable, but the GPU VA alignment will be increased to the size of the BO. This shouldn't matter as there is plenty of GPU VA space. Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Robin Murphy <robin.murphy@arm.com> Reviewed-by: NSteven Price <steven.price@arm.com> Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: NRob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190808222200.13176-6-robh@kernel.org
-
- 09 8月, 2019 1 次提交
-
-
由 Rob Herring 提交于
Add support for madvise and a shrinker similar to other drivers. This allows userspace to mark BOs which can be freed when there is memory pressure. Unlike other implementations, we don't depend on struct_mutex. The driver maintains a list of BOs which can be freed when the shrinker is called. Access to the list is serialized with the shrinker_lock. Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: NRob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190805143358.21245-2-robh@kernel.org
-
- 02 8月, 2019 1 次提交
-
-
由 Felix Kuehling 提交于
This memory allocation flag will be used to indicate BOs containing sensitive data that should not be leaked to other processes. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 7月, 2019 1 次提交
-
-
由 Steven Price 提交于
Midgard/Bifrost GPUs have a bunch of feature registers providing details of what the hardware supports. Panfrost already reads these, this patch exports them all to user space so that the jobs created by the user space driver can be tuned for the particular hardware implementation. Signed-off-by: NSteven Price <steven.price@arm.com> Reviewed-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: NRob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190724105626.53552-1-steven.price@arm.com
-
- 23 7月, 2019 1 次提交
-
-
由 Noralf Trønnes 提交于
tinydrm drivers announce DRM_MODE_CONNECTOR_VIRTUAL for its SPI drivers. Add a SPI connector type to match the actual connector. X will list the connector as Unknown: X.Org X Server 1.19.2 Release Date: 2017-03-02 <...> [ 53523.905] (II) modeset(0): Output Unknown19-1 has no monitor section [ 53523.908] (II) modeset(0): EDID for output Unknown19-1 [ 53523.910] (II) modeset(0): Printing probed modes for output Unknown19-1 [ 53523.911] (II) modeset(0): Modeline "320x240"x0.0 0.00 320 320 320 320 240 240 240 240 (0.0 kHz eP) [ 53523.911] (II) modeset(0): Output Unknown19-1 connected [ 53523.912] (II) modeset(0): Using exact sizes for initial modes [ 53523.912] (II) modeset(0): Output Unknown19-1 using initial mode 320x240 +0+0 The weston source shows that it will be listed as UNNAMED. v2: Split patch in core and driver changes, expand commit message (Daniel) Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Acked-by: NDavid Lechner <david@lechnology.com> Signed-off-by: NNoralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-2-noralf@tronnes.org
-
- 17 7月, 2019 1 次提交
-
-
由 Emil Velikov 提交于
Currently the AMDGPU_CTX_PRIORITY_* defines are used in both drm_amdgpu_ctx_in::priority and drm_amdgpu_sched_in::priority. Extend the comment to mention the CAP_SYS_NICE or DRM_MASTER requirement is only applicable with the former. Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: NEmil Velikov <emil.velikov@collabora.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 7月, 2019 1 次提交
-
-
git://people.freedesktop.org/~thomash/linux由 Dave Airlie 提交于
This reverts commit 031e610a, reversing changes made to 52d2d44e. The mm changes in there we premature and not fully ack or reviewed by core mm folks, I dropped the ball by merging them via this tree, so lets take em all back out. Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 04 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Expose whether or not we support the PMU software tracking in our scheduler capabilities, so userspace can query at runtime. v2: Use I915_SCHEDULER_CAP_ENGINE_BUSY_STATS for a less ambiguous capability name. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190703143702.11339-1-chris@chris-wilson.co.uk
-
- 22 6月, 2019 1 次提交
-
-
由 Hawking Zhang 提交于
the initial/default value of pa_sc_tile_steering_override need to be exposed to user mode driver Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 6月, 2019 2 次提交
-
-
由 Huang Rui 提交于
New vram type. Signed-off-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Huang Rui 提交于
Signed-off-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 6月, 2019 2 次提交
-
-
由 Boris Brezillon 提交于
Expose performance counters through 2 driver specific ioctls: one to enable/disable the perfcnt block, and one to dump the counter values. There are discussions to expose global performance monitors (those counters that can't be retrieved on a per-job basis) in a consistent way, but this is likely to take time to settle on something that works for various HW/users. The ioctls are marked unstable so we can get rid of them when the time comes. We initally went for a debugfs-based interface, but this was making the transition to per-FD address space more complicated (we need to specify the namespace the GPU has to use when dumping the perf counters), hence the decision to switch back to driver specific ioctls which are passed the FD they operate on and thus will have a dedicated address space attached to them. Other than that, the implementation is pretty simple: it basically dumps all counters and copy the values to a userspace buffer. The parsing is left to userspace which has to know the specific layout that's used by the GPU (layout differs on a per-revision basis). Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com> Acked-by: NAlyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: NRob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190618081648.17297-5-boris.brezillon@collabora.com
-
由 Thomas Hellstrom 提交于
Add the callbacks necessary to implement emulated coherent memory for surfaces. Add a flag to the gb_surface_create ioctl to indicate that surface memory should be coherent. Also bump the drm minor version to signal the availability of coherent surfaces. Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com> Reviewed-by: NDeepak Rawat <drawat@vmware.com>
-
- 04 6月, 2019 1 次提交
-
-
由 Uma Shankar 提交于
Fixes the following warnings: ./include/drm/drm_mode_config.h:841: warning: Incorrect use of kernel-doc format: * hdr_output_metadata_property: Connector property containing hdr ./include/drm/drm_mode_config.h:918: warning: Function parameter or member 'hdr_output_metadata_property' not described in 'drm_mode_config' ./include/drm/drm_connector.h:1251: warning: Function parameter or member 'hdr_output_metadata' not described in 'drm_connector' ./include/drm/drm_connector.h:1251: warning: Function parameter or member 'hdr_sink_metadata' not described in 'drm_connector' Also adds some property documentation for HDR Metadata Connector Property in connector property create function. v2: Fixed Sean Paul's review comments. v3: Fixed Daniel Vetter's review comments, added the UAPI structure definition section in kernel docs. v4: Fixed Daniel Vetter's review comments. v5: Added structure member references as per Daniel's suggestion. Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Ville Syrjä <ville.syrjala@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <maxime.ripard@bootlin.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: "Ville Syrjä" <ville.syrjala@linux.intel.com> Cc: Hans Verkuil <hansverk@cisco.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-fbdev@vger.kernel.org Reviewed-by: Sean Paul <sean@poorly.run> (v1) Signed-off-by: NUma Shankar <uma.shankar@intel.com> [danvet: Fix up markup: () for functions, & for structs. Style guide also recommends to prepend struct for structures.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1559647022-7336-1-git-send-email-uma.shankar@intel.com
-
- 03 6月, 2019 1 次提交
-
-
由 Uma Shankar 提交于
Fixed doc warnings in drm uapi header. All the UAPI structures are now documented in kernel doc. Signed-off-by: NUma Shankar <uma.shankar@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1559159944-21103-4-git-send-email-uma.shankar@intel.com
-
- 23 5月, 2019 1 次提交
-
-
由 Uma Shankar 提交于
This patch adds a blob property to get HDR metadata information from userspace. This will be send as part of AVI Infoframe to panel. It also implements get() and set() functions for HDR output metadata property.The blob data is received from userspace and saved in connector state, the same is returned as blob in get property call to userspace. v2: Rebase and modified the metadata structure elements as per Ville's POC changes. v3: No Change v4: Addressed Shashank's review comments v5: Rebase. v6: Addressed Brian Starkey's review comments, defined new structure with header for dynamic metadata scalability. Merge get/set property functions for metadata in this patch. v7: Addressed Jonas Karlman review comments and defined separate structure for infoframe to better align with CTA 861.G spec. Added Shashank's RB. v8: Addressed Ville's review comments. Moved sink metadata structure out of uapi headers as suggested by Jonas Karlman. v9: Rebase and addressed Jonas Karlman review comments. v10: Addressed Ville's review comments, dropped the metdata_changed state variable as its not needed anymore. Signed-off-by: NUma Shankar <uma.shankar@intel.com> Reviewed-by: NShashank Sharma <shashank.sharma@intel.com> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1558015817-12025-2-git-send-email-uma.shankar@intel.com
-
- 22 5月, 2019 9 次提交
-
-
由 Tvrtko Ursulin 提交于
Engine discovery query allows userspace to enumerate engines, probe their configuration features, all without needing to maintain the internal PCI ID based database. A new query for the generic i915 query ioctl is added named DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure drm_i915_query_engine_info. The address of latter should be passed to the kernel in the query.data_ptr field, and should be large enough for the kernel to fill out all known engines as struct drm_i915_engine_info elements trailing the query. As with other queries, setting the item query length to zero allows userspace to query minimum required buffer size. Enumerated engines have common type mask which can be used to query all hardware engines, versus engines userspace can submit to using the execbuf uAPI. Engines also have capabilities which are per engine class namespace of bits describing features not present on all engine instances. v2: * Fixed HEVC assignment. * Reorder some fields, rename type to flags, increase width. (Lionel) * No need to allocate temporary storage if we do it engine by engine. (Lionel) v3: * Describe engine flags and mark mbz fields. (Lionel) * HEVC only applies to VCS. v4: * Squash SFC flag into main patch. * Tidy some comments. v5: * Add uabi_ prefix to engine capabilities. (Chris Wilson) * Report exact size of engine info array. (Chris Wilson) * Drop the engine flags. (Joonas Lahtinen) * Added some more reserved fields. * Move flags after class/instance. v6: * Do not check engine info array was zeroed by userspace but zero the unused fields for them instead. v7: * Simplify length calculation loop. (Lionel Landwerlin) v8: * Remove MBZ comments where not applicable. * Rename ABI flags to match engine class define naming. * Rename SFC ABI flag to reflect it applies to VCS and VECS. * SFC is wired to even _logical_ engine instances. * SFC applies to VCS and VECS. * HEVC is present on all instances on Gen11. (Tony) * Simplify length calculation even more. (Chris Wilson) * Move info_ptr assigment closer to loop for clarity. (Chris Wilson) * Use vdbox_sfc_access from runtime info. * Rebase for RUNTIME_INFO. * Refactor for lower indentation. * Rename uAPI class/instance to engine_class/instance to avoid C++ keyword. v9: * Rebase for s/num_rings/num_engines/ in RUNTIME_INFO. v10: * Use new copy_query_item. v11: * Consolidate with struct i915_engine_class_instnace. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tony Ye <tony.ye@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7 Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com
-
由 Chris Wilson 提交于
There is a desire to split a task onto two engines and have them run at the same time, e.g. scanline interleaving to spread the workload evenly. Through the use of the out-fence from the first execbuf, we can coordinate secondary execbuf to only become ready simultaneously with the first, so that with all things idle the second execbufs are executed in parallel with the first. The key difference here between the new EXEC_FENCE_SUBMIT and the existing EXEC_FENCE_IN is that the in-fence waits for the completion of the first request (so that all of its rendering results are visible to the second execbuf, the more common userspace fence requirement). Since we only have a single input fence slot, userspace cannot mix an in-fence and a submit-fence. It has to use one or the other! This is not such a harsh requirement, since by virtue of the submit-fence, the secondary execbuf inherit all of the dependencies from the first request, and for the application the dependencies should be common between the primary and secondary execbuf. Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Testcase: igt/gem_exec_fence/parallel Link: https://github.com/intel/media-driver/pull/546Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-10-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Some users require that when a master batch is executed on one particular engine, a companion batch is run simultaneously on a specific slave engine. For this purpose, we introduce virtual engine bonding, allowing maps of master:slaves to be constructed to constrain which physical engines a virtual engine may select given a fence on a master engine. For the moment, we continue to ignore the issue of preemption deferring the master request for later. Ideally, we would like to then also remove the slave and run something else rather than have it stall the pipeline. With load balancing, we should be able to move workload around it, but there is a similar stall on the master pipeline while it may wait for the slave to be executed. At the cost of more latency for the bonded request, it may be interesting to launch both on their engines in lockstep. (Bubbles abound.) Opens: Also what about bonding an engine as its own master? It doesn't break anything internally, so allow the silliness. v2: Emancipate the bonds v3: Couple in delayed scheduling for the selftests v4: Handle invalid mutually exclusive bonding v5: Mention what the uapi does v6: s/nbond/num_bonds/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-9-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Having allowed the user to define a set of engines that they will want to only use, we go one step further and allow them to bind those engines into a single virtual instance. Submitting a batch to the virtual engine will then forward it to any one of the set in a manner as best to distribute load. The virtual engine has a single timeline across all engines (it operates as a single queue), so it is not able to concurrently run batches across multiple engines by itself; that is left up to the user to submit multiple concurrent batches to multiple queues. Multiple users will be load balanced across the system. The mechanism used for load balancing in this patch is a late greedy balancer. When a request is ready for execution, it is added to each engine's queue, and when an engine is ready for its next request it claims it from the virtual engine. The first engine to do so, wins, i.e. the request is executed at the earliest opportunity (idle moment) in the system. As not all HW is created equal, the user is still able to skip the virtual engine and execute the batch on a specific engine, all within the same queue. It will then be executed in order on the correct engine, with execution on other virtual engines being moved away due to the load detection. A couple of areas for potential improvement left! - The virtual engine always take priority over equal-priority tasks. Mostly broken up by applying FQ_CODEL rules for prioritising new clients, and hopefully the virtual and real engines are not then congested (i.e. all work is via virtual engines, or all work is to the real engine). - We require the breadcrumb irq around every virtual engine request. For normal engines, we eliminate the need for the slow round trip via interrupt by using the submit fence and queueing in order. For virtual engines, we have to allow any job to transfer to a new ring, and cannot coalesce the submissions, so require the completion fence instead, forcing the persistent use of interrupts. - We only drip feed single requests through each virtual engine and onto the physical engines, even if there was enough work to fill all ELSP, leaving small stalls with an idle CS event at the end of every request. Could we be greedy and fill both slots? Being lazy is virtuous for load distribution on less-than-full workloads though. Other areas of improvement are more general, such as reducing lock contention, reducing dispatch overhead, looking at direct submission rather than bouncing around tasklets etc. sseu: Lift the restriction to allow sseu to be reconfigured on virtual engines composed of RENDER_CLASS (rcs). v2: macroize check_user_mbz() v3: Cancel virtual engines on wedging v4: Commence commenting v5: Replace 64b sibling_mask with a list of class:instance v6: Drop the one-element array in the uabi v7: Assert it is an virtual engine in to_virtual_engine() v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2) Link: https://github.com/intel/media-driver/pull/283Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-6-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
A usecase arose out of handling context recovery in mesa, whereby they wish to recreate a context with fresh logical state but preserving all other details of the original. Currently, they create a new context and iterate over which bits they want to copy across, but it would much more convenient if they were able to just pass in a target context to clone during creation. This essentially extends the setparam during creation to pull the details from a target context instead of the user supplied parameters. The ideal here is that we don't expose control over anything more than can be obtained via CONTEXT_PARAM. That is userspace retains explicit control over all features, and this api is just convenience. For example, you could replace struct context_param p = { .param = CONTEXT_PARAM_VM }; param.ctx_id = old_id; gem_context_get_param(&p.param); new_id = gem_context_create(); param.ctx_id = new_id; gem_context_set_param(&p.param); gem_vm_destroy(param.value); /* drop the ref to VM_ID handle */ with struct create_ext_param p = { { .name = CONTEXT_CREATE_CLONE }, .clone_id = old_id, .flags = CLONE_FLAGS_VM } new_id = gem_context_create_ext(&p); and not have to worry about stray namespace pollution etc. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The SINGLE_TIMELINE flag can be used to create a context such that all engine instances within that context share a common timeline. This can be useful for mixing operations between real and virtual engines, or when using a composite context for a single client API context. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Allow the user to specify a local engine index (as opposed to class:index) that they can use to refer to a preset engine inside the ctx->engine[] array defined by an earlier I915_CONTEXT_PARAM_ENGINES. This will be useful for setting SSEU parameters on virtual engines that are local to the context and do not have a valid global class:instance lookup. Note that due to the ambiguity in using class:instance with ctx->engines[], if a user supplied engine map is active the user must specify the engine to alter by its index into the ctx->engines[]. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Over the last few years, we have debated how to extend the user API to support an increase in the number of engines, that may be sparse and even be heterogeneous within a class (not all video decoders created equal). We settled on using (class, instance) tuples to identify a specific engine, with an API for the user to construct a map of engines to capabilities. Into this picture, we then add a challenge of virtual engines; one user engine that maps behind the scenes to any number of physical engines. To keep it general, we want the user to have full control over that mapping. To that end, we allow the user to constrain a context to define the set of engines that it can access, order fully controlled by the user via (class, instance). With such precise control in context setup, we can continue to use the existing execbuf uABI of specifying a single index; only now it doesn't automagically map onto the engines, it uses the user defined engine map from the context. v2: Fixup freeing of local on success of get_engines() v3: Allow empty engines[] v4: s/nengine/num_engines/ v5: Replace 64 limit on num_engines with a note that execbuf is currently limited to only using the first 64 engines. v6: Actually use the engines_mutex to guard the ctx->engines. Testcase: igt/gem_ctx_engines Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Having hid the partially exposed new ABI from the PR, put it back again for completion of context recovery. A significant part of context recovery is the ability to reuse as much of the old context as is feasible (to avoid expensive reconstruction). The biggest chunk kept hidden at the moment is fine-control over the ctx->ppgtt (the GPU page tables and associated translation tables and kernel maps), so make control over the ctx->ppgtt explicit. This allows userspace to create and share virtual memory address spaces (within the limits of a single fd) between contexts they own, along with the ability to query the contexts for the vm state. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-1-chris@chris-wilson.co.uk
-
- 17 5月, 2019 1 次提交
-
-
由 James Clarke 提交于
Like GNU/Linux, GNU/kFreeBSD's sys/types.h does not define the uintX_t types, which differs from the BSDs' headers. Thus we should include stdint.h to ensure we have all the required integer types. Signed-off-by: NJames Clarke <jrtc27@jrtc27.com> Signed-off-by: NEric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190115150418.68080-1-jrtc27@jrtc27.comReviewed-by: NEric Anholt <eric@anholt.net>
-
- 02 5月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
Unfortunately userspace users of this API cannot be publicly disclosed yet. This commit effectively disables timeline syncobj ioctls for all drivers. Each driver wishing to support this feature will need to expose DRIVER_SYNCOBJ_TIMELINE. v2: Add uAPI capability check (Christian) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> (v1) Cc: Dave Airlie <airlied@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Christian König <christian.koenig@amd.com> Cc: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: NDave Airlie <airlied@redhat.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190416125750.31370-1-lionel.g.landwerlin@intel.com
-
- 20 4月, 2019 3 次提交
-
-
由 Jordan Crouse 提交于
Add the capability to query information from a submit queue. The first available parameter is for querying the number of GPU faults (hangs) that can be attributed to the queue. This is useful for implementing context robustness. A user context can regularly query the number of faults to see if it is responsible for any and if so it can invalidate itself. This is also helpful for testing by confirming to the user driver if a particular command stream caused a fault (or not as the case may be). Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org> Signed-off-by: NRob Clark <robdclark@chromium.org>
-
由 Rob Clark 提交于
For KHR_robustness, userspace wants to know two things, the count of GPU faults globally, and the count of faults attributed to a given context. This patch providees the former, and the next patch provides the latter. Signed-off-by: NRob Clark <robdclark@chromium.org> Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org>
-
由 Rob Clark 提交于
For now it always returns '0' (false), but once the iommu work is in place to enable per-process pagetables we can update the value returned. Userspace needs to know this to make an informed decision about exposing KHR_robustness. Signed-off-by: NRob Clark <robdclark@chromium.org> Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org>
-