- 04 4月, 2020 1 次提交
-
-
由 Ashutosh Dixit 提交于
It is wrong to block the user thread in the next poll when OA data is already available which could not fit in the user buffer provided in the previous read. In several cases the exact user buffer size is not known. Blocking user space in poll can lead to data loss when the buffer size used is smaller than the available data. This change fixes this issue and allows user space to read all OA data even when using a buffer size smaller than the available data using multiple non-blocking reads rather than staying blocked in poll till the next timer interrupt. v2: Fix ret value for blocking reads (Umesh) v3: Mistake during patch send (Ashutosh) v4: Remove -EAGAIN from comment (Umesh) v5: Improve condition for clearing pollin and return (Lionel) v6: Improve blocking read loop and other cleanups (Lionel) v7: Added Cc stable Testcase: igt/perf/polling-small-buf Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200403010120.3067-1-ashutosh.dixit@intel.com
-
- 31 3月, 2020 2 次提交
-
-
由 Lionel Landwerlin 提交于
Reading or writing those fields should only happen under stream->oa_buffer.ptr_lock. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: d1df41eb ("drm/i915/perf: rework aging tail workaround") Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200330091411.37357-1-lionel.g.landwerlin@intel.com
-
由 Chris Wilson 提交于
We wish that the scheduler emit the context modification commands prior to enabling the oa_config, for which we must explicitly inform it of the ordering constraints. This is especially important as we now wait for the final oa_config setup to be completed and as this wait may be on a distinct context to the state modifications, we need that command packet to be always last in the queue. We borrow the i915_active for its ability to track multiple timelines and the last dma_fence on each; a flexible dma_resv. Keeping track of each dma_fence is important for us so that we can efficiently schedule the requests and reprioritise as required. Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200327112212.16046-3-chris@chris-wilson.co.uk
-
- 27 3月, 2020 3 次提交
-
-
由 Lionel Landwerlin 提交于
This new parameter let's the application choose how often the OA buffer should be checked on the CPU side for data availability. Longer polling period tend to reduce CPU overhead if the application does not care about somewhat real time data collection. v2: Allow disabling polling completely with 0 value (Lionel) v3: Version the new parameter (Joonas) v4: Rebase (Umesh) v5: Make poll delay value of 0 invalid (Umesh) v6: - Describe poll_oa_period (Ashutosh) - Fix comment for new poll parameter (Lionel) - Drop open_flags in read_properties_unlocked (Lionel) - Rename uapi parameter (Ashutosh) v7: Reword the comment in uapi (Ashutosh) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200324185457.14635-4-umesh.nerlige.ramappa@intel.com
-
由 Lionel Landwerlin 提交于
This isn't really gen specific stuff, so just move it to the common code. v2: Rebase (Umesh) v3: Remove comment, pollin is a per stream state already (Ashutosh) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200324185457.14635-3-umesh.nerlige.ramappa@intel.com
-
由 Lionel Landwerlin 提交于
We're about to introduce an options to open the perf stream, giving the user ability to configure how often it wants the kernel to poll the OA registers for available data. Right now the workaround against the OA tail pointer race condition requires at least twice the internal kernel polling timer to make any data available. This changes introduce checks on the OA data written into the circular buffer to make as much data as possible available on the first iteration of the polling timer. v2: Use OA_TAKEN macro without the gtt_offset (Lionel) v3: (Umesh) - Rebase - Change report to report32 from below review https://patchwork.freedesktop.org/patch/330704/?series=66697&rev=1 v4: (Ashutosh, Lionel) - Fix checkpatch errors - Fix aging_timestamp initialization - Check for only one valid landed report - Fix check for unlanded report v5: (Ashutosh) - Fix bug in accurately determining landed report. - Optimize the check for landed reports by going as far as the previously determined aged tail. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200324185457.14635-2-umesh.nerlige.ramappa@intel.com
-
- 18 3月, 2020 1 次提交
-
-
由 Umesh Nerlige Ramappa 提交于
On running several back to back perf capture sessions involving closing and opening the perf stream, invalid OA reports are seen in the beginning of the OA buffer in some sessions. Fix this by invalidating OA TLB when the perf stream is closed or disabled on gen12. Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL") Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200309211057.38575-1-umesh.nerlige.ramappa@intel.com
-
- 17 3月, 2020 3 次提交
-
-
由 Lionel Landwerlin 提交于
On Gen11 powergating half the execution units is a functional requirement when using the VME samplers. Not fullfilling this requirement can lead to hangs. This unfortunately plays fairly poorly with the NOA requirements. NOA requires a stable power configuration to maintain its configuration. As a result using OA (and NOA feeding into it) so far has required us to use a power configuration that can work for all contexts. The only power configuration fullfilling this is powergating half the execution units. This makes performance analysis for 3D workloads somewhat pointless. Failing to find a solution that would work for everybody, this change introduces a new i915-perf stream open parameter that punts the decision off to userspace. If this parameter is omitted, the existing Gen11 behavior remains (half EU array powergating). This change takes the initiative to move all perf related sseu configuration into i915_perf.c v2: Make parameter priviliged if different from default v3: Fix context modifying its sseu config while i915-perf is enabled v4: Always consider global sseu a privileged operation (Tvrtko) Override req_sseu point in intel_sseu_make_rpcs() (Tvrtko) Remove unrelated changes (Tvrtko) v5: Some typos (Tvrtko) Process sseu param in read_properties_unlocked() (Tvrtko) v6: Actually commit the bits from v5... Fixup some checkpath warnings v7: Only compare engine uabi field (Chris) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-3-lionel.g.landwerlin@intel.com
-
由 Lionel Landwerlin 提交于
The caller of i915_oa_init_reg_state() already sets this. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-2-lionel.g.landwerlin@intel.com
-
由 Lionel Landwerlin 提交于
A little bit of history : Back when i915-perf was introduced (4.13), there was no way to dynamically add new OA configurations to i915. Only the generated configs baked in at build time were allowed. It quickly became obvious that we would need to allow applications to upload their own configurations, for instance to be able to test new ones, and so by the next stable version (4.14) we added uAPIs to allow uploading new configurations. When adding that capability, we took the opportunity to remove most HW configurations except the TestOa one which is a configuration IGT would rely on to verify that the HW is outputting correct values. At the time it made sense to have that confiuration in at the same time a given HW platform added to the i915-perf driver. Now that IGT has become the reference point for HW configurations (see commit 53f8f541ca ("lib: Add i915_perf library"), previously this was located in the GPUTop repository), the need for having those configurations in i915-perf is gone. On the Mesa side, we haven't relied on this test configuration for a while. The MDAPI library always required 4.14 feature level and always loaded its configuration into i915. I'm sure nobody will miss this generated stuff in i915 :) v2: Fix selftests by creating an empty config v3: Fix unlocking on allocation error (Dan Carpenter) v4: Fixup checkpatch warnings v5: Fix incorrect unlock in error path (Umesh) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-1-lionel.g.landwerlin@intel.com
-
- 03 3月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
We still need to wait for the initial OA configuration to happen before we enable OA report writes to the OA buffer. Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 15d0ace1 ("drm/i915/perf: execute OA configuration from command stream") Closes: https://gitlab.freedesktop.org/drm/intel/issues/1356 Testcase: igt/perf/stream-open-close Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200302085812.4172450-7-chris@chris-wilson.co.uk
-
- 28 2月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
The engine->kernel_context is a special case for request emission. Since it is used as the barrier within the engine's wakeref, we must acquire the wakeref before submitting a request to the kernel_context. Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Inside the general i915_oa_init_reg_state() we avoid using the perf->mutex. However, we rely on perf->exclusive_stream being valid to access at that point, and for that we have to control the race with disabling perf. This relies on the disabling being a heavy barrier that inspects all active contexts, after marking the perf->exclusive_stream as not available. This should ensure that there are no more concurrent accesses to the perf->exclusive_stream as we destroy it. Mark up the races around the perf->exclusive_stream so that they stand out much more. (And hopefully we will be running kcsan to start validating that the only races we have are carefully controlled.) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-2-chris@chris-wilson.co.uk
-
- 21 2月, 2020 1 次提交
-
-
由 Wambui Karuga 提交于
Manual conversion of instances of printk based drm logging macros to the struct drm_device based logging macros in i915/i915_perf.c. Also involves extraction of the struct drm_i915_private device from various intel types for use in the macros. Instances of the DRM_DEBUG printk macro were not converted due to the lack of an analogous struct drm_device based logging macro. v2: remove instances of DRM_DEBUG that were converted. References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.htmlSigned-off-by: NWambui Karuga <wambui.karugax@gmail.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200218173936.19664-1-wambui.karugax@gmail.com
-
- 04 2月, 2020 1 次提交
-
-
由 Wambui Karuga 提交于
Converts various instances of the printk drm logging macros to the struct drm_device based logging macros in the drm/i915 folder using the following coccinelle script that transforms based on the existence of the struct drm_i915_private device pointer: @@ identifier fn, T; @@ fn(...) { ... struct drm_i915_private *T = ...; <+... ( -DRM_INFO( +drm_info(&T->drm, ...) | -DRM_ERROR( +drm_err(&T->drm, ...) | -DRM_WARN( +drm_warn(&T->drm, ...) | -DRM_DEBUG_KMS( +drm_dbg_kms(&T->drm, ...) | -DRM_DEBUG_DRIVER( +drm_dbg(&T->drm, ...) | -DRM_DEBUG_ATOMIC( +drm_dbg_atomic(&T->drm, ...) ) ...+> } @@ identifier fn, T; @@ fn(...,struct drm_i915_private *T,...) { <+... ( -DRM_INFO( +drm_info(&T->drm, ...) | -DRM_ERROR( +drm_err(&T->drm, ...) | -DRM_WARN( +drm_warn(&T->drm, ...) | -DRM_DEBUG_DRIVER( +drm_dbg(&T->drm, ...) | -DRM_DEBUG_KMS( +drm_dbg_kms(&T->drm, ...) | -DRM_DEBUG_ATOMIC( +drm_dbg_atomic(&T->drm, ...) ) ...+> } Checkpatch warnings were fixed manually. Instances of the DRM_DEBUG macro were not converted due to lack of a consensus of an analogous struct drm_device based macro. References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.htmlSigned-off-by: NWambui Karuga <wambui.karugax@gmail.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200131093416.28431-2-wambui.karugax@gmail.com
-
- 28 1月, 2020 1 次提交
-
-
由 Umesh Nerlige Ramappa 提交于
Engine context pinned in perf OA was set to same context id as the idle context. Set the context id to an unused value. Clear the sw context id field in lrc descriptor before ORing with ce->tag (Chris) Closes: https://gitlab.freedesktop.org/drm/intel/issues/756Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200124013701.40609-1-umesh.nerlige.ramappa@intel.com
-
- 22 1月, 2020 1 次提交
-
-
由 Pankaj Bharadiya 提交于
drm specific WARN* calls include device information in the backtrace, so we know what device the warnings originate from. Covert all the calls of WARN* with device specific drm_WARN* variants in functions where intel_uncore/i915_perf_stream struct pointer is readily available. The conversion was done automatically with below coccinelle semantic patch. checkpatch errors/warnings are fixed manually. @@ identifier func, T; @@ func(...) { ... struct intel_uncore *T = ...; <... ( -WARN( +drm_WARN(&T->i915->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->i915->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->i915->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->i915->drm, ...) ) ...> } @@ identifier func, T; @@ func(struct intel_uncore *T,...) { <... ( -WARN( +drm_WARN(&T->i915->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->i915->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->i915->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->i915->drm, ...) ) ...> } @@ identifier func, T; @@ func(struct i915_perf_stream *T,...) { +struct drm_i915_private *i915 = T->perf->i915; <+... ( -WARN( +drm_WARN(&i915->drm, ...) | -WARN_ON( +drm_WARN_ON(&i915->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&i915->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&i915->drm, ...) ) ...+> } command: ls drivers/gpu/drm/i915/*.c | xargs spatch --sp-file <script> \ --linux-spacing --in-place Signed-off-by: NPankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200115034455.17658-11-pankaj.laxminarayan.bharadiya@intel.com
-
- 09 1月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Since we now allow the intel_context_unpin() to run unserialised, we risk our operations under the intel_context_lock_pinned() being run as the context is unpinned (and thus invalidating our state). We can atomically acquire the pin, testing to see if it is pinned in the process, thus ensuring that the state remains consistent during the course of the whole operation. Fixes: 84135022 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200109085142.871563-1-chris@chris-wilson.co.uk
-
- 22 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Allocate only an internal intel_context for the kernel_context, forgoing a global GEM context for internal use as we only require a separate address space (for our own protection). Now having weaned GT from requiring ce->gem_context, we can stop referencing it entirely. This also means we no longer have to create random and unnecessary GEM contexts for internal use. GEM contexts are now entirely for tracking GEM clients, and intel_context the execution environment on the GPU. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
-
- 20 12月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Keep the intel_context as being the primary state for i915_request, with the GEM context a backpointer from the low level state for the rarer cases we need client information. Our goal is to remove such references to clients from the backend, and leave the HW submission agnostic to client interfaces and self-contained. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk
-
- 14 12月, 2019 1 次提交
-
-
We do not require to register the sysctl paths per instance, so making registration global. v2: make sysctl path register and unregister function driver specific (Tvrtko and Lucas). Cc: Sudeep Dutt <sudeep.dutt@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: NVenkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-1-venkata.s.dhanalakota@intel.com
-
- 10 12月, 2019 2 次提交
-
-
由 Umesh Nerlige Ramappa 提交于
Gen12 supports saving/restoring render counters per context. Apply OAR configuration only for the context that is passed in to perf. v2: - Fix OACTXCONTROL value to only stop/resume counters. - Remove gen12_update_reg_state_unlocked as power state is already applied by the caller. v3: (Lionel) - Move register initialization into the array - Assume a valid oa_config in enable_metric_set Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL") Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-2-umesh.nerlige.ramappa@intel.com (cherry picked from commit ccdeed49) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Umesh Nerlige Ramappa 提交于
SAMPLE_OA_REPORT enables sampling of OA reports from the OA buffer. Since reports from OA buffer had system wide visibility, collecting samples from the OA buffer was a privileged operation on previous platforms. Prior to TGL, it was also necessary to sample the OA buffer to normalize reports from MI REPORT PERF COUNT. TGL has a dedicated OAR unit to sample perf reports for a specific render context. This removes the necessity to sample OA buffer. - If not sampling the OA buffer, allow non-privileged access. An earlier patch allows the non-privilege access: https://patchwork.freedesktop.org/patch/337716/?series=68582&rev=1 - Clear up the path for non-privileged access in this patch Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL") Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 322d56aa) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 09 12月, 2019 2 次提交
-
-
由 Umesh Nerlige Ramappa 提交于
Gen12 supports saving/restoring render counters per context. Apply OAR configuration only for the context that is passed in to perf. v2: - Fix OACTXCONTROL value to only stop/resume counters. - Remove gen12_update_reg_state_unlocked as power state is already applied by the caller. v3: (Lionel) - Move register initialization into the array - Assume a valid oa_config in enable_metric_set Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL") Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-2-umesh.nerlige.ramappa@intel.com
-
由 Umesh Nerlige Ramappa 提交于
SAMPLE_OA_REPORT enables sampling of OA reports from the OA buffer. Since reports from OA buffer had system wide visibility, collecting samples from the OA buffer was a privileged operation on previous platforms. Prior to TGL, it was also necessary to sample the OA buffer to normalize reports from MI REPORT PERF COUNT. TGL has a dedicated OAR unit to sample perf reports for a specific render context. This removes the necessity to sample OA buffer. - If not sampling the OA buffer, allow non-privileged access. An earlier patch allows the non-privilege access: https://patchwork.freedesktop.org/patch/337716/?series=68582&rev=1 - Clear up the path for non-privileged access in this patch Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: 00a7f0d7 ("drm/i915/tgl: Add perf support on TGL") Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-1-umesh.nerlige.ramappa@intel.com
-
- 04 12月, 2019 1 次提交
-
-
由 Mao Wenan 提交于
There is no need to have the 'T *v' variable static since new value always be assigned before use it. Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NMao Wenan <maowenan@huawei.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191204010154.152396-1-maowenan@huawei.com
-
- 25 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
As the engine->kernel_context is used within the engine-pm barrier, we have to be careful when emitting requests outside of the barrier, as the strict timeline locking rules do not apply. Instead, we must ensure the engine_park() cannot be entered as we build the request, which is simplest by taking an explicit engine-pm wakeref around the request construction. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-1-chris@chris-wilson.co.uk
-
- 18 11月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
I'm observing incoherence metric values, changing from run to run. It appears the patches introducing noa wait & reconfiguration from command stream switched places in the series multiple times during the review. This lead to the dependency of one onto the order to go missing... Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 15d0ace1 ("drm/i915/perf: execute OA configuration from command stream") Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191113154639.27144-1-lionel.g.landwerlin@intel.com (cherry picked from commit 93937659) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 16 11月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
While we're waiting for the OA configuration to apply, let's give a chance to other contexts that might need to run other workloads. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191114140224.21818-1-lionel.g.landwerlin@intel.com
-
- 14 11月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
I'm observing incoherence metric values, changing from run to run. It appears the patches introducing noa wait & reconfiguration from command stream switched places in the series multiple times during the review. This lead to the dependency of one onto the order to go missing... Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 15d0ace1 ("drm/i915/perf: execute OA configuration from command stream") Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191113154639.27144-1-lionel.g.landwerlin@intel.com
-
- 12 11月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
The ordering of the checks in the existing code can lead to holding preemption not being considered as privileged op. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 9cd20ef7 ("drm/i915/perf: allow holding preemption on filtered ctx") Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191111095308.2550-1-lionel.g.landwerlin@intel.com (cherry picked from commit 0b0120d4) Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 11 11月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
The ordering of the checks in the existing code can lead to holding preemption not being considered as privileged op. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 9cd20ef7 ("drm/i915/perf: allow holding preemption on filtered ctx") Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191111095308.2550-1-lionel.g.landwerlin@intel.com
-
- 02 11月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Avoid drivers/gpu/drm/i915/i915_perf.c:2442:85: warning: dubious: x | !y simply by inverting the predicate and reversing the ternary. v2: Move the long lines into their own function so there is no confusion on operator precedence. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191101192116.12647-1-chris@chris-wilson.co.uk
-
- 29 10月, 2019 3 次提交
-
-
由 Lionel Landwerlin 提交于
The design of the OA unit has been split into several units. We now have a global unit (OAG) and a render specific unit (OAR). This leads to some changes on how we program things. Some details : OAR: - has its own set of counter registers, they are per-context saved/restored - counters are not written to the circular OA buffer - a snapshot of the counters can be acquired with MI_RECORD_PERF_COUNT, or a single counter can be read with MI_STORE_REGISTER_MEM. OAG: - has global counters that increment across context switches - counters are written into the circular OA buffer (if requested) v2: Fix checkpatch warnings on code style (Lucas) v3: (Umesh) - Update register from which tail, status and head are read - Update logic to sample context reports - Update whitelist mux and b counter regs v4: Fix a bug when updating context image for new contexts (Umesh) v5: Squash patch enabling save/restore of counters into context image We want this so we can preempt performance queries and keep the system responsive even when long running queries are ongoing. We avoid doing it for all contexts. - use LRI to modify context control (Chris) - use MASKED_FIELD to program just the masked bits (Chris) - disable save/restore of counters on cleanup (Chris) v6: Do not use implicit parameters (Chris) BSpec: 28727, 30021 Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Acked-by: NChris Wilson <chris.p.wilson@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191025193746.47155-2-umesh.nerlige.ramappa@intel.com
-
由 Umesh Nerlige Ramappa 提交于
Add helper macros for range and equality comparisons and use them to check with whitelisted registers in oa configurations. Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191025193746.47155-1-umesh.nerlige.ramappa@intel.com
-
由 Michal Wajdeczko 提交于
While processing CSB there is no need to look at GuC submission settings, just check if engine is configured for execlists mode. While today GuC submission is disabled it's settings are still based on modparam values that might not correctly reflect actual submission status in case of any fallback. Until that is fully fixed, use alternate method to confirm that engine really runs in execlists mode by comparing set_default_submission vfunc. v2: add other immediate use of new helper Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: NJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191028164520.31772-1-michal.wajdeczko@intel.com
-
- 24 10月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Split the legacy submission backend from the common CS ring buffer handling. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk
-
- 22 10月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
The actual conditions are that we know the GPU is not accessing the context, and we hold a pin on the context image to allow CPU access. We used a fake lock on ce->pin_mutex so that we could try and use lockdep to assert that access is serialised, but the various different hardirq/softirq contexts where we need to *fake* holding the pin_mutex are causing more trouble. Still it would be nice if we did have a way to reassure ourselves that the direct update to the context image is serialised with GPU execution. In the meantime, stop lockdep complaining about false irq inversions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111923Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191022122845.25038-1-chris@chris-wilson.co.uk
-
- 20 10月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
The current logic just reapplies the same configuration already stored into stream->oa_config instead of the newly selected one. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 7831e9a9 ("drm/i915/perf: Allow dynamic reconfiguration of the OA stream") Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191019214647.27866-1-lionel.g.landwerlin@intel.com
-
- 15 10月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
We would like to make use of perf in Vulkan. The Vulkan API is much lower level than OpenGL, with applications directly exposed to the concept of command buffers (pretty much equivalent to our batch buffers). In Vulkan, queries are always limited in scope to a command buffer. In OpenGL, the lack of command buffer concept meant that queries' duration could span multiple command buffers. With that restriction gone in Vulkan, we would like to simplify measuring performance just by measuring the deltas between the counter snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the more complex scheme we currently have in the GL driver, using 2 MI_RECORD_PERF_COUNT commands and doing some post processing on the stream of OA reports, coming from the global OA buffer, to remove any unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. Disabling preemption only apply to a single context with which want to query performance counters for and is considered a privileged operation, by default protected by CAP_SYS_ADMIN. It is possible to enable it for a normal user by disabling the paranoid stream setting. v2: Store preemption setting in intel_context (Chris) v3: Use priorities to avoid preemption rather than the HW mechanism v4: Just modify the port priority reporting function v5: Add nopreempt flag on gem context and always flag requests appropriately, regarless of OA reconfiguration. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-4-chris@chris-wilson.co.uk
-