- 01 6月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
We can reduce our exposure to random neutrinos by resting on the kernel context having flushed out the user contexts to system memory and beyond. The corollary is that we then we require two passes through the idle handler to go to sleep, which on a truly idle system involves an extra pass through the slow and irregular retire work handler. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180531082246.9763-1-chris@chris-wilson.co.uk
-
- 24 5月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
We were not very carefully checking to see if an older request on the engine was an earlier switch-to-kernel-context before deciding to emit a new switch. The end result would be that we could get into a permanent loop of trying to emit a new request to perform the switch simply to flush the existing switch. What we need is a means of tracking the completion of each timeline versus the kernel context, that is to detect if a more recent request has been submitted that would result in a switch away from the kernel context. To realise this, we need only to look in our syncmap on the kernel context and check that we have synchronized against all active rings. v2: Since all ringbuffer clients currently share the same timeline, we do have to use the gem_context to distinguish clients. As a bonus, include all the tracing used to debug the death inside suspend. v3: Test, test, test. Construct a selftest to exercise and assert the expected behaviour that multiple switch-to-contexts do not emit redundant requests. Reported-by: NMika Kuoppala <mika.kuoppala@intel.com> Fixes: a89d1f92 ("drm/i915: Split i915_gem_timeline into individual timelines") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180524081135.15278-1-chris@chris-wilson.co.uk
-
- 18 5月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
To ease the frequent and ugly pointer dance of &request->gem_context->engine[request->engine->id] during request submission, store that pointer as request->hw_context. One major advantage that we will exploit later is that this decouples the logical context state from the engine itself. v2: Set mock_context->ops so we don't crash and burn in selftests. Cleanups from Tvrtko. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: NZhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Move the knowledge about resetting the current context tracking on the engine from inside i915_gem_context.c into intel_engine_cs.c Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-2-chris@chris-wilson.co.uk
-
- 16 5月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
When switching to the kernel context, we force the switch to occur after all currently active requests (so that we know the GPU won't switch immediately away and the kernel context remains current as we work). To do so we have to inspect all the timelines and add a fence from the active work to queue our switch afterwards. We can use the tracked set of active rings to shrink our search for active timelines. v2: Use a local to shrink the list_for_each_entry() Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180515143149.4795-1-chris@chris-wilson.co.uk
-
- 03 5月, 2018 2 次提交
-
-
由 Chris Wilson 提交于
We need to move to a more flexible timeline that doesn't assume one fence context per engine, and so allow for a single timeline to be used across a combination of engines. This means that preallocating a fence context per engine is now a hindrance, and so we want to introduce the singular timeline. From the code perspective, this has the notable advantage of clearing up a lot of mirky semantics and some clumsy pointer chasing. By splitting the timeline up into a single entity rather than an array of per-engine timelines, we can realise the goal of the previous patch of tracking the timeline alongside the ring. v2: Tweak wait_for_idle to stop the compiling thinking that ret may be uninitialised. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In the future, we want to move a request between engines. To achieve this, we first realise that we have two timelines in effect here. The first runs through the GTT is required for ordering vma access, which is tracked currently by engine. The second is implied by sequential execution of commands inside the ringbuffer. This timeline is one that maps to userspace's expectations when submitting requests (i.e. given the same context, batch A is executed before batch B). As the rings's timelines map to userspace and the GTT timeline an implementation detail, move the timeline from the GTT into the ring itself (per-context in logical-ring-contexts/execlists, or a global per-engine timeline for the shared ringbuffers in legacy submission. The two timelines are still assumed to be equivalent at the moment (no migrating requests between engines yet) and so we can simply move from one to the other without adding extra ordering. v2: Reinforce that one isn't allowed to mix the engine execution timeline with the client timeline from userspace (on the ring). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk
-
- 30 4月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Make life easier in upcoming patches by moving the context_pin and context_unpin vfuncs into inline helpers. v2: Fixup mock_engine to mark the context as pinned on use. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-2-chris@chris-wilson.co.uk
-
- 19 4月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Today we only want to pass along the priority to engine->schedule(), but in the future we want to have much more control over the various aspects of the GPU during a context's execution, for example controlling the frequency allowed. As we need an ever growing number of parameters for scheduling, move those into a struct for convenience. v2: Move the anonymous struct into its own function for legibility and ye olde gcc. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180418184052.7129-3-chris@chris-wilson.co.uk
-
- 12 4月, 2018 1 次提交
-
-
由 Oscar Mateo 提交于
There are different kind of workarounds (those that modify registers that live in the context image, those that modify global registers, those that whitelist registers, etc...) and they have different requirements in terms of where they are applied and how. Also, by splitting them apart, it should be easier to decide where a new workaround should go. v2: - Add multiple MISSING_CASE - Rebased v3: - Rename mmio_workarounds to gt_workarounds (Chris, Mika) - Create empty placeholders for BDW and CHV GT WAs - Rebased v4: Rebased v5: - Rebased - FORCE_TO_NONPRIV register exists since BDW, so make a path for it to achieve universality, even if empty (Chris) Signed-off-by: NOscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> [ickle: appease checkpatch] Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1523376767-18480-2-git-send-email-oscar.mateo@intel.com
-
- 14 3月, 2018 1 次提交
-
-
由 Jackie Li 提交于
Hardware may have specific restrictions on GuC WOPCM offset and size. On Gen9, the value of the GuC WOPCM size register needs to be larger than the value of GuC WOPCM offset register + a Gen9 specific offset (144KB) for reserved GuC WOPCM. Fail to enforce such a restriction on GuC WOPCM size will lead to GuC firmware execution failures. On the other hand, with current static GuC WOPCM offset and size values (512KB for both offset and size), the GuC WOPCM size verification will fail on Gen9 even if it can be fixed by lowering the GuC WOPCM offset by calculating its value based on HuC firmware size (which is likely less than 200KB on Gen9), so that we can have a GuC WOPCM size value which is large enough to pass the GuC WOPCM size check. This patch updates the reserved GuC WOPCM size for RC6 context on Gen9 to 24KB to strictly align with the Gen9 GuC WOPCM layout. It also adds support to verify the GuC WOPCM size aganist the Gen9 hardware restrictions. To meet all above requirements, let's provide dynamic partitioning of the WOPCM that will be based on platform specific HuC/GuC firmware sizes. v2: - Removed intel_wopcm_init (Ville/Sagar/Joonas) - Renamed and Moved the intel_wopcm_partition into intel_guc (Sagar) - Removed unnecessary function calls (Joonas) - Init GuC WOPCM partition as soon as firmware fetching is completed v3: - Fixed indentation issues (Chris) - Removed layering violation code (Chris/Michal) - Created separat files for GuC wopcm code (Michal) - Used inline function to avoid code duplication (Michal) v4: - Preset the GuC WOPCM top during early GuC init (Chris) - Fail intel_uc_init_hw() as soon as GuC WOPCM partitioning failed v5: - Moved GuC DMA WOPCM register updating code into intel_wopcm.c - Took care of the locking status before writing to GuC DMA Write-Once registers. (Joonas) v6: - Made sure the GuC WOPCM size to be multiple of 4K (4K aligned) v8: - Updated comments and fixed naming issues (Sagar/Joonas) - Updated commit message to include more description about the hardware restriction on GuC WOPCM size (Sagar) v9: - Minor changes variable names and code comments (Sagar) - Added detailed GuC WOPCM layout drawing (Sagar/Michal) - Refined macro definitions to be reader friendly (Michal) - Removed redundent check to valid flag (Michal) - Unified first parameter for exported GuC WOPCM functions (Michal) - Refined the name and parameter list of hardware restriction checking functions (Michal) v10: - Used shorter function name for internal functions (Joonas) - Moved init-ealry function into c file (Joonas) - Consolidated and removed redundant size checks (Joonas/Michal) - Removed unnecessary unlikely() from code which is only called once during boot (Joonas) - More fixes to kernel-doc format and content (Michal) - Avoided the use of PAGE_MASK for 4K pages (Michal) - Added error log messages to error paths (Michal) v11: - Replaced intel_guc_wopcm with more generic intel_wopcm and attached intel_wopcm to drm_i915_private instead intel_guc (Michal) - dynamic calculation of GuC non-wopcm memory start (a.k.a WOPCM Top offset from GuC WOPCM base) (Michal) - Moved WOPCM marco definitions into .c source file (Michal) - Exported WOPCM layout diagram as kernel-doc (Michal) v12: - Updated naming, function kernel-doc to align with new changes (Michal) v13: - Updated the ordering of s-o-b/cc/r-b tags (Sagar) - Corrected one tense error in comment (Sagar) - Corrected typos and removed spurious comments (Joonas) Bspec: 12690 Signed-off-by: NJackie Li <yaodong.li@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Spotswood <john.a.spotswood@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v8) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9) Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12) Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-2-git-send-email-yaodong.li@intel.com
-
- 07 3月, 2018 1 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
Starting from Gen11 the context descriptor format has been updated in the HW. The hw_id field has been considerably reduced in size and engine class and instance fields have been added. There is a slight name clashing issue because the field that we call hw_id is actually called SW Context ID in the specs for Gen11+. With the current size of the hw_id field we can have a maximum of 2k contexts at any time, but we could use the sw_counter field (which is sw defined) to increase that because the HW requirement is that engine_id + sw id + sw_counter is a unique number. GuC uses a similar method to support more contexts but does its tracking at lrc level. To avoid doing an implementation that will need to be reworked once GuC support lands, defer it for now and mark it as TODO. v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT v3: rebased, bring back lost code from i915_gem_context.c v4: make TODO comment more generic v5: be consistent with bit ordering, add extra checks (Chris) Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NOscar Mateo <oscar.mateo@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.comSigned-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
-
- 22 2月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
We want to de-emphasize the link between the request (dependency, execution and fence tracking) from GEM and so rename the struct from drm_i915_gem_request to i915_request. That is we may implement the GEM user interface on top of requests, but they are an abstraction for tracking execution rather than an implementation detail of GEM. (Since they are not tied to HW, we keep the i915 prefix as opposed to intel.) In short, the spatch: @@ @@ - struct drm_i915_gem_request + struct i915_request A corollary to contracting the type name, we also harmonise on using 'rq' shorthand for local variables where space if of the essence and repetition makes 'request' unwieldy. For globals and struct members, 'request' is still much preferred for its clarity. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 13 2月, 2018 1 次提交
-
-
由 Chris Wilson 提交于
Userspace provides a 64b value for the priority, we need to be careful to preserve the full range before validation to prevent truncation (and letting an illegal value pass). Reported-by: NAntonio Argenziano <antonio.argenziano@intel.com> Fixes: ac14fbd4 ("drm/i915/scheduler: Support user-defined priorities") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Antonio Argenziano <antonio.argenziano@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208085151.11480-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit 11a18f63) Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
- 08 2月, 2018 4 次提交
-
-
由 Chris Wilson 提交于
The comment is very old and quite misleading now. drivers/gpu/drm/i915/i915_gem_context.c:349: warning: No description found for parameter 'dev_priv' drivers/gpu/drm/i915/i915_gem_context.c:349: warning: No description found for parameter 'file_priv' Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208111559.32663-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Userspace provides a 64b value for the priority, we need to be careful to preserve the full range before validation to prevent truncation (and letting an illegal value pass). Reported-by: NAntonio Argenziano <antonio.argenziano@intel.com> Fixes: ac14fbd4 ("drm/i915/scheduler: Support user-defined priorities") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Antonio Argenziano <antonio.argenziano@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208085151.11480-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
If we remove some hardcoded assumptions about the preempt context having a fixed id, reserved from use by normal user contexts, we may only allocate the i915_gem_context when required. Then the subsequent decisions on using preemption reduce to having the preempt context available. v2: Include an assert that we don't allocate the preempt context twice. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-3-chris@chris-wilson.co.ukReviewed-by: NMichel Thierry <michel.thierry@intel.com>
-
由 Chris Wilson 提交于
Rather than having the high level ioctl interface guess the underlying implementation details, having the implementation declare what capabilities it exports. We define an intel_driver_caps, similar to the intel_device_info, which instead of trying to describe the HW gives details on what the driver itself supports. This is then populated by the engine backend for the new scheduler capability field for use elsewhere. v2: Use caps.scheduler for validating CONTEXT_PARAM_SET_PRIORITY (Mika) One less assumption of engine[RCS] \o/ Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tomasz Lis <tomasz.lis@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NTomasz Lis <tomasz.lis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-2-chris@chris-wilson.co.ukReviewed-by: NMichel Thierry <michel.thierry@intel.com>
-
- 13 12月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
If a fence allocation fails in a blocking context, we will sleep on the fence as a last resort. We can therefore allow ourselves to fail and sleep on the fence instead of triggering a system-wide oom. This allows us to throttle malicious clients that are consuming lots of system resources by capping the amount of memory used by fences. Testcase: igt/gem_shrink/execbufX Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171212180652.22061-2-chris@chris-wilson.co.uk
-
- 06 12月, 2017 1 次提交
-
-
由 Michal Wajdeczko 提交于
In the upcoming patch we will change the way how to recognize when GuC is in use. Using helper macros will minimize scope of that changes. While here, update dev_info message. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-3-michal.wajdeczko@intel.com
-
- 28 11月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Even though all rendering should have been flushed at the end of the previous requests, add an extra flush after switching to the kernel_context. As the switch to the kernel_context is used when idling the gpu (e.g. suspend), having an extra layer of paranoia to ensure everything is flushed to memory seems sensible. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171126214856.23702-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 24 11月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
The legacy i915_switch_context() is only applicable to the legacy ringbuffer submission method, so move it from the general i915_gem_context.c to intel_ringbuffer.c (rename pending!). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123152631.31385-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The legacy context switch for ringbuffer submission is multistaged, where each of those stages may fail. However, we were updating global state after some stages, and so we had to force the incomplete request to be submitted because we could not unwind. Save the global state before performing the switches, and so enable us to unwind back to the previous global state should any phase fail. We then must cancel the request instead of submitting it should the construction fail. v2: s/saved_ctx/from_ctx/; s/ctx/to_ctx/ etc. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123152631.31385-1-chris@chris-wilson.co.uk
-
- 21 11月, 2017 3 次提交
-
-
由 Chris Wilson 提交于
Having disabled the broken semaphores on Sandybridge, there is no need for a modparam any more, so remove it in favour of a simple HAS_LEGACY_SEMAPHORES() guard. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Since removing the module parameter to force selection of ringbuffer emission for gen8, the code is defunct. Remove it. To put the difference into perspective, a couple of microbenchmarks (bdw i7-5557u, 20170324): ring execlists exec continuous nops on all rings: 1.491us 2.223us exec sequential nops on each ring: 12.508us 53.682us single nop + sync: 9.272us 30.291us vblank_mode=0 glxgears: ~11000fps ~9000fps Since the earlier submission, gen8 ringbuffer submission has fallen further and further behind in features. So while ringbuffer may hold the throughput crown, in terms of interactive latency, execlists is much better. Alas, we have no convenient metrics for such, other than demonstrating things we can do with execlists but can not using legacy ringbuffer submission. We have made a few improvements to lowlevel execlists throughput, and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026): ring execlists exec continuous nops on all rings: n/a 1.921us exec sequential nops on each ring: n/a 44.621us single nop + sync: n/a 21.953us vblank_mode=0 glxgears: n/a ~18500fps References: https://bugs.freedesktop.org/show_bug.cgi?id=87725Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Once-upon-a-time-Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Execlists and legacy ringbuffer submission are no longer feature comparable (execlists now offer greater functionality that should overcome their performance hit) and obsoletes the unsafe module parameter, i.e. comparing the two modes of execution is no longer useful, so remove the debug tool. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk
-
- 20 11月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
During request construction, after pinning the context we know whether or not we have to emit a context switch. So move this common operation from every caller into i915_gem_request_alloc() itself. v2: Always submit the request if we emitted some commands during request construction, as typically it also involves changes in global state. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-2-chris@chris-wilson.co.uk
-
- 11 11月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Take a copy of the HW state after a reset upon module loading by executing a context switch from a blank context to the kernel context, thus saving the default hw state over the blank context image. We can then use the default hw state to initialise any future context, ensuring that each starts with the default view of hw state. v2: Unmap our default state from the GTT after stealing it from the context. This should stop us from accidentally overwriting it via the GTT (and frees up some precious GTT space). Testcase: igt/gem_ctx_isolation Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-7-chris@chris-wilson.co.uk
-
- 09 11月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
When we close the VMA, we unbind it from the ppgtt and tear down the page directory pointing at it. That may trigger us to return WC pages back to the system, requiring conversion back to WB which itself may sleep. That makes i915_vma_close() unsuitable for use inside the RCU read lock, which we need to hold to iterate the radixtree. The fix is quite simple, we can close all the VMA as we close the ppgtt, we only need to do that instead of closing them during destruction of the LUT. v2: Order between closing the LUT and the ppgtt is important; we use the vma inside the LUT as a means of retrieving the object, and so we must clear the LUT before freeing the VMA when closing the ppgtt. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103638 Fixes: 547da76b ("drm/i915: Hold rcu_read_lock when iterating over the radixtree (vma idr)") Fixes: d1b48c1e ("drm/i915: Replace execbuf vma ht with an idr") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171109085540.32264-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> (cherry picked from commit 94dec871) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Chris Wilson 提交于
When we close the VMA, we unbind it from the ppgtt and tear down the page directory pointing at it. That may trigger us to return WC pages back to the system, requiring conversion back to WB which itself may sleep. That makes i915_vma_close() unsuitable for use inside the RCU read lock, which we need to hold to iterate the radixtree. The fix is quite simple, we can close all the VMA as we close the ppgtt, we only need to do that instead of closing them during destruction of the LUT. v2: Order between closing the LUT and the ppgtt is important; we use the vma inside the LUT as a means of retrieving the object, and so we must clear the LUT before freeing the VMA when closing the ppgtt. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103638 Fixes: 547da76b ("drm/i915: Hold rcu_read_lock when iterating over the radixtree (vma idr)") Fixes: d1b48c1e ("drm/i915: Replace execbuf vma ht with an idr") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171109085540.32264-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
- 31 10月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Kasan spotted [IGT] gem_tiled_pread_pwrite: exiting, ret=0 ================================================================== BUG: KASAN: use-after-free in __i915_gem_object_reset_page_iter+0x15c/0x170 [i915] Read of size 8 at addr ffff8801359da310 by task kworker/3:2/182 CPU: 3 PID: 182 Comm: kworker/3:2 Tainted: G U 4.14.0-rc6-CI-Custom_3340+ #1 Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017 Workqueue: events __i915_gem_free_work [i915] Call Trace: dump_stack+0x68/0xa0 print_address_description+0x78/0x290 ? __i915_gem_object_reset_page_iter+0x15c/0x170 [i915] kasan_report+0x23d/0x350 __asan_report_load8_noabort+0x19/0x20 __i915_gem_object_reset_page_iter+0x15c/0x170 [i915] ? i915_gem_object_truncate+0x100/0x100 [i915] ? lock_acquire+0x380/0x380 __i915_gem_object_put_pages+0x30d/0x530 [i915] __i915_gem_free_objects+0x551/0xbd0 [i915] ? lock_acquire+0x13e/0x380 __i915_gem_free_work+0x4e/0x70 [i915] process_one_work+0x6f6/0x1590 ? pwq_dec_nr_in_flight+0x2b0/0x2b0 worker_thread+0xe6/0xe90 ? pci_mmcfg_check_reserved+0x110/0x110 kthread+0x309/0x410 ? process_one_work+0x1590/0x1590 ? kthread_create_on_node+0xb0/0xb0 ret_from_fork+0x27/0x40 Allocated by task 1801: save_stack_trace+0x1b/0x20 kasan_kmalloc+0xee/0x190 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc+0xdc/0x2e0 radix_tree_node_alloc.constprop.12+0x48/0x330 __radix_tree_create+0x274/0x480 __radix_tree_insert+0xa2/0x610 i915_gem_object_get_sg+0x224/0x670 [i915] i915_gem_object_get_page+0xb5/0x1c0 [i915] i915_gem_pread_ioctl+0x822/0xf60 [i915] drm_ioctl_kernel+0x13f/0x1c0 drm_ioctl+0x6cf/0x980 do_vfs_ioctl+0x184/0xf30 SyS_ioctl+0x41/0x70 entry_SYSCALL_64_fastpath+0x1c/0xb1 Freed by task 37: save_stack_trace+0x1b/0x20 kasan_slab_free+0xaf/0x190 kmem_cache_free+0xbf/0x340 radix_tree_node_rcu_free+0x79/0x90 rcu_process_callbacks+0x46d/0xf40 __do_softirq+0x21c/0x8d3 The buggy address belongs to the object at ffff8801359da0f0 which belongs to the cache radix_tree_node of size 576 The buggy address is located 544 bytes inside of 576-byte region [ffff8801359da0f0, ffff8801359da330) The buggy address belongs to the page: page:ffffea0004d67600 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x8000000000008100(slab|head) raw: 8000000000008100 0000000000000000 0000000000000000 0000000100110011 raw: ffffea0004b52920 ffffea0004b38020 ffff88015b416a80 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801359da200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801359da280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8801359da300: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc ^ ffff8801359da380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8801359da400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Disabling lock debugging due to kernel taint which looks like the slab containing the radixtree iter was freed as we traversed the tree, taking the rcu read lock across the loop should prevent that (deferring all the frees until the end). Reported-by: NTomi Sarvela <tomi.p.sarvela@intel.com> Fixes: d1b48c1e ("drm/i915: Replace execbuf vma ht with an idr") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171026130032.10677-2-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> (cherry picked from commit 547da76b) Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
-
- 27 10月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Kasan spotted [IGT] gem_tiled_pread_pwrite: exiting, ret=0 ================================================================== BUG: KASAN: use-after-free in __i915_gem_object_reset_page_iter+0x15c/0x170 [i915] Read of size 8 at addr ffff8801359da310 by task kworker/3:2/182 CPU: 3 PID: 182 Comm: kworker/3:2 Tainted: G U 4.14.0-rc6-CI-Custom_3340+ #1 Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017 Workqueue: events __i915_gem_free_work [i915] Call Trace: dump_stack+0x68/0xa0 print_address_description+0x78/0x290 ? __i915_gem_object_reset_page_iter+0x15c/0x170 [i915] kasan_report+0x23d/0x350 __asan_report_load8_noabort+0x19/0x20 __i915_gem_object_reset_page_iter+0x15c/0x170 [i915] ? i915_gem_object_truncate+0x100/0x100 [i915] ? lock_acquire+0x380/0x380 __i915_gem_object_put_pages+0x30d/0x530 [i915] __i915_gem_free_objects+0x551/0xbd0 [i915] ? lock_acquire+0x13e/0x380 __i915_gem_free_work+0x4e/0x70 [i915] process_one_work+0x6f6/0x1590 ? pwq_dec_nr_in_flight+0x2b0/0x2b0 worker_thread+0xe6/0xe90 ? pci_mmcfg_check_reserved+0x110/0x110 kthread+0x309/0x410 ? process_one_work+0x1590/0x1590 ? kthread_create_on_node+0xb0/0xb0 ret_from_fork+0x27/0x40 Allocated by task 1801: save_stack_trace+0x1b/0x20 kasan_kmalloc+0xee/0x190 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc+0xdc/0x2e0 radix_tree_node_alloc.constprop.12+0x48/0x330 __radix_tree_create+0x274/0x480 __radix_tree_insert+0xa2/0x610 i915_gem_object_get_sg+0x224/0x670 [i915] i915_gem_object_get_page+0xb5/0x1c0 [i915] i915_gem_pread_ioctl+0x822/0xf60 [i915] drm_ioctl_kernel+0x13f/0x1c0 drm_ioctl+0x6cf/0x980 do_vfs_ioctl+0x184/0xf30 SyS_ioctl+0x41/0x70 entry_SYSCALL_64_fastpath+0x1c/0xb1 Freed by task 37: save_stack_trace+0x1b/0x20 kasan_slab_free+0xaf/0x190 kmem_cache_free+0xbf/0x340 radix_tree_node_rcu_free+0x79/0x90 rcu_process_callbacks+0x46d/0xf40 __do_softirq+0x21c/0x8d3 The buggy address belongs to the object at ffff8801359da0f0 which belongs to the cache radix_tree_node of size 576 The buggy address is located 544 bytes inside of 576-byte region [ffff8801359da0f0, ffff8801359da330) The buggy address belongs to the page: page:ffffea0004d67600 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x8000000000008100(slab|head) raw: 8000000000008100 0000000000000000 0000000000000000 0000000100110011 raw: ffffea0004b52920 ffffea0004b38020 ffff88015b416a80 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801359da200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801359da280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8801359da300: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc ^ ffff8801359da380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8801359da400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Disabling lock debugging due to kernel taint which looks like the slab containing the radixtree iter was freed as we traversed the tree, taking the rcu read lock across the loop should prevent that (deferring all the frees until the end). Reported-by: NTomi Sarvela <tomi.p.sarvela@intel.com> Fixes: d1b48c1e ("drm/i915: Replace execbuf vma ht with an idr") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171026130032.10677-2-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.william.auld@gmail.com>
-
- 25 10月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
During evict, we wish to idle the GPU if we see that the GGTT is full. However, our test for idle in i915_gem_evict_something() and in i915_gem_switch_to_kernel_context() do not match leading to disappointment - we never believe that we are idle and keep trying to flush the GGTT ad infinitum. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103438Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171024220855.30155-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- 05 10月, 2017 2 次提交
-
-
由 Chris Wilson 提交于
Use a priority stored in the context as the initial value when submitting a request. This allows us to change the default priority on a per-context basis, allowing different contexts to be favoured with GPU time at the expense of lower importance work. The user can adjust the context's priority via I915_CONTEXT_PARAM_PRIORITY, with more positive values being higher priority (they will be serviced earlier, after their dependencies have been resolved). Any prerequisite work for an execbuf will have its priority raised to match the new request as required. Normal users can specify any value in the range of -1023 to 0 [default], i.e. they can reduce the priority of their workloads (and temporarily boost it back to normal if so desired). Privileged users can specify any value in the range of -1023 to 1023, [default is 0], i.e. they can raise their priority above all overs and so potentially starve the system. Note that the existing schedulers are not fair, nor load balancing, the execution is strictly by priority on a first-come, first-served basis, and the driver may choose to boost some requests above the range available to users. This priority was originally based around nice(2), but evolved to allow clients to adjust their priority within a small range, and allow for a privileged high priority range. For example, this can be used to implement EGL_IMG_context_priority https://www.khronos.org/registry/egl/extensions/IMG/EGL_IMG_context_priority.txt EGL_CONTEXT_PRIORITY_LEVEL_IMG determines the priority level of the context to be created. This attribute is a hint, as an implementation may not support multiple contexts at some priority levels and system policy may limit access to high priority contexts to appropriate system privilege level. The default value for EGL_CONTEXT_PRIORITY_LEVEL_IMG is EGL_CONTEXT_PRIORITY_MEDIUM_IMG." so we can map PRIORITY_HIGH -> 1023 [privileged, will failback to 0] PRIORITY_MED -> 0 [default] PRIORITY_LOW -> -1023 They also map onto the priorities used by VkQueue (and a VkQueue is essentially a timeline, our i915_gem_context under full-ppgtt). v2: s/CAP_SYS_ADMIN/CAP_SYS_NICE/ v3: Report min/max user priorities as defines in the uapi, and rebase internal priorities on the exposed values. Testcase: igt/gem_exec_schedule Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-9-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Add another perma-pinned context for using for preemption at any time. We cannot just reuse the existing kernel context, as first and foremost we need to ensure that we can preempt the kernel context itself, so require a distinct context id. Similar to the kernel context, we may want to interrupt execution and switch to the preempt context at any time, and so it needs to be permanently pinned and available. To compensate for yet another permanent allocation, we shrink the existing context and the new context by reducing their ringbuffer to the minimum. v2: Assert that we never allocate a request from the preemption context. v3: Limit perma-pin to engines that may preempt. v4: Onion cleanup for early driver death v5: Onion ordering in main driver cleanup as well. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-4-chris@chris-wilson.co.uk
-
- 22 9月, 2017 1 次提交
-
-
由 Michal Wajdeczko 提交于
Our global struct with params is named exactly the same way as new preferred name for the drm_i915_private function parameter. To avoid such name reuse lets use different name for the global. v5: pure rename v6: fix Credits-to: Coccinelle @@ identifier n; @@ ( - i915.n + i915_modparams.n ) Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ville Syrjala <ville.syrjala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170919193846.38060-1-michal.wajdeczko@intel.com
-
- 18 8月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
This was the competing idea long ago, but it was only with the rewrite of the idr as an radixtree and using the radixtree directly ourselves, along with the realisation that we can store the vma directly in the radixtree and only need a list for the reverse mapping, that made the patch performant enough to displace using a hashtable. Though the vma ht is fast and doesn't require any extra allocation (as we can embed the node inside the vma), it does require a thread for resizing and serialization and will have the occasional slow lookup. That is hairy enough to investigate alternatives and favour them if equivalent in peak performance. One advantage of allocating an indirection entry is that we can support a single shared bo between many clients, something that was done on a first-come first-serve basis for shared GGTT vma previously. To offset the extra allocations, we create yet another kmem_cache for them. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk
-
- 15 8月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
When switching between contexts using the aliasing_ppgtt, the VM is shared. We don't need to reload the PD registers unless they are dirty. Martin Peres reported an issue that looks like corruption between Haswell context switches, bisecting to commit f9326be5 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use"). Switching between the same mm (the aliasing_ppgtt is used for all contexts in this case) should be a nop, but appears to trigger some side-effects in the context switch. However, as we know the switch is redundant in this case, we can skip it and continue to ignore the issue until somebody feels strong enough to investigate full-ppgtt on gen7 again! Except.. Martin was using full-ppgtt which is not supported as it doesn't work correctly yet. So whilst the bisect did yield valuable information about the failures, the fix should not have any user impact under default settings, with the exception of a slightly lower throughput on xcs as the VM would always be reloaded. v2: Also remember to set the legacy_active_context following the switch on xcs (commit e8a9c58f ("drm/i915: Unify active context tracking between legacy/execlists/guc")) Fixes: f9326be5 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use") Fixes: e8a9c58f ("drm/i915: Unify active context tracking between legacy/execlists/guc") Reported-by: NMartin Peres <martin.peres@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170812152724.6883-1-chris@chris-wilson.co.uk (cherry picked from commit 12124bea) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 12 8月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
When switching between contexts using the aliasing_ppgtt, the VM is shared. We don't need to reload the PD registers unless they are dirty. Martin Peres reported an issue that looks like corruption between Haswell context switches, bisecting to commit f9326be5 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use"). Switching between the same mm (the aliasing_ppgtt is used for all contexts in this case) should be a nop, but appears to trigger some side-effects in the context switch. However, as we know the switch is redundant in this case, we can skip it and continue to ignore the issue until somebody feels strong enough to investigate full-ppgtt on gen7 again! Except.. Martin was using full-ppgtt which is not supported as it doesn't work correctly yet. So whilst the bisect did yield valuable information about the failures, the fix should not have any user impact under default settings, with the exception of a slightly lower throughput on xcs as the VM would always be reloaded. v2: Also remember to set the legacy_active_context following the switch on xcs (commit e8a9c58f ("drm/i915: Unify active context tracking between legacy/execlists/guc")) Fixes: f9326be5 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use") Fixes: e8a9c58f ("drm/i915: Unify active context tracking between legacy/execlists/guc") Reported-by: NMartin Peres <martin.peres@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170812152724.6883-1-chris@chris-wilson.co.uk
-
- 27 7月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Since we make call i915_gem_context_mark_guilty() concurrently when resetting different engines in parallel, we need to make sure that our updates are safe for the unlocked access. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-12-chris@chris-wilson.co.ukSigned-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-