- 25 1月, 2020 3 次提交
-
-
由 Michal Wajdeczko 提交于
We should never BUG_ON on any corruption in CTB descriptor as data there can be also modified by the GuC. Instead we can use flag "is_in_error" to indicate that we will not process any further messages over this CTB (until reset). Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200120191817.50164-1-michal.wajdeczko@intel.com
-
由 Chris Wilson 提交于
Using a clear page for scratch means that we have relatively benign errors in case it is accidentally used, but that can be rather too benign for debugging. If we poison the scratch, ideally it quickly results in an obvious error. v2: Set each page individually just in case we are using highmem for our scratch page. v3: Pick a new scratch register as MI_STORE_REGISTER_MEM does not work with GPR0 on gen7, unbelievably. v4: Haswell still considers 3DPRIM a privileged register! Suggested-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200124115133.53360-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Due to the asynchronous nature of releasing our wakerefs, we can signal the main GT wakeref as complete before the individual engines have cleared their own wakerefs. During shutdown we assert that the engines are indeed parked before we release them, but for this to be always true we need to flush their workers as well. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200124143339.140988-1-chris@chris-wilson.co.uk
-
- 24 1月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Include the current RC6 residency counter in the error message, so that if we fail to park and manually enter RC6 we can see if the counter has a particularly suspect value (such as 0). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123145755.1420622-1-chris@chris-wilson.co.uk
-
- 23 1月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
If we encounter a hang on a virtual engine, as we process the hang the request may already have been moved back to the virtual engine (we are processing the hang on the physical engine). We need to reclaim the request from the virtual engine so that the locking is consistent and local to the real engine on which we will hold the request for error state capturing. v2: Pull the reclamation into execlists_hold() and assert that cannot be called from outside of the reset (i.e. with the tasklet disabled). v3: Added selftest v4: Drop the reference owned by the virtual engine Fixes: 74831738 ("drm/i915/execlists: Offline error capture") Testcase: igt/gem_exec_balancer/hang Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Thanks to preempt-to-busy, we leave the request on the HW as we submit the preemption request. This means that the request may complete at any moment as we process HW events, and in particular the request may be retired as we are planning to capture it for a preemption timeout. Be more careful while obtaining the request to capture after a preemption timeout, and check to see if it completed before we were able to put it on the on-hold list. If we do see it did complete just before we capture the request, proclaim the preemption-timeout a false positive and pardon the reset as we should hit an arbitration point momentarily and so be able to process the preemption. Note that even after we move the request to be on hold it may be retired (as the reset to stop the HW comes after), so we do require to hold our own reference as we work on the request for capture (and all of the peeking at state within the request needs to be carefully protected). Fixes: 32ff621f ("drm/i915/gt: Allow temporary suspension of inflight requests") Closes: https://gitlab.freedesktop.org/drm/intel/issues/997Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We have two trace messages that rely on the function name for distinction. However, if gcc inlines the function, the two traces end up with the same function name and are indistinguishable. Add a different message to each to clarify which one we hit, i.e. which phase of engine parking we are processing. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200122124154.483444-1-chris@chris-wilson.co.uk
-
- 22 1月, 2020 1 次提交
-
-
由 Pankaj Bharadiya 提交于
drm specific WARN* calls include device information in the backtrace, so we know what device the warnings originate from. Covert all the calls of WARN* with device specific drm_WARN* variants in functions where drm_i915_private struct pointer is readily available. The conversion was done automatically with below coccinelle semantic patch. checkpatch errors/warnings are fixed manually. @rule1@ identifier func, T; @@ func(...) { ... struct drm_i915_private *T = ...; <+... ( -WARN( +drm_WARN(&T->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->drm, ...) ) ...+> } @rule2@ identifier func, T; @@ func(struct drm_i915_private *T,...) { <+... ( -WARN( +drm_WARN(&T->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->drm, ...) ) ...+> } command: spatch --sp-file <script> --dir drivers/gpu/drm/i915/gt \ --linux-spacing --in-place Signed-off-by: NPankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200115034455.17658-7-pankaj.laxminarayan.bharadiya@intel.com
-
- 20 1月, 2020 1 次提交
-
-
由 Tvrtko Ursulin 提交于
In our ABI we have defined I915_ENGINE_CLASS_INVALID_NONE and I915_ENGINE_CLASS_INVALID_VIRTUAL as negative values which creates implicit coupling with type widths used in, also ABI, struct i915_engine_class_instance. One place where we export engine->uabi_class I915_ENGINE_CLASS_INVALID_VIRTUAL is from our our tracepoints. Because the type of the former is u8 in contrast to u16 defined in the ABI, 254 will be returned instead of 65534 which userspace would legitimately expect. Another place is I915_CONTEXT_PARAM_ENGINES. Therefore we need to align the type used to store engine ABI class and instance. v2: * Update the commit message mentioning get_engines and cc stable. (Chris) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 6d06779e ("drm/i915: Load balancing across a virtual engine") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: <stable@vger.kernel.org> # v5.3+ Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200116134508.25211-1-tvrtko.ursulin@linux.intel.com
-
- 18 1月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
Just in the very unlikely case we have not stopped the GPU before we return the pages being used by the GPU to the system, force a reset. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200117180309.3249427-1-chris@chris-wilson.co.uk
-
由 Matthew Auld 提交于
If we create a rather large userptr object(e.g 1ULL << 32) we might shift past the type-width of num_pages: (int)num_pages << PAGE_SHIFT, resulting in a totally bogus sg_table, which fortunately will eventually manifest as: gen8_ppgtt_insert_huge:463 GEM_BUG_ON(iter->sg->length < page_size) kernel BUG at drivers/gpu/drm/i915/gt/gen8_ppgtt.c:463! v2: more unsigned long prefer I915_GTT_PAGE_SIZE Fixes: 5cc9ed4b ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl") Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200117132413.1170563-2-matthew.auld@intel.com
-
由 Chris Wilson 提交于
Since commit 22b7a426 ("drm/i915/execlists: Preempt-to-busy"), we prune the engine->active.requests list prior to preemption, thus removing the trace of the currently executing request. If that request hangs rather than be preempted, we conclude that no active request was on the GPU. Fortunately, this only impacts our debugging, and not our means of hang detection or recovery. v2: Use from to check the current iterator before continuing, and report active as NULL if the current request is already completed. References: 22b7a426 ("drm/i915/execlists: Preempt-to-busy") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200117113259.3023890-1-chris@chris-wilson.co.uk
-
- 17 1月, 2020 8 次提交
-
-
由 Michal Wajdeczko 提交于
As we now have "ct" available almost in all functions we can start using dev variants of logs also for debug. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-6-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
As we now have "ct" available in ct_read function we can switch from generic DRM_ERROR to our custom CT_ERROR. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NPiotr Piórkowski <piotr.piorkowski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-5-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
Since we only have one RECV buffer we don't need to explicitly pass it to the read function. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-4-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
Since we only have one SEND buffer we don't need to explicitly pass it to the write function. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-3-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
We should never BUG_ON on any corruption in CTB descriptor as data there can be also modified by the GuC. Instead we can use flag "is_in_error" to indicate that we will not process any further messages over this CTB (until reset). While here move descriptor error reporting to the function that actually touches that descriptor. Note that unexpected content of the specific CT messages, that still complies with generic CT message format, shall not trigger disabling whole CTB, as that might just indicate new unsupported message types. v2: drop redundant message (Daniele) Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-2-michal.wajdeczko@intel.com
-
由 Chris Wilson 提交于
Currently, we skip error capture upon forced preemption. We apply forced preemption when there is a higher priority request that should be running but is being blocked, and we skip inline error capture so that the preemption request is not further delayed by a user controlled capture -- extending the denial of service. However, preemption reset is also used for heartbeats and regular GPU hangs. By skipping the error capture, we remove the ability to debug GPU hangs. In order to capture the error without delaying the preemption request further, we can do an out-of-line capture by removing the guilty request from the execution queue and scheduling a worker to dump that request. When removing a request, we need to remove the entire context and all descendants from the execution queue, so that they do not jump past. Closes: https://gitlab.freedesktop.org/drm/intel/issues/738 Fixes: 3a7a92ab ("drm/i915/execlists: Force preemption") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In order to support out-of-line error capture, we need to remove the active request from HW and put it to one side while a worker compresses and stores all the details associated with that request. (As that compression may take an arbitrary user-controlled amount of time, we want to let the engine continue running on other workloads while the hanging request is dumped.) Not only do we need to remove the active request, but we also have to remove its context and all requests that were dependent on it (both in flight, queued and future submission). Finally once the capture is complete, we need to be able to resubmit the request and its dependents and allow them to execute. v2: Replace stack recursion with a simple list. v3: Check all the parents, not just the first, when searching for a stuck ancestor! References: https://gitlab.freedesktop.org/drm/intel/issues/738Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If we keep track of when the i915_request.sched.link is on the HW runlist, or in the priority queue we can simplify our interactions with the request (such as during rescheduling). This also simplifies the next patch where we introduce a new in-between list, for requests that are ready but neither on the run list or in the queue. v2: Update i915_sched_node.link explanation for current usage where it is a link on both the queue and on the runlists. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-1-chris@chris-wilson.co.uk
-
- 16 1月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
Remove the double space that crept into the fmt stringification. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116125749.2786743-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We need to allow concurrent intel_context_unpin, which means avoiding doing destructive operations like intel_ring_reset(). This was already fixed for intel_ring_unpin() in commit 0725d9a3 ("drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint"), but I overlooked that execlists_context_unpin() also made the same mistake. Reported-by: NMatthew Brost <matthew.brost@intel.com> Fixes: 84135022 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin") References: 0725d9a3 ("drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200115175829.2761329-1-chris@chris-wilson.co.uk
-
- 15 1月, 2020 2 次提交
-
-
由 Chris Wilson 提交于
In converting over to using set_bit()/test_bit(), when manually inspecting the rq->fence.flags, we need to use BIT(). Fixes: e1c31fb5 ("drm/i915: Merge i915_request.flags with i915_request.fence.flags") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200115122509.2673075-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Add a i915_vma to the mock_engine/mock_ring so that the core code can always assume the presence of ring->vma. Fixes: 8ccfc20a ("drm/i915/gt: Mark ring->vma as active while pinned") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200114160030.2468927-1-chris@chris-wilson.co.uk
-
- 14 1月, 2020 6 次提交
-
-
由 Michal Wajdeczko 提交于
While we have function that returns "next fence" that can be used by new CT request, we internally store value of the last used fence. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200111231114.59208-5-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
Update GuC CTB action helpers to benefit from new CT_ERROR macro. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200111231114.59208-4-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
We should start using dev variants of error logging and to simplify that introduce helper macro that will do any necessary conversions to obtain pointer to device struct. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200111231114.59208-3-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
We need CT message size in bytes so just use that in helper var. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200111231114.59208-2-michal.wajdeczko@intel.com
-
由 Chris Wilson 提交于
On suspend, the rc6 residency counters (stored in HW registers) will be lost and cleared. However, we keep track of the rc6 residency to provide a continuous 64b sampling, and if we see the HW value go backwards, we assume it overflowed and add on 32b/40b -- an interesting artifact when sampling across suspend. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200114105648.2172026-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Currently, we reset the timer after a pre-eemption event. This has the side-effect that the timeslice runs into the second context after the first is completed after a normal promotion event, causing the second context to be swapped out early and switched for a third context. To be more fair, we want to reset the clock after promotion as well. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200113214546.1990139-1-chris@chris-wilson.co.uk
-
- 11 1月, 2020 5 次提交
-
-
由 Michal Wajdeczko 提交于
uC sanitization is only meaningful if we are running with uC present or enabled. Make this function part of the uc_ops. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200110222723.14724-5-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
uC preparation and cleanup steps are only meaningful if we are running with uC enabled. Make these functions part of the uc_ops. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200110222723.14724-4-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
Firmware fetching and cleanup steps are only meaningful if we are running with uC enabled. Make these functions part of the uc_ops. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200110222723.14724-3-michal.wajdeczko@intel.com
-
由 Michal Wajdeczko 提交于
Instead of spreading multiple conditionals across the uC code to find out current mode of uC operation, start using predefined set of function pointers that reflect that mode. Begin with pair of init_hw/fini_hw functions that are responsible for uC hardware initialization and cleanup. v2: drop ops_none, use macro to generate ops helpers v3: reuse __uc_check_hw to avoid redundant comment v4: forward declare ops struct vs functions Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200110222723.14724-2-michal.wajdeczko@intel.com
-
由 Chris Wilson 提交于
We need to hold the runtime-pm wakeref to update the global PTEs (as they exist behind a PCI BAR). However, some systems invoke ACPI during runtime resume and so require allocations, which is verboten inside the vm->mutex. Ergo, we must not use intel_runtime_pm_get() inside the mutex, but lift the call outside. Closes: https://gitlab.freedesktop.org/drm/intel/issues/958Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110144418.1415639-1-chris@chris-wilson.co.uk
-
- 10 1月, 2020 5 次提交
-
-
由 Chris Wilson 提交于
In the near future, we will want to start a GPU error capture from a new context, from inside the softirq region of a forced preemption. To do so requires us to break up the monolithic error capture to provide new entry points with finer control; in particular focusing on one engine/gt, and being able to compose an error state from little pieces of HW capture. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110123059.1348712-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As we use the active state to keep the vma alive while we are reading its contents during GPU error capture, we need to mark the ring->vma as active during execution if we want to include the rinbuffer in the error state. Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: b1e3177b ("drm/i915: Coordinate i915_active with its own mutex") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110110402.1231745-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As we use the active state to keep the vma alive while we are reading its contents during GPU error capture, we need to mark the context->state vma as active during execution if we want to include it in the error state. Reported-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: b1e3177b ("drm/i915: Coordinate i915_active with its own mutex") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110110402.1231745-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Currently we first to try to unbind the VMA (and lazily rebind on next use) as an optimisation during restore_ggtt_mappings. Ideally, the only objects in the GGTT upon resume are the pinned kernel objects which can't be unbound and need to be restored. As the unbind interferes with the plan to mark those objects as active for error capture, forgo the optimisation. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110110402.1231745-1-chris@chris-wilson.co.uk
-
由 Chen Zhou 提交于
Fix build error: drivers/gpu/drm/i915/gt/intel_ggtt.c: In function ggtt_restore_mappings: drivers/gpu/drm/i915/gt/intel_ggtt.c:1239:3: error: implicit declaration of function wbinvd_on_all_cpus; did you mean wrmsr_on_cpus? [-Werror=implicit-function-declaration] wbinvd_on_all_cpus(); ^~~~~~~~~~~~~~~~~~ wrmsr_on_cpus Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NChen Zhou <chenzhou10@huawei.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200109012303.153001-1-chenzhou10@huawei.com
-