提交 · 88bfb6dbb61c54008600c3cc6276610393a00d2b · openeuler / Kernel

25 5月, 2022 1 次提交

drm/panfrost: Job should reference MMU not file_priv · 6e516faf

由 Steven Price 提交于 5月 19, 2022

For a while now it's been allowed for a MMU context to outlive it's
corresponding panfrost_priv, however the job structure still references
panfrost_priv to get hold of the MMU context. If panfrost_priv has been
freed this is a use-after-free which I've been able to trigger resulting
in a splat.

To fix this, drop the reference to panfrost_priv in the job structure
and add a direct reference to the MMU structure which is what's actually
needed.

Fixes: 7fdc48cc ("drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv")
Signed-off-by: NSteven Price <steven.price@arm.com>
Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220519152003.81081-1-steven.price@arm.com

6e516faf

07 4月, 2022 1 次提交

dma-buf: specify usage while adding fences to dma_resv obj v7 · 73511edf

由 Christian König 提交于 11月 09, 2021

Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.

Rework all drivers to use this interface instead and deprecate the old one.

v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
    disable warning temporary, rebase on upstream changes
v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com

73511edf

06 4月, 2022 1 次提交

dma-buf/drivers: make reserving a shared slot mandatory v4 · c8d4c18b

由 Christian König 提交于 11月 16, 2021

Audit all the users of dma_resv_add_excl_fence() and make sure they
reserve a shared slot also when only trying to add an exclusive fence.

This is the next step towards handling the exclusive fence like a
shared one.

v2: fix missed case in amdgpu
v3: and two more radeon, rename function
v4: add one more case to TTM, fix i915 after rebase
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com

c8d4c18b

23 2月, 2022 1 次提交

drm/sched: Add device pointer to drm_gpu_scheduler · 8ab62eda

由 Jiawei Gu 提交于 2月 22, 2022

Add device pointer so scheduler's printing can use
DRM_DEV_ERROR() instead, which makes life easier under multiple GPU
scenario.

v2: amend all calls of drm_sched_init()
v3: fill dev pointer for all drm_sched_init() calls
Signed-off-by: NJiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com

8ab62eda

30 8月, 2021 3 次提交

drm/panfrost: use scheduler dependency tracking · 53516280

由 Daniel Vetter 提交于 8月 05, 2021

Just deletes some code that's now more shared.

Note that thanks to the split into drm_sched_job_init/arm we can now
easily pull the _init() part from under the submission lock way ahead
where we're adding the sync file in-fences as dependencies.

v2: Correctly clean up the partially set up job, now that job_init()
and job_arm() are apart (Emma).

v3: Rebased over renamed functions for adding depdencies
Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: NEmma Anholt <emma@anholt.net>
Reviewed-by: Steven Price <steven.price@arm.com> (v3)
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-8-daniel.vetter@ffwll.ch

53516280

drm/sched: drop entity parameter from drm_sched_push_job · 0e10e9a1

由 Daniel Vetter 提交于 8月 05, 2021

Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NEmma Anholt <emma@anholt.net>
Acked-by: NMelissa Wen <mwen@igalia.com>
Reviewed-by: Steven Price <steven.price@arm.com> (v1)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1)
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Melissa Wen <mwen@igalia.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-6-daniel.vetter@ffwll.ch

0e10e9a1

drm/sched: Split drm_sched_job_init · dbe48d03

由 Daniel Vetter 提交于 8月 17, 2021

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
  usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
  to be moved into drm_sched_job_arm, which made me realize that the
  job->id definitely needs to be moved too.

  Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't
be a problem where it is now.
Acked-by: NChristian König <christian.koenig@amd.com>
Acked-by: NMelissa Wen <mwen@igalia.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Acked-by: NEmma Anholt <emma@anholt.net>
Acked-by: Steven Price <steven.price@arm.com> (v2)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v5)
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210817084917.3555822-1-daniel.vetter@ffwll.ch

dbe48d03

26 8月, 2021 1 次提交

drm/panfrost: Use upper/lower_32_bits helpers · e9ae220d

由 Alyssa Rosenzweig 提交于 8月 25, 2021

Use upper_32_bits/lower_32_bits helpers instead of open-coding them.
This is easier to scan quickly compared to bitwise manipulation, and it
is pleasingly symmetric. I noticed this when debugging lock_region,
which had a particularly "creative" way of writing upper_32_bits.

v2: Use helpers for one more call site and add review tag (Steven).
Signed-off-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rob Herring <robh@kernel.org> (v1)
Reviewed-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210825153348.4980-1-alyssa.rosenzweig@collabora.com

e9ae220d

01 7月, 2021 10 次提交

drm/panfrost: Queue jobs on the hardware · 030761e0

由 Steven Price 提交于 6月 30, 2021

The hardware has a set of '_NEXT' registers that can hold a second job
while the first is executing. Make use of these registers to enqueue a
second job per slot.

v5:
* Fix a comment in panfrost_job_init()

v3:
* Fix the done/err job dequeuing logic to get a valid active state
* Only enable the second slot on GPUs supporting jobchain disambiguation
* Split interrupt handling in sub-functions
Signed-off-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-16-boris.brezillon@collabora.com

030761e0

drm/panfrost: Kill in-flight jobs on FD close · 30b5d4ed

由 Boris Brezillon 提交于 6月 30, 2021

If the process who submitted these jobs decided to close the FD before
the jobs are done it probably means it doesn't care about the result.

v5:
* Add a panfrost_exception_is_fault() helper and the
  DRM_PANFROST_EXCEPTION_MAX_NON_FAULT value

v4:
* Don't disable/restore irqs when taking the job_lock (not needed since
  this lock is never taken from an interrupt context)

v3:
* Set fence error to ECANCELED when a TERMINATED exception is received
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-15-boris.brezillon@collabora.com

30b5d4ed

drm/panfrost: Don't reset the GPU on job faults unless we really have to · 2905db27

由 Boris Brezillon 提交于 6月 30, 2021

If we can recover from a fault without a reset there's no reason to
issue one.

v3:
* Drop the mention of Valhall requiring a reset on JOB_BUS_FAULT
* Set the fence error to -EINVAL instead of having per-exception
  error codes
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-14-boris.brezillon@collabora.com

2905db27

drm/panfrost: Make sure job interrupts are masked before resetting · 1d0cab54

由 Boris Brezillon 提交于 6月 30, 2021

This is not yet needed because we let active jobs be killed during by
the reset and we don't really bother making sure they can be restarted.
But once we start adding soft-stop support, controlling when we deal
with the remaining interrrupts and making sure those are handled before
the reset is issued gets tricky if we keep job interrupts active.

Let's prepare for that and mask+flush job IRQs before issuing a reset.

v4:
* Add a comment explaining why we WARN_ON(!job) in the irq handler
* Keep taking the job_lock when evicting stalled jobs

v3:
* New patch
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-11-boris.brezillon@collabora.com

1d0cab54

drm/panfrost: Simplify the reset serialization logic · a11c4711

由 Boris Brezillon 提交于 6月 30, 2021

Now that we can pass our own workqueue to drm_sched_init(), we can use
an ordered workqueue on for both the scheduler timeout tdr and our own
reset work (which we use when the reset is not caused by a fault/timeout
on a specific job, like when we have AS_ACTIVE bit stuck). This
guarantees that the timeout handlers and reset handler can't run
concurrently which drastically simplifies the locking.

v5:
* Don't call cancel_delayed_timeout() in the reset path (those works
  are canceled in drm_sched_stop())

v4:
* Actually pass the reset workqueue to drm_sched_init()
* Don't call cancel_work_sync() in panfrost_reset(). It will deadlock
  since it might be called from the reset work, which is executing and
  cancel_work_sync() will wait for the handler to return. Checking the
  reset pending status should avoid spurious resets

v3:
* New patch
Suggested-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-10-boris.brezillon@collabora.com

a11c4711

drm/panfrost: Use a threaded IRQ for job interrupts · 070ce765

由 Boris Brezillon 提交于 6月 30, 2021

This should avoid switching to interrupt context when the GPU is under
heavy use.

v3:
* Don't take the job_lock in panfrost_job_handle_irq()
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-9-boris.brezillon@collabora.com

070ce765

drm/panfrost: Expose a helper to trigger a GPU reset · 229f4578

由 Boris Brezillon 提交于 6月 30, 2021

Expose a helper to trigger a GPU reset so we can easily trigger reset
operations outside the job timeout handler.
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-8-boris.brezillon@collabora.com

229f4578

drm/panfrost: Drop the pfdev argument passed to panfrost_exception_name() · 6ef2f37f

由 Boris Brezillon 提交于 6月 30, 2021

Currently unused. We'll add it back if we need per-GPU definitions.
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-6-boris.brezillon@collabora.com

6ef2f37f

drm/panfrost: Make ->run_job() return an ERR_PTR() when appropriate · 9f4e9110

由 Boris Brezillon 提交于 6月 30, 2021

If the fence creation fail, we can return the error pointer directly.
The core will update the fence error accordingly.
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-4-boris.brezillon@collabora.com

9f4e9110

drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr · 78efe21b

由 Boris Brezillon 提交于 6月 30, 2021

Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work
Suggested-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NLucas Stach <l.stach@pengutronix.de>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NChristian König <christian.koenig@amd.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com

78efe21b

24 6月, 2021 4 次提交

drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv · 7fdc48cc

由 Boris Brezillon 提交于 6月 21, 2021

Jobs can be in-flight when the file descriptor is closed (either because
the process did not terminate properly, or because it didn't wait for
all GPU jobs to be finished), and apparently panfrost_job_close() does
not cancel already running jobs. Let's refcount the MMU context object
so it's lifetime is no longer bound to the FD lifetime and running jobs
can finish properly without generating spurious page faults.
Reported-by: NIcecream95 <ixn@keemail.me>
Fixes: 7282f764 ("drm/panfrost: Implement per FD address spaces")
Cc: <stable@vger.kernel.org>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210621133907.1683899-2-boris.brezillon@collabora.com

7fdc48cc

drm/panfrost: Fix implicit sync · 7601d53c

由 Daniel Vetter 提交于 6月 22, 2021

Currently this has no practial relevance I think because there's not
many who can pull off a setup with panfrost and another gpu in the
same system. But the rules are that if you're setting an exclusive
fence, indicating a gpu write access in the implicit fencing system,
then you need to wait for all fences, not just the previous exclusive
fence.

panfrost against itself has no problem, because it always sets the
exclusive fence (but that's probably something that will need to be
fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
Also no problem with that against display.

With the prep work done to switch over to the dependency helpers this
is now a oneliner.
Reviewed-by: NBoris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-7-daniel.vetter@ffwll.ch

7601d53c

drm/panfrost: Use xarray and helpers for depedency tracking · 7d7a0fc4

由 Daniel Vetter 提交于 6月 22, 2021

More consistency and prep work for the next patch.

Aside: I wonder whether we shouldn't just move this entire xarray
business into the scheduler so that not everyone has to reinvent the
same wheels. Cc'ing some scheduler people for this too.

v2: Correctly handle sched_lock since Lucas pointed out it's needed.

v3: Rebase, dma_resv_get_excl_unlocked got renamed

v4: Don't leak job references on failure (Steven).
Reviewed-by: NBoris Brezillon <boris.brezillon@collabora.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-6-daniel.vetter@ffwll.ch

7d7a0fc4

drm/panfrost: Shrink sched_lock · 94dd80fe

由 Daniel Vetter 提交于 6月 22, 2021

drm/scheduler requires a lock between _init and _push_job, but the
reservation lock dance doesn't. So shrink the critical section a
notch.

v2: Lucas pointed out how this should really work, I got it all wrong
in v1.
Reviewed-by: NBoris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-5-daniel.vetter@ffwll.ch

94dd80fe

06 6月, 2021 1 次提交

dma-buf: rename dma_resv_get_excl_rcu to _unlocked · 6b41323a

由 Christian König 提交于 6月 02, 2021

That describes much better what the function is doing here.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJason Ekstrand <jason@jlekstrand.net>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-6-christian.koenig@amd.com

6b41323a

10 2月, 2021 2 次提交

Revert "drm/scheduler: Job timeout handler returns status (v3)" · e2183fb1

由 Maarten Lankhorst 提交于 2月 10, 2021

This reverts commit c10983e1.

This commit is not meant for drm-misc-next-fixes, and was accidentally
cherry picked over.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>

e2183fb1

drm/scheduler: Job timeout handler returns status (v3) · c10983e1

由 Luben Tuikov 提交于 1月 20, 2021

This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
    timeout callback method.

v3: Return scheduler/device information, rather
    than task information.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Acked-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/415095/
(cherry picked from commit a6a1f036)
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>

c10983e1

05 2月, 2021 1 次提交

drm/scheduler: provide scheduler score externally · f2f12eb9

由 Christian König 提交于 2月 02, 2021

Allow multiple schedulers to share the load balancing score.

This is useful when one engine has different hw rings.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: NLeo Liu <leo.liu@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com

f2f12eb9

29 1月, 2021 1 次提交

drm/scheduler: Job timeout handler returns status (v3) · a6a1f036

由 Luben Tuikov 提交于 1月 20, 2021

This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
    timeout callback method.

v3: Return scheduler/device information, rather
    than task information.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Acked-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/415095/

a6a1f036

16 11月, 2020 1 次提交

drm/panfrost: Move the GPU reset bits outside the timeout handler · 5bc5cc28

由 Boris Brezillon 提交于 11月 05, 2020

We've fixed many races in panfrost_job_timedout() but some remain.
Instead of trying to fix it again, let's simplify the logic and move
the reset bits to a separate work scheduled when one of the queue
reports a timeout.

v5:
- Simplify panfrost_scheduler_stop() (Steven Price)
- Always restart the queue in panfrost_scheduler_start() even if
  the status is corrupted (Steven Price)

v4:
- Rework the logic to prevent a race between drm_sched_start()
  (reset work) and drm_sched_job_timedout() (timeout work)
- Drop Steven's R-b
- Add dma_fence annotation to the panfrost_reset() function (Daniel Vetter)

v3:
- Replace the atomic_cmpxchg() by an atomic_xchg() (Robin Murphy)
- Add Steven's R-b

v2:
- Use atomic_cmpxchg() to conditionally schedule the reset work
  (Steven Price)

Fixes: 1a11a88c ("drm/panfrost: Fix job timeout handling")
Cc: <stable@vger.kernel.org>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105151704.2010667-1-boris.brezillon@collabora.com

5bc5cc28

03 11月, 2020 1 次提交

drm/panfrost: Remove unused variables in panfrost_job_close() · 7d6763ab

由 Boris Brezillon 提交于 11月 01, 2020

Commit a17d609e ("drm/panfrost: Don't corrupt the queue mutex on
open/close") left unused variables behind, thus generating a warning
at compilation time. Remove those variables.

Fixes: a17d609e ("drm/panfrost: Don't corrupt the queue mutex on open/close")
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201101173817.831769-1-boris.brezillon@collabora.com

7d6763ab

30 10月, 2020 1 次提交

drm/panfrost: Don't corrupt the queue mutex on open/close · a17d609e

由 Steven Price 提交于 10月 29, 2020

The mutex within the panfrost_queue_state should have the lifetime of
the queue, however it was erroneously initialised/destroyed during
panfrost_job_{open,close} which is called every time a client
opens/closes the drm node.

Move the initialisation/destruction to panfrost_job_{init,fini} where it
belongs.

Fixes: 1a11a88c ("drm/panfrost: Fix job timeout handling")
Signed-off-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NBoris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201029170047.30564-1-steven.price@arm.com

a17d609e

08 10月, 2020 1 次提交

drm/panfrost: Fix job timeout handling · 1a11a88c

由 Boris Brezillon 提交于 10月 02, 2020

If more than two jobs end up timeout-ing concurrently, only one of them
(the one attached to the scheduler acquiring the lock) is fully handled.
The other one remains in a dangling state where it's no longer part of
the scheduling queue, but still blocks something in scheduler, leading
to repetitive timeouts when new jobs are queued.

Let's make sure all bad jobs are properly handled by the thread
acquiring the lock.

v3:
- Add Steven's R-b
- Don't take the sched_lock when stopping the schedulers

v2:
- Fix the subject prefix
- Stop the scheduler before returning from panfrost_job_timedout()
- Call cancel_delayed_work_sync() after drm_sched_stop() to make sure
  no timeout handlers are in flight when we reset the GPU (Steven Price)
- Make sure we release the reset lock before restarting the
  schedulers (Steven Price)

Fixes: f3ba9122 ("drm/panfrost: Add initial panfrost driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002122506.1374183-1-boris.brezillon@collabora.com

1a11a88c

08 8月, 2020 2 次提交

drm/panfrost: introduce panfrost_devfreq struct · 9bfacfc8

由 Clément Péron 提交于 7月 10, 2020

Introduce a proper panfrost_devfreq to deal with devfreq variables.
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: NClément Péron <peron.clem@gmail.com>
Signed-off-by: NRob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200710095409.407087-5-peron.clem@gmail.com

9bfacfc8

drm/panfrost: don't use pfdevfreq.busy_count to know if hw is idle · eb9dd672

由 Clément Péron 提交于 7月 10, 2020

This use devfreq variable that will be lock with spinlock in future
patches. We should either introduce a function to access this one
but as devfreq is optional let's just remove it.
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: NClément Péron <peron.clem@gmail.com>
Signed-off-by: NRob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200710095409.407087-4-peron.clem@gmail.com

eb9dd672

19 6月, 2020 2 次提交

drm/panfrost: Fix runtime PM imbalance on error · 64092598

由 Dinghao Liu 提交于 5月 22, 2020

The caller expects panfrost_job_hw_submit() to increase
runtime PM usage counter. The refcount decrement on the
error branch of WARN_ON() will break the counter balance
and needs to be removed.
Signed-off-by: NDinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200522134109.27204-1-dinghao.liu@zju.edu.cn

64092598

drm/panfrost: Fix inbalance of devfreq record_busy/idle() · b99773ef

由 Steven Price 提交于 5月 22, 2020

The calls to panfrost_devfreq_record_busy() and
panfrost_devfreq_record_idle() must be balanced to ensure that the
devfreq utilisation is correctly reported. But there are two cases where
this doesn't work correctly.

In panfrost_job_hw_submit() if pm_runtime_get_sync() fails or the
WARN_ON() fires then no call to panfrost_devfreq_record_busy() is made,
but when the job times out the corresponding _record_idle() call is
still made in panfrost_job_timedout(). Move the call up to ensure that
it always happens.

Secondly panfrost_job_timedout() only makes a single call to
panfrost_devfreq_record_idle() even if it is cleaning up multiple jobs.
Move the call inside the loop to ensure that the number of
_record_idle() calls matches the number of _record_busy() calls.

Fixes: 9e62b885 ("drm/panfrost: Simplify devfreq utilisation tracking")
Acked-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200522153653.40754-1-steven.price@arm.com

b99773ef

20 5月, 2020 1 次提交

drm/panfrost: remove _unlocked suffix in drm_gem_object_put_unlocked · 496d0cc6

由 Emil Velikov 提交于 5月 15, 2020

Spelling out _unlocked for each and every driver is a annoying.
Especially if we consider how many drivers, do not know (or need to)
about the horror stories involving struct_mutex.

Just drop the suffix. It makes the API cleaner.

Done via the following script:

__from=drm_gem_object_put_unlocked
__to=drm_gem_object_put
for __file in $(git grep --name-only $__from); do
  sed -i  "s/$__from/$__to/g" $__file;
done

Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: NEmil Velikov <emil.velikov@collabora.com>
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Reviewed-by: NSteven Price <steven.price@arm.com>
Acked-by: NThomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515095118.2743122-28-emil.l.velikov@gmail.com

496d0cc6

13 2月, 2020 1 次提交

drm/panfrost: Remove set but not used variable 'bo' · fe154a24

由 YueHaibing 提交于 2月 03, 2020

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/panfrost/panfrost_job.c: In function 'panfrost_job_cleanup':
drivers/gpu/drm/panfrost/panfrost_job.c:278:31: warning:
 variable 'bo' set but not used [-Wunused-but-set-variable]

commit bdefca2d ("drm/panfrost: Add the panfrost_gem_mapping concept")
involved this unused variable.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NAlyssas Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: NRob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200203152724.42611-1-yuehaibing@huawei.com

fe154a24

03 2月, 2020 1 次提交

drm/panfrost: Make sure the shrinker does not reclaim referenced BOs · 7e0cf7e9

由 Boris Brezillon 提交于 11月 29, 2019

Userspace might tag a BO purgeable while it's still referenced by GPU
jobs. We need to make sure the shrinker does not purge such BOs until
all jobs referencing it are finished.

Fixes: 013b6510 ("drm/panfrost: Add madvise and shrinker support")
Cc: <stable@vger.kernel.org>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Signed-off-by: NRob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129135908.2439529-9-boris.brezillon@collabora.com

7e0cf7e9

22 1月, 2020 2 次提交

drm/panfrost: Add the panfrost_gem_mapping concept · bdefca2d

由 Boris Brezillon 提交于 1月 15, 2020

With the introduction of per-FD address space, the same BO can be mapped
in different address space if the BO is globally visible (GEM_FLINK)
and opened in different context or if the dmabuf is self-imported. The
current implementation does not take case into account, and attaches the
mapping directly to the panfrost_gem_object.

Let's create a panfrost_gem_mapping struct and allow multiple mappings
per BO.

The mappings are refcounted which helps solve another problem where
mappings were torn down (GEM handle closed by userspace) while GPU
jobs accessing those BOs were still in-flight. Jobs now keep a
reference on the mappings they use.

v2 (robh):
- Minor review comment clean-ups from Steven
- Use list_is_singular helper
- Just WARN if we add a mapping when madvise state is not WILLNEED.
  With that, drop the use of object_name_lock.

v3 (robh):
- Revert returning list iterator in panfrost_gem_mapping_get()

Fixes: a5efb4c9 ("drm/panfrost: Restructure the GEM object creation")
Fixes: 7282f764 ("drm/panfrost: Implement per FD address spaces")
Cc: <stable@vger.kernel.org>
Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: NRob Herring <robh@kernel.org>
Acked-by: NBoris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116021554.15090-1-robh@kernel.org

bdefca2d

drm/panfrost: Prefix interrupt handlers' names · 73896f60

由 Ezequiel Garcia 提交于 12月 14, 2019

Currently, the interrupt lines requested by Panfrost
use unmeaningful names, which adds some obscurity
to interrupt introspection (i.e. any tool based
on procfs' interrupts file).

In order to improve this, prefix each requested
interrupt with the module name: panfrost-{gpu,job,mmu}.
Signed-off-by: NEzequiel Garcia <ezequiel@collabora.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: NRob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191214045952.9452-1-ezequiel@collabora.com

73896f60

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功