提交 · 12ffa55da60f8355a5e485bc6d612257a303147e · openeuler / Kernel

11 6月, 2019 1 次提交

drm/amd: drop use of drmP.h in amdgpu/amdgpu* · fdf2f6c5

由 Sam Ravnborg 提交于 6月 10, 2019

Drop use of drmP.h in all files named amdgpu*
in drm/amd/amdgpu/

Fix fallout.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190609220757.10862-10-sam@ravnborg.org

fdf2f6c5

25 5月, 2019 1 次提交

drm/amdgpu: suppress repeating tmo report · c3b6c607

由 Monk Liu 提交于 5月 13, 2019

only report once per TMO job and the timer would
be restarted upon the job finished if it's just slow.
Suggested-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c3b6c607

21 12月, 2018 1 次提交

drm/amdgpu: print process info when job timeout · 0346bfd9

由 Trigger Huang 提交于 12月 18, 2018

When a job is timeout, try to print the related process information
for debugging
Signed-off-by: NTrigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0346bfd9

06 11月, 2018 1 次提交

drm/scheduler: Add drm_sched_job_cleanup · 26efecf9

由 Sharat Masetty 提交于 10月 29, 2018

This patch adds a new API to clean up the scheduler job resources. This
is primarliy needed in cases the job was created but was not queued to
the scheduler queue. Additionally with this change, the layer which
creates the scheduler job also gets to free up the job's resources and
this entails moving the dma_fence_put(finished_fence) to the drivers
ops free handler routines.
Signed-off-by: NSharat Masetty <smasetty@codeaurora.org>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

26efecf9

12 9月, 2018 1 次提交

drm/amdgpu: Fix SDMA TO after GPU reset v3 · d8de8260

由 Andrey Grodzovsky 提交于 9月 10, 2018

After GPU reset amdgpu_vm_clear_bo triggers VM flush
but job->vm_pd_addr is not set causing SDMA TO.

v2:
Per advise by Christian König avoid flushing VM for jobs where
job->vm_pd_addr wasn't explicitly set.

v3:
Shortcut vm_flush_needed early.

Fixes cbd52851 drm/amdgpu: move setting the GART addr into TTM.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d8de8260

28 8月, 2018 3 次提交

drm/amdgpu: add ring soft recovery v4 · 7876fa4f

由 Christian König 提交于 8月 21, 2018

Instead of hammering hard on the GPU try a soft recovery first.

v2: reorder code a bit
v3: increase timeout to 10ms, increment GPU reset counter
v4: squash in compile fix (Christian)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>

7876fa4f

drm/amdgpu: move setting the GART addr into TTM · cbd52851

由 Christian König 提交于 8月 21, 2018

Move setting the GART addr for window based copies into the TTM code who
uses it.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cbd52851

drm/amdgpu: cleanup GPU recovery check a bit (v2) · 12938fad

由 Christian König 提交于 8月 21, 2018

Check if we should call the function instead of providing the forced
flag.

v2: rebase on KFD changes (Alex)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

12938fad

26 7月, 2018 2 次提交

drm/scheduler: remove sched field from the entity · 068c3304

由 Nayan Deshmukh 提交于 7月 20, 2018

The scheduler of the entity is decided by the run queue on which
it is queued. This patch avoids us the effort required to maintain
a sync between rq and sched field when we start shifting entites
among different rqs.
Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

068c3304

drm/scheduler: modify API to avoid redundancy · cdc50176

由 Nayan Deshmukh 提交于 7月 20, 2018

entity has a scheduler field and we don't need the sched argument
in any of the functions where entity is provided.
Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cdc50176

19 7月, 2018 1 次提交

drm/amdgpu: change ring priority after pushing the job (v2) · b5286801

由 Christian König 提交于 7月 16, 2018

Pushing a job can change the ring assignment of an entity.

v2: squash in:
"drm/amdgpu: fix job priority handling" (Christian)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b5286801

18 7月, 2018 2 次提交

drm/amdgpu: minor cleanup in amdgpu_job.c · f024e883

由 Christian König 提交于 7月 13, 2018

Remove superflous NULL check, fix coding style a bit, shorten error
messages.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f024e883

drm/amdgpu: remove job->adev (v2) · a1917b73

由 Christian König 提交于 7月 13, 2018

We can get that from the ring.

v2: squash in "drm/amdgpu: always initialize job->base.sched" (Alex)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a1917b73

17 7月, 2018 4 次提交

drm/amdgpu: add amdgpu_job_submit_direct helper · ee913fd9

由 Christian König 提交于 7月 13, 2018

Make sure that we properly initialize at least the sched member.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ee913fd9

drm/amdgpu: remove job->ring · 3320b8d2

由 Christian König 提交于 7月 13, 2018

We can easily get that from the scheduler.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3320b8d2

drm/amdgpu: remove ring parameter from amdgpu_job_submit · 0e28b10f

由 Christian König 提交于 7月 13, 2018

We know the ring through the entity anyway.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0e28b10f

drm/amdgpu: remove fence context from the job · eb3961a5

由 Christian König 提交于 7月 13, 2018

Can be obtained directly from the fence as well.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

eb3961a5

28 12月, 2017 2 次提交

drm/amdgpu: rename vm_id to vmid · c4f46f22

由 Christian König 提交于 12月 18, 2017

sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.h
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c4f46f22

drm/amdgpu: separate VMID and PASID handling · 620f774f

由 Christian König 提交于 12月 18, 2017

Move both into the new files amdgpu_ids.[ch]. No functional change.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

620f774f

18 12月, 2017 1 次提交

drm/amdgpu: rename amdgpu_gpu_recover · 5f152b5e

由 Alex Deucher 提交于 12月 15, 2017

add device to the name for consistency.
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5f152b5e

16 12月, 2017 1 次提交

drm/amdgpu: Add gpu_recovery parameter · dcebf026

由 Andrey Grodzovsky 提交于 12月 12, 2017

Add new parameter to control GPU recovery procedure.

v2:
Add auto logic where reset is disabled for bare metal and enabled
for SR-IOV.
Allow forced reset from debugfs.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dcebf026

08 12月, 2017 1 次提交

drm: move amd_gpu_scheduler into common location · 1b1f42d8

由 Lucas Stach 提交于 12月 06, 2017

This moves and renames the AMDGPU scheduler to a common location in DRM
in order to facilitate re-use by other drivers. This is mostly a straight
forward rename with no code changes.

One notable exception is the function to_drm_sched_fence(), which is no
longer a inline header function to avoid the need to export the
drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures.
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1b1f42d8

07 12月, 2017 1 次提交

drm/amdgpu: Get rid of dep_sync as a seperate object. · cebb52b7

由 Andrey Grodzovsky 提交于 11月 13, 2017

Instead mark fence as explicit in it's amdgpu_sync_entry.

v2:
Fix use after free bug and add new parameter description.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cebb52b7

05 12月, 2017 4 次提交

drm/amdgpu:implement new GPU recover(v3) · 5740682e

由 Monk Liu 提交于 10月 25, 2017

1,new imple names amdgpu_gpu_recover which gives more hint
on what it does compared with gpu_reset

2,gpu_recover unify bare-metal and SR-IOV, only the asic reset
part is implemented differently

3,gpu_recover will increase hang job karma and mark its entity/context
as guilty if exceeds limit

V2:

4,in scheduler main routine the job from guilty context  will be immedialy
fake signaled after it poped from queue and its fence be set with
"-ECANCELED" error

5,in scheduler recovery routine all jobs from the guilty entity would be
dropped

6,in run_job() routine the real IB submission would be skipped if @skip parameter
equales true or there was VRAM lost occured.

V3:

7,replace deprecated gpu reset, use new gpu recover
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5740682e

amd/scheduler:imple job skip feature(v3) · 48f05f29

由 Monk Liu 提交于 10月 25, 2017

jobs are skipped under two cases
1)when the entity behind this job marked guilty, the job
poped from this entity's queue will be dropped in sched_main loop.

2)in job_recovery(), skip the scheduling job if its karma detected
above limit, and also skipped as well for other jobs sharing the
same fence context. this approach is becuase job_recovery() cannot
access job->entity due to entity may already dead.

v2:
some logic fix

v3:
when entity detected guilty, don't drop the job in the poping
stage, instead set its fence error as -ECANCELED

in run_job(), skip the scheduling either:1) fence->error < 0
or 2) there was a VRAM LOST occurred on this job.
this way we can unify the job skipping logic.

with this feature we can introduce new gpu recover feature.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

48f05f29

drm/amdgpu: Remove job->s_entity to avoid keeping reference to stale pointer. · a4176cb4

由 Andrey Grodzovsky 提交于 10月 24, 2017

Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a4176cb4

drm/amdgpu: Avoid accessing job->entity after the job is scheduled. · d1f6dc1a

由 Andrey Grodzovsky 提交于 10月 19, 2017

Bug: amdgpu_job_free_cb was accessing s_job->s_entity when the allocated
amdgpu_ctx (and the entity inside it) were already deallocated from
amdgpu_cs_parser_fini.

Fix: Save job's priority on it's creation instead of accessing it from
s_entity later on.
Signed-off-by: NAndrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: NAndres Rodriguez <andresx7@gmail.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d1f6dc1a

20 10月, 2017 3 次提交

drm/amdgpu:fix duplicated setting job's vram_lost · c70b78a7

由 Monk Liu 提交于 10月 16, 2017

Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c70b78a7

drm/amdgpu: set -ECANCELED when dropping jobs · 7a0a48dd

由 Christian König 提交于 10月 09, 2017

And return from the wait functions the fence error code.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NNicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7a0a48dd

drm/amdgpu: keep copy of VRAM lost counter in job · 14e47f93

由 Christian König 提交于 10月 09, 2017

Instead of reading the current counter from fpriv.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NNicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

14e47f93

10 10月, 2017 1 次提交

drm/amdgpu: add framework for HW specific priority settings v9 · b2ff0e8a

由 Andres Rodriguez 提交于 2月 20, 2017

Add an initial framework for changing the HW priorities of rings. The
framework allows requesting priority changes for the lifetime of an
amdgpu_job. After the job completes the priority will decay to the next
lowest priority for which a request is still valid.

A new ring function set_priority() can now be populated to take care of
the HW specific programming sequence for priority changes.

v2: set priority before emitting IB, and take a ref on amdgpu_job
v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_*
v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb
v5: use atomic for tracking job priorities instead of last_job
v6: rename amdgpu_ring_priority_[get/put]() and align parameters
v7: replace spinlocks with mutexes for KIQ compatibility
v8: raise ring priority during cs_ioctl, instead of job_run
v9: priority_get() before push_job()
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAndres Rodriguez <andresx7@gmail.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b2ff0e8a

14 7月, 2017 1 次提交

drm/amdgpu: allow flushing VMID0 before IB execution as well · df264f9e

由 Christian König 提交于 6月 28, 2017

This allows us to queue IBs which needs an up to date system domain as well.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>

df264f9e

25 5月, 2017 6 次提交

drm/amdgpu: add dep_sync for amdgpu job · a340c7bc

由 Chunming Zhou 提交于 5月 18, 2017

The fence in dep_sync cannot be optimized.
Signed-off-by: NChunming Zhou <David1.Zhou@amd.com>
Tested and Reviewed-by: Roger.He <Hongbo.He@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a340c7bc

drm/amdgpu: skip all jobs of guilty vm · 15d73ce6

由 Chunming Zhou 提交于 5月 16, 2017

If the vm is guilty of a GPU reset, skips all its jobs.
Signed-off-by: NChunming Zhou <David1.Zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

15d73ce6

drm/amdgpu:use job* to replace voluntary · 7225f873

由 Monk Liu 提交于 4月 26, 2017

that way we can know which job cause hang and
can do per sched reset/recovery instead of all
sched.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7225f873

drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset (v2) · 4fbf87e2

由 Monk Liu 提交于 5月 05, 2017

because we don't want to do sriov-gpu-reset under certain
cases, so just split those two funtion and don't invoke
sr-iov one from bare-metal one.

V2:
remove debugfs_gpu_reset routine on SRIOV case.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4fbf87e2

drm/amdgpu: make pipeline sync be in same place v2 · b9bf33d5

由 Chunming Zhou 提交于 5月 11, 2017

v2: directly return for 'if' case.
Signed-off-by: NChunming Zhou <David1.Zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b9bf33d5

drm/amdgpu: add sched sync for amdgpu job v2 · df83d1eb

由 Chunming Zhou 提交于 5月 09, 2017

this is an improvement for previous patch, the sched_sync is to store fence
that could be skipped as scheduled, when job is executed, we didn't need
pipeline_sync if all fences in sched_sync are signalled, otherwise insert
pipeline_sync still.

v2: handle error when adding fence to sync failed.
Signed-off-by: NChunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1)
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

df83d1eb

11 5月, 2017 1 次提交

drm/amdgpu: fix dependency issue · 30514dec

由 Chunming Zhou 提交于 5月 09, 2017

The problem is that executing the jobs in the right order doesn't give you the right result
because consecutive jobs executed on the same engine are pipelined.
In other words job B does it buffer read before job A has written it's result.
Signed-off-by: NChunming Zhou <David1.Zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

30514dec

29 4月, 2017 1 次提交

drm/amdgpu: fix no-vmid job · 6c98d31e

由 Chunming Zhou 提交于 4月 21, 2017

[  132.036658] amdgpu 0000:22:00.0: VM IB without ID
[  132.036709] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[  132.036755] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!

root cause is fence is signaled during sync transfer.
Signed-off-by: NChunming Zhou <David1.Zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6c98d31e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功