- 11 6月, 2019 1 次提交
-
-
由 Sam Ravnborg 提交于
Drop use of drmP.h in all files named amdgpu* in drm/amd/amdgpu/ Fix fallout. Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20190609220757.10862-10-sam@ravnborg.org
-
- 25 5月, 2019 1 次提交
-
-
由 Monk Liu 提交于
only report once per TMO job and the timer would be restarted upon the job finished if it's just slow. Suggested-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 12月, 2018 1 次提交
-
-
由 Trigger Huang 提交于
When a job is timeout, try to print the related process information for debugging Signed-off-by: NTrigger Huang <Trigger.Huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 11月, 2018 1 次提交
-
-
由 Sharat Masetty 提交于
This patch adds a new API to clean up the scheduler job resources. This is primarliy needed in cases the job was created but was not queued to the scheduler queue. Additionally with this change, the layer which creates the scheduler job also gets to free up the job's resources and this entails moving the dma_fence_put(finished_fence) to the drivers ops free handler routines. Signed-off-by: NSharat Masetty <smasetty@codeaurora.org> Reviewed-by: NChristian König <christian.koenig@amd.com> Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 9月, 2018 1 次提交
-
-
由 Andrey Grodzovsky 提交于
After GPU reset amdgpu_vm_clear_bo triggers VM flush but job->vm_pd_addr is not set causing SDMA TO. v2: Per advise by Christian König avoid flushing VM for jobs where job->vm_pd_addr wasn't explicitly set. v3: Shortcut vm_flush_needed early. Fixes cbd52851 drm/amdgpu: move setting the GART addr into TTM. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 8月, 2018 3 次提交
-
-
由 Christian König 提交于
Instead of hammering hard on the GPU try a soft recovery first. v2: reorder code a bit v3: increase timeout to 10ms, increment GPU reset counter v4: squash in compile fix (Christian) Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com>
-
由 Christian König 提交于
Move setting the GART addr for window based copies into the TTM code who uses it. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Check if we should call the function instead of providing the forced flag. v2: rebase on KFD changes (Alex) Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 7月, 2018 2 次提交
-
-
由 Nayan Deshmukh 提交于
The scheduler of the entity is decided by the run queue on which it is queued. This patch avoids us the effort required to maintain a sync between rq and sched field when we start shifting entites among different rqs. Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nayan Deshmukh 提交于
entity has a scheduler field and we don't need the sched argument in any of the functions where entity is provided. Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 19 7月, 2018 1 次提交
-
-
由 Christian König 提交于
Pushing a job can change the ring assignment of an entity. v2: squash in: "drm/amdgpu: fix job priority handling" (Christian) Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 7月, 2018 2 次提交
-
-
由 Christian König 提交于
Remove superflous NULL check, fix coding style a bit, shorten error messages. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We can get that from the ring. v2: squash in "drm/amdgpu: always initialize job->base.sched" (Alex) Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 7月, 2018 4 次提交
-
-
由 Christian König 提交于
Make sure that we properly initialize at least the sched member. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We can easily get that from the scheduler. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We know the ring through the entity anyway. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Can be obtained directly from the fence as well. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 12月, 2017 2 次提交
-
-
由 Christian König 提交于
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.c sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.h Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Move both into the new files amdgpu_ids.[ch]. No functional change. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 12月, 2017 1 次提交
-
-
由 Alex Deucher 提交于
add device to the name for consistency. Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 12月, 2017 1 次提交
-
-
由 Andrey Grodzovsky 提交于
Add new parameter to control GPU recovery procedure. v2: Add auto logic where reset is disabled for bare metal and enabled for SR-IOV. Allow forced reset from debugfs. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 12月, 2017 1 次提交
-
-
由 Lucas Stach 提交于
This moves and renames the AMDGPU scheduler to a common location in DRM in order to facilitate re-use by other drivers. This is mostly a straight forward rename with no code changes. One notable exception is the function to_drm_sched_fence(), which is no longer a inline header function to avoid the need to export the drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures. Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 12月, 2017 1 次提交
-
-
由 Andrey Grodzovsky 提交于
Instead mark fence as explicit in it's amdgpu_sync_entry. v2: Fix use after free bug and add new parameter description. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 12月, 2017 4 次提交
-
-
由 Monk Liu 提交于
1,new imple names amdgpu_gpu_recover which gives more hint on what it does compared with gpu_reset 2,gpu_recover unify bare-metal and SR-IOV, only the asic reset part is implemented differently 3,gpu_recover will increase hang job karma and mark its entity/context as guilty if exceeds limit V2: 4,in scheduler main routine the job from guilty context will be immedialy fake signaled after it poped from queue and its fence be set with "-ECANCELED" error 5,in scheduler recovery routine all jobs from the guilty entity would be dropped 6,in run_job() routine the real IB submission would be skipped if @skip parameter equales true or there was VRAM lost occured. V3: 7,replace deprecated gpu reset, use new gpu recover Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
jobs are skipped under two cases 1)when the entity behind this job marked guilty, the job poped from this entity's queue will be dropped in sched_main loop. 2)in job_recovery(), skip the scheduling job if its karma detected above limit, and also skipped as well for other jobs sharing the same fence context. this approach is becuase job_recovery() cannot access job->entity due to entity may already dead. v2: some logic fix v3: when entity detected guilty, don't drop the job in the poping stage, instead set its fence error as -ECANCELED in run_job(), skip the scheduling either:1) fence->error < 0 or 2) there was a VRAM LOST occurred on this job. this way we can unify the job skipping logic. with this feature we can introduce new gpu recover feature. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Bug: amdgpu_job_free_cb was accessing s_job->s_entity when the allocated amdgpu_ctx (and the entity inside it) were already deallocated from amdgpu_cs_parser_fini. Fix: Save job's priority on it's creation instead of accessing it from s_entity later on. Signed-off-by: NAndrey Grodzovsky <Andrey.Grodzovsky@amd.com> Reviewed-by: NAndres Rodriguez <andresx7@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 20 10月, 2017 3 次提交
-
-
由 Monk Liu 提交于
Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
And return from the wait functions the fence error code. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NNicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Instead of reading the current counter from fpriv. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NNicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 10月, 2017 1 次提交
-
-
由 Andres Rodriguez 提交于
Add an initial framework for changing the HW priorities of rings. The framework allows requesting priority changes for the lifetime of an amdgpu_job. After the job completes the priority will decay to the next lowest priority for which a request is still valid. A new ring function set_priority() can now be populated to take care of the HW specific programming sequence for priority changes. v2: set priority before emitting IB, and take a ref on amdgpu_job v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb v5: use atomic for tracking job priorities instead of last_job v6: rename amdgpu_ring_priority_[get/put]() and align parameters v7: replace spinlocks with mutexes for KIQ compatibility v8: raise ring priority during cs_ioctl, instead of job_run v9: priority_get() before push_job() Reviewed-by: NChristian König <christian.koenig@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAndres Rodriguez <andresx7@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 7月, 2017 1 次提交
-
-
由 Christian König 提交于
This allows us to queue IBs which needs an up to date system domain as well. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
-
- 25 5月, 2017 6 次提交
-
-
由 Chunming Zhou 提交于
The fence in dep_sync cannot be optimized. Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Tested and Reviewed-by: Roger.He <Hongbo.He@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Chunming Zhou 提交于
If the vm is guilty of a GPU reset, skips all its jobs. Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
that way we can know which job cause hang and can do per sched reset/recovery instead of all sched. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
because we don't want to do sriov-gpu-reset under certain cases, so just split those two funtion and don't invoke sr-iov one from bare-metal one. V2: remove debugfs_gpu_reset routine on SRIOV case. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Chunming Zhou 提交于
v2: directly return for 'if' case. Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Chunming Zhou 提交于
this is an improvement for previous patch, the sched_sync is to store fence that could be skipped as scheduled, when job is executed, we didn't need pipeline_sync if all fences in sched_sync are signalled, otherwise insert pipeline_sync still. v2: handle error when adding fence to sync failed. Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1) Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 5月, 2017 1 次提交
-
-
由 Chunming Zhou 提交于
The problem is that executing the jobs in the right order doesn't give you the right result because consecutive jobs executed on the same engine are pipelined. In other words job B does it buffer read before job A has written it's result. Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 4月, 2017 1 次提交
-
-
由 Chunming Zhou 提交于
[ 132.036658] amdgpu 0000:22:00.0: VM IB without ID [ 132.036709] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) [ 132.036755] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job! root cause is fence is signaled during sync transfer. Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-