- 07 12月, 2017 2 次提交
-
-
由 Xiangliang.Yu 提交于
Do ring clear before ring test, otherwise compute ring test will fail after gpu resetting. Still can't find the root cause, just workaround it. Signed-off-by: NXiangliang.Yu <Xiangliang.Yu@amd.com> Acked-by: NMonk Liu <Monk.Liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Pixel Ding 提交于
Retry at drm_dev_register instead of amdgpu_device_init. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NPixel Ding <Pixel.Ding@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 12月, 2017 38 次提交
-
-
由 Monk Liu 提交于
Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
It introduces 900ms latency in exclusive mode which causes failure of driver loading. Host can resize the BAR before guest staring, so the resizing is not necessary here. Signed-off-by: NPixel Ding <Pixel.Ding@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
Signed-off-by: Npding <Pixel.Ding@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
Also finalize kfd inside amdgpu_device_fini. kfd device_init needs SRIOV exclusive accessing. Try to gather exclusive accessing to reduce time consuming. Signed-off-by: Npding <Pixel.Ding@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Just allocate the GART space and fill it. This prevents forcing the BO to be idle. v2: don't unbind/bind at all, just fill the allocated GART space Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We actually don't bind here, but rather allocate GART space if necessary. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Try to resize BAR0 to let CPU access all of VRAM. v2: rebased, style cleanups, disable mem decode before resize, handle gmc_v9 as well, round size up to power of two. v3: handle gmc_v6 as well, release and reassign all BARs in the driver. v4: rename new function to amdgpu_device_resize_fb_bar, reenable mem decoding only if all resources are assigned. v5: reorder resource release, return -ENODEV instead of BUG_ON(). v6: squash in rebase fix Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Horace Chen 提交于
The previous solution will create a zero buffer on the system domain and then move the zeroes to the VRAM. This will break the original data on the VRAM. Refine the code to create bo on VRAM domain directly and then remove and re-create mem node to the exact position before bo_pin. This can avoid breaking the data and will not cause eviction. Signed-off-by: NHorace Chen <horace.chen@amd.com> Reviewed-by: Nmonk liu <monk.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
This is caused of that hypervisor fails to handle request, one known issue is MMIO unblocking timeout. In theory we can retry init here. Signed-off-by: Npding <Pixel.Ding@amd.com> Reviewed-by: NXiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
Reported-by: NSun Gary <Gary.Sun@amd.com> Signed-off-by: Npding <Pixel.Ding@amd.com> Reviewed-by: NXiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Michel Dänzer 提交于
Not used anymore. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Michel Dänzer 提交于
Hardcoding the maximum numbers could result in spurious error messages from the IRQ state callbacks, e.g. on Polaris 11/12: [drm:dce_v11_0_set_pageflip_irq_state [amdgpu]] *ERROR* invalid pageflip crtc 5 [drm:amdgpu_irq_disable_all [amdgpu]] *ERROR* error disabling interrupt (-22) Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
The GTT manager handles the GART address space anyway, so it is completely pointless to keep the same information around twice. v2: rebased Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Rename amdgpu_gtt_mgr_is_allocated() to amdgpu_gtt_mgr_has_gart_addr() and use that instead. v2: rename the function as well. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
for SR-IOV when doing gpu reset this routine shouldn't do resource allocating otherwise memory leak Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
1,no sriov check since gpu recover is unified 2,need CPU_ACCESS_REQUIRED flag for VRAM if SRIOV because otherwise after following PIN the first allocated VRAM bo is wasted due to some TTM mgr reason. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
since now gpu reset is unified with gpu_recover for both bare-metal and SR-IOV: 1)rename in_sriov_reset to in_gpu_reset 2)move lock_reset from adev->virt to adev Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
1,new imple names amdgpu_gpu_recover which gives more hint on what it does compared with gpu_reset 2,gpu_recover unify bare-metal and SR-IOV, only the asic reset part is implemented differently 3,gpu_recover will increase hang job karma and mark its entity/context as guilty if exceeds limit V2: 4,in scheduler main routine the job from guilty context will be immedialy fake signaled after it poped from queue and its fence be set with "-ECANCELED" error 5,in scheduler recovery routine all jobs from the guilty entity would be dropped 6,in run_job() routine the real IB submission would be skipped if @skip parameter equales true or there was VRAM lost occured. V3: 7,replace deprecated gpu reset, use new gpu recover Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
jobs are skipped under two cases 1)when the entity behind this job marked guilty, the job poped from this entity's queue will be dropped in sched_main loop. 2)in job_recovery(), skip the scheduling job if its karma detected above limit, and also skipped as well for other jobs sharing the same fence context. this approach is becuase job_recovery() cannot access job->entity due to entity may already dead. v2: some logic fix v3: when entity detected guilty, don't drop the job in the poping stage, instead set its fence error as -ECANCELED in run_job(), skip the scheduling either:1) fence->error < 0 or 2) there was a VRAM LOST occurred on this job. this way we can unify the job skipping logic. with this feature we can introduce new gpu recover feature. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
That was somehow completely of. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
the variable ref_clock was assigned same value twice in same function. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
Used to set up smu power logging. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
move powerplay and amdgpu shared structures and definitions to kgd_pp_interface.h. This is the interface between the base driver and powerplay. Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
Clean up the interface. Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We always use the BO mem now. v2: minor rebase Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Display can't seem to handle this correctly. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We need to test if any domain fits, not all of them. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We always need to bind pinned BOs, not just when the caller requested the address. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
This allows us to specify multiple possible placements again. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
The exclusive mode has real-time limitation in reality, such like being done in 300ms. It's easy observed if running many VF/VMs in single host with heavy CPU workload. If we find the init fails due to exclusive mode timeout, try it again. v2: - rewrite the condition for readable value. v3: - fix typo, add comments for sleep Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: Npding <Pixel.Ding@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
Reviewed-by: NMonk Liu <monk.liu@amd.com> Signed-off-by: Npding <Pixel.Ding@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
Driver can use this interface to check if there's a function level reset done in hypervisor. It's helpful when IRQ handler for reset is not ready, or special handling is required. Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NMonk Liu <monk.liu@amd.com> Signed-off-by: Npding <Pixel.Ding@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
MMIO space can be blocked on virtualised device. Add this function to check if MMIO is blocked or not. Todo: need a reliable method such like communation with hypervisor. v2: - add comments inline Signed-off-by: Npding <Pixel.Ding@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
Normally all waiting get timeout if there's one. Release the lock and return immediately when timeout happens. v2: - set the se_sh to broadcase before return Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: Npding <Pixel.Ding@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pding 提交于
When this VF stays in exclusive mode for long, other VFs will be impacted. The redundant messages causes exclusive mode timeout when they're redirected. That is a normal use case for cloud service to redirect guest log to virtual serial port. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: Npding <Pixel.Ding@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-