- 23 4月, 2020 22 次提交
-
-
由 Christian König 提交于
In pp_one_vf mode avoid the extra overhead and read/write the registers without the KIQ. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMonk Liu <monk.liu@amd.com> Acked-by: NYintian Tao <yintian.tao@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
If set error query ready in amdgpu_ras_late_init, which will cause some IP blocks aren't initialized, but their error query is ready. Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Make code more readable. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
This is basically just some code cosmetic. The current design for XGMI setup gput reset is to operate on current device(adev) first and then on other devices from the hive(by another 'for' loop). But actually we can do some sort to the device list(to put current device 1st position) and handle all the devices in a single 'for' loop. V2: added missing hive->hive_lock protection Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
As for XGMI setup, it should be performed on other devices from the hive also. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
As for XGMI setup, it needs to be performed on all the devices from the same hive. Signed-off-by: NEvan Quan <evan.quan@amd.com> Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Bernard Zhao 提交于
Make the code a bit more readable by using a common error handling pattern. Signed-off-by: NBernard Zhao <bernard@vivo.com> Reviewed-by: Christian König <christian.koenig@amd.com>. Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kevin Wang 提交于
clean up unused variable: 1. ring_lru_list 2. ring_lru_list_lock related-commit: drm/amdgpu: remove ring lru handling Signed-off-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
Prefix RAS message printing in gfx/mmhub with PCI device info, which assists the debug in multiple GPU case. Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Signed-off-by: NDennis Li <Dennis.Li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jiawei 提交于
disble vblank in dce_vitual_crtc_commit(), which is skipped under sriov before Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NJiawei <Jiawei.Gu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yong Zhao 提交于
This is convenient for multiple teams to obtain the information. Also, add device info by using dev_info(). Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NDennis Li <Dennis.Li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yong Zhao 提交于
Turn off the printing by default because it is not very useful, while adding more details. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jonathan Kim 提交于
Vega20 arbitrates pstate at hive level and not device level. Last peer to remote buffer unmap could drop P-State while another process is still remote buffer mapped. With this fix, P-States still needs to be disabled for now as SMU bug was discovered on synchronous P2P transfers. This should be fixed in the next FW update. Signed-off-by: NJonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 James Zhu 提交于
This reverts commit 3fded222 This is work around for vcn1 only. Currently vcn1 has separate begin_use and idle work handle. Signed-off-by: NJames Zhu <James.Zhu@amd.com> Tested-by: Nchangzhu <Changfeng.Zhu@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
When running ras uncorrectable error injection and triggering GPU reset on sGPU, below issue is observed. It's caused by the list uninitialized when accessing. [ 80.047227] BUG: unable to handle page fault for address: ffffffffc0f4f750 [ 80.047300] #PF: supervisor write access in kernel mode [ 80.047351] #PF: error_code(0x0003) - permissions violation [ 80.047404] PGD 12c20e067 P4D 12c20e067 PUD 12c210067 PMD 41c4ee067 PTE 404316061 [ 80.047477] Oops: 0003 [#1] SMP PTI [ 80.047516] CPU: 7 PID: 377 Comm: kworker/7:2 Tainted: G OE 5.4.0-rc7-guchchen #1 [ 80.047594] Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 [ 80.047888] Workqueue: events amdgpu_ras_do_recovery [amdgpu] Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NJohn Clements <John.Clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
Update the list with supported Arcturus chips, but disable for now until final list is confirmed. Ideally we can poll atombios for FRU support, instead of maintaining this list of chips, but this will enable serial number reading for supported ASICs for the time-being. Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rajneesh Bhardwaj 提交于
Fixes a minor typo in the file. Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
This reverts commit c12b84d6. The original patch causes a RAS event and subsequent kernel hard-hang when running the KFDMemoryTest.PtraceAccessInvisibleVram on VG20 and Arcturus dmesg output at hang time: [drm] RAS event of type ERREVENT_ATHUB_INTERRUPT detected! amdgpu 0000:67:00.0: GPU reset begin! Evicting PASID 0x8000 queues Started evicting pasid 0x8000 qcm fence wait loop timeout expired The cp might be in an unrecoverable state due to an unsuccessful queues preemption Failed to evict process queues Failed to suspend process 0x8000 Finished evicting pasid 0x8000 Started restoring pasid 0x8000 Finished restoring pasid 0x8000 [drm] UVD VCPU state may lost due to RAS ERREVENT_ATHUB_INTERRUPT amdgpu: [powerplay] Failed to send message 0x26, response 0x0 amdgpu: [powerplay] Failed to set soft min gfxclk ! amdgpu: [powerplay] Failed to upload DPM Bootup Levels! amdgpu: [powerplay] Failed to send message 0x7, response 0x0 amdgpu: [powerplay] [DisableAllSMUFeatures] Failed to disable all smu features! amdgpu: [powerplay] [DisableDpmTasks] Failed to disable all smu features! amdgpu: [powerplay] [PowerOffAsic] Failed to disable DPM! [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <powerplay> failed -5 Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Fix screen corruption with firefox. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=207171Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 John Clements 提交于
Set MP1 state to prepare for unload before reloading SMU FW Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 John Clements 提交于
Added dedicated function to check if particular fw should be skipped from loading. Added dedicated function for SMU FW loading via PSP Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Prike Liang 提交于
The system reboot failed as some IP blocks enter power gate before perform hw resource destory. Meanwhile use unify interface to set device CGPG to ungate state can simplify the amdgpu poweroff or reset ungate guard. Fixes: 487eca11 ("drm/amdgpu: fix gfx hang during suspend with video playback (v2)") Signed-off-by: NPrike Liang <Prike.Liang@amd.com> Tested-by: NMengbing Wang <Mengbing.Wang@amd.com> Tested-by: NPaul Menzel <pmenzel@molgen.mpg.de> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 4月, 2020 13 次提交
-
-
由 Jason Yan 提交于
This code is dead, let's remove it. Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Aurabindo Pillai 提交于
Let format prefixes take care of printing the module name through pr_fmt and dev_fmt definitions. Signed-off-by: NAurabindo Pillai <mail@aurabindo.in> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Aurabindo Pillai 提交于
Define dev_fmt macro for informative print messages Signed-off-by: NAurabindo Pillai <mail@aurabindo.in> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Aurabindo Pillai 提交于
amdgpu uses lots of pr_* calls for printing error messages. With this prefix, errors shall be more obvious to the end use regarding its origin, and may help debugging. Prefix format: [xxx.xxxxx] amdgpu: ... Signed-off-by: NAurabindo Pillai <mail@aurabindo.in> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Set up a GPU scheduler based on the ring flag rather than the ring type. Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
We don't want a GPU scheduler for this ring. Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
This allows IPs to flag whether a specific ring requires a GPU scheduler or not. E.g., sometimes instances of an IP are asymmetric and have different capabilities. Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Vram lost counter is wrongly increased by two during baco reset. V2: assumed vram lost for mode1 reset on all ASICs Signed-off-by: NEvan Quan <evan.quan@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Prefix RAS message printing in GFX IP with PCI device info, which assists the debug in multiple GPU case. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yintian Tao 提交于
If there is no GPU hang, user still can access debugfs through kiq. Signed-off-by: NYintian Tao <yttao@amd.com> Reviewed-by: NMonk Liu <Monk.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Prefix ras related kernel message logging with PCI device info by replacing DRM_INFO/WARN/ERROR with dev_info/warn/err. This can clearly tell user about GPU device information where ras is. And add some other ras message printing to make it more clear and friendly as well. Suggested-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Uncorrectable error count printing is missed when issuing UMC UE injection. When going to the error count log function in GPU recover work thread, there is no chance to get correct error count value by last error injection and print, because the error status register is automatically cleared after reading in UMC ecc irq callback. So add such message printing in UMC ecc irq cb to be consistent with other RAS error interrupt cases. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yintian Tao 提交于
Under bare metal, there is no more else to take care of the GPU register access through MMIO. Under Virtualization, to access GPU register is implemented through KIQ during run-time due to world-switch. Therefore, under SR-IOV user can only access debugfs to r/w GPU registers when meets all three conditions below. - amdgpu_gpu_recovery=0 - TDR happened - in_gpu_reset=0 v2: merge amdgpu_virt_can_access_debugfs() into amdgpu_virt_enable_access_debugfs() v3: drop ret variable in amdgpu_virt_enable_access_debugfs() and directly return result Signed-off-by: NYintian Tao <yttao@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 4月, 2020 5 次提交
-
-
由 John Clements 提交于
added macro to define timeout Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Aurabindo Pillai 提交于
Execution will only reach here if the asserted condition is true. Hence there is no need for the additional check. Signed-off-by: NAurabindo Pillai <mail@aurabindo.in> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Aaron Liu 提交于
Make the fw_write_wait default case true since presumably all new gfx9 asics will have updated firmware. That is using unique WAIT_REG_MEM packet with opration=1. Signed-off-by: NAaron Liu <aaron.liu@amd.com> Tested-by: NAaron Liu <aaron.liu@amd.com> Tested-by: NYuxian Dai <Yuxian.Dai@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
add indirect access support to registers outside of mmio bar. Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
all the register access through kiq is redirected to amdgpu_kiq_rreg/amdgpu_kiq_wreg Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-