- 01 7月, 2022 12 次提交
-
-
由 Philip Yang 提交于
Output user queue eviction and restore event. User queue eviction may be triggered by svm or userptr MMU notifier, TTM eviction, device suspend and CRIU checkpoint and restore. User queue restore may be rescheduled if eviction happens again while restore. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
For migration start and end event, output timestamp when migration starts, ends, svm range address and size, GPU id of migration source and destination and svm range attributes, Migration trigger could be prefetch, CPU or GPU page fault and TTM eviction. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
Use ktime_get_boottime_ns() as timestamp to correlate with other APIs. Output timestamp when GPU recoverable fault starts and ends to recover the fault, if migration happened or only GPU page table is updated to recover, fault address, if read or write fault. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
Process receive event from same process by default. Add a flag to be able to receive event from all processes, this requires super user permission. Event using pid 0 to send the event to all processes, to keep the default behavior of existing SMI events. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
Define new system management interface event IDs for migration, GPU recoverable page fault, user queues eviction, restore and unmap from GPU events and corresponding event triggers, those will be implemented in the following patches. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Xiao 提交于
This reverts commit 8748de87 since drv enabled mes to access registers. Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Xiao 提交于
Enable mes to access registers. v2: squash mes sched ring enablement flag Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Xiao 提交于
Add mes register access routines: 1. read register 2. write register 3. wait register 4. write and wait register Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Xiao 提交于
Add misc op commands in mes11. Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jonathan Kim 提交于
GFX10 and up have work group processors (WGP) and WGP mode is the native compile mode. KFD and ROCr have no visibility into whether a dispatch is operating in CU or WGP mode. Enforce CU masking to be pairwise continguous in enablement and round robin distribute CUs across the SEs in a pairwise manner to assume WGP mode at all times. Signed-off-by: NJonathan Kim <jonathan.kim@amd.com> Reviewed-by: NFelix Kuehling <felix.kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Xiao 提交于
Add common interface for mes misc op, including accessing register interface. Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Xiao 提交于
Update MES firmware api for accessing registers. Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 6月, 2022 11 次提交
-
-
由 Alex Deucher 提交于
Fixes this issue: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5094: warning: expecting prototype for amdgpu_device_gpu_recover_imp(). Prototype was for amdgpu_device_gpu_recover() instead Fixes: cf727044 ("drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover") Reviewed-by: NKent Russell <kent.russell@amd.com> Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
[Why] Redundant if-else cases for repeater and non-repeater checks [How] Without changing the core logic, rearranged the code by removing redundant checks Signed-off-by: NChandan Vurdigere Nataraj <chandan.vurdigerenataraj@amd.com> Reviewed-by: NLeo Li <sunpeng.li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Aurabindo Pillai 提交于
[Why&How] Some userspace expect a backwards compatible modifier on DCN32/321. For hardware with num_pipes more than 16, we expose the most efficient modifier first. As a fall back method, we need to expose slightly inefficient modifier AMD_FMT_MOD_TILE_GFX9_64K_R_X after the best option. Also set the number of packers to fixed value as required per hardware documentation. This value is cached during hardware initialization and can be read through the base driver. Signed-off-by: NAurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: NBas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Aurabindo Pillai 提交于
[Why&How] TA firmware is needed to enable HDCP. Changes in v2: Load separate firmware for PSP 13.0.0 Signed-off-by: NAurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Mauro Carvalho Chehab 提交于
This symbol is missing documentation: drivers/gpu/drm/amd/include/amd_shared.h:224: warning: Enum value 'PP_GFX_DCS_MASK' not described in enum 'PP_FEATURE_MASK' Document it. Fixes: 680602d6 ("drm/amd/pm: enable DCS") Signed-off-by: NMauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Mauro Carvalho Chehab 提交于
There are 4 undocumented fields at struct amdgpu_display_manager. Add documentation for them, fixing those warnings: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'dmub_outbox_params' not described in 'amdgpu_display_manager' drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'num_of_edps' not described in 'amdgpu_display_manager' drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'disable_hpd_irq' not described in 'amdgpu_display_manager' drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'dmub_aux_transfer_done' not described in 'amdgpu_display_manager' drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: Function parameter or member 'delayed_hpd_wq' not described in 'amdgpu_display_manager' Signed-off-by: NMauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tom Rix 提交于
sparse reports drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn32/irq_service_dcn32.c:39:20: warning: symbol 'to_dal_irq_source_dcn32' was not declared. Should it be static? to_dal_irq_source_dnc32() is only referenced in irq_service_dnc32.c, so change its storage class specifier to static. Fixes: 0efd4374 ("drm/amd/display: add dcn32 IRQ changes") Reviewed-by: NAurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: NTom Rix <trix@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tom Rix 提交于
sparse reports drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:3885:6: warning: symbol 'FORCE_RATE' was not declared. Should it be static? drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:3886:10: warning: symbol 'FORCE_LANE_COUNT' was not declared. Should it be static? Neither of thse variables is used in dc_link_dp.c. Reviewing the commit listed in the fixes tag shows neither was used in the original patch. So remove them. Fixes: 265280b9 ("drm/amd/display: add CLKMGR changes for DCN32/321") Reviewed-by: NAurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: NTom Rix <trix@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
No longer used so drop it. Fixes: ec457f83 ("drm/amd/display: Drop unnecessary detect link code") Reported-by: Nkernel test robot <lkp@intel.com> Reviewed-by: NHarry Wentland <harry.wentland@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
Change amdggpu to amdgpu and pedning to pending Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Souptick Joarder (HPE) 提交于
Kernel test robot throws below warning -> drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c: In function 'dc_link_reduce_mst_payload': drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:3782:32: warning: variable 'ret' set but not used [-Wunused-but-set-variable] 3782 | enum act_return_status ret; Removed the unused ret variable. Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NSouptick Joarder (HPE) <jrdr.linux@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 6月, 2022 12 次提交
-
-
由 Philip Yang 提交于
This reverts commit ab8529b0. This causes KFDTest KFDMemoryTest.MemoryRegister test failed on gfx9. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rahul Kumar 提交于
We observed hard hang due to NULL derefrence This issue is seen after running system all the time after two or three days struct dc *dc = plane_state->ctx->dc; Randomly in long run we found plane_state or plane_state->ctx is found NULL which causes exception. BUG: kernel NULL pointer dereference, address: 0000000000000000 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 1dc7f2067 P4D 1dc7f2067 PUD 222c75067 PMD 0 Oops: 0000 [#1] SMP NOPTI CPU: 5 PID: 29855 Comm: kworker/u16:4 ... ... Workqueue: events_unbound commit_work [drm_kms_helper] RIP: 0010:dcn10_update_pending_status+0x1f/0xee [amdgpu] Code: 41 5f c3 0f 1f 44 00 00 b0 01 c3 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 1f 4c 8b af f8 00 00 00 48 8b 83 88 03 00 00 48 85 db <4c> 8b 20 0f 84 bf 00 00 00 48 89 fd 48 8b bf b8 00 00 00 48 8b 07 RSP: 0018:ffff942941997ab8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8d7fd98d2000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8d7e3e87c708 RDI: ffff8d7f2d8c0690 RBP: ffff8d7f2d8c0000 R08: ffff942941997a34 R09: 00000000ffffffff R10: 0000000000005000 R11: 00000000000000f0 R12: ffff8d7f2d8c0690 R13: ffff8d8035a41680 R14: 00000000000186a0 R15: ffff8d7f2d8c1dd8 FS: 0000000000000000(0000) GS:ffff8d8037340000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000148030000 CR4: 00000000003406e0 Call Trace: dc_commit_state+0x6a2/0x7f0 [amdgpu] amdgpu_dm_atomic_commit_tail+0x460/0x19bb [amdgpu] Tested-by: NRodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed-by: NHarry Wentland <harry.wentland@amd.com> Signed-off-by: NRahul Kumar <rahul.kumar1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Align refcount behaviour for amdgpu_job embedded HW fence with classic pointer style HW fences by increasing refcount each time emit is called so amdgpu code doesn't need to make workarounds using amdgpu_job.job_run_counter to keep the HW fence refcount balanced. Also since in the previous patch we resumed setting s_fence->parent to NULL in drm_sched_stop switch to directly checking if job->hw_fence is signaled to short circuit reset if already signed. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Tested-by: NYiqing Yao <yiqing.yao@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Problem: This patch caused negative refcount as described in [1] because for that case parent fence did not signal by the time of drm_sched_stop and hence kept in pending list the assumption was they will not signal and so fence was put to account for the s_fence->parent refcount but for amdgpu which has embedded HW fence (always same parent fence) drm_sched_fence_release_scheduled was always called and would still drop the count for parent fence once more. For jobs that never signaled this imbalance was masked by refcount bug in amdgpu_fence_driver_clear_job_fences that would not drop refcount on the fences that were removed from fence drive fences array (against prevois insertion into the array in get in amdgpu_fence_emit). Fix: Revert this patch and by setting s_job->s_fence->parent to NULL as before prevent the extra refcount drop in amdgpu when drm_sched_fence_release_scheduled is called on job release. Also - align behaviour in drm_sched_resubmit_jobs_ext with that of drm_sched_main when submitting jobs - take a refcount for the new parent fence pointer and drop refcount for original kref_init for new HW fence creation (or fake new HW fence in amdgpu - see next patch). [1] - https://lore.kernel.org/all/731b7ff1-3cc9-e314-df2a-7c51b76d4db0@amd.com/t/#r00c728fcc069b1276642c325bfa9d82bf8fa21a3Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Tested-by: NYiqing Yao <yiqing.yao@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Problem: After we start handling timed out jobs we assume there fences won't be signaled but we cannot be sure and sometimes they fire late. We need to prevent concurrent accesses to fence array from amdgpu_fence_driver_clear_job_fences during GPU reset and amdgpu_fence_process from a late EOP interrupt. Fix: Before accessing fence array in GPU disable EOP interrupt and flush all pending interrupt handlers for amdgpu device's interrupt line. v2: Switch from irq_get/put to full enable/disable_irq for amdgpu Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
This function should drop the fence refcount when it extracts the fence from the fence array, just as it's done in amdgpu_fence_process. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Leslie Shi 提交于
Signed-off-by: NLeslie Shi <Yuliang.Shi@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Xiao 提交于
MES requires mc wptr address for usermode queues. Export bo gart address for mc wptr address. Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
[Why] Existing logs doesn't print DP LT failure reason [How] Update the existing log with DP LT failure reason Signed-off-by: NChandan Vurdigere Nataraj <chandan.vurdigerenataraj@amd.com> Reviewed-by: NLeo Li <sunpeng.li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Enable VR0 Hot support for SMU 13.0.0. Signed-off-by: NEvan Quan <evan.quan@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Update GFX11 cs related settings. Signed-off-by: NEvan Quan <evan.quan@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Move more stack variable in to dummy vars structure on the heap. Fixes stack frame size errors: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c: In function 'dml32_ModeSupportAndSystemConfigurationFull': drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:3833:1: error: the frame size of 2720 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] 3833 | } // ModeSupportAndSystemConfigurationFull | ^ Fixes: dda4fb85 ("drm/amd/display: DML changes for DCN32/321") Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Rodrigo Siqueira Jordao <Rodrigo.Siqueira@amd.com> Reviewed-by: NRodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 6月, 2022 5 次提交
-
-
由 Bas Nieuwenhuizen 提交于
This reverts commit 5089c4a8. This breaks validation and enumeration of display capable modifiers. The early return true means the rest of the validation code never gets executed, and we need that to enumerate the right modifiers to userspace for the format. The modifiers that are in the initial list generated for a plane are the superset for all formats and we need the proper checks in this function to filter some of them out for formats with which they're invalid to be used. Furthermore, the safety contract here is that we validate the incoming modifiers to ensure the kernel can handle them and the display hardware can handle them. This includes e.g. rejecting multi-plane images with DCC. Note that the legacy swizzle mechanism allows encoding more swizzles, and at fb creation time we convert them to modifiers and reject those with no corresponding modifiers. If we are seeing rejections I'm happy to help define modifiers that correspond to those, or if absolutely needed implement a fallback path to allow for less strict validation of the legacy path. However, I'd like to revert this patch, since any of these is going to be a significant rework of the patch, and I'd rather not the regression gets into a release or forgotten in the meantime. Reviewed-by: NAurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: NBas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Colin Ian King 提交于
There is a spelling mistake in a dml_print message. Fix it. Reviewed-by: NHarry Wentland <harry.wentland@amd.com> Signed-off-by: NColin Ian King <colin.i.king@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Colin Ian King 提交于
There is a spelling mistake in a pr_debug message. Fix it. Signed-off-by: NColin Ian King <colin.i.king@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
This reverts commit 92020e81. This causes stuttering and timeouts with DMCUB for some users so revert it until we understand why and safely enable it to save power. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1887Acked-by: NHarry Wentland <harry.wentland@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
-
由 Jiang Jian 提交于
there is an unexpected word 'for' in the comments that need to be dropped file - drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c line - 245 * position and also advance the position for for Vega10 changed to: * position and also advance the position for Vega10 Signed-off-by: NJiang Jian <jiangjian@cdjrlc.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-