- 02 8月, 2019 36 次提交
-
-
由 Tao Zhou 提交于
ce can also trigger interrupt, and even both ce and ue error can be found in one ras query, distinguishing between ce and ue in interrupt handler is uncessary. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Suggested-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
we only read error information for correctable error in interrupt handler, gpu reset is unnecessary since there is no data lost in correctable error Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
the initial value of ecc error count can be adjusted Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
enable umc ce interrupt and initialize ecc error count Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
correctable error can also trigger interrupt in some ras blocks Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
umc error address query can get ce/ue error address and clear error status Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
use umc_for_each_channel to make code simpler Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
common function for all umc versions, loop for each umc channel is a frequent used operation in umc block, define it as a macro to simplify code Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
add initialization for new members of amdgpu_umc structure Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
expose more parameters and functions of specific umc version to common umc layer, so amdgpu_umc layer and other blocks could access them Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
clearing MCA_STATUS is enough to reset the whole MCA, writing zero to MCA_ADDR is unnecessary Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kevin Wang 提交于
too frequently to update mertrics table will cause smu internal error. Signed-off-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nicholas Kazlauskas 提交于
[Why] DRM private objects have no hw_done/flip_done fencing mechanism on their own and cannot be used to sequence commits accordingly. When issuing commits that don't touch the same set of hardware resources like page-flips on different CRTCs we can run into the issue below because of this: 1. Client requests non-blocking Commit #1, has a new dc_state #1, state is swapped, commit tail is deferred to work queue 2. Client requests non-blocking Commit #2, has a new dc_state #2, state is swapped, commit tail is deferred to work queue 3. Commit #2 work starts, commit tail finishes, atomic state is cleared, dc_state #1 is freed 4. Commit #1 work starts, commit tail encounters null pointer deref on dc_state #1 In order to change the DC state as in the private object we need to ensure that we wait for all outstanding commits to finish and that any other pending commits must wait for the current one to finish as well. We do this for MEDIUM and FULL updates. But not for FAST updates, nor would we want to since it would cause stuttering from the delays. FAST updates that go through dm_determine_update_type_for_commit always create a new dc_state and lock the DRM private object if there are any changed planes. We need the old state to validate, but we don't actually need the new state here. [How] If the commit isn't a full update then the use after free can be resolved by simply discarding the new state entirely and retaining the existing one instead. With this change the sequence above can be reexamined. Commit #2 will still free Commit #1's reference, but before this happens we actually added an additional reference as part of Commit #2. If an update comes in during this that needs to change the dc_state it will need to wait on Commit #1 and Commit #2 to finish. Then it'll swap the state, finish the work in commit tail and drop the last reference on Commit #2's dc_state. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204181 Fixes: 004b3938 ("drm/amd/display: Check scaling info when determing update type") Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NDavid Francis <david.francis@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nicholas Kazlauskas 提交于
[Why] By passing through the dm_determine_update_type_for_commit for atomic commits that can be done asynchronously we are incurring a performance penalty by locking access to the global private object and holding that access until the end of the programming sequence. This is also allocating a new large dc_state on every access in addition to retaining all the references on each stream and plane until the end of the programming sequence. [How] Shift the determination for async update before validation. Return early if it's going to be an async update. Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NDavid Francis <david.francis@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nicholas Kazlauskas 提交于
[Why] We previously allowed framebuffer swaps as async updates for cursor planes but had to disable them due to a bug in DRM with async update handling and incorrect ref counting. The check to block framebuffer swaps has been added to DRM for a while now, so this check is redundant. The real fix that allows this to properly in DRM has also finally been merged and is getting backported into stable branches, so dropping this now seems to be the right time to do so. [How] Drop the redundant check for old_fb != new_fb. With the proper fix in DRM, this should also fix some cursor stuttering issues with xf86-video-amdgpu since it double buffers the cursor. IGT tests that swap framebuffers (-varying-size for example) should also pass again. Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NDavid Francis <david.francis@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Colin Ian King 提交于
Currenly the error check on variable instance is always false because it is a uint32_t type and this is never less than zero. Fix this by making it an int type. Addresses-Coverity: ("Unsigned compared against 0") Fixes: 7d0e6329 ("drm/amdgpu: update more sdma instances irq support") Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Matt Coffin 提交于
[Why] Before this change, the fan control state on smu_v11 was not able to be changed because the capability check for checking if the fan control capability existed was inverted. [How] The capability check for fan control in smu_v11_0_auto_fan_control was inverted, to correctly check for the absence, instead of presence of fan control capabilities. Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NMatt Coffin <mcoffin13@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Colin Ian King 提交于
There are a few spelling mistakes "unknow" -> "unknown" and "enabeld" -> "enabled". Fix these. Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jia-Ju Bai 提交于
In radeon_connector_set_property(), there is an if statement on line 743 to check whether connector->encoder is NULL: if (connector->encoder) When connector->encoder is NULL, it is used on line 755: if (connector->encoder->crtc) Thus, a possible null-pointer dereference may occur. To fix this bug, connector->encoder is checked before being used. This bug is found by a static analysis tool STCheck written by us. Signed-off-by: NJia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Colin Ian King 提交于
There are two occurrances of off-by-one upper bound checking of indexes causing potential out-of-bounds array reads. Fix these. Addresses-Coverity: ("Out-of-bounds read") Fixes: cb33363d ("drm/amd/powerplay: add smu feature name support") Fixes: 6b294793 ("drm/amd/powerplay: add smu message name support") Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 KyleMahlkuch 提交于
During kexec some adapters hit an EEH since they are not properly shut down in the radeon_pci_shutdown() function. Adding radeon_suspend_kms() fixes this issue. Signed-off-by: NKyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jay Cornwall 提交于
Following bitmap layout logic introduced by: "drm/amdgpu: support get_cu_info for Arcturus". v2: squash in fixup for gfx_v9_0.c (Alex) v3: squash in debug print output fix Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Le Ma 提交于
This change is because SE/SH layout on Arcturus is 8*1, different from 4*2(or 4*1) on Vega ASICs. Currently the cu bitmap array is 4x4 size, and besides the bitmap is used widely across SW stack. To mostly reduce the scale of impact, we make the cu bitmap array compatible with SE/SH layout on Arcturus. Then the store of cu bits of each shader array for Arcturus will be like below: SE0,SH0 --> bitmap[0][0] SE1,SH0 --> bitmap[1][0] SE2,SH0 --> bitmap[2][0] SE3,SH0 --> bitmap[3][0] SE4,SH0 --> bitmap[0][1] SE5,SH0 --> bitmap[1][1] SE6,SH0 --> bitmap[2][1] SE7,SH0 --> bitmap[3][1] Signed-off-by: NLe Ma <le.ma@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
The registers used for VG20 are different in that certain performance counters were split off to TXCLK3/4. Vega10/12 doesn't have this, so add a new vg20_get_pcie_usage to reflect this change. Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
These are added for VG20, and are needed for PCIe bandwidth. Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Fixes GPU reset crash. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Memory used by KFD applications can contain sensitive information that should not be leaked to other processes. The current approach to prevent leaks is to clear VRAM at allocation time. This is not effective because memory can be reused in other ways without being cleared. Synchronously clearing memory on the allocation path also carries a significant performance penalty. Stop clearing memory at allocation time. Instead mark the memory for wipe on release. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Wipe VRAM memory containing sensitive data when moving or releasing BOs. Clearing the memory is pipelined to minimize any impact on subsequent memory allocation latency. Use of a poison value should help debug future use-after-free bugs. When moving BOs, the existing ttm_bo_pipelined_move ensures that the memory won't be reused before being wiped. When releasing BOs, the BO is fenced with the memory fill operation, which results in queuing the BO for a delayed delete. v2: Move amdgpu_amdkfd_unreserve_memory_limit into amdgpu_bo_release_notify so that KFD can use memory that's still being cleared in the background Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
This memory allocation flag will be used to indicate BOs containing sensitive data that should not be leaked to other processes. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
This notifies the driver that a BO is about to be released. Releasing a BO also invokes the move_notify callback from ttm_bo_cleanup_memtype_use, but that happens too late for anything that would add fences to the BO and require a delayed delete. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Leo Li 提交于
Using a static int array will cause errors if the given dm_pp_clk_type is out-of-bounds. For robustness, use a switch table, with a default case to handle all invalid values. v2: 0 is a valid clock type for smu_clk_type. Return SMU_CLK_COUNT instead on invalid mapping. Signed-off-by: NLeo Li <sunpeng.li@amd.com> Reviewed-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nathan Chancellor 提交于
clang warns: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.c:336:8: warning: implicit conversion from enumeration type 'enum smu_clk_type' to different enumeration type 'enum amd_pp_clock_type' [-Wenum-conversion] dc_to_smu_clock_type(clk_type), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.c:421:14: warning: implicit conversion from enumeration type 'enum amd_pp_clock_type' to different enumeration type 'enum smu_clk_type' [-Wenum-conversion] dc_to_pp_clock_type(clk_type), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are functions to properly convert between all of these types, use them so there are no longer any warnings. Fixes: a43913ea ("drm/amd/powerplay: add function get_clock_by_type_with_latency for navi10") Fixes: e5e4e223 ("drm/amd/powerplay: add interface to get clock by type with latency for display (v2)") Link: https://github.com/ClangBuiltLinux/linux/issues/586Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Reviewed-by: NLeo Li <sunpeng.li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
previously the ucode loading of PSP was repreated, one executed in phase_1 init/re-init/resume and the other in fw_loading routine Avoid this double loading by clearing ip_blocks.status.hw in suspend or reset prior to the FW loading and any block's hw_init/resume v2: still do the smu fw loading since it is needed by bare-metal v3: drop the change in reinit_early_sriov, just clear all block's status.hw in the head place and set the status.hw after hw_init done is enough Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
for SRIOV the SOS fw of PSP is loaded in hypervisor thus guest won't tell the version of it, and judging feature by reading the sos fw version in guest side is completely wrong Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
we can simplify all those unnecessary function under SRIOV for vega10 since: 1) PSP L1 policy is by force enabled in SRIOV 2) original logic always set all flags which make itself a dummy step besides, 1) the ih_doorbell_range set should also be skipped for VEGA10 SRIOV. 2) the gfx_common registers should also be skipped for VEGA10 SRIOV. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kevin Wang 提交于
before this change, the pp_feature sysfs show feature enable state by logic feature id, it is not easy to read. this change will sort pp_features show index by asic feature id. before: features high: 0x00000623 low: 0xb3cdaffb 00. DPM_PREFETCHER ( 0) : enabeld 01. DPM_GFXCLK ( 1) : enabeld 02. DPM_UCLK ( 3) : enabeld 03. DPM_SOCCLK ( 4) : enabeld 04. DPM_MP0CLK ( 5) : enabeld 05. DPM_LINK ( 6) : enabeld 06. DPM_DCEFCLK ( 7) : enabeld 07. DS_GFXCLK (10) : enabeld 08. DS_SOCCLK (11) : enabeld 09. DS_LCLK (12) : disabled 10. PPT (23) : enabeld 11. TDC (24) : enabeld 12. THERMAL (33) : enabeld 13. RM (35) : disabled ...... after: features high: 0x00000623 low: 0xb3cdaffb 00. DPM_PREFETCHER ( 0) : enabeld 01. DPM_GFXCLK ( 1) : enabeld 02. DPM_GFX_PACE ( 2) : disabled 03. DPM_UCLK ( 3) : enabeld 04. DPM_SOCCLK ( 4) : enabeld 05. DPM_MP0CLK ( 5) : enabeld 06. DPM_LINK ( 6) : enabeld 07. DPM_DCEFCLK ( 7) : enabeld 08. MEM_VDDCI_SCALING ( 8) : enabeld 09. MEM_MVDD_SCALING ( 9) : enabeld 10. DS_GFXCLK (10) : enabeld 11. DS_SOCCLK (11) : enabeld 12. DS_LCLK (12) : disabled 13. DS_DCEFCLK (13) : enabeld ...... Signed-off-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NKenneth Feng <kenneth.feng@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 8月, 2019 4 次提交
-
-
由 Alex Deucher 提交于
Same as navi10. Reviewed-by: NXiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
some subblocks of gfx fail in inject test, disable them Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
check gfx error count in both ras querry function and ras interrupt handler. gfx ras is still disabled by default due to known stability issue found in gpu reset. Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
Add functions for RAS error inject and query error counter Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-