提交 · 12ffa55da60f8355a5e485bc6d612257a303147e · openeuler / Kernel

14 9月, 2019 23 次提交

drm/amdgpu: Fix bugs in amdgpu_device_gpu_recover in XGMI case. · 12ffa55d

由 Andrey Grodzovsky 提交于 8月 30, 2019

Issue 1:
In  XGMI case amdgpu_device_lock_adev for other devices in hive
was called to late, after access to their repsective schedulers.
So relocate the lock to the begining of accessing the other devs.

Issue 2:
Using amdgpu_device_ip_need_full_reset to switch the device list from
all devices in hive to the single 'master' device who owns this reset
call is wrong because when stopping schedulers we iterate all the devices
in hive but when restarting we will only reactivate the 'master' device.
Also, in case amdgpu_device_pre_asic_reset conlcudes that full reset IS
needed we then have to stop schedulers for all devices in hive and not
only the 'master' but with amdgpu_device_ip_need_full_reset  we
already missed the opprotunity do to so. So just remove this logic and
always stop and start all schedulers for all devices in hive.

Also minor cleanup and print fix.

v4: Minor coding style fix.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

12ffa55d

drm/amdgpu: remove amdgpu_cs_try_evict · 43ce6bab

由 Christian König 提交于 8月 30, 2019

Trying to evict things from the current working set doesn't work that
well anymore because of per VM BOs.

Rely on reserving VRAM for page tables to avoid contention.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

43ce6bab

drm/amdgpu: reserve at least 4MB of VRAM for page tables v2 · 9d1b3c78

由 Christian König 提交于 8月 30, 2019

This hopefully helps reduce the contention for page tables.

v2: adjust maximum reported VRAM size as well
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9d1b3c78

drm/amdgpu: use moving fence instead of exclusive for VM updates · 629be203

由 Christian König 提交于 9月 13, 2019

Make VM updates depend on the moving fence instead of the exclusive one.

Makes it less likely to actually have a dependency.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

629be203

drm/amdgpu: only apply gds clearing workaround when ras is supported · 39857252

由 Hawking Zhang 提交于 8月 31, 2019

gds clearing workaround should only be applied on asics that support gfx ras
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

39857252

drm/amdgpu: fix memory leak when ras is not supported on specific ip block · 8bf2485a

由 Hawking Zhang 提交于 8月 31, 2019

free ras_if if ras is not supported
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8bf2485a

drm/amdgpu: check mmhub_funcs pointer before refering to it · 4ce71be6

由 Hawking Zhang 提交于 8月 31, 2019

mmhub callback functions are not initialized for all the ASICs
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4ce71be6

drm/amdgpu: Remove unnecessary TLB workaround (v2) · 17da41bf

由 Felix Kuehling 提交于 8月 29, 2019

This workaround is better handled in user mode in a way that doesn't
require allocating extra memory and breaking userptr BOs.

The TLB bug is a performance bug, not a functional or security bug.
Hence it is safe to remove this kernel part of the workaround to
allow a better workaround using only virtual address alignments in
user mode.

v2: Removed VI_BO_SIZE_ALIGN definition
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

17da41bf

drm/amdgpu: Use optimal mtypes and PTE bits for Arcturus · e0253d08

由 Felix Kuehling 提交于 8月 26, 2019

For compute VRAM allocations on Arturus use the new RW mtype
for non-coherent local memory, CC mtype for coherent local
memory and PTE_SNOOPED bit for invalidating non-dirty cache
lines on remote XGMI mappings.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Tested-by: NAmber Lin <Amber.Lin@amd.com>
Reviewed-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e0253d08

drm/amdgpu: Determing PTE flags separately for each mapping (v3) · d0ba51b1

由 Felix Kuehling 提交于 8月 26, 2019

The same BO can be mapped with different PTE flags by different GPUs.
Therefore determine the PTE flags separately for each mapping instead
of storing them in the KFD buffer object.

Add a helper function to determine the PTE flags to be extended with
ASIC and memory-type-specific logic in subsequent commits.

v2: Split Arcturus-specific MTYPE changes into separate commit
v3: Fix return type of get_pte_flags to uint64_t
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d0ba51b1

drm/amdgpu: Support new arcturus mtype · 093e48c0

由 Oak Zeng 提交于 7月 26, 2019

Arcturus repurposed mtype WC to RW. Modify gmc functions
to support the new mtype
Signed-off-by: NOak Zeng <Oak.Zeng@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

093e48c0

drm/amdgpu: switch to amdgpu_ras_late_init for nbio v7_4 (v2) · 22e1d14f

由 Hawking Zhang 提交于 8月 29, 2019

call helper function in late init phase to handle ras init
for nbio ip block

v2: init local var r to 0 in case the function return failure
on asics that don't have ras_late_init implementation
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

22e1d14f

drm/amdgpu: add ras_late_init callback function for nbio v7_4 (v3) · 9ad1dc29

由 Hawking Zhang 提交于 8月 29, 2019

ras_late_init callback function will be used to do common ras
init in late init phase.

v2: call ras_late_fini to do cleanup when fails to enable interrupt

v3: rename sysfs/debugfs node name to pcie_bif_xxx
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9ad1dc29

drm/amdgpu: add mmhub ras_late_init callback function (v2) · dda79907

由 Hawking Zhang 提交于 8月 30, 2019

The function will be called in late init phase to do mmhub
ras init

v2: check ras_late_init function pointer before invoking the
function
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dda79907

drm/amdgpu: switch to amdgpu_ras_late_init for gmc v9 block (v2) · 2452e778

由 Hawking Zhang 提交于 8月 29, 2019

call helper function in late init phase to handle ras init
for gmc ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2452e778

drm/amdgpu: switch to amdgpu_ras_late_init for sdma v4 block (v2) · 7d0a31e8

由 Hawking Zhang 提交于 8月 29, 2019

call helper function in late init phase to handle ras init
for sdma ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7d0a31e8

drm/amdgpu: switch to amdgpu_ras_late_init for gfx v9 block (v2) · 63fa48db

由 Hawking Zhang 提交于 8月 29, 2019

call helper function in late init phase to handle ras init
for gfx ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

63fa48db

drm/amdgpu: add helper function to do common ras_late_init/fini (v3) · b293e891

由 Hawking Zhang 提交于 8月 30, 2019

In late_init for ras, the helper function will be used to
1). disable ras feature if the IP block is masked as disabled
2). send enable feature command if the ip block was masked as enabled
3). create debugfs/sysfs node per IP block
4). register interrupt handler

v2: check ih_info.cb to decide add interrupt handler or not

v3: add ras_late_fini for cleanup all the ras fs node and remove
interrupt handler
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b293e891

drm/amdgpu: poll ras_controller_irq and err_event_athub_irq status · a344db8e

由 Hawking Zhang 提交于 6月 05, 2019

For the hardware that can not enable BIF ring for IH cookies for both
ras_controller_irq and err_event_athub_irq, the driver has to poll the
status register in irq handling and ack the hardware properly when there
is interrupt triggered
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a344db8e

drm/amdgpu: add ras_controller and err_event_athub interrupt support · 4e644fff

由 Hawking Zhang 提交于 6月 05, 2019

Ras controller interrupt and Ras err event athub interrupt are two dedicated
interrupts for RAS support.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4e644fff

drm/amdgpu/nbio: add functions to query ras specific interrupt status · 4241863a

由 Hawking Zhang 提交于 5月 30, 2019

ras_controller_interrupt and err_event_interrupt are ras specific interrupts.
add functions to check their status and ack them if they are generated. both
funcitons should only be invoked in ISR when BIF ring is disabled or even not
initialized.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4241863a

drm/amdgpu: switch to new amdgpu_nbio structure · bebc0762

由 Hawking Zhang 提交于 8月 23, 2019

no functional change, just switch to new structures
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bebc0762

drm/amdgpu: add new amdgpu nbio header file · 078ef4e9

由 Hawking Zhang 提交于 8月 23, 2019

More nbio funcitonalities will be added and nbio could
be treated as an ip block like gfx/sdma.etc
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

078ef4e9

31 8月, 2019 3 次提交

drm/amdgpu: Fix undefined dm_ip_block for navi12 · 20c14ee1

由 Petr Cvek 提交于 8月 30, 2019

There is missing "if defined" CONFIG_DRM_AMD_DC block for non DC
configurations. This will cause link error. The patch is fixing that.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=110979Signed-off-by: NPetr Cvek <petrcvekcz@gmail.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

20c14ee1

drm/amdgpu: fix no interrupt issue for renoir emu (v2) · 537e3bbf

由 Aaron Liu 提交于 12月 14, 2018

In renoir's vega10_ih model, there's a security change in mmIH_CHICKEN
register, that limits IH to use physical address (FBPA, GPA) directly.
Those chicken bits need to be programmed first.
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

537e3bbf

drm/amdgpu: Handle job is NULL use case in amdgpu_device_gpu_recover · 0b2d2c2e

由 Andrey Grodzovsky 提交于 8月 27, 2019

This should be checked at all places job is accessed.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0b2d2c2e

30 8月, 2019 12 次提交

drm/amdgpu: Enable DC on Renoir · e1c14c43

由 Roman Li 提交于 8月 08, 2019

Enable DC support for renoir.
Acked-by: NHarry Wentland <harry.wentland@amd.com>
Signed-off-by: NRoman Li <Roman.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e1c14c43

drm/amdgpu: Initialize and update SDMA power gating · 334ffd0d

由 Prike Liang 提交于 8月 26, 2019

Init SDMA HW base configuration and enable idle INT for rn.
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NAaron Liu <aaron.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

334ffd0d

drm/amdgpu/psp: keep TMR in visible vram region for SRIOV · 12842d02

由 Tianci.Yin 提交于 8月 28, 2019

Fix compute ring test failure in sriov scenario.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NTianci.Yin <tianci.yin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

12842d02

drm/amdgpu: keep the stolen memory in visible vram region · 994dcfaa

由 Tianci.Yin 提交于 8月 28, 2019

stolen memory should be fixed in visible region.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NTianci.Yin <tianci.yin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

994dcfaa

drm/amdgpu: fix spelling mistake "jumpimng" -> "jumping" · 92ead9fa

由 Colin Ian King 提交于 8月 29, 2019

There is a spelling mistake in a DRM_DEBUG_DRIVER debug message.
Fix it.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

92ead9fa

drm/amdgpu/virtual_dce: drop error message in hw_init · 53fd9b5a

由 Alex Deucher 提交于 8月 28, 2019

No need to add new asic cases.  This is a sw display
implementation, so just drop the error message so when
we add new asics, all we have to do is add the virtual
dce IP module.
Reviewed-by: NXiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

53fd9b5a

drm/amdgpu/si: fix ASIC tests · 77efe48a

由 Jean Delvare 提交于 8月 28, 2019

Comparing adev->family with CHIP constants is not correct.
adev->family can only be compared with AMDGPU_FAMILY constants and
adev->asic_type is the struct member to compare with CHIP constants.
They are separate identification spaces.
Signed-off-by: NJean Delvare <jdelvare@suse.de>
Fixes: 62a37553 ("drm/amdgpu: add si implementation v10")
Cc: Ken Wang <Qingqing.Wang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

77efe48a

drm/amd/amdgpu: hide voltage and power sensors on SI and KV parts · 1cdd229b

由 Jean Delvare 提交于 8月 28, 2019

The driver does not support these sensors yet and there is no point in
creating sysfs attributes which will always return an error.
Signed-off-by: NJean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1cdd229b

drm/amdgpu: introduce vram lost for reset (v2) · e3526257

由 Monk Liu 提交于 8月 27, 2019

for SOC15/vega10 the BACO reset & mode1 would introduce vram lost
in high end address range, current kmd's vram lost checking cannot
catch it since it only check very ahead visible frame buffer

v2:
cover NV as well
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e3526257

drm/amdgpu: enable athub powergating for navi12 · 5ef3b8ac

由 Xiaojie Yuan 提交于 8月 27, 2019

Signed-off-by: NXiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5ef3b8ac

drm/amdgpu: enable vcn powergating for navi12 · c1653ea0

由 Xiaojie Yuan 提交于 8月 27, 2019

Signed-off-by: NXiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c1653ea0

drm/amdgpu: correct in_suspend setting for navi series · 317f9cc9

由 Hawking Zhang 提交于 8月 27, 2019

in_suspend flag should be set in amdgpu_device_suspend/resume in pairs,
instead of gfx10 ip suspend/resume function.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

317f9cc9

27 8月, 2019 2 次提交

drm/amdgpu: fix GFXOFF on Picasso and Raven2 · c072b0c2

由 Aaron Liu 提交于 8月 27, 2019

For picasso(adev->pdev->device == 0x15d8)&raven2(adev->rev_id >= 0x8),
firmware is sufficient to support gfxoff.
In commit 98f58ada, for picasso&raven2,
return directly and cause gfxoff disabled.

Fixes: 98f58ada ("drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible")
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c072b0c2

drm/amdgpu: Add APTX quirk for Dell Latitude 5495 · c7b33cfb

由 Kai-Heng Feng 提交于 8月 27, 2019

Needs ATPX rather than _PR3 to really turn off the dGPU. This can save
~5W when dGPU is runtime-suspended.
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c7b33cfb

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功