提交 · 75737cb4eb78c7f185e4700b4aa20cf7a3381aca · openeuler / raspberrypi-kernel

07 12月, 2017 2 次提交

drm/amdgpu/gfx8: Fix compute ring failure after resetting · 75737cb4

由 Xiangliang.Yu 提交于 11月 10, 2017

Do ring clear before ring test, otherwise compute ring test will
fail after gpu resetting. Still can't find the root cause, just
workaround it.
Signed-off-by: NXiangliang.Yu <Xiangliang.Yu@amd.com>
Acked-by: NMonk Liu <Monk.Liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

75737cb4

drm/amdgpu: revise retry init to fully cleanup driver · 1daee8b4

由 Pixel Ding 提交于 11月 08, 2017

Retry at drm_dev_register instead of amdgpu_device_init.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NPixel Ding <Pixel.Ding@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1daee8b4

05 12月, 2017 38 次提交

drm/amdgpu:read VRAMLOST from gim · 75bc6099

由 Monk Liu 提交于 10月 30, 2017

Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

75bc6099

drm/amdgpu: bypass FB resizing for SRIOV VF · 0c03b912

由 pding 提交于 11月 07, 2017

It introduces 900ms latency in exclusive mode which causes failure
of driver loading. Host can resize the BAR before guest staring,
so the resizing is not necessary here.
Signed-off-by: NPixel Ding <Pixel.Ding@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0c03b912

drm/amdgpu: release exclusive mode after hw_init · c6332b97

由 pding 提交于 11月 06, 2017

Signed-off-by: Npding <Pixel.Ding@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c6332b97

drm/amdkfd: initialise kfd inside amdgpu_device_init · 1884734a

由 pding 提交于 11月 06, 2017

Also finalize kfd inside amdgpu_device_fini. kfd device_init needs
SRIOV exclusive accessing. Try to gather exclusive accessing to
reduce time consuming.
Signed-off-by: Npding <Pixel.Ding@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1884734a

drm/amdgpu: don't use ttm_bo_move_ttm in amdgpu_ttm_bind v2 · 40575732

由 Christian König 提交于 10月 26, 2017

Just allocate the GART space and fill it.

This prevents forcing the BO to be idle.

v2: don't unbind/bind at all, just fill the allocated GART space
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

40575732

drm/amdgpu: rename amdgpu_ttm_bind to amdgpu_ttm_alloc_gart · c5835bbb

由 Christian König 提交于 10月 27, 2017

We actually don't bind here, but rather allocate GART space if necessary.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c5835bbb

drm/amdgpu: switch to use new SOC15 reg read/write macros for soc15 ih · b2b7e457

由 Hawking Zhang 提交于 11月 02, 2017

Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b2b7e457

drm/amdgpu: resize VRAM BAR for CPU access v6 · d6895ad3

由 Christian König 提交于 2月 28, 2017

Try to resize BAR0 to let CPU access all of VRAM.

v2: rebased, style cleanups, disable mem decode before resize,
    handle gmc_v9 as well, round size up to power of two.
v3: handle gmc_v6 as well, release and reassign all BARs in the driver.
v4: rename new function to amdgpu_device_resize_fb_bar,
    reenable mem decoding only if all resources are assigned.
v5: reorder resource release, return -ENODEV instead of BUG_ON().
v6: squash in rebase fix
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d6895ad3

drm/amdgpu: refine SR-IOV firmware VRAM reservation to protect data · 3c738893

由 Horace Chen 提交于 11月 01, 2017

The previous solution will create a zero buffer on the system
domain and then move the zeroes to the VRAM. This will break the
original data on the VRAM.

Refine the code to create bo on VRAM domain directly and then remove
and re-create mem node to the exact position before bo_pin. This can
avoid breaking the data and will not cause eviction.
Signed-off-by: NHorace Chen <horace.chen@amd.com>
Reviewed-by: Nmonk liu <monk.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3c738893

drm/amdgpu: retry init if exclusive mode request is failed · 5ffa61c1

由 pding 提交于 10月 30, 2017

This is caused of that hypervisor fails to handle request, one known
issue is MMIO unblocking timeout. In theory we can retry init here.
Signed-off-by: Npding <Pixel.Ding@amd.com>
Reviewed-by: NXiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5ffa61c1

drm/amdgpu: return error when sriov access requests get timeout · f4711033

由 pding 提交于 10月 30, 2017

Reported-by: NSun Gary <Gary.Sun@amd.com>
Signed-off-by: Npding <Pixel.Ding@amd.com>
Reviewed-by: NXiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f4711033

amdgpu: Remove AMDGPU_{HPD,CRTC_IRQ,PAGEFLIP_IRQ}_LAST · 8fb0450c

由 Michel Dänzer 提交于 10月 24, 2017

Not used anymore.
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8fb0450c

amdgpu/dce: Use actual number of CRTCs and HPDs in set_irq_funcs · d794b9f8

由 Michel Dänzer 提交于 10月 24, 2017

Hardcoding the maximum numbers could result in spurious error messages
from the IRQ state callbacks, e.g. on Polaris 11/12:

[drm:dce_v11_0_set_pageflip_irq_state [amdgpu]] *ERROR* invalid pageflip crtc 5
[drm:amdgpu_irq_disable_all [amdgpu]] *ERROR* error disabling interrupt (-22)
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d794b9f8

drm/amdgpu: move GART recovery into GTT manager v2 · c1c7ce8f

由 Christian König 提交于 10月 16, 2017

The GTT manager handles the GART address space anyway, so it is
completely pointless to keep the same information around twice.

v2: rebased
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c1c7ce8f

drm/amdgpu: nuke amdgpu_ttm_is_bound() v2 · 3da917b6

由 Christian König 提交于 10月 27, 2017

Rename amdgpu_gtt_mgr_is_allocated() to amdgpu_gtt_mgr_has_gart_addr() and use
that instead.

v2: rename the function as well.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3da917b6

drm/amdgpu:fix random missing of FLR NOTIFY · 34a4d2bf

由 Monk Liu 提交于 10月 24, 2017

Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

34a4d2bf

drm/amdgpu/sriov:fix memory leak in psp_load_fw · 77a3c96b

由 Monk Liu 提交于 9月 19, 2017

for SR-IOV when doing gpu reset this routine shouldn't do
resource allocating otherwise memory leak
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

77a3c96b

drm/amdgpu:cleanup ucode_init_bo · 503846e0

由 Monk Liu 提交于 10月 17, 2017

1,no sriov check since gpu recover is unified
2,need CPU_ACCESS_REQUIRED flag for VRAM if SRIOV
because otherwise after following PIN the first allocated
VRAM bo is wasted due to some TTM mgr reason.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

503846e0

drm/amdgpu:cleanup in_sriov_reset and lock_reset · 13a752e3

由 Monk Liu 提交于 10月 17, 2017

since now gpu reset is unified with gpu_recover
for both bare-metal and SR-IOV:

1)rename in_sriov_reset to in_gpu_reset
2)move lock_reset from adev->virt to adev
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

13a752e3

drm/amdgpu:implement new GPU recover(v3) · 5740682e

由 Monk Liu 提交于 10月 25, 2017

1,new imple names amdgpu_gpu_recover which gives more hint
on what it does compared with gpu_reset

2,gpu_recover unify bare-metal and SR-IOV, only the asic reset
part is implemented differently

3,gpu_recover will increase hang job karma and mark its entity/context
as guilty if exceeds limit

V2:

4,in scheduler main routine the job from guilty context  will be immedialy
fake signaled after it poped from queue and its fence be set with
"-ECANCELED" error

5,in scheduler recovery routine all jobs from the guilty entity would be
dropped

6,in run_job() routine the real IB submission would be skipped if @skip parameter
equales true or there was VRAM lost occured.

V3:

7,replace deprecated gpu reset, use new gpu recover
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5740682e

amd/scheduler:imple job skip feature(v3) · 48f05f29

由 Monk Liu 提交于 10月 25, 2017

jobs are skipped under two cases
1)when the entity behind this job marked guilty, the job
poped from this entity's queue will be dropped in sched_main loop.

2)in job_recovery(), skip the scheduling job if its karma detected
above limit, and also skipped as well for other jobs sharing the
same fence context. this approach is becuase job_recovery() cannot
access job->entity due to entity may already dead.

v2:
some logic fix

v3:
when entity detected guilty, don't drop the job in the poping
stage, instead set its fence error as -ECANCELED

in run_job(), skip the scheduling either:1) fence->error < 0
or 2) there was a VRAM LOST occurred on this job.
this way we can unify the job skipping logic.

with this feature we can introduce new gpu recover feature.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

48f05f29

drm/amdgpu: fix indentation in amdgpu_display.h · 3a393cf9

由 Christian König 提交于 10月 23, 2017

That was somehow completely of.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3a393cf9

drm/amdgpu: delete duplicated code. · 433f1aa7

由 Rex Zhu 提交于 10月 20, 2017

the variable ref_clock was assigned same
value twice in same function.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

433f1aa7

drm/amdgpu: add new pp function point notify_smu_memory_info · d668942b

由 Rex Zhu 提交于 9月 15, 2017

Used to set up smu power logging.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d668942b

drm/amdgpu: add header kgd_pp_interface.h · c79563a3

由 Rex Zhu 提交于 9月 29, 2017

move powerplay and amdgpu shared structures
and definitions to kgd_pp_interface.h.  This
is the interface between the base driver
and powerplay.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c79563a3

drm/amdgpu: move struct amd_powerplay to amdgpu.h · 11dc9364

由 Rex Zhu 提交于 9月 29, 2017

Clean up the interface.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

11dc9364

drm/amdgpu: remove extra parameter from amdgpu_ttm_bind() v2 · 4ff23be3

由 Christian König 提交于 10月 16, 2017

We always use the BO mem now.

v2: minor rebase
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4ff23be3

drm/amdgpu: don't wait interruptible while binding GART space · 2a018f28

由 Christian König 提交于 10月 25, 2017

Display can't seem to handle this correctly.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2a018f28

drm/amdgpu: fix pin domain compatibility check · f5318959

由 Christian König 提交于 10月 23, 2017

We need to test if any domain fits, not all of them.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f5318959

drm/amdgpu: always bind pinned BOs · ead282a4

由 Christian König 提交于 10月 20, 2017

We always need to bind pinned BOs, not just when the caller requested the
address.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ead282a4

drm/amdgpu: use the actual placement for pin accounting · 5e91fb57

由 Christian König 提交于 10月 20, 2017

This allows us to specify multiple possible placements again.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5e91fb57

drm/amdgpu: retry init if it fails due to exclusive mode timeout (v3) · 8840a387

由 pding 提交于 10月 23, 2017

The exclusive mode has real-time limitation in reality, such like being
done in 300ms. It's easy observed if running many VF/VMs in single host
with heavy CPU workload.

If we find the init fails due to exclusive mode timeout, try it again.

v2:
 - rewrite the condition for readable value.

v3:
 - fix typo, add comments for sleep
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: Npding <Pixel.Ding@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8840a387

drm/amdgpu/virt: implement wait_reset callbacks for vi/ai · b5914238

由 pding 提交于 10月 24, 2017

Reviewed-by: NMonk Liu <monk.liu@amd.com>
Signed-off-by: Npding <Pixel.Ding@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b5914238

drm/amd/powerplay: describe the PCIE link speed in right GT/s · 7413d2fa

由 Evan Quan 提交于 10月 26, 2017

Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7413d2fa

drm/amdgpu/virt: add wait_reset virt ops · b636176e

由 pding 提交于 10月 24, 2017

Driver can use this interface to check if there's a function level
reset done in hypervisor. It's helpful when IRQ handler for reset
is not ready, or special handling is required.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NMonk Liu <monk.liu@amd.com>
Signed-off-by: Npding <Pixel.Ding@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b636176e

drm/amdgpu/virt: add function to check MMIO (v2) · a16f8f11

由 pding 提交于 10月 24, 2017

MMIO space can be blocked on virtualised device. Add this
function to check if MMIO is blocked or not.

Todo: need a reliable method such like communation
with hypervisor.

v2:
 - add comments inline
Signed-off-by: Npding <Pixel.Ding@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a16f8f11

drm/amdgpu: avoid soft lockup when waiting for RLC serdes (v2) · 1366b2d0

由 pding 提交于 10月 23, 2017

Normally all waiting get timeout if there's one.
Release the lock and return immediately when timeout happens.

v2:
 - set the se_sh to broadcase before return
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: Npding <Pixel.Ding@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1366b2d0

drm/amdgpu: change redundant init logs to debug level · 9953b72f

由 pding 提交于 10月 26, 2017

When this VF stays in exclusive mode for long, other VFs will be
impacted.

The redundant messages causes exclusive mode timeout when they're
redirected. That is a normal use case for cloud service to redirect
guest log to virtual serial port.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: Npding <Pixel.Ding@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9953b72f