提交 · 6a52d4641c3a237aabb74077ae1290974b6c1d72 · openeuler / Kernel

12 2月, 2020 1 次提交

drm/amdgpu: limit GDS clearing workaround in cold boot sequence · ea6f0931

由 Guchun Chen 提交于 2月 09, 2020

GDS clear workaround will cause gfx failure in suspend/resume case.

[   98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block <gfx_v9_0> failed -110
[   98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110
[   98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110

As this workaround is specific to the HW bug of GDS's ECC error
existing in cold boot up, so bypass this workaround in suspend/
resume case after booting up.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ea6f0931

31 1月, 2020 1 次提交

drm/amdgpu: Enable DISABLE_BARRIER_WAITCNT for Arcturus · 18c6b74e

由 Joseph Greathouse 提交于 1月 27, 2020

In previous gfx9 parts, S_BARRIER shader instructions are implicitly
S_WAITCNT 0 instructions as well. This setting turns off that
mechanism in Arcturus and beyond. With this, shaders must follow the
ISA guide insofar as putting in explicit S_WAITCNT operations even
after an S_BARRIER.

v2: Fix patch title to list component
Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

18c6b74e

28 1月, 2020 1 次提交

drm/amdgpu: attempt to enable gfxoff on more raven1 boards (v2) · 7af2a577

由 Alex Deucher 提交于 1月 15, 2020

Switch to a blacklist so we can disable specific boards
that are problematic.

v2: make the blacklist non-raven specific.
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7af2a577

23 1月, 2020 7 次提交

drm/amdgpu: remove unnecessary conversion to bool · a9d4fe2f

由 Nirmoy Das 提交于 1月 20, 2020

Better clean that up before some automation starts to complain about it
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a9d4fe2f

drm/amdgpu: add RAS support for the gfx block of Arcturus · 4c461d89

由 Dennis Li 提交于 1月 16, 2020

Implement functions to do the RAS error injection and
query EDC counter.
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4c461d89

drm/amdgpu: abstract EDC counter clear to a separated function · 504c5e72

由 Dennis Li 提交于 1月 16, 2020

1. Add IP prefix for the IP related codes.
2. Refactor the code to clear EDC counter.
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

504c5e72

drm/amdgpu: refine the security check for RAS functions · 5e66403e

由 Dennis Li 提交于 1月 16, 2020

To avoid calling RAS related functions when RAS feature isn't
supported in hardware. Change to check supported features, instead
of checking asic type.

v2: reuse amdgpu_ras_is_supported function, instead of introducing
a new flag for hardware ras feature.
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5e66403e

drm/amdgpu: read gfx register using RREG32_KIQ macro · e3cd0360

由 chen gong 提交于 1月 14, 2020

Reading CP_MEM_SLP_CNTL register with RREG32_SOC15 macro will lead to
hang when GPU is in "gfxoff" state.
I do a uniform substitution here.
Signed-off-by: Nchen gong <curry.gong@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e3cd0360

drm/amdgpu: provide a generic function interface for reading/writing register by KIQ · d33a99c4

由 chen gong 提交于 1月 15, 2020

Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c,
and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and
flexible.
Signed-off-by: Nchen gong <curry.gong@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d33a99c4

drm/amdgpu: modify packet size for pm4 flush tlbs · 36a1707a

由 Alex Sierra 提交于 1月 13, 2020

[Why]
PM4 packet size for flush message was oversized.

[How]
Packet size adjusted to allocate flush + fence packets.
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

36a1707a

17 1月, 2020 4 次提交

drm/amdgpu: only set cp active field for kiq queue · 0e5b7a95

由 Huang Rui 提交于 1月 10, 2020

The mec ucode will set the CP_HQD_ACTIVE bit while the queue is mapped by
MAP_QUEUES packet. So we only need set cp active field for kiq queue.
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0e5b7a95

drm/amdgpu: replace kcq enable/disable functions on gfx_v9 · 4f01f1e5

由 Alex Sierra 提交于 1月 09, 2020

[Why]
There are HW-indpendent functions that enables and disables kcq. These functions use
the kiq_pm4_funcs implementation.

[How]
Local kcq enable and disable functions removed and replace it by the generic kcq
enable under amdgpu_gfx
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4f01f1e5

drm/amdgpu: implement tlbs invalidate on gfx9 gfx10 · 58e508b6

由 Alex Sierra 提交于 1月 09, 2020

tlbs invalidate pointer function added to kiq_pm4_funcs struct.
This way, tlb flush can be done through kiq member.
TLBs invalidatation implemented for gfx9 and gfx10.
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

58e508b6

drm/amdgpu: kiq pm4 function implementation for gfx_v9 · f167ea6a

由 Alex Sierra 提交于 1月 09, 2020

Functions implemented from kiq_pm4_funcs struct members
for gfx_v9 version.
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f167ea6a

14 1月, 2020 3 次提交

drm/amdgpu: Match TC hash settings to DF settings (v2) · 22d39fe7

由 Joseph Greathouse 提交于 1月 09, 2020

On Arcturus, data fabric hashing is set by the VBIOS, and
affects which addresses map to which memory channels. The
gfx core's caches also need to know this mapping, but the
hash settings for these these caches is set by the driver.

This change queries the DF to understand how the VBIOS
configured DF, then matches the TC hash configuration bits
to do the same thing.

v2: squash in warning fix
Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

22d39fe7

drm/amdgpu/gfx9: remove unused sdma headers · d44394a9

由 Alex Deucher 提交于 1月 08, 2020

All of the sdma stuff these were used for moves to
the sdma code, so remove them.
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d44394a9

drm/amdgpu: read sdma edc counter to clear the counters · 5e62db9d

由 Hawking Zhang 提交于 1月 08, 2020

SDMA edc counter registers were added in gfx edc counters
array. When querying gfx error counter in that array, there
is no way to differentiate sdma instance number for different
asic and then results to NULL pointer access when trying to
read sdma register base address for instances greater
than 2 on Vega20.
In addition, this also results to wrong gfx error counters
since it actually added sdma edc counters.
Therefore, sdma edc counter registers should be separated
from gfx edc counter regsiter array and only get initialized
when driver tries to enable sdma ras.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5e62db9d

08 1月, 2020 1 次提交

drm/amdgpu/gfx: simplify old firmware warning · 48ccd5ff

由 Alex Deucher 提交于 1月 06, 2020

Put it on one line to avoid whitespace issues when
printing in the log.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

48ccd5ff

29 12月, 2019 1 次提交

drm/amdgpu: enable gfxoff for raven1 refresh · e0c63812

由 changzhu 提交于 12月 12, 2019

When smu version is larger than 0x41e2b, it will load
raven_kicker_rlc.bin.To enable gfxoff for raven_kicker_rlc.bin,it
needs to avoid adev->pm.pp_feature &= ~PP_GFXOFF_MASK when it loads
raven_kicker_rlc.bin.
Signed-off-by: Nchangzhu <Changfeng.Zhu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e0c63812

24 12月, 2019 3 次提交

drm/amdgpu/gfx: Add mmSDMA2-7_EDC_COUNTER to support Arcturus · 57cb635b

由 James Zhu 提交于 12月 16, 2019

Add mmSDMA2-7_EDC_COUNTER to support Arcturus
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NYong Zhao <Yong.Zhao@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

57cb635b

drm/amdgpu/gfx: Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus · 107ab061

由 James Zhu 提交于 12月 16, 2019

Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NYong Zhao <Yong.Zhao@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

107ab061

drm/amdgpu/gfx: Replace ARRAY_SIZE with size variable · d8c61373

由 James Zhu 提交于 12月 16, 2019

Replace ARRAY_SIZE with size variables to support
different ASICs.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NYong Zhao <Yong.Zhao@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d8c61373

19 12月, 2019 1 次提交

drm/amdgpu: enable gfxoff for raven1 refresh · aaff8b44

由 changzhu 提交于 12月 12, 2019

When smu version is larger than 0x41e2b, it will load
raven_kicker_rlc.bin.To enable gfxoff for raven_kicker_rlc.bin,it
needs to avoid adev->pm.pp_feature &= ~PP_GFXOFF_MASK when it loads
raven_kicker_rlc.bin.
Signed-off-by: Nchangzhu <Changfeng.Zhu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

aaff8b44

06 12月, 2019 1 次提交

drm/amdgpu/gfx: Improvement on EDC GPR workarounds · f83f5a1e

由 James Zhu 提交于 12月 03, 2019

SPI limits total CS waves in flight per SE to no more than 32 * num_cu and
we need to stuff 40 waves on a CU to completely clean the SGPR. This is
accomplished in the WR by cleaning the SE in two steps, half of the CU per
step.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NYong Zhao <Yong.Zhao@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f83f5a1e

04 12月, 2019 1 次提交

drm/amdgpu: fix calltrace during kmd unload(v3) · 747d4f71

由 Monk Liu 提交于 11月 26, 2019

issue:
kernel would report a warning from a double unpin
during the driver unloading on the CSB bo

why:
we unpin it during hw_fini, and there will be another
unpin in sw_fini on CSB bo.

fix:
actually we don't need to pin/unpin it during
hw_init/fini since it is created with kernel pinned,
we only need to fullfill the CSB again during hw_init
to prevent CSB/VRAM lost after S3

v2:
get_csb in init_rlc so hw_init() will make CSIB content
back even after reset or s3

v3:
use bo_create_kernel instead of bo_create_reserved for CSB
otherwise the bo_free_kernel() on CSB is not aligned and
would lead to its internal reserve pending there forever

take care of gfx7/8 as well
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NXiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

747d4f71

03 12月, 2019 3 次提交

drm/amdgpu: fix calltrace during kmd unload(v3) · 82a829dc

由 Monk Liu 提交于 11月 26, 2019

issue:
kernel would report a warning from a double unpin
during the driver unloading on the CSB bo

why:
we unpin it during hw_fini, and there will be another
unpin in sw_fini on CSB bo.

fix:
actually we don't need to pin/unpin it during
hw_init/fini since it is created with kernel pinned,
we only need to fullfill the CSB again during hw_init
to prevent CSB/VRAM lost after S3

v2:
get_csb in init_rlc so hw_init() will make CSIB content
back even after reset or s3

v3:
use bo_create_kernel instead of bo_create_reserved for CSB
otherwise the bo_free_kernel() on CSB is not aligned and
would lead to its internal reserve pending there forever

take care of gfx7/8 as well
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NXiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

82a829dc

drm/amdgpu/gfx: Increase dispatch packet number · 45317d5f

由 James Zhu 提交于 11月 26, 2019

For Arcturus, increase dispatch packet number to stress scheduler.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

45317d5f

drm/amdgpu/gfx: Clear more EDC cnt · 2255d7f3

由 James Zhu 提交于 11月 26, 2019

Clear SDMA and HDP EDC counter in GPR workarounds.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2255d7f3

27 11月, 2019 1 次提交

drm/amdgpu: apply gpr/gds workaround before enabling GFX EDC mode · be3e73ea

由 Hawking Zhang 提交于 11月 20, 2019

gfx memory should be initialized before enabling
DED and FUE field in mmGB_EDC_MODE
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

be3e73ea

23 11月, 2019 4 次提交

drm/amdgpu: Update Arcturus golden registers · 57fb0ab2

由 Jay Cornwall 提交于 11月 20, 2019

Signed-off-by: NJay Cornwall <jay.cornwall@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

57fb0ab2

drm/amdgpu: disable gfxoff on original raven · 8fc41344

由 Alex Deucher 提交于 11月 15, 2019

There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8fc41344

drm/amdgpu: Update Arcturus golden registers · 6e04b224

由 Jay Cornwall 提交于 11月 20, 2019

Signed-off-by: NJay Cornwall <jay.cornwall@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6e04b224

drm/amdgpu: define soc15_ras_field_entry for reuse · 46f71969

由 Dennis Li 提交于 11月 19, 2019

The struct soc15_ras_field_entry will be reused by
other IPs, such as mmhub and gc

v2: rename ras_subblock_regs to gc_ras_fields_vg20,
because the future asic maybe have a different table.
Signed-off-by: NDennis Li <dennis.li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

46f71969

21 11月, 2019 1 次提交

drm/amdgpu: disable gfxoff on original raven · 941a0a79

由 Alex Deucher 提交于 11月 15, 2019

There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

941a0a79

19 11月, 2019 1 次提交

drm/amdgpu: disable gfxoff on original raven · 3f2a06ac

由 Alex Deucher 提交于 11月 15, 2019

There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3f2a06ac

09 11月, 2019 1 次提交

drm/amdgpu: allow direct upload save restore list for raven2 · eebc7f4d

由 changzhu 提交于 11月 07, 2019

It will cause modprobe atombios stuck problem in raven2 if it doesn't
allow direct upload save restore list from gfx driver.
So it needs to allow direct upload save restore list for raven2
temporarily.
Signed-off-by: Nchangzhu <Changfeng.Zhu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

eebc7f4d

07 11月, 2019 4 次提交

drm/amdgpu/renoir: move gfxoff handling into gfx9 module · 77a31602

由 Alex Deucher 提交于 10月 29, 2019

To properly handle the option parsing ordering.
Reviewed-by: NYong Zhao <yong.zhao@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

77a31602

drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9 · 440a7a54

由 changzhu 提交于 11月 05, 2019

It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.
Signed-off-by: Nchangzhu <Changfeng.Zhu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

440a7a54

drm/amdgpu/renoir: move gfxoff handling into gfx9 module · ad4d81dc

由 Alex Deucher 提交于 10月 29, 2019

To properly handle the option parsing ordering.
Reviewed-by: NYong Zhao <yong.zhao@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ad4d81dc

drm/amdgpu: change read of GPU clock counter on Vega10 VF · f88e2d1f

由 Eric Huang 提交于 11月 05, 2019

Using unified VBIOS has performance drop in sriov environment.
The fix is switching to another register instead.
Signed-off-by: NEric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f88e2d1f

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功