提交 · 4f8bc72fbf10f2dc8bca74d5da08b3a981b2e5cd · openeuler / Kernel

21 3月, 2019 6 次提交

drm/amdgpu: free up the first paging queue v2 · 4f8bc72f

由 Christian König 提交于 12月 05, 2018

We need the first paging queue to handle page faults.

v2: handle any number of SDMA instances gracefully
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4f8bc72f

drm/amdgpu: re-enable retry faults · f11a13ec

由 Christian König 提交于 11月 05, 2018

Now that we have re-reoute faults to the other IH
ring we can enable retries again.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f11a13ec

drm/amdkfd/sriov:Put the pre and post reset in exclusive mode v2 · f81e8d53

由 Wentao Lou 提交于 3月 13, 2019

add amdgpu_amdkfd_pre_reset and amdgpu_amdkfd_post_reset inside amdgpu_device_reset_sriov.
Signed-off-by: NWentao Lou <Wentao.Lou@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f81e8d53

drm/amd/display: Respect DRM framebuffer info for video surfaces · 1791e54f

由 Nicholas Kazlauskas 提交于 3月 13, 2019

[Why]
Incorrect hardcoded assumptions are made regarding luma and chroma
alignment. The actual values set for the DRM framebuffer should be used
when programming the address.

[How]
Respect the given pitch for both luma and chroma planes - it's not like
we can force the alignment to anything else at this point anyway.

Use the FB offset for the chroma planes directly. DRM already
provides this to us so there's no need to calculate it manually.

While we don't actually use the chroma surface size parameters on Raven,
these should have technically been fb->width / 2 and fb->height / 2
since the chroma plane is half size of the luma plane for NV12.

Leave a TODO indicating that those should be set based on the actual
surface format instead since this is only correct for YUV420 formats.
Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: NHarry Wentland <harry.wentland@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1791e54f

drm/amdgpu: Wait for newly allocated PTs to be idle · 98ae7f98

由 Felix Kuehling 提交于 3月 13, 2019

When page table are updated by the CPU, synchronize with the
allocation and initialization of newly allocated page tables.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

98ae7f98

drm/amdgpu: more descriptive message if HMM not enabled · 194f87dd

由 Philip Yang 提交于 3月 04, 2019

If using old kernel config file, CONFIG_ZONE_DEVICE is not selected,
so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver
error message "Failed to register MMU notifier" is not clear. Inform
user with more descriptive message on how to fix the missing kernel
config option.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109808Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

194f87dd

20 3月, 2019 34 次提交

drm/amdgpu: support userptr cross VMAs case with HMM · 5aeaccca

由 Philip Yang 提交于 3月 04, 2019

userptr may cross two VMAs if the forked child process (not call exec
after fork) malloc buffer, then free it, and then malloc larger size
buf, kerenl will create new VMA adjacent to old VMA which was cloned
from parent process, some pages of userptr are in the first VMA, the
rest pages are in the second VMA.

HMM expects range only have one VMA, loop over all VMAs in the address
range, create multiple ranges to handle this case. See
is_mergeable_anon_vma in mm/mmap.c for details.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5aeaccca

drm/amdkfd: support concurrent userptr update for HMM · 386a68e7

由 Philip Yang 提交于 3月 04, 2019

Userptr restore may have concurrent userptr invalidation after
hmm_vma_fault adds the range to the hmm->ranges list, needs call
hmm_vma_range_done to remove the range from hmm->ranges list first,
then reschedule the restore worker. Otherwise hmm_vma_fault will add
same range to the list, this will cause loop in the list because
range->next point to range itself.

Add function untrack_invalid_user_pages to reduce code duplication.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

386a68e7

drm/amdgpu: stop evicting busy PDs/PTs · 1bd4e4ca

由 Christian König 提交于 11月 07, 2018

Otherwise we won't be able to cleanly handle page faults.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1bd4e4ca

drm/amdgpu: wait for VM to become idle during flush · 56753e73

由 Christian König 提交于 1月 10, 2019

Make sure that not only the entities are flush, but that
we also wait for the HW to finish all processing.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

56753e73

drm/amdgpu: remove non-sense NULL ptr check · 3119e7f4

由 Christian König 提交于 1月 10, 2019

It's a bug having a dead pointer in the IDR, silently returning
is the worst we can do.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3119e7f4

drm/amdgpu: remove chash · 04ed8459

由 Christian König 提交于 11月 07, 2018

Remove the chash implementation for now since it isn't used any more.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

04ed8459

drm/amdgpu: use ring/hash for fault handling on GMC9 v3 · c1a8abd9

由 Christian König 提交于 11月 07, 2018

Further testing showed that the idea with the chash doesn't work as expected.
Especially we can't predict when we can remove the entries from the hash again.

So replace the chash with a ring buffer/hash mix where entries in the container
age automatically based on their timestamp.

v2: use ring buffer / hash mix
v3: check the timeout to make sure all entries age
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v2)
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c1a8abd9

drm/amdgpu: limit the number of IVs processed at once · 8c65fe5f

由 Christian König 提交于 3月 05, 2019

Only process a maximum of 32 IVs before writing back the RPTR. This improves
hw handling when we get close to an overflow in the ring buffer.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8c65fe5f

drm/amdgpu: enable IH ring 1&2 for Vega20 as well · b51cd19e

由 Christian König 提交于 3月 04, 2019

That doesn't seem to have any negative effects.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b51cd19e

drm/amdgpu: enable IH doorbell for ring 1&2 on Vega · 1ae64cec

由 Christian König 提交于 2月 27, 2019

The doorbells should already be reserved, just enable them.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1ae64cec

drm/amdgpu: change Vega IH ring 1 config · 0133690e

由 Christian König 提交于 2月 27, 2019

Disable overflow and enable full drain. This makes fault handling on ring 1
much more reliable since we don't generate back pressure any more.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0133690e

drm/amdgpu: Only clear dumb buffers if ring is enabled · 46846ba2

由 Nicholas Kazlauskas 提交于 3月 11, 2019

The buffers should be cleared when possible but we also don't want
buffer creation to fail in the rare case where the ring isn't ready
during the call. This could happen during some suspend/resume sequences.

Cc: Christian König <ckoenig.leichtzumerken@gmail.com>
Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

46846ba2

drm/amdgpu: Clear VRAM for DRM dumb_create buffers · 95b13468

由 Nicholas Kazlauskas 提交于 3月 08, 2019

The dumb_create API isn't intended for high performance rendering
and it's more useful for userspace (ie. IGT) to have them precleared.

The bonus here is that we also won't needlessly leak whatever was
previously in VRAM, but it also probably wasn't sensitive if it was
going through this API.
Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

95b13468

drm/amdgpu: fix semicolon.cocci warnings · 289d513b

由 kbuild test robot 提交于 3月 06, 2019

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:405:2-3: Unneeded semicolon
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:435:2-3: Unneeded semicolon

 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

289d513b

drm/amdgpu: add new ras workflow control flags · 108c6a63

由 xinhui pan 提交于 3月 11, 2019

add ras post init function.
Do some initialization after all IP have finished their late init.

Add new member flags which will control the ras work flow.
For now, vbios enable ras for us on boot. That might change in the
future.
So there should be a flag from vbios to tell us if ras is enabled or not
on boot. Looks like there is no such info now.

Other bits of the flags are reserved to control other parts of ras.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

108c6a63

drm/amdgpu: let ras initialization a little noticeable · 5d0f903f

由 xinhui pan 提交于 3月 12, 2019

add drm info output if ras initialized successfully.
add ras atomfirmware sanity check.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5d0f903f

drm/amdgpu: Fix lockdep warning more gracely · 163def43

由 xinhui pan 提交于 3月 11, 2019

lockdep need a static key.
Previously we set ignore bit to avoid the warning.
Now call sysfs_attr_init to initialize the static key.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-and-Tested-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

163def43

drm/amdgpu: Fix ras debugfs data parse · b076296b

由 xinhui pan 提交于 3月 11, 2019

Unzero char is accepted by sscanf, so when data is structure but
unexpectedly return error invalid;
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b076296b

drm/amdgpu: add new member hw_supported · 5caf466a

由 xinhui pan 提交于 3月 11, 2019

Currently, it is not clear how ras is supported. Both software and
hardware can set the supported. That is confusing.

Fix it by adding new member hw_supported.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5caf466a

drm/amdgpu: Fix warning when lockdep is enabled · 2b9505e3

由 xinhui pan 提交于 3月 11, 2019

Set ignore bit to satisfy locpdep.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2b9505e3

drm/amdgpu: Fix NULL pointer when ta is missing · 54eb4ed6

由 xinhui pan 提交于 3月 11, 2019

Ta is optional, so check if ta firmware is loaded or not.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

54eb4ed6

drm/amdgpu: fix ras parameter descriptions · 2f3940e9

由 Evan Quan 提交于 3月 07, 2019

The descriptions of modinfo wrongly show two parameters
for each feature(see below). This patch can fix this
incorrect outputs.

parm:           amdgpu_ras_enable:Enable RAS features on the GPU (0 = disable, 1 = enable, -1 = auto (default))
parm:           ras_enable:int
parm:           amdgpu_ras_mask:Mask of RAS features to enable (default 0xffffffff), only valid when ras_enable == 1
parm:           ras_mask:uint
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nxinhui pan <xinhui.pan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2f3940e9

drm/amdgpu: export both supported and enabled ras features · 1febb00e

由 xinhui pan 提交于 3月 07, 2019

Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1febb00e

drm/amdgpu: lookup vbios table to check ecc capability · b404ae82

由 xinhui pan 提交于 3月 07, 2019

Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b404ae82

drm/amdgpu: query sram ecc/ecc availability from atombios · f49ea9f8

由 Hawking Zhang 提交于 3月 07, 2019

query sram ecc capability via amdgpu_atomfirmware_ecc_default_enabled
query ecc availability via amdgpu_atomfirmware_sram_ecc_supported
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f49ea9f8

drm/amdgpu: add atomfirmware helper function to query sram ecc caps · 8b6da23f

由 Hawking Zhang 提交于 3月 07, 2019

sram ecc capability could be get from firmware_capability field in firmwareinfo table
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8b6da23f

drm/amdgpu: add atomfirmware helper function to query ecc status · 511c4348

由 Hawking Zhang 提交于 3月 07, 2019

ecc default status (enabled or disabled) could be get from umc_config field in umc_info table
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

511c4348

drm/amdgpu: update atomfirmware header with ecc related members · ed606ca3

由 Hawking Zhang 提交于 3月 07, 2019

add new umc_info structures and new firmware_capability defines
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ed606ca3

drm/amdgpu: handle ras resume · acbbee01

由 xinhui pan 提交于 3月 07, 2019

Suspend will put irq, so resume need get irq back.
And in the same time, skip other ras initialization.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

acbbee01

drm/amdkfd: add RAS ECC event support (v3) · 9b54d201

由 Eric Huang 提交于 1月 11, 2019

RAS ECC event will combine with GPU reset event, due to
ECC interrupts are caused by uncorrectable error that triggers
GPU reset.

v2: Fix misleading-indentation warning
v3: fix build with CONFIG_HSA_AMD disabled
Signed-off-by: NEric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9b54d201

drm/amdkfd: add RAS capabilities in topology for Vega20 (v2) · 0dee45a2

由 Eric Huang 提交于 1月 11, 2019

It is to collaborate with HSA_CAPABILITY in libhsakmt.

v2: squash in NULL pointer check
Signed-off-by: NEric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0dee45a2

drm/amdgpu: add human readable debugfs control support (v2) · 96ebb307

由 xinhui pan 提交于 3月 01, 2019

Currently, the debugfs control node can't parse bash-like commands.
Now add such support for any tester that uses scripts.

v2: squash in fixes for input validation
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

96ebb307

drm/amdgpu: skip gpu reset when ras error occured · 138352e5

由 xinhui pan 提交于 3月 01, 2019

gpu reset is not stable on vega20 A1.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

138352e5

drm/amdgpu: add ioctl query for enabled ras features (v2) · 5cb77114

由 xinhui pan 提交于 12月 17, 2018

Add a query for userspace to check which RAS features
are enabled.

v2: squash in warning fix
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5cb77114

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功