提交 · 01d468d9a420152e4a1270992e69a37ea0c98e04 · openeuler / Kernel

03 3月, 2022 1 次提交

drm/amdgpu: Modify .ras_fini function pointer parameter · 01d468d9

由 yipechai 提交于 2月 17, 2022

Modify .ras_fini function pointer parameter so that
we can remove redundant intermediate calls in some
ras blocks.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

01d468d9

18 2月, 2022 2 次提交

drm/amdgpu: Optimize xxx_ras_late_init function of each ras block · caae42f0

由 yipechai 提交于 2月 14, 2022

1. Move calling ras block instance members from module internal
   function to the top calling xxx_ras_late_init.
2. Module internal function calls can only use parameter variables
   of xxx_ras_late_init instead of ras block instance members.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

caae42f0

drm/amdgpu: Modify .ras_late_init function pointer parameter · 4e9b1fa5

由 yipechai 提交于 2月 14, 2022

Modify .ras_late_init function pointer parameter so that
it can remove redundant intermediate calls in some ras blocks.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4e9b1fa5

15 2月, 2022 1 次提交

drm/amdgpu: Optimize amdgpu_gfx_ras_late_init/amdgpu_gfx_ras_fini function code · 31106508

由 yipechai 提交于 2月 08, 2022

Optimize amdgpu_gfx_ras_late_init/amdgpu_gfx_ras_fini function code.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

31106508

15 1月, 2022 2 次提交

drm/amdgpu: Modify gfx block to fit for the unified ras block data and ops · 8b0fb0e9

由 yipechai 提交于 1月 04, 2022

1.Modify gfx block to fit for the unified ras block data and ops.
2.Change amdgpu_gfx_ras_funcs to amdgpu_gfx_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of gfx ras variable so that gfx ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register gfx ras block into amdgpu device ras block link list.
5.Remove the redundant code about gfx in amdgpu_ras.c after using the unified ras block.
6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of gfx versions. If .ras_late_init and .ras_fini had been defined by the selected gfx version, the defined functions will take effect; if not defined, default fill with amdgpu_gfx_ras_late_init and amdgpu_gfx_ras_fini.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NJohn Clements <john.clements@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8b0fb0e9

drm/amd/pm: do not expose implementation details to other blocks out of power · bc143d8b

由 Evan Quan 提交于 11月 22, 2021

Those implementation details(whether swsmu supported, some ppt_funcs supported,
accessing internal statistics ...)should be kept internally. It's not a good
practice and even error prone to expose implementation details.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NLijo Lazar <lijo.lazar@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bc143d8b

05 10月, 2021 2 次提交

drm/amdgpu: During s0ix don't wait to signal GFXOFF · 1d617c02

由 Lijo Lazar 提交于 10月 01, 2021

In the rare event when GFX IP suspend coincides with a s0ix entry, don't
schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
signaled about GFXOFF status before amd-pmc module passes OS HINT
to PMFW telling that everything is ready for a safe s0ix entry.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712Signed-off-by: NLijo Lazar <lijo.lazar@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NMario Limonciello <mario.limonciell@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

1d617c02

drm/amdgpu: During s0ix don't wait to signal GFXOFF · d04287d0

由 Lijo Lazar 提交于 10月 01, 2021

d04287d0

21 8月, 2021 2 次提交

drm/amdgpu: Cancel delayed work when GFXOFF is disabled · 32bc8f83

由 Michel Dänzer 提交于 8月 17, 2021

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: NMichel Dänzer <mdaenzer@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

32bc8f83

drm/amdgpu: Cancel delayed work when GFXOFF is disabled · 90a92662

由 Michel Dänzer 提交于 8月 17, 2021

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: NMichel Dänzer <mdaenzer@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

90a92662

17 8月, 2021 1 次提交

drm/amd/amdgpu: remove unnecessary RAS context field · 893cf382

由 Candice Li 提交于 8月 13, 2021

Delete ras_if->name in the RAS ctx structure and remove related lines.
Signed-off-by: NCandice Li <candice.li@amd.com>
Reviewed-by: NJohn Clements <john.clements@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

893cf382

20 5月, 2021 1 次提交

drm/amdgpu: Conditionally reset RAS counters on boot · 8f6368a9

由 John Clements 提交于 5月 17, 2021

Only clear RAS error counters if perestent EDC harvesting is not supported
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NJohn Clements <john.clements@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8f6368a9

10 4月, 2021 5 次提交

drm/amdgpu: split gfx callbacks into ras and non-ras ones · 719a9b33

由 Hawking Zhang 提交于 3月 19, 2021

gfx ras is only available in cerntain ip generations.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NJohn Clements <John.Clements@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

719a9b33

drm/amd/pm: unify the interface for gfx state setting · 2d64d23e

由 Evan Quan 提交于 3月 25, 2021

No need to have special handling for swSMU supported ASICs.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NLijo Lazar <lijo.lazar@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2d64d23e

drm/amdgpu: add the sched_score to amdgpu_ring_init · c107171b

由 Christian König 提交于 2月 02, 2021

Allow separate ring to share the same scheduler score.

No functional change.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: NLeo Liu <leo.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c107171b

drm/amdgpu: wrap kiq ring ops with kiq spinlock · 5a8cd98e

由 Nirmoy Das 提交于 3月 12, 2021

KIQ ring is being operated by kfd as well as amdgpu.
KFD is using kiq lock, we should the same from amdgpu side
as well.
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5a8cd98e

drm/amdgpu: add codes to capture invalid hardware access when recovery · 56b53c0b

由 Dennis Li 提交于 3月 10, 2021

When recovery thread has begun GPU reset, there should be not other
threads to access hardware, otherwise system randomly hang.

v2 (chk): rewritten from scratch, use trylock and lockdep instead of
hand wiring the logic.

v3: add in_irq check

v4: change to check in_task
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

56b53c0b

24 3月, 2021 1 次提交

drm/amdgpu: harvest edc status when connected to host via xGMI · 761d86d3

由 Dennis Li 提交于 2月 04, 2021

When connected to a host via xGMI, system fatal errors may trigger
warm reset, driver has no change to query edc status before reset.
Therefore in this case, driver should harvest previous error loging
registers during boot, instead of only resetting them.

v2:
1. IP's ras_manager object is created when its ras feature is enabled,
so change to query edc status after amdgpu_ras_late_init called

2. change to enable watchdog timer after finishing gfx edc init
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reivewed-by: NHawking Zhang <hawking.zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

761d86d3

10 2月, 2021 1 次提交

drm/amdgpu: enable only one high prio compute queue · 8c0225d7

由 Nirmoy Das 提交于 2月 01, 2021

For high priority compute to work properly we need to enable
wave limiting on gfx pipe. Wave limiting is done through writing
into mmSPI_WCL_PIPE_PERCENT_GFX register. Enable only one high
priority compute queue to avoid race condition between multiple
high priority compute queues writing that register simultaneously.
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8c0225d7

14 11月, 2020 2 次提交

drm/amd/pm: add gfx_state_change_set() for rn gfx power switch (v2) · 8279bb4e

由 Prike Liang 提交于 10月 20, 2020

The gfx_state_change_set() funtion can support set GFX power
change status to D0/D3.

v2: make sure to register callback (Alex)
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8279bb4e

drm/amdgpu: add amdgpu_gfx_state_change_set() set gfx power change entry (v2) · d90a53d6

由 Prike Liang 提交于 9月 09, 2020

The new amdgpu_gfx_state_change_set() funtion can support set GFX power
change status to D0/D3.

v2: squash in warning fix (Alex)
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d90a53d6

13 11月, 2020 1 次提交

drm/amdgpu: fix compute queue priority if num_kcq is less than 4 · 3f66bf40

由 Nirmoy Das 提交于 11月 09, 2020

Compute queues are configurable with module param, num_kcq.
amdgpu_gfx_is_high_priority_compute_queue was setting 1st 4 queues to
high priority queue leaving a null drm scheduler in
adev->gpu_sched[hw_ip]["normal_prio"].sched if num_kcq < 5.

This patch tries to fix it by alternating compute queue priority between
normal and high priority.

Fixes: 33abcb1f (drm/amdgpu: set compute queue priority at mqd_init)
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3f66bf40

17 10月, 2020 1 次提交

drm/amdgpu: move amdgpu_num_kcq handling to a helper · a3bab325

由 Alex Deucher 提交于 10月 16, 2020

Add a helper so we can set per asic default values. Also,
the module parameter is currently clamped to 8, but clamp it
per asic just in case some asics have different limits in the
future. Enable the option on gfx6,7 as well for consistency.
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a3bab325

16 9月, 2020 1 次提交

drm/amdgpu: Avoid accessing HW when suspending SW state · bf36b52e

由 Andrey Grodzovsky 提交于 7月 29, 2020

At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.

v2: Rename in_dpc to more meaningful name
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bf36b52e

25 8月, 2020 2 次提交

drm/amdgpu: refine message print for devices of hive · aac89168

由 Dennis Li 提交于 8月 20, 2020

Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device
of a hive is the message for.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

aac89168

drm/amdgpu: refine codes to avoid reentering GPU recovery · 53b3f8f4

由 Dennis Li 提交于 8月 19, 2020

if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive->in_reset
and adev->in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

53b3f8f4

15 8月, 2020 2 次提交

drm/amdgpu: revert "fix system hang issue during GPU reset" · f1403342

由 Christian König 提交于 8月 12, 2020

The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1a.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f1403342

drm/amdgpu: reconfigure spm golden settings on Navi1x after GFXOFF exit(v3) · 425a78f4

由 Tianci.Yin 提交于 7月 20, 2020

On Navi1x, the SPM golden settings are lost after GFXOFF
enter/exit, so reconfigure the golden settings after GFXOFF
exit.
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NTianci.Yin <tianci.yin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

425a78f4

05 8月, 2020 1 次提交

drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5) · a300de40

由 Monk Liu 提交于 7月 27, 2020

what:
the MQD's save and restore of KCQ (kernel compute queue)
cost lots of clocks during world switch which impacts a lot
to multi-VF performance

how:
introduce a paramter to control the number of KCQ to avoid
performance drop if there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

v2:
refine namings

v3:
choose queues for each ring to that try best to cross pipes evenly.

v4:
fix indentation
some cleanupsin the gfx_compute_queue_acquire()

v5:
further fix on indentations
more cleanupsin gfx_compute_queue_acquire()

TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a300de40

28 7月, 2020 1 次提交

drm/amdgpu: fix system hang issue during GPU reset · df9c8d1a

由 Dennis Li 提交于 7月 08, 2020

when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
re-entering GPU recovery.

During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev->reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.

v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm->is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.

v3:
1. change back to use adev->reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;

[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249]  dump_stack+0x98/0xd5
[ 1230.179443]  amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673]  gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882]  amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098]  amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239]  ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394]  ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558]  ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707]  ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832]  ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979]  ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522]  amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833]  free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143]  destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475]  pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819]  kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154]  kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458]  ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656]  ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831]  ksys_ioctl+0x98/0xb0
[ 1230.204004]  __x64_sys_ioctl+0x1a/0x20
[ 1230.205174]  do_syscall_64+0x5f/0x250
[ 1230.206339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

2. remove try_lock and introduce atomic hive->in_reset, to avoid
re-enter GPU recovery.

v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset

v5:
1. Fix some style issues.
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: NChristian König <christian.koenig@amd.com>
Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: NLijo Lazar <Lijo.Lazar@amd.com>
Suggested-by: NLuben Tukov <luben.tuikov@amd.com>
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

df9c8d1a

22 7月, 2020 1 次提交

drm/amdgpu: add read amdgpu_gfxoff status in debugfs · 443c7f3c

由 Jinzhou.Su 提交于 7月 07, 2020

 Add interface for SMU12 device, used by UMR.

v2: fix code style
Signed-off-by: NJinzhou.Su <Jinzhou.Su@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

443c7f3c

02 5月, 2020 2 次提交

drm/amdgpu: Rename amdgpu_gfx_kcq_queue_mask_transform() · 5c180eb9

由 Yong Zhao 提交于 3月 04, 2020

Rename it to amdgpu_queue_mask_bit_to_set_resource_bit() to be more
specific about its functionality. KFD will use it later.
Signed-off-by: NYong Zhao <Yong.Zhao@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5c180eb9

drm/amdgpu: update the method to set kcq queue mask · 3ab6fe4b

由 Likun Gao 提交于 10月 24, 2019

Use a common method to set queue mask before set kiq resource.
The value of queue mask must suitablt for the designated form.
Signed-off-by: NLikun Gao <Likun.Gao@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3ab6fe4b

24 4月, 2020 2 次提交

drm/amdgpu: protect ring overrun · 04e4e2e9

由 Yintian Tao 提交于 4月 23, 2020

Wait for the oldest sequence on the ring
to be signaled in order to make sure there
will be no command overrun.

v2: fix coding stype and remove abs operation
v3: remove the initialization of variable r
Signed-off-by: NYintian Tao <yttao@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

04e4e2e9

drm/amdgpu: request reg_val_offs each kiq read reg · 54208194

由 Yintian Tao 提交于 4月 22, 2020

According to the current kiq read register method,
there will be race condition when using KIQ to read
register if multiple clients want to read at same time
just like the expample below:
1. client-A start to read REG-0 throguh KIQ
2. client-A poll the seqno-0
3. client-B start to read REG-1 through KIQ
4. client-B poll the seqno-1
5. the kiq complete these two read operation
6. client-A to read the register at the wb buffer and
   get REG-1 value

Therefore, use amdgpu_device_wb_get() to request reg_val_offs
for each kiq read register.

v2: fix the error remove
v3: fix the print typo
v4: remove unused variables
Signed-off-by: NYintian Tao <yttao@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

54208194

14 4月, 2020 1 次提交

drm/amdgpu/kiq: add no_scheduler flag to KIQ · a783910d

由 Alex Deucher 提交于 4月 09, 2020

We don't want a GPU scheduler for this ring.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a783910d

09 4月, 2020 1 次提交

drm/amdgpu: rework sched_list generation · 1c6d567b

由 Nirmoy Das 提交于 4月 01, 2020

Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.

v2:
fix a coding style issue
do not use drm hw_ip const to populate amdgpu_ring_type enum

v3:
remove ctx reference and move sched array and num_sched to a struct
use num_scheds to detect uninitialized scheduler list

v4:
use array_index_nospec for user space controlled variables
fix possible checkpatch.pl warnings
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1c6d567b

11 3月, 2020 1 次提交

drm/amdgpu: call ras_debugfs_create_all in debugfs_init · 204eaac6

由 Tao Zhou 提交于 3月 06, 2020

and remove each ras IP's own debugfs creation

this is required to fix ras when the driver does not use the drm load
and unload callbacks due to ordering issues with the drm device node.
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

204eaac6

10 3月, 2020 1 次提交

drm/amdgpu: set compute queue priority at mqd_init · 33abcb1f

由 Nirmoy Das 提交于 2月 27, 2020

We were changing compute ring priority while rings were being used
before every job submission which is not recommended. This patch
sets compute queue priority at mqd initialization for gfx8, gfx9 and
gfx10.

Policy: make queue 0 of each pipe as high priority compute queue

High/normal priority compute sched lists are generated from set of high/normal
priority compute queues. At context creation, entity of compute queue
get a sched list from high or normal priority depending on ctx->priority
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

33abcb1f

27 2月, 2020 1 次提交

drm/amdgpu: use amdgpu_ring_test_helper when possible · c6fc97f9

由 Nirmoy Das 提交于 2月 25, 2020

amdgpu_ring_test_helper already handles ring->sched.ready correctly
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c6fc97f9

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功