提交 · d09ef243035b75a6d403ebfeb7e87fa20d7e25c6 · openeuler / Kernel

16 11月, 2022 1 次提交

drm/amdgpu: clarify DC checks · d09ef243

由 Alex Deucher 提交于 7月 19, 2022

There are several places where we don't want to check
if a particular asic could support DC, but rather, if
DC is enabled.  Set a flag if DC is enabled and check
for that rather than if a device supports DC or not.
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d09ef243

01 11月, 2022 1 次提交

drm/amdgpu: Disable GPU reset on SRIOV before remove pci. · 2103c421

由 Gavin Wan 提交于 10月 26, 2022

The recent change brought a bug on SRIOV envrionment. It caused
unloading amdgpu failed on Guest VM. The reason is that the VF
FLR was requested while unloading amdgpu driver, but the VF FLR
of SRIOV sequence is wrong while removing PCI device.

For SRIOV, the guest driver should not trigger the whole XGMI hive
to do the reset. Host driver control how the device been reset.

Fixes: f5c7e779 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2")
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: NGavin Wan <Gavin.Wan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2103c421

24 9月, 2022 1 次提交

drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers. · f158936b

由 Jim Cromie 提交于 9月 11, 2022

Use DECLARE_DYNDBG_CLASSMAP across DRM:

 - in .c files, since macro defines/initializes a record

 - in drivers, $mod_{drv,drm,param}.c
   ie where param setup is done, since a classmap is param related

 - in drm/drm_print.c
   since existing __drm_debug param is defined there,
   and we ifdef it, and provide an elaborated alternative.

 - in drm_*_helper modules:
   dp/drm_dp - 1st item in makefile target
   drivers/gpu/drm/drm_crtc_helper.c - random pick iirc.

Since these modules all use identical CLASSMAP declarations (ie: names
and .class_id's) they will all respond together to "class DRM_UT_*"
query-commands:

  :#> echo class DRM_UT_KMS +p > /proc/dynamic_debug/control

NOTES:

This changes __drm_debug from int to ulong, so BIT() is usable on it.

DRM's enum drm_debug_category values need to sync with the index of
their respective class-names here.  Then .class_id == category, and
dyndbg's class FOO mechanisms will enable drm_dbg(DRM_UT_KMS, ...).

Though DRM needs consistent categories across all modules, thats not
generally needed; modules X and Y could define FOO differently (ie a
different NAME => class_id mapping), changes are made according to
each module's private class-map.

No callsites are actually selected by this patch, since none are
class'd yet.
Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20220912052852.1123868-3-jim.cromie@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

f158936b

21 9月, 2022 1 次提交

drm/amdgpu: bump minor for gang submit · 69743405

由 Christian König 提交于 9月 20, 2022

Since that has now landed bump the minor to let userspace know about it.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

69743405

20 9月, 2022 2 次提交

drm/amdgpu: Fixed psp fence and memory issues when removing amdgpu device · 83d29a5f

由 YiPeng Chai 提交于 9月 08, 2022

V3:
Fixed psp fence and memory issues for the asic
using smu v13_0_2 when removing amdgpu device.

[Why]:
1. psp_suspend->psp_free_shared_bufs->
       psp_ta_free_shared_buf->
           amdgpu_bo_free_kernel->
             ...->amdgpu_bo_release_notify->
                    amdgpu_fill_buffer
   psp will free vram memory used by psp when psp_suspend
   is called. But for the asic using smu v13_0_2, because
   psp_suspend is called before adev->shutdown is set to
   true when removing the first hive device, amdgpu fill_buffer
   will be called, which will cause fence issues when evicting
   all vram resources in amdgpu vram mgr_fini.
2. Since psp_hw_fini is not called after calling psp_suspend
   and psp_suspend only calls psp_ring_stop, the psp ring memory
   will not be released when amdgpu device is removed.

[How]:
1. Set shutdown to true before calling amdgpu_device_gpu_recover,
   then amdgpu_fill_buffer will not be called when psp_suspend is
   called.
2. Free psp ring memory in psp_sw_fini.
Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

83d29a5f

drm/amdgpu: Adjust removal control flow for smu v13_0_2 · f5c7e779

由 YiPeng Chai 提交于 9月 07, 2022

Adjust removal control flow for smu v13_0_2:
   During amdgpu uninstallation, when removing the first
device, the kernel needs to first send a mode1reset message
to all gpu devices. Otherwise, smu initialization will fail
the next time amdgpu is installed.

V2:
1. Update commit comments.
2. Remove the global variable amdgpu_device_remove_cnt
   and add a variable to the structure amdgpu_hive_info.
3. Use hive to detect the first removed device instead of
   a global variable.

V3:
 1. Update commit comments.
 2. Split a patch into multiple patches.
 3. The current patch does:
    a. Add a work mode of AMDGPU_RESET_FOR_DEVICE_REMOVE into
       the existing gpu recover path, which make all devices
       in hive list only have HW reset but no resume (except
       the base IP).
    b. Call AMDGPU_RESET_FOR_DEVICE_REMOVE and
       AMDGPU_NEED_FULL_RESET mode of amdgpu_device_gpu_recover
       in amdgpu_pci_remove when removing the first device in
       hive list.
    c. When removing the first device, the IP blocks keyword
       function call sequence is as follows:
.suspend->mode1reset->.resume(basic ip)->.hw_fini->.early_fini->.sw_fini.
   ^                           |
   |-<----------<---------<----|
	The first three sequences are because of a call to
        amdgpu_device_gpu_recover. The three sequences will be
        executed in a loop until all devices in the hive list
        are iterated.
        The sequences starting from .hw_fini only apply to the
        first device. Since .suspend has been called before,
        except the resumed phase1 basic ip blocks, all other ip
        blocks .hw_fini of current device will do nothing.
     d. When removing other devices, the calling sequences is the
        same as legacy:
	   .hw_fini -> .early_fini -> .sw_fini.
	Since .suspend has been called when removing the first device,
        except the resumed phase1 basic ip blocks, all of other ip
        blocks .hw_fini of current device will do nothing.
Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f5c7e779

08 9月, 2022 1 次提交

drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled · fac53471

由 YiPeng Chai 提交于 8月 18, 2022

V1:
  The psp_cmd_submit_buf function is called by psp_hw_fini to send
TA unload messages to psp to terminate ras, asd and tmr. But when
amdgpu is uninstalled, drm_dev_unplug is called earlier than
psp_hw_fini in amdgpu_pci_remove, the calling order as follows:
static void amdgpu_pci_remove(struct pci_dev *pdev) {
	drm_dev_unplug
	......
	amdgpu_driver_unload_kms->amdgpu_device_fini_hw->...
		->.hw_fini->psp_hw_fini->...
		->psp_ta_unload->psp_cmd_submit_buf
	......
}
The program will return when calling drm_dev_enter in psp_cmd_submit_buf.

So the call to drm_dev_enter in psp_cmd_submit_buf should be
removed, so that the TA unload messages can be sent to the psp
when amdgpu is uninstalled.

V2:
1. Restore psp_cmd_submit_buf to its original code.
2. Move drm_dev_unplug call after amdgpu_driver_unload_kms in
   amdgpu_pci_remove.
3. Since amdgpu_device_fini_hw is called by amdgpu_driver_unload_kms,
   remove the unplug check to release device mmio resource in
   amdgpu_device_fini_hw before calling drm_dev_unplug.
Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fac53471

30 8月, 2022 1 次提交

drm/amdgpu: add missing pci_disable_device() in amdgpu_pmops_runtime_resume() · 6b11af6d

由 Yang Yingliang 提交于 8月 26, 2022

Add missing pci_disable_device() if amdgpu_device_resume() fails.

Fixes: 8e4d5d43 ("drm/amdgpu: Handling of amdgpu_device_resume return value for graceful teardown")
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6b11af6d

25 7月, 2022 3 次提交

drm/amd/display: Add visualconfirm module parameter · 792a0cdd

由 Leo Li 提交于 7月 06, 2022

[Why]

Being able to configure visual confirm at boot or in cmdline is helpful
when debugging.

[How]

Add a module parameter to configure DC visual confirm, which works the
same way as the equivalent debugfs entry.
Signed-off-by: NLeo Li <sunpeng.li@amd.com>
Reviewed-by: NRodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

792a0cdd

drm/amdgpu: bump driver version for IP discovery info in HW INFO · 465576ca

由 Alex Deucher 提交于 5月 20, 2022

So userspace knows when it is available.

Proposed mesa patch:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411/diffs?commit_id=c8a63590dfd0d64e6e6a634dcfed993f135dd075Reviewed-by: NMarek Olšák <marek.olsak@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

465576ca

drm/amdgpu: Fix comment typo · c19a23fa

由 Jason Wang 提交于 7月 16, 2022

The double `to' is duplicated in the comment, remove one.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NJason Wang <wangborong@cdjrlc.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c19a23fa

19 7月, 2022 1 次提交

drm/amdgpu: drop runpm from amdgpu_device structure · 9c913f38

由 Guchun Chen 提交于 7月 14, 2022

It's redundant, as now switching to rpm_mode to indicate
runtime power management mode.
Suggested-by: NLijo Lazar <lijo.lazar@amd.com>
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NLijo Lazar <lijo.lazar@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9c913f38

08 6月, 2022 1 次提交

drm/amdgpu: Add peer-to-peer support among PCIe connected AMD GPUs · 08a2fd23

由 Ramesh Errabolu 提交于 5月 26, 2022

Add support for peer-to-peer communication among AMD GPUs over PCIe
bus. Support REQUIRES enablement of config HSA_AMD_P2P.
Signed-off-by: NRamesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

08a2fd23

27 5月, 2022 2 次提交

drm/amdgpu: bump minor version number · 08cffb3e

由 Christian König 提交于 5月 06, 2022

Increase the minor version number to indicate that the new flags are
available.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NMarek Olšák <marek.olsak@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

08cffb3e

drm/amdgpu: add beige goby PCI ID · 62e9bd20

由 Alex Deucher 提交于 5月 24, 2022

Add a beige goby PCI ID.
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

62e9bd20

19 5月, 2022 2 次提交

drm/amd: Don't reset dGPUs if the system is going to s2idle · 7123d39d

由 Mario Limonciello 提交于 5月 17, 2022

An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

This functionality was introduced in commit daf8de08 ("drm/amdgpu:
always reset the asic in suspend (v2)"). A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

Fixes: daf8de08 ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

7123d39d

drm/amd: Don't reset dGPUs if the system is going to s2idle · 0223e516

由 Mario Limonciello 提交于 5月 17, 2022

An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

0223e516

06 5月, 2022 1 次提交

Revert "drm/amdgpu: disable runpm if we are the primary adapter" · 5a90c24a

由 Alex Deucher 提交于 5月 04, 2022

This reverts commit b95dc06a.

This workaround is no longer necessary.  We have a better workaround
in commit f95af4a9 ("drm/amdgpu: don't runtime suspend if there are displays attached (v3)").
Reviewed-by: NJavier Martinez Canillas <javierm@redhat.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5a90c24a

04 5月, 2022 2 次提交

drm/amd/amdgpu: add more fw load type to fit new ASICs · a76be7bb

由 Chengming Gui 提交于 2月 17, 2022

Align exported fw load types with internal used.
Signed-off-by: NChengming Gui <Jack.Gui@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a76be7bb

drm/amdgpu: add mes_kiq module parameter v2 · 928fe236

由 Jack Xiao 提交于 4月 14, 2021

mes_kiq parameter is used to enable mes kiq pipe.
This module parameter is unneccessary or enabled by default
in final version.

v2: reword commit message.
Signed-off-by: NJack Xiao <Jack.Xiao@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

928fe236

28 4月, 2022 1 次提交

drm/amdgpu: don't runtime suspend if there are displays attached (v3) · f95af4a9

由 Alex Deucher 提交于 12月 28, 2021

We normally runtime suspend when there are displays attached if they
are in the DPMS off state, however, if something wakes the GPU
we send a hotplug event on resume (in case any displays were connected
while the GPU was in suspend) which can cause userspace to light
up the displays again soon after they were turned off.

Prior to
commit 087451f3 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."),
the driver took a runtime pm reference when the fbdev emulation was
enabled because we didn't implement proper shadowing support for
vram access when the device was off so the device never runtime
suspended when there was a console bound.  Once that commit landed,
we now utilize the core fb helper implementation which properly
handles the emulation, so runtime pm now suspends in cases where it did
not before.  Ultimately, we need to sort out why runtime suspend in not
working in this case for some users, but this should restore similar
behavior to before.

v2: move check into runtime_suspend
v3: wake ups -> wakeups in comment, retain pm_runtime behavior in
    runtime_idle callback

Fixes: 087451f3 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/Tested-by: NMichele Ballabio <ballabio.m@gmail.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

f95af4a9

22 4月, 2022 1 次提交

drm/amdgpu: don't runtime suspend if there are displays attached (v3) · 4020c228

由 Alex Deucher 提交于 12月 28, 2021

We normally runtime suspend when there are displays attached if they
are in the DPMS off state, however, if something wakes the GPU
we send a hotplug event on resume (in case any displays were connected
while the GPU was in suspend) which can cause userspace to light
up the displays again soon after they were turned off.

Prior to
commit 087451f3 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."),
the driver took a runtime pm reference when the fbdev emulation was
enabled because we didn't implement proper shadowing support for
vram access when the device was off so the device never runtime
suspended when there was a console bound. Once that commit landed,
we now utilize the core fb helper implementation which properly
handles the emulation, so runtime pm now suspends in cases where it did
not before. Ultimately, we need to sort out why runtime suspend in not
working in this case for some users, but this should restore similar
behavior to before.

v2: move check into runtime_suspend
v3: wake ups -> wakeups in comment, retain pm_runtime behavior in
runtime_idle callback

Fixes: 087451f3 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/Tested-by: NMichele Ballabio <ballabio.m@gmail.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4020c228

14 4月, 2022 1 次提交

drm/amdgpu: Ensure HDA function is suspended before ASIC reset · 887f75cf

由 Kai-Heng Feng 提交于 4月 07, 2022

DP/HDMI audio on AMD PRO VII stops working after S3:
[  149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset
[  149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset
[  149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset
[  149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot
[  150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot
...
[  155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535

The offending commit is daf8de08 ("drm/amdgpu: always reset the asic in
suspend (v2)"). Commit 34452ac3 ("drm/amdgpu: don't use BACO for
reset in S3 ") doesn't help, so the issue is something different.

Assuming that to make HDA resume to D0 fully realized, it needs to be
successfully put to D3 first. And this guesswork proves working, by
moving amdgpu_asic_reset() to noirq callback, so it's called after HDA
function is in D3.

Fixes: daf8de08 ("drm/amdgpu: always reset the asic in suspend (v2)")
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

887f75cf

09 4月, 2022 1 次提交

drm/amdgpu: expand cg_flags from u32 to u64 · 25faeddc

由 Evan Quan 提交于 3月 25, 2022

With this, we can support more CG flags.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

25faeddc

08 4月, 2022 1 次提交

drm/amdgpu: Ensure HDA function is suspended before ASIC reset · 9e051720

由 Kai-Heng Feng 提交于 4月 07, 2022

DP/HDMI audio on AMD PRO VII stops working after S3:
[  149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset
[  149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset
[  149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset
[  149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot
[  150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot
...
[  155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535

The offending commit is daf8de08 ("drm/amdgpu: always reset the asic in
suspend (v2)"). Commit 34452ac3 ("drm/amdgpu: don't use BACO for
reset in S3 ") doesn't help, so the issue is something different.

Assuming that to make HDA resume to D0 fully realized, it needs to be
successfully put to D3 first. And this guesswork proves working, by
moving amdgpu_asic_reset() to noirq callback, so it's called after HDA
function is in D3.

Fixes: daf8de08 ("drm/amdgpu: always reset the asic in suspend (v2)")
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9e051720

26 3月, 2022 1 次提交

drm/amdkfd: Fix Incorrect VMIDs passed to HWS · b7dfbd2e

由 Tushar Patel 提交于 3月 17, 2022

Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix
this by passing correct number of VMIDs to HWS

v2: squash in warning fix (Alex)
Signed-off-by: NTushar Patel <tushar.patel@amd.com>
Reviewed-by: NFelix Kuehling <felix.kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b7dfbd2e

05 3月, 2022 1 次提交

drm/amdgpu/vcn: Add vcn firmware log · 11eb648d

由 Ruijing Dong 提交于 3月 02, 2022

vcn fwlog is for debugging purpose only,
by default, it is disabled.
Signed-off-by: NRuijing Dong <ruijing.dong@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

11eb648d

03 3月, 2022 1 次提交

drm/amdgpu: Bump minor version for hot plug tests enabling. · 5aa06147

由 Andrey Grodzovsky 提交于 3月 01, 2022

This will allow to enable the tests only after latest fix
after which the tests passed on my system.

I tested on NV21 standalone and Vega 10 and Polaris as
pair with DRI_PRIME.

It's possible there might be still issues on ASICs i don't
have at my posession but that that the point of enbling
the tests finally - if other people during testing will
encounter errors they will report and I will be able to fix.

The releated merge request for enabling libdrm tests suite  is in
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/227Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5aa06147

25 2月, 2022 2 次提交

drm/amdgpu: Exclude PCI reset method for now. · 2656fd23

由 Andrey Grodzovsky 提交于 2月 24, 2022

According to my investigation of the state of PCI
reset recently it's not working. The reason is
due to the fact the kernel PCI code rejects SBR
when there are more then one PF under same bridge
which we always have (at least AUDIO PF but usually
more) and that because SBR will reset all the PFS
and devices under the same bridge as you and you
cannot assume they support SBR.
Once we anble FLR support we can reenable this option as
FLR is doable on single PF and doens't have this
restriction.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2656fd23

drm/amdgpu: Add use_xgmi_p2p module parameter · 158a05a0

由 Alex Sierra 提交于 2月 23, 2022

This parameter controls xGMI p2p communication, which is enabled by
default. However, it can be disabled by setting it to 0. In case xGMI
p2p is disabled in a dGPU, PCIe p2p interface will be used instead.
This parameter is ignored in GPUs that do not support xGMI
p2p configuration.
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Acked-by: NLuben Tuikov <luben.tuikov@amd.com>
Acked-by: NHarish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

158a05a0

24 2月, 2022 3 次提交

drm/amd: Check if ASPM is enabled from PCIe subsystem · 7294863a

由 Mario Limonciello 提交于 2月 01, 2022

commit 0064b0ce ("drm/amd/pm: enable ASPM by default") enabled ASPM
by default but a variety of hardware configurations it turns out that this
caused a regression.

* PPC64LE hardware does not support ASPM at a hardware level.
  CONFIG_PCIEASPM is often disabled on these architectures.
* Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem
  disables it

Check with the PCIe subsystem to see that ASPM has been enabled
or not.

Fixes: 0064b0ce ("drm/amd/pm: enable ASPM by default")
Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907
Tested-by: koba.ko@canonical.com
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

7294863a

drm/amdgpu: drop testing module parameter · b784f42c

由 Alex Deucher 提交于 2月 18, 2022

This test is not particularly useful now that GTT and GART
are decoupled in the driver.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b784f42c

drm/amdgpu: drop benchmark module parameter · 0b1a6348

由 Alex Deucher 提交于 2月 18, 2022

Now that we expose the benchmarks via debugfs, there is no
longer a need for the module parameter.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0b1a6348

23 2月, 2022 1 次提交

drm/amdgpu: Fix typo in *whether* in comment · cec2cc7b

由 Paul Menzel 提交于 2月 19, 2022

Signed-off-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cec2cc7b

18 2月, 2022 2 次提交

drm/amd: Refactor `amdgpu_aspm` to be evaluated per device · 0ab5d711

由 Mario Limonciello 提交于 2月 16, 2022

Evaluating `pcie_aspm_enabled` as part of driver probe has the implication
that if one PCIe bridge with an AMD GPU connected doesn't support ASPM
then none of them do.  This is an invalid assumption as the PCIe core will
configure ASPM for individual PCIe bridges.

Create a new helper function that can be called by individual dGPUs to
react to the `amdgpu_aspm` module parameter without having negative results
for other dGPUs on the PCIe bus.
Suggested-by: NLijo Lazar <lijo.lazar@amd.com>
Reviewed-by: NLijo Lazar <lijo.lazar@amd.com>
Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0ab5d711

drm/amd: Check if ASPM is enabled from PCIe subsystem · cba07cce

由 Mario Limonciello 提交于 2月 01, 2022

commit 0064b0ce ("drm/amd/pm: enable ASPM by default") enabled ASPM
by default but a variety of hardware configurations it turns out that this
caused a regression.

* PPC64LE hardware does not support ASPM at a hardware level.
  CONFIG_PCIEASPM is often disabled on these architectures.
* Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem
  disables it

Check with the PCIe subsystem to see that ASPM has been enabled
or not.

Fixes: 0064b0ce ("drm/amd/pm: enable ASPM by default")
Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907
Tested-by: koba.ko@canonical.com
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cba07cce

17 2月, 2022 1 次提交

drm/amdgpu: make cyan skillfish support code more consistent · dfcc3e8c

由 Alex Deucher 提交于 2月 14, 2022

Since this is an existing asic, adjust the code to follow
the same logic as previously so the driver state is consistent.

No functional change intended.
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dfcc3e8c

08 2月, 2022 1 次提交

drm/amdgpu: drop experimental flag on aldebaran · 3786a9bc

由 Alex Deucher 提交于 2月 03, 2022

These have been at production level for a while. Drop
the flag.
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3786a9bc

03 2月, 2022 2 次提交

drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabled · e55a3aea

由 Mario Limonciello 提交于 1月 25, 2022

dGPUs connected to Intel systems configured for suspend to idle
will not have the power rails cut at suspend and resetting the GPU
may lead to problematic behaviors.

Fixes: e25443d2 ("drm/amdgpu: add a dev_pm_ops prepare callback (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1879Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e55a3aea

drm/amd: Only run s3 or s0ix if system is configured properly · 04ef8604

由 Mario Limonciello 提交于 1月 25, 2022

This will cause misconfigured systems to not run the GPU suspend
routines.

* In APUs that are properly configured system will go into s2idle.
* In APUs that are intended to be S3 but user selects
  s2idle the GPU will stay fully powered for the suspend.
* In APUs that are intended to be s2idle and system misconfigured
  the GPU will stay fully powered for the suspend.
* In systems that are intended to be s2idle, but AMD dGPU is also
  present, the dGPU will go through S3
Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

04ef8604

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功