提交 · dfd0287bd3920e132a8dae2a0ec3d92eaff5f2dd · openeuler / Kernel

30 11月, 2022 1 次提交

drm/amdgpu: Fix potential double free and null pointer dereference · dfd0287b

由 Liang He 提交于 11月 22, 2022

In amdgpu_get_xgmi_hive(), we should not call kfree() after
kobject_put() as the PUT will call kfree().

In amdgpu_device_ip_init(), we need to check the returned *hive*
which can be NULL before we dereference it.
Signed-off-by: NLiang He <windhl@126.com>
Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dfd0287b

20 9月, 2022 1 次提交

drm/amd/display: clean up some inconsistent indentings · 7f89f997

由 Yang Li 提交于 9月 15, 2022

clean up some inconsistent indentings

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2178Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NYang Li <yang.lee@linux.alibaba.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7f89f997

14 9月, 2022 1 次提交

drm/amdgpu: Use per device reset_domain for XGMI on sriov configuration · 46c67660

由 shaoyunl 提交于 9月 06, 2022

For SRIOV configuration, host driver control the reset method(either FLR or
heavier chain reset). The host will notify the guest individually with FLR
message if individual GPU within the hive need to be reset. So for guest
side, no need to use hive->reset_domain to replace the original per
device reset_domain
Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

46c67660

26 8月, 2022 1 次提交

drm/amdgpu: skip set_topology_info for VF · 7c55b598

由 Vignesh Chander 提交于 8月 16, 2022

Skip set_topology_info as xgmi TA will now block it
and host needs to program it.
Signed-off-by: NVignesh Chander <Vignesh.Chander@amd.com>
Reviewed-By : Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7c55b598

23 8月, 2022 1 次提交

drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini · d8adafc7

由 YiPeng Chai 提交于 8月 12, 2022

V1:
The amdgpu_xgmi_remove_device function will send unload command
to psp through psp ring to terminate xgmi, but psp ring has been
destroyed in psp_hw_fini.

V2:
1. Change the commit title.
2. Restore amdgpu_xgmi_remove_device to its original calling location.
   Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to
   psp_hw_fini.
Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d8adafc7

20 8月, 2022 1 次提交

drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini · 9d705d77

由 YiPeng Chai 提交于 8月 12, 2022

V1:
The amdgpu_xgmi_remove_device function will send unload command
to psp through psp ring to terminate xgmi, but psp ring has been
destroyed in psp_hw_fini.

V2:
1. Change the commit title.
2. Restore amdgpu_xgmi_remove_device to its original calling location.
   Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to
   psp_hw_fini.
Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9d705d77

16 3月, 2022 1 次提交

drm/amdgpu: drop xmgi23 error query/reset support · a03b2886

由 Hawking Zhang 提交于 3月 10, 2022

xgmi_ras is only initialized when host to GPU interface
is PCIE. in such case, xgmi23 is disabled and protected
by security firmware. Host access will results to
security violation
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a03b2886

03 3月, 2022 4 次提交

drm/amdgpu: Remove redundant .ras_fini initialization in some ras blocks · 80e0c2cb

由 yipechai 提交于 2月 17, 2022

1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as
   .ras_fini common function, which is called when
   .ras_fini of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_fini to
   initialize .ras_fini in ras blocks.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

80e0c2cb

drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block · f148c143

由 yipechai 提交于 2月 14, 2022

Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f148c143

drm/amdgpu: Optimize xxx_ras_fini function of each ras block · 667c7091

由 yipechai 提交于 2月 17, 2022

1. Move the variables of ras block instance members from
   specific xxx_ras_fini to general ras_fini call.
2. Function calls inside the modules only use parameters
   passed from xxx_ras_fini instead of ras block instance
   members.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

667c7091

drm/amdgpu: Modify .ras_fini function pointer parameter · 01d468d9

由 yipechai 提交于 2月 17, 2022

Modify .ras_fini function pointer parameter so that
we can remove redundant intermediate calls in some
ras blocks.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

01d468d9

18 2月, 2022 2 次提交

drm/amdgpu: Optimize xxx_ras_late_init function of each ras block · caae42f0

由 yipechai 提交于 2月 14, 2022

1. Move calling ras block instance members from module internal
   function to the top calling xxx_ras_late_init.
2. Module internal function calls can only use parameter variables
   of xxx_ras_late_init instead of ras block instance members.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

caae42f0

drm/amdgpu: Modify .ras_late_init function pointer parameter · 4e9b1fa5

由 yipechai 提交于 2月 14, 2022

Modify .ras_late_init function pointer parameter so that
it can remove redundant intermediate calls in some ras blocks.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4e9b1fa5

15 2月, 2022 2 次提交

drm/amdgpu: Optimize amdgpu_xgmi_ras_late_init/amdgpu_xgmi_ras_fini function code · 892a57a9

由 yipechai 提交于 2月 08, 2022

Optimize amdgpu_xgmi_ras_late_init/amdgpu_xgmi_ras_fini function code.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

892a57a9

drm/amdgpu: Optimize xxx_ras_late_init/xxx_ras_late_fini for each ras block · bdb3489c

由 yipechai 提交于 1月 30, 2022

1. Define amdgpu_ras_block_late_init to create sysfs nodes
   and interrupt handles.
2. Define amdgpu_ras_block_late_fini to remove sysfs nodes
   and interrupt handles.
3. Replace ras block variable members in struct
   amdgpu_ras_block_object with struct ras_common_if, which
   can make it easy to associate each ras block instance
   with each ras block functional interface.
4. Add .ras_cb to struct amdgpu_ras_block_object.
5. Change each ras block to fit for the changement of struct
   amdgpu_ras_block_object.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bdb3489c

10 2月, 2022 3 次提交

drm/amdgpu: Rework reset domain to be refcounted. · cfbb6b00

由 Andrey Grodzovsky 提交于 1月 21, 2022

The reset domain contains register access semaphor
now and so needs to be present as long as each device
in a hive needs it and so it cannot be binded to XGMI
hive life cycle.
Adress this by making reset domain refcounted and pointed
by each member of the hive and the hive itself.

v4:

Fix crash on boot witrh XGMI hive by adding type to reset_domain.
XGMI will only create a new reset_domain if prevoius was of single
device type meaning it's first boot. Otherwsie it will take a
refocunt to exsiting reset_domain from the amdgou device.

Add a wrapper around reset_domain->refcount get/put
and a wrapper around send to reset wq (Lijo)
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74121.html

cfbb6b00

drm/amdgpu: Drop hive->in_reset · 681260df

由 Andrey Grodzovsky 提交于 12月 15, 2021

Since we serialize all resets no need to protect from concurrent
resets.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74115.html

681260df

drm/amdgpu: Introduce reset domain · a4c63caf

由 Andrey Grodzovsky 提交于 11月 30, 2021

Defined a reset_domain struct such that
all the entities that go through reset
together will be serialized one against
another. Do it for both single device and
XGMI hive cases.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Suggested-by: NChristian König <ckoenig.leichtzumerken@gmail.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74111.html

a4c63caf

19 1月, 2022 1 次提交

drm/amdgpu: Fix the code style warnings in hdp xgmi mca and umc · 71b6c4a2

由 yipechai 提交于 1月 14, 2022

drm/amdgpu: Fix the code style warnings in hdp xgmi mca and umc:
1. WARNING: missing space after struct definition.
2. WARNING: please, no space before tabs.
3. WARNING: line length of xxx exceeds 100 columns.
4. ERROR: "foo* bar" should be "foo *bar".
5. ERROR: space required before the open parenthesis '('.
6. ERROR: space prohibited after that open parenthesis '('.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

71b6c4a2

15 1月, 2022 2 次提交

drm/amdgpu: Adjust error inject function code style in amdgpu_ras.c · 22d4ba53

由 yipechai 提交于 1月 05, 2022

1. Move xgmi special error inject function from amdgpu_ras.c to xgmi block.
2. Support to use psp_ras_trigger_error as default error inject function in amdgpu_ras.c. If .ras_error_inject isn't defined in ras block, default error inject function will take effect.

v2: squash in warning fix (Alex)
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NJohn Clements <john.clements@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

22d4ba53

drm/amdgpu: Modify xgmi block to fit for the unified ras block data and ops · 6c245386

由 yipechai 提交于 1月 04, 2022

1.Modify gmc block to fit for the unified ras block data and ops.
2.Change amdgpu_xgmi_ras_funcs to amdgpu_xgmi_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of gmc ras variable so that gmc ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register gmc ras block into amdgpu device ras block link list.
5.Remove the redundant code about gmc in amdgpu_ras.c after using the unified ras block.
Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NJohn Clements <john.clements@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6c245386

12 1月, 2022 1 次提交

drm/amdgpu: use default_groups in kobj_type · 7ff61cdc

由 Greg Kroah-Hartman 提交于 1月 06, 2022

There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the amdgpu sysfs code to use default_groups field which has
been the preferred way since aa30f47c ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jonathan Kim <jonathan.kim@amd.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: shaoyunl <shaoyun.liu@amd.com>
Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7ff61cdc

14 12月, 2021 1 次提交

drm/amdgpu: check df_funcs and its callback pointers · cace4bff

由 Hawking Zhang 提交于 11月 25, 2021

in case they are not avaiable in early phase
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NLe Ma <Le.Ma@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cace4bff

23 11月, 2021 1 次提交

drm/amd/amdgpu: fix potential memleak · 7b833d68

由 Bernard Zhao 提交于 11月 14, 2021

In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed
There is a potential memleak if not call kobject_put.
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NBernard Zhao <bernard@vivo.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7b833d68

18 11月, 2021 1 次提交

drm/amd/amdgpu: fix potential memleak · 27dfaedc

由 Bernard Zhao 提交于 11月 14, 2021

In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed
There is a potential memleak if not call kobject_put.
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NBernard Zhao <bernard@vivo.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

27dfaedc

06 11月, 2021 1 次提交

drm/amdgpu: correct xgmi ras error count reset · 7513c9ff

由 Tao Zhou 提交于 11月 04, 2021

The error count reset for xgmi3x16 pcs is missed.
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7513c9ff

27 8月, 2021 1 次提交

drm/amdgpu: Add support for RAS XGMI err query · 3c4ff2dc

由 John Clements 提交于 8月 26, 2021

Update XGMI RAS to support error query on aldebaran
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NJohn Clements <john.clements@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3c4ff2dc

25 8月, 2021 1 次提交

drm/amdgpu: Update RAS XGMI Error Query · f24d991b

由 John Clements 提交于 8月 24, 2021

Resolve bug querying error on unsupported ASIC
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NJohn Clements <john.clements@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f24d991b

19 8月, 2021 1 次提交

drm/amdgpu: get extended xgmi topology data · 44357a1b

由 Jonathan Kim 提交于 8月 03, 2021

The TA has a limit to the amount of data that can be retrieved from
GET_TOPOLOGY.  For setups that exceed this limit, the xGMI topology
needs to be re-initialized and data needs to be re-fetched from the
extended link records by setting a flag in the shared command buffer.

The number of hops and the number of links must be accumulated by the
driver. Other data points are all fetched from the first request.
Because the TA has already exceeded its link record limit, it
cannot hold bidirectional information.  Otherwise the driver would
have to do more than two fetches so the driver has to reflect the
topology information in the opposite direction.

v2: squashed with internal reviewed fix
Signed-off-by: NJonathan Kim <jonathan.kim@amd.com>
Reviewed-by: NHawking Zhang <hawking.zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

44357a1b

17 8月, 2021 1 次提交

drm/amd/amdgpu: remove unnecessary RAS context field · 893cf382

由 Candice Li 提交于 8月 13, 2021

Delete ras_if->name in the RAS ctx structure and remove related lines.
Signed-off-by: NCandice Li <candice.li@amd.com>
Reviewed-by: NJohn Clements <john.clements@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

893cf382

23 7月, 2021 1 次提交

drm/amdkfd: report xgmi bandwidth between direct peers to the kfd · 3f46c4e9

由 Jonathan Kim 提交于 5月 12, 2021

Report the min/max bandwidth in megabytes to the kfd for direct
xgmi connections only.  Indirect peers will report 0 since
indirect route is unknown.
Signed-off-by: NJonathan Kim <jonathan.kim@amd.com>
Reviewed-by: NFelix Kuehling <felix.kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3f46c4e9

10 4月, 2021 3 次提交

drm/amdgpu: move xgmi ras functions to xgmi_ras_funcs · 52137ca8

由 Hawking Zhang 提交于 3月 18, 2021

xgmi ras is not managed by gpu driver when gpu is
connected to cpu through xgmi. move all xgmi ras
functions to xgmi_ras_funcs so gpu driver only
initializes xgmi ras functions when it manages
xgmi ras.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NJohn Clements <John.Clements@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

52137ca8

drm/amdgpu: Convert sysfs sprintf/snprintf family to sysfs_emit · 36000c7a

由 Tian Tao 提交于 3月 24, 2021

Fix the following coccicheck warning:
drivers/gpu//drm/amd/amdgpu/amdgpu_ras.c:434:9-17: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_xgmi.c:220:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_xgmi.c:249:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/df_v3_6.c:208:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_psp.c:2973:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:75:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:112:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:58:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:93:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:125:9-17: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_gtt_mgr.c:52:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_gtt_mgr.c:71:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:140:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:164:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:186:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:208:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_atombios.c:1916:8-16: WARNING:
use scnprintf or sprintf
Signed-off-by: NTian Tao <tiantao6@hisilicon.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

36000c7a

drm/amd/pm: label these APIs used internally as static · c6ce68e6

由 Evan Quan 提交于 3月 19, 2021

Also drop unnecessary header file and declarations.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c6ce68e6

24 3月, 2021 2 次提交

drm/amdgpu: Reset the devices in the XGMI hive duirng probe · e3c1b071

由 shaoyunl 提交于 2月 16, 2021

In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices
without sync to each other. This could cause device hang since for XGMI configuration, all the devices
within the hive need to be reset at a limit time slot. This serial of patches try to solve this issue
by co-operate with new SMU which will only do minimum house keeping to response the SBR request but don't
do the real reset job and leave it to driver. Driver need to do the whole sw init and minimum HW init
to bring up the SMU and trigger the reset(possibly BACO) on all the ASICs at the same time
Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com>
Acked-by: Andrey Grodzovsky andrey.grodzovsky@amd.com
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e3c1b071

drm/amdgpu: mask the xgmi number of hops reported from psp to kfd · 4ac5617c

由 Jonathan Kim 提交于 1月 27, 2021

The psp supplies the link type in the upper 2 bits of the psp xgmi node
information num_hops field. With a new link type, Aldebaran has these
bits set to a non-zero value (1 = xGMI3) so the KFD topology will report
the incorrect IO link weights without proper masking.
The actual number of hops is located in the 3 least significant bits of
this field so mask if off accordingly before passing it to the KFD.
Signed-off-by: NJonathan Kim <jonathan.kim@amd.com>
Reviewed-by: NAmber Lin <amber.lin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4ac5617c

10 2月, 2021 1 次提交

drm/amdgpu: optimize list operation in amdgpu_xgmi · be8901c2

由 Kevin Wang 提交于 2月 03, 2021

simplify the list operation.
Signed-off-by: NKevin Wang <kevin1.wang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

be8901c2

13 11月, 2020 1 次提交

drm/amdgpu: check hive pointer before access · a9f5f98f

由 Hawking Zhang 提交于 9月 05, 2020

in case it is an invalid one
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NKevin Wang <kevin1.wang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a9f5f98f

10 10月, 2020 1 次提交

drm/amdgpu: Fix inconsistent of format with argument type in amdgpu_xgmi.c · 73e34336

由 Ye Bin 提交于 10月 09, 2020

Fix follow warning:
[drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c:249]: (warning) %d in format
string (no. 1) requires 'int' but the argument type is 'unsigned int'.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NYe Bin <yebin10@huawei.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

73e34336

25 8月, 2020 1 次提交

drm/amdgpu: Get DRM dev from adev by inline-f · 4a580877

由 Luben Tuikov 提交于 8月 24, 2020

Add a static inline adev_to_drm() to obtain
the DRM device pointer from an amdgpu_device pointer.
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4a580877

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功