- 03 3月, 2022 1 次提交
-
-
由 yipechai 提交于
Modify .ras_fini function pointer parameter so that we can remove redundant intermediate calls in some ras blocks. Signed-off-by: Nyipechai <YiPeng.Chai@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 2月, 2022 1 次提交
-
-
由 yipechai 提交于
Modify .ras_late_init function pointer parameter so that it can remove redundant intermediate calls in some ras blocks. Signed-off-by: Nyipechai <YiPeng.Chai@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 1月, 2022 2 次提交
-
-
由 yipechai 提交于
1.Modify gfx block to fit for the unified ras block data and ops. 2.Change amdgpu_gfx_ras_funcs to amdgpu_gfx_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of gfx ras variable so that gfx ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register gfx ras block into amdgpu device ras block link list. 5.Remove the redundant code about gfx in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of gfx versions. If .ras_late_init and .ras_fini had been defined by the selected gfx version, the defined functions will take effect; if not defined, default fill with amdgpu_gfx_ras_late_init and amdgpu_gfx_ras_fini. Signed-off-by: Nyipechai <YiPeng.Chai@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NJohn Clements <john.clements@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Those implementation details(whether swsmu supported, some ppt_funcs supported, accessing internal statistics ...)should be kept internally. It's not a good practice and even error prone to expose implementation details. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NLijo Lazar <lijo.lazar@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 9月, 2021 1 次提交
-
-
由 Nirmoy Das 提交于
Currently AMDGPU_RING_PRIO_MAX is redefinition of a max gfx hwip priority, this won't work well when we will have a hwip with different set of priorities than gfx. Also, HW ring priorities are different from ring priorities. Create a global enum for ring priority levels which each HWIP can use to define its own priority levels. Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NLijo Lazar <lijo.lazar@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 4月, 2021 1 次提交
-
-
由 Hawking Zhang 提交于
gfx ras is only available in cerntain ip generations. Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NJohn Clements <John.Clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 3月, 2021 3 次提交
-
-
由 Dennis Li 提交于
When connected to a host via xGMI, system fatal errors may trigger warm reset, driver has no change to query edc status before reset. Therefore in this case, driver should harvest previous error loging registers during boot, instead of only resetting them. v2: 1. IP's ras_manager object is created when its ras feature is enabled, so change to query edc status after amdgpu_ras_late_init called 2. change to enable watchdog timer after finishing gfx edc init Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reivewed-by: NHawking Zhang <hawking.zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
SQ's watchdog timer monitors forward progress, a mask of which waves caused the watchdog timeout is recorded into ras status registers and then trigger a system fatal error event. v2: 1. change *query_timeout_status to *query_sq_timeout_status. 2. move query_sq_timeout_status into amdgpu_ras_do_recovery. 3. add module parameters to enable/disable fatal error event and modify the watchdog timer. v3: 1. remove unused parameters of *enable_watchdog_timer Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
add edc counter/status reset and query functions for gfx block of aldebaran. v2: change to clear edc counter explicitly aldebaran hardware will not clear edc counter after driver reading them, so driver should clear them explicitly. Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 2月, 2021 1 次提交
-
-
由 Nirmoy Das 提交于
For high priority compute to work properly we need to enable wave limiting on gfx pipe. Wave limiting is done through writing into mmSPI_WCL_PIPE_PERCENT_GFX register. Enable only one high priority compute queue to avoid race condition between multiple high priority compute queues writing that register simultaneously. Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 11月, 2020 1 次提交
-
-
由 Prike Liang 提交于
The new amdgpu_gfx_state_change_set() funtion can support set GFX power change status to D0/D3. v2: squash in warning fix (Alex) Signed-off-by: NPrike Liang <Prike.Liang@amd.com> Acked-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 11月, 2020 1 次提交
-
-
由 Nirmoy Das 提交于
Compute queues are configurable with module param, num_kcq. amdgpu_gfx_is_high_priority_compute_queue was setting 1st 4 queues to high priority queue leaving a null drm scheduler in adev->gpu_sched[hw_ip]["normal_prio"].sched if num_kcq < 5. This patch tries to fix it by alternating compute queue priority between normal and high priority. Fixes: 33abcb1f (drm/amdgpu: set compute queue priority at mqd_init) Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 10月, 2020 1 次提交
-
-
由 Alex Deucher 提交于
Add a helper so we can set per asic default values. Also, the module parameter is currently clamped to 8, but clamp it per asic just in case some asics have different limits in the future. Enable the option on gfx6,7 as well for consistency. Acked-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 10月, 2020 1 次提交
-
-
由 Evan Quan 提交于
Enable Navi1X MGCG perfmon setting. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 9月, 2020 1 次提交
-
-
由 Stanley.Yang 提交于
GCEA/MMHUB EA error should not result to DF freeze, this is fixed in next generation, but for some reasons the GCEA/MMHUB EA error will result to DF freeze in previous generation, diver should avoid to indicate GCEA/MMHUB EA error as hw fatal error in kernel message by read GCEA/MMHUB err status registers. Changed from V1: make query_ras_error_status function more general make read mmhub er status register more friendly Changed from V2: move ras error status query function into do_recovery workqueue Changed from V3: remove useless code from V2, print GCEA error status instance number Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 8月, 2020 1 次提交
-
-
由 Tianci.Yin 提交于
On Navi1x, the SPM golden settings are lost after GFXOFF enter/exit, so reconfiguration is needed. Make the configuration code as an interface for future use. Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com> Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com> Signed-off-by: NTianci.Yin <tianci.yin@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 22 7月, 2020 1 次提交
-
-
由 Jinzhou.Su 提交于
Add interface for SMU12 device, used by UMR. v2: fix code style Signed-off-by: NJinzhou.Su <Jinzhou.Su@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 04 6月, 2020 1 次提交
-
-
由 Likun Gao 提交于
Add support for GC 10.3. v2: Squash in gb_addr_config fix (Alex) v3: Add num_pkrs support (Alex) Signed-off-by: NLikun Gao <Likun.Gao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 5月, 2020 1 次提交
-
-
由 Yong Zhao 提交于
Rename it to amdgpu_queue_mask_bit_to_set_resource_bit() to be more specific about its functionality. KFD will use it later. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 4月, 2020 1 次提交
-
-
由 Yintian Tao 提交于
According to the current kiq read register method, there will be race condition when using KIQ to read register if multiple clients want to read at same time just like the expample below: 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll the seqno-1 5. the kiq complete these two read operation 6. client-A to read the register at the wb buffer and get REG-1 value Therefore, use amdgpu_device_wb_get() to request reg_val_offs for each kiq read register. v2: fix the error remove v3: fix the print typo v4: remove unused variables Signed-off-by: NYintian Tao <yttao@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 4月, 2020 1 次提交
-
-
由 Nirmoy Das 提交于
Generate HW IP's sched_list in amdgpu_ring_init() instead of amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(), ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary. This patch also stores sched_list for all HW IPs in one big array in struct amdgpu_device which makes amdgpu_ctx_init_entity() much more leaner. v2: fix a coding style issue do not use drm hw_ip const to populate amdgpu_ring_type enum v3: remove ctx reference and move sched array and num_sched to a struct use num_scheds to detect uninitialized scheduler list v4: use array_index_nospec for user space controlled variables fix possible checkpatch.pl warnings Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 3月, 2020 1 次提交
-
-
由 Nirmoy Das 提交于
We were changing compute ring priority while rings were being used before every job submission which is not recommended. This patch sets compute queue priority at mqd initialization for gfx8, gfx9 and gfx10. Policy: make queue 0 of each pipe as high priority compute queue High/normal priority compute sched lists are generated from set of high/normal priority compute queues. At context creation, entity of compute queue get a sched list from high or normal priority depending on ctx->priority Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 3月, 2020 1 次提交
-
-
由 Hawking Zhang 提交于
GFX ras error counters are dirty ones after cold reboot Read operation is needed to reset them to 0 Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 2月, 2020 1 次提交
-
-
由 Yong Zhao 提交于
The two members will be used by KFD later. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 1月, 2020 1 次提交
-
-
由 chen gong 提交于
Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c, and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and flexible. Signed-off-by: Nchen gong <curry.gong@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 1月, 2020 1 次提交
-
-
由 Alex Sierra 提交于
tlbs invalidate pointer function added to kiq_pm4_funcs struct. This way, tlb flush can be done through kiq member. TLBs invalidatation implemented for gfx9 and gfx10. Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 19 12月, 2019 1 次提交
-
-
由 Nirmoy Das 提交于
This sched array can be passed on to entity creation routine instead of manually creating such sched array on every context creation. v2: squash in missing break fix Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 11月, 2019 1 次提交
-
-
由 Xiaojie Yuan 提交于
1. no need to allocate an extra member for 'mqd_backup' array 2. backup/restore mqd to/from the correct 'mqd_backup' array slot v2: warning fix (Alex) Signed-off-by: NXiaojie Yuan <xiaojie.yuan@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 19 11月, 2019 1 次提交
-
-
由 Xiaojie Yuan 提交于
1. no need to allocate an extra member for 'mqd_backup' array 2. backup/restore mqd to/from the correct 'mqd_backup' array slot v2: warning fix (Alex) Signed-off-by: NXiaojie Yuan <xiaojie.yuan@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 11月, 2019 2 次提交
-
-
由 changzhu 提交于
The GRBM register interface is now capable of bursting 1 cycle per register wr->wr, wr->rd much faster than previous muticycle per transaction done interface. This has caused a problem where status registers requiring HW to update have a 1 cycle delay, due to the register update having to go through GRBM. For cp ucode, it has realized dummy read in cp firmware.It covers the use of WAIT_REG_MEM operation 1 case only.So it needs to call gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to update firmware in case firmware is too old to have function to realize dummy read in cp firmware. For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is moved to gfxhub in gfx10. So it needs to add dummy read in driver between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0. Signed-off-by: Nchangzhu <Changfeng.Zhu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 changzhu 提交于
The GRBM register interface is now capable of bursting 1 cycle per register wr->wr, wr->rd much faster than previous muticycle per transaction done interface. This has caused a problem where status registers requiring HW to update have a 1 cycle delay, due to the register update having to go through GRBM. For cp ucode, it has realized dummy read in cp firmware.It covers the use of WAIT_REG_MEM operation 1 case only.So it needs to call gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to update firmware in case firmware is too old to have function to realize dummy read in cp firmware. For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is moved to gfxhub in gfx10. So it needs to add dummy read in driver between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0. Signed-off-by: Nchangzhu <Changfeng.Zhu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 10月, 2019 1 次提交
-
-
由 Nirmoy Das 提交于
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 03 10月, 2019 6 次提交
-
-
由 Marek Olšák 提交于
UMDs need this for correct programming of harvested chips. Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
gfx_ras_late_init can get the info by itself Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
gfx_ras_fini can be shared among all generations of gfx Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
gfx ras ecc common functions could be reused among all gfx generations Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Marek Olšák 提交于
Never used. Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Marek Olšák 提交于
UMDs need this for correct programming of harvested chips. Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 9月, 2019 1 次提交
-
-
由 Hawking Zhang 提交于
amdgpu_gfx_ras_late_init is used to init gfx specfic ras debugfs/sysfs node and gfx specific interrupt handler. It can be shared among gfx generations Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 8月, 2019 1 次提交
-
-
由 Dennis Li 提交于
Add functions for RAS error inject and query error counter Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-