- 10 2月, 2022 3 次提交
-
-
由 Andrey Grodzovsky 提交于
Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency there. For TDR call the original recovery function directly since it's already executed from within the wq. For others just use a wrapper to qeueue work and wait on it to finish. v2: Rename to amdgpu_recover_work_struct Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74113.html
-
由 Andrey Grodzovsky 提交于
Before we initialize schedulers we must know which reset domain are we in - for single device there iis a single domain per device and so single wq per device. For XGMI the reset domain spans the entire XGMI hive and so the reset wq is per hive. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74112.html
-
由 Andrey Grodzovsky 提交于
Defined a reset_domain struct such that all the entities that go through reset together will be serialized one against another. Do it for both single device and XGMI hive cases. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Suggested-by: NChristian König <ckoenig.leichtzumerken@gmail.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74111.html
-
- 19 1月, 2022 1 次提交
-
-
由 Jingwen Chen 提交于
[Why] This fixes 892deb48 ("drm/amdgpu: Separate vf2pf work item init from virt data exchange"). we should read pf2vf data based at mman.fw_vram_usage_va after gmc sw_init. commit 892deb48 breaks this logic. [How] calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to set the right base in the right sequence. v2: call amdgpu_virt_init_data_exchange after gmc sw_init to make data exchange workqueue run v3: clean up the code logic v4: add some comment and make the code more readable Fixes: 892deb48 ("drm/amdgpu: Separate vf2pf work item init from virt data exchange") Signed-off-by: NJingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: NHorace Chen <horace.chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 1月, 2022 2 次提交
-
-
由 Alex Deucher 提交于
Rather than opting into GPU recovery support, default to on, and opt out if it's not working on a particular GPU. This avoids the need to add new asics to this list since this is a core feature. Reviewed-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
Add yellow carp to devices which support recovery Signed-off-by: NCHANDAN VURDIGERE NATARAJ <chandan.vurdigerenataraj@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 1月, 2022 5 次提交
-
-
由 Prike Liang 提交于
In some APU project we needn't always assign flags to identify each other, so we may not need return an error. Signed-off-by: NPrike Liang <Prike.Liang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tom St Denis 提交于
The function amdgpu_mm_wreg_mmio_rlc() is used by debugfs to write to MMIO registers. It didn't support registers beyond the BAR mapped MMIO space. This adds pcie indirect write support. Signed-off-by: NTom St Denis <tom.stdenis@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nirmoy Das 提交于
Get rid off pin/unpin of gart BO at resume/suspend and instead pin only once and try to recover gart content at resume time. This is much more stable in case there is OOM situation at 2nd call to amdgpu_device_evict_resources() while evicting GART table. v3: remove gart recovery from other places v2: pin gart at amdgpu_gart_table_vram_alloc() Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nirmoy Das 提交于
Do not allow exported amdgpu_gtt_mgr_*() to accept any ttm_resource_manager pointer. Also there is no need to force other module to call a ttm function just to eventually call gtt_mgr functions. v4: remove unused adev. v3: upcast mgr from ttm resopurce manager instead of getting it from adev. v2: pass adev's gtt_mgr instead of adev. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Leslie Shi 提交于
Patch: 3efb17ae7e92 ("drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure") makes call to amdgpu_device_unmap_mmio() conditioned on device unplugged. This patch unmaps MMIO mappings even when device is not unplugged. v2: Add condition of drm_dev_enter() to deleted unmaps in patch "drm/amdgpu: Unmap all MMIO mappings" Signed-off-by: NLeslie Shi <Yuliang.Shi@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 1月, 2022 1 次提交
-
-
由 Mario Limonciello 提交于
This codepath should be running in both s0ix and s3, but only does currently because s3 and s0ix are both set in the s0ix case. Signed-off-by: NMario Limonciello <mario.limonciello@amd.com> Acked-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 12月, 2021 3 次提交
-
-
由 Alex Deucher 提交于
Chips with no display hardware should return false for DC support. v2: drop Arcturus and Aldebaran Fixes: f7f12b25 ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Reported-by: NTareque Md.Hanif <tarequemd.hanif@yahoo.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Surbhi Kakarya 提交于
If the event guard is enabled and VF doesn't receive an ack from PF for full access, the guest driver load crashes. This is caused due to the call to ttm_device_clear_dma_mappings with non-initialized mman during driver tear down. This patch adds the necessary condition to check if the mman initialization passed or not and takes the path based on the condition output. Signed-off-by: NSurbhi Kakarya <Surbhi.Kakarya@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Leslie Shi 提交于
drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure [Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs. [How] call amdgpu_device_unmap_mmio() if device is unplugged to prevent invalid memory address in vcn_v3_0_sw_fini() when GPU initialization failure. Signed-off-by: NLeslie Shi <Yuliang.Shi@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 12月, 2021 2 次提交
-
-
由 sashank saye 提交于
For Aldebaran chip passthrough case we need to intimate SMU about special handling for SBR.On older chips we send LightSBR to SMU, enabling the same for Aldebaran. Slight difference, compared to previous chips, is on Aldebaran, SMU would do a heavy reset on SBR. Hence, the word Heavy instead of Light SBR is used for SMU to differentiate. Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Nsashank saye <sashank.saye@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Victor Skvortsov 提交于
Driver needs to call get_xgmi_info() before ip_init to determine whether it needs to handle a pending hive reset. Signed-off-by: NVictor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: NDavid Nieto <david.nieto@amd.com> Reviewed by: shaoyun.liu <Shaoyun.lui@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 12月, 2021 1 次提交
-
-
由 Alex Deucher 提交于
Chips with no display hardware should return false for DC support. v2: drop Arcturus and Aldebaran Fixes: f7f12b25 ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Reported-by: NTareque Md.Hanif <tarequemd.hanif@yahoo.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 12月, 2021 1 次提交
-
-
由 Huang Rui 提交于
The job embedded fence donesn't initialize the flags at dma_fence_init(). Then we will go a wrong way in amdgpu_fence_get_timeline_name callback and trigger a null pointer panic once we enabled the trace event here. So introduce new amdgpu_fence object to indicate the job embedded fence. [ 156.131790] BUG: kernel NULL pointer dereference, address: 00000000000002a0 [ 156.131804] #PF: supervisor read access in kernel mode [ 156.131811] #PF: error_code(0x0000) - not-present page [ 156.131817] PGD 0 P4D 0 [ 156.131824] Oops: 0000 [#1] PREEMPT SMP PTI [ 156.131832] CPU: 6 PID: 1404 Comm: sdma0 Tainted: G OE 5.16.0-rc1-custom #1 [ 156.131842] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016 [ 156.131848] RIP: 0010:strlen+0x0/0x20 [ 156.131859] Code: 89 c0 c3 0f 1f 80 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 156.131872] RSP: 0018:ffff9bd0018dbcf8 EFLAGS: 00010206 [ 156.131880] RAX: 00000000000002a0 RBX: ffff8d0305ef01b0 RCX: 000000000000000b [ 156.131888] RDX: ffff8d03772ab924 RSI: ffff8d0305ef01b0 RDI: 00000000000002a0 [ 156.131895] RBP: ffff9bd0018dbd60 R08: ffff8d03002094d0 R09: 0000000000000000 [ 156.131901] R10: 000000000000005e R11: 0000000000000065 R12: ffff8d03002094d0 [ 156.131907] R13: 000000000000001f R14: 0000000000070018 R15: 0000000000000007 [ 156.131914] FS: 0000000000000000(0000) GS:ffff8d062ed80000(0000) knlGS:0000000000000000 [ 156.131923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.131929] CR2: 00000000000002a0 CR3: 000000001120a005 CR4: 00000000003706e0 [ 156.131937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.131942] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.131949] Call Trace: [ 156.131953] <TASK> [ 156.131957] ? trace_event_raw_event_dma_fence+0xcc/0x200 [ 156.131973] ? ring_buffer_unlock_commit+0x23/0x130 [ 156.131982] dma_fence_init+0x92/0xb0 [ 156.131993] amdgpu_fence_emit+0x10d/0x2b0 [amdgpu] [ 156.132302] amdgpu_ib_schedule+0x2f9/0x580 [amdgpu] [ 156.132586] amdgpu_job_run+0xed/0x220 [amdgpu] v2: fix mismatch warning between the prototype and function name (Ray, kernel test robot) Signed-off-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 12月, 2021 2 次提交
-
-
由 Victor Skvortsov 提交于
We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: NVictor Skvortsov <victor.skvortsov@amd.com> Reviewed By: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Huang Rui 提交于
The job embedded fence donesn't initialize the flags at dma_fence_init(). Then we will go a wrong way in amdgpu_fence_get_timeline_name callback and trigger a null pointer panic once we enabled the trace event here. So introduce new amdgpu_fence object to indicate the job embedded fence. [ 156.131790] BUG: kernel NULL pointer dereference, address: 00000000000002a0 [ 156.131804] #PF: supervisor read access in kernel mode [ 156.131811] #PF: error_code(0x0000) - not-present page [ 156.131817] PGD 0 P4D 0 [ 156.131824] Oops: 0000 [#1] PREEMPT SMP PTI [ 156.131832] CPU: 6 PID: 1404 Comm: sdma0 Tainted: G OE 5.16.0-rc1-custom #1 [ 156.131842] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016 [ 156.131848] RIP: 0010:strlen+0x0/0x20 [ 156.131859] Code: 89 c0 c3 0f 1f 80 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 156.131872] RSP: 0018:ffff9bd0018dbcf8 EFLAGS: 00010206 [ 156.131880] RAX: 00000000000002a0 RBX: ffff8d0305ef01b0 RCX: 000000000000000b [ 156.131888] RDX: ffff8d03772ab924 RSI: ffff8d0305ef01b0 RDI: 00000000000002a0 [ 156.131895] RBP: ffff9bd0018dbd60 R08: ffff8d03002094d0 R09: 0000000000000000 [ 156.131901] R10: 000000000000005e R11: 0000000000000065 R12: ffff8d03002094d0 [ 156.131907] R13: 000000000000001f R14: 0000000000070018 R15: 0000000000000007 [ 156.131914] FS: 0000000000000000(0000) GS:ffff8d062ed80000(0000) knlGS:0000000000000000 [ 156.131923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.131929] CR2: 00000000000002a0 CR3: 000000001120a005 CR4: 00000000003706e0 [ 156.131937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.131942] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.131949] Call Trace: [ 156.131953] <TASK> [ 156.131957] ? trace_event_raw_event_dma_fence+0xcc/0x200 [ 156.131973] ? ring_buffer_unlock_commit+0x23/0x130 [ 156.131982] dma_fence_init+0x92/0xb0 [ 156.131993] amdgpu_fence_emit+0x10d/0x2b0 [amdgpu] [ 156.132302] amdgpu_ib_schedule+0x2f9/0x580 [amdgpu] [ 156.132586] amdgpu_job_run+0xed/0x220 [amdgpu] v2: fix mismatch warning between the prototype and function name (Ray, kernel test robot) Signed-off-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 12月, 2021 2 次提交
-
-
由 Yann Dirson 提交于
Those are not today pulled by the sphinx doc, but better be ready. Signed-off-by: NYann Dirson <ydirson@free.fr> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Updated for consistency when accessing drm_device from amdgpu driver. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 12月, 2021 6 次提交
-
-
由 Philip Yang 提交于
If host and amdgpu IOMMU is not enabled or IOMMU is pass through mode, set adev->ram_is_direct_mapped flag which will be used to optimize memory usage for multi GPU mappings. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Lang Yu 提交于
It is useful to maintain error context when debugging SW/FW issues. Introduce amdgpu_device_halt() for this purpose. It will bring hardware to a kind of halt state, so that no one can touch it any more. Compare to a simple hang, the system will keep stable at least for SSH access. Then it should be trivial to inspect the hardware state and see what's going on. v2: - Set adev->no_hw_access earlier to avoid potential crashes.(Christian) Suggested-by: NChristian Koenig <christian.koenig@amd.com> Suggested-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NLang Yu <lang.yu@amd.com> Reviewed-by: NChristian Koenig <christian.koenig@amd.co> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Lang Yu 提交于
We found some headaches on ASICs don't need that, so remove that for them. Suggested-by: NLijo Lazar <lijo.lazar@amd.com> Signed-off-by: NLang Yu <lang.yu@amd.com> Reviewed-by: NKevin Wang <kevinyang.wang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Isabella Basso 提交于
This fixes various warnings relating to erroneous docstring syntax, of which some are listed below: warning: Function parameter or member 'adev' not described in 'amdgpu_atomfirmware_ras_rom_addr' ... warning: expecting prototype for amdgpu_atpx_validate_functions(). Prototype was for amdgpu_atpx_validate() instead ... warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new' ... warning: Cannot understand * @kfd_get_cu_occupancy - Collect number of waves in-flight on this device ... warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Signed-off-by: NIsabella Basso <isabbasso@riseup.net> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Zhigang Luo 提交于
For SRIOV VF, the XGMI topology was not recovered after reset. This change added code to SRIOV VF reset function to update XGMI topology for SRIOV VF after reset. Signed-off-by: NZhigang Luo <zhigang.luo@amd.com> Reviewed-by: NShaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Zhigang Luo 提交于
On SRIOV, host driver can support FLR(function level reset) on individual VF within the hive which might bring the individual device back to normal without the necessary to execute the hive reset. If the FLR failed , host driver will trigger the hive reset, each guest VF will get reset notification before the real hive reset been executed. The VF device can handle the reset request individually in it's reset work handler. This change updated gpu recover sequence to skip reset other device in the same hive for SRIOV VF. Signed-off-by: NZhigang Luo <zhigang.luo@amd.com> Reviewed-by: NShaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 12月, 2021 5 次提交
-
-
由 shaoyunl 提交于
This change revert previous commits: 9f4f2c1a ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") 271fd38c ("drm/amdgpu: move kfd post_reset out of reset_sriov function") This change moves the amdgpu_amdkfd_pre_reset to an earlier place in amdgpu_device_reset_sriov, presumably to address the sequence issue that the first patch was originally meant to fix. Some register access(GRBM_GFX_CNTL) only be allowed on full access mode. Move kfd_pre_reset and kfd_post_reset back inside reset_sriov function. Fixes: 9f4f2c1a ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") Fixes: 271fd38c ("drm/amdgpu: move kfd post_reset out of reset_sriov function") Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Flora Cui 提交于
since vkms support atomic KMS interface Signed-off-by: NFlora Cui <flora.cui@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Acked-by: NAlex Deucher <aleander.deucher@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 shaoyunl 提交于
This change revert previous commits: 9f4f2c1a ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") 271fd38c ("drm/amdgpu: move kfd post_reset out of reset_sriov function") This change moves the amdgpu_amdkfd_pre_reset to an earlier place in amdgpu_device_reset_sriov, presumably to address the sequence issue that the first patch was originally meant to fix. Some register access(GRBM_GFX_CNTL) only be allowed on full access mode. Move kfd_pre_reset and kfd_post_reset back inside reset_sriov function. Fixes: 9f4f2c1a ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") Fixes: 271fd38c ("drm/amdgpu: move kfd post_reset out of reset_sriov function") Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Stanley.Yang 提交于
v2: still need call ras_disable_all_featrures to handle ras initilization failure case. Function amdgpu_device_fini_hw is called before amdgpu_device_fini_sw, so ras ta will unload before send ras disable command, ras dsiable operation must before hw fini. Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> -
由 Flora Cui 提交于
since vkms support atomic KMS interface Signed-off-by: NFlora Cui <flora.cui@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Acked-by: NAlex Deucher <aleander.deucher@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 25 11月, 2021 2 次提交
-
-
由 shaoyunl 提交于
Fixes: 9f4f2c1a ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") For sriov XGMI configuration, the host driver will handle the hive reset, so in guest side, the reset_sriov only be called once on one device. This will make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov function to make them balance . Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 shaoyunl 提交于
Fixes: 9f4f2c1a ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") For sriov XGMI configuration, the host driver will handle the hive reset, so in guest side, the reset_sriov only be called once on one device. This will make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov function to make them balance . Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 11月, 2021 1 次提交
-
-
由 Evan Quan 提交于
Just bail out if the target IP block is already in the desired powergate/ungate state. This can avoid some duplicate settings which sometimes may cause unexpected issues. Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/ Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789Signed-off-by: NEvan Quan <evan.quan@amd.com> Tested-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NLijo Lazar <lijo.lazar@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 11月, 2021 2 次提交
-
-
由 Evan Quan 提交于
Just bail out if the target IP block is already in the desired powergate/ungate state. This can avoid some duplicate settings which sometimes may cause unexpected issues. Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/ Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789 Fixes: bf756fb8 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") Signed-off-by: NEvan Quan <evan.quan@amd.com> Tested-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NLijo Lazar <lijo.lazar@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
由 Evan Quan 提交于
With the shadow buffer support from generic framebuffer emulation, it's possible now to have runpm kicked when no update for console. Signed-off-by: NEvan Quan <evan.quan@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 11月, 2021 1 次提交
-
-
由 shaoyunl 提交于
The KFD pre_reset should be called before reset been executed, it will hold the lock to prevent other rocm process to sent the packlage to hiq during host execute the real reset on the HW Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-