- 13 4月, 2019 1 次提交
-
-
由 wentalou 提交于
shadow was added into shadow_list by amdgpu_bo_create_shadow. meanwhile, shadow->tbo.mem was not fully configured. tbo.mem would be fully configured by amdgpu_vm_sdma_map_table until calling amdgpu_vm_clear_bo. If sriov TDR occurred between amdgpu_bo_create_shadow and amdgpu_vm_sdma_map_table, amdgpu_device_recover_vram would deal with shadow without tbo.mem.start. Signed-off-by: NWentao Lou <Wentao.Lou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 04 4月, 2019 1 次提交
-
-
由 wentalou 提交于
amdgpu_bo_restore_shadow would assign zero to r if succeeded. r would remain zero if there is only one node in shadow_list. current code would always return failure when r <= 0. restart the timeout for each wait was a rather problematic bug as well. The value of tmo SHOULD be changed, otherwise we wait tmo jiffies on each loop. Signed-off-by: NWentao Lou <Wentao.Lou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 3月, 2019 1 次提交
-
-
由 Chengming Gui 提交于
use pcie_bandwidth_available to get real link state to update pcie table. v2: fix incorrect initialized return value v3: expand the fetching method about the link width to all asics. Signed-off-by: NChengming Gui <Jack.Gui@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 2月, 2019 1 次提交
-
-
由 Harish Kasiviswanathan 提交于
The new Vega series GPU cards have in-built bridges. To get the pcie speed and width supported by the platform walk the hierarchy and get the slowest link. Signed-off-by: NHarish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 1月, 2019 2 次提交
-
-
由 Andrey Grodzovsky 提交于
Decauple sched threads stop and start and ring mirror list handling from the policy of what to do about the guilty jobs. When stoppping the sched thread and detaching sched fences from non signaled HW fenes wait for all signaled HW fences to complete before rerunning the jobs. v2: Fix resubmission of guilty job into HW after refactoring. v4: Full restart for all the jobs, not only from guilty ring. Extract karma increase into standalone function. v5: Rework waiting for signaled jobs without relying on the job struct itself as those might already be freed for non 'guilty' job's schedulers. Expose karma increase to drivers. v6: Use list_for_each_entry_safe_continue and drm_sched_process_job in case fence already signaled. Call drm_sched_increase_karma only once for amdgpu and add documentation. v7: Wait only for the latest job's fence. Suggested-by: NChristian Koenig <Christian.Koenig@amd.com> Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 wentalou 提交于
sriov would meet guest driver load failure, if calling amdgpu_asic_reset in amdgpu_device_init. sriov should skip asic_reset in device_init. Signed-off-by: NWentao Lou <Wentao.Lou@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 1月, 2019 1 次提交
-
-
由 Daniel Vetter 提交于
Having the probe helper stuff (which pretty much everyone needs) in the drm_crtc_helper.h file (which atomic drivers should never need) is confusing. Split them out. To make sure I actually achieved the goal here I went through all drivers. And indeed, all atomic drivers are now free of drm_crtc_helper.h includes. v2: Make it compile. There was so much compile fail on arm drivers that I figured I'll better not include any of the acks on v1. v3: Massive rebase because i915 has lost a lot of drmP.h includes, but not all: Through drm_crtc_helper.h > drm_modeset_helper.h -> drmP.h there was still one, which this patch largely removes. Which means rolling out lots more includes all over. This will also conflict with ongoing drmP.h cleanup by others I expect. v3: Rebase on top of atomic bochs. v4: Review from Laurent for bridge/rcar/omap/shmob/core bits: - (re)move some of the added includes, use the better include files in other places (all suggested from Laurent adopted unchanged). - sort alphabetically v5: Actually try to sort them, and while at it, sort all the ones I touch. v6: Rebase onto i915 changes. v7: Rebase once more. Acked-by: NHarry Wentland <harry.wentland@amd.com> Acked-by: NSam Ravnborg <sam@ravnborg.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: NBenjamin Gaignard <benjamin.gaignard@linaro.org> Acked-by: NJani Nikula <jani.nikula@intel.com> Acked-by: NNeil Armstrong <narmstrong@baylibre.com> Acked-by: NOleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Acked-by: NCK Hu <ck.hu@mediatek.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NSam Ravnborg <sam@ravnborg.org> Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: NLiviu Dudau <liviu.dudau@arm.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: virtualization@lists.linux-foundation.org Cc: etnaviv@lists.freedesktop.org Cc: linux-samsung-soc@vger.kernel.org Cc: intel-gfx@lists.freedesktop.org Cc: linux-mediatek@lists.infradead.org Cc: linux-amlogic@lists.infradead.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: spice-devel@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Cc: linux-tegra@vger.kernel.org Cc: xen-devel@lists.xen.org Link: https://patchwork.freedesktop.org/patch/msgid/20190117210334.13234-1-daniel.vetter@ffwll.ch
-
- 15 1月, 2019 6 次提交
-
-
由 Alex Deucher 提交于
To deal with situations like kexec or GPU VM passthrough where the device may have been used previously without a proper GPU reset between. v2: rebase bug: https://bugs.freedesktop.org/show_bug.cgi?id=108585 bug: https://bugs.freedesktop.org/show_bug.cgi?id=108754Reviewed-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tom St Denis 提交于
v2: Move locks around in other functions so that this function can stand on its own. Also only hold the hive specific lock for add/remove device instead of the driver global lock so you can't add/remove devices in parallel from one hive. v3: add reset_lock Acked-by: Shaoyun.liu < Shaoyun.liu@amd.com> Signed-off-by: NTom St Denis <tom.stdenis@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
When init fail, send rel init, req_fini and rel_fini to host for the finishing routine. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 wentalou 提交于
distinguish ip_reinit_early_sriov and ip_reinit_late_sriov by different log RE-INIT-early and RE-INIT-late Signed-off-by: NWentao Lou <Wentao.Lou@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
The pfvf exchange need be in exclusive mode. And add pfvf exchange in gpu reset. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-By: NXiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
For virtual display feature, no need to pin cursor bo. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 1月, 2019 1 次提交
-
-
由 Daniel Vetter 提交于
It's not a core function, and the matching atomic functions are also not in the core. Plus the suspend/resume helper is also already there. Needs a tiny bit of open-coding, but less midlayer beats that I think. v2: Rebase onto ast (which gained a new user). Cc: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NSean Paul <sean@poorly.run> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <maxime.ripard@bootlin.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: Rex Zhu <Rex.Zhu@amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Shaoyun Liu <Shaoyun.Liu@amd.com> Cc: Monk Liu <Monk.Liu@amd.com> Cc: nouveau@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20181217194303.14397-4-daniel.vetter@ffwll.ch
-
- 03 1月, 2019 2 次提交
-
-
由 Emily Deng 提交于
The pfvf exchange need be in exclusive mode. And add pfvf exchange in gpu reset. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-By: NXiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
For virtual display feature, no need to pin cursor bo. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 12月, 2018 1 次提交
-
-
由 wentalou 提交于
XGMI hive put kfd_pre_reset into amdgpu_device_lock_adev, but outside req_full_gpu of sriov. It would make sriov hang during reset. Signed-off-by: NWentao Lou <Wentao.Lou@amd.com> Reviewed-by: NShaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 12月, 2018 1 次提交
-
-
由 Andrey Grodzovsky 提交于
I retested Bonaire (gfx7 dGPU) and it works fine. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 12月, 2018 1 次提交
-
-
由 Alex Deucher 提交于
SI does not use doorbells, move asic doorbell init later asic check. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=108920Reviewed-by: NOak Zeng <Oak.Zeng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 04 12月, 2018 2 次提交
-
-
由 Andrey Grodzovsky 提交于
Use per hive wq to concurrently send reset commands to all nodes in the hive. v2: Switch to system_highpri_wq after dropping dedicated queue. Fix non XGMI code path KASAN error. Stop the hive reset for each node loop if there is a reset failure on any of the nodes. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
XGMI hive has some resources allocted on device init which needs to be deallocated when the device is unregistered. v2: Remove creation of dedicated wq for XGMI hive reset. v3: Use the gmc.xgmi.supported flag Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 12月, 2018 1 次提交
-
-
由 Oak Zeng 提交于
When paging queue is enabled, it use the second page of doorbell. The AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the kernel doorbells are in the first page. So with paging queue enabled, the total kernel doorbell range should be original num_doorbell plus one page (0x400 in dword), not *2. Signed-off-by: NOak Zeng <ozeng@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 11月, 2018 4 次提交
-
-
由 Andrey Grodzovsky 提交于
For XGMI hive case do reset in steps where each step iterates over all devs in hive. This especially important for asic reset since all PSP FW in hive must come up within a limited time (around 1 sec) to properply negotiate the link. Do this by refactoring amdgpu_device_gpu_recover and amdgpu_device_reset into pre_asic_reset, asic_reset and post_asic_reset functions where is part is exectued for all the GPUs in the hive before going to the next step. v2: Update names for amdgpu_device_lock/unlock functions. v3: Introduce per hive locking to avoid multiple resets for GPUs in same hive. v4: Remove delayed_workqueue()/ttm_bo_unlock_delayed_workqueue() - they are copy & pasted over from radeon and on amdgpu there isn't any reason for that any more. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
This is prep work for updating each PSP FW in hive after GPU reset. Split into build topology SW state and update each PSP FW in the hive. Save topology and count of XGMI devices for reuse. v2: Create seperate header for XGMI. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
ASIC specific doorbell layout is used instead of enum definition Signed-off-by: NOak Zeng <ozeng@amd.com> Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com> Suggested-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Also call functioin amdgpu_device_doorbell_init after amdgpu_device_ip_early_init because the former depends on the later to set up asic-specific init_doorbell_index function Signed-off-by: NOak Zeng <ozeng@amd.com> Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com> Suggested-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 11月, 2018 1 次提交
-
-
由 Philip Yang 提交于
Because increase SDMA_DOORBELL_RANGE to add new SDMA doorbell for paging queue will break SRIOV, instead we can reserve and map two doorbell pages for amdgpu, paging queues doorbell index use same index as SDMA gfx queues index but on second page. For Vega20, after we change doorbell layout to increase SDMA doorbell for 8 SDMA RLC queues later, we could use new doorbell index for paging queue. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 11月, 2018 1 次提交
-
-
由 Hawking Zhang 提交于
Setup and tear down xgmi as part of psp. v2: - make psp_xgmi_terminate static - squash in: drm/amdgpu: only issue xgmi cmd when it is enabled drm/amdgpu/psp: terminate xgmi ta in suspend and hw_fini phase Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 11月, 2018 2 次提交
-
-
由 Rex Zhu 提交于
There is no functional changes, Use function arguments for SRIOV special variables which is hardcode in those functions. so we can share those functions in baremetal. Reviewed-by: NMonk Liu <Monk.Liu@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
After testing looks like these subset of ASICs has GPU reset working for the most part. Enable reset due to job timeout. v2: Switch from GFX version to ASIC type. v3: Fix identation Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 11月, 2018 2 次提交
-
-
由 Christian König 提交于
This is still completely breaking my Raven system. This reverts commit cdf2f910fa969adca1b0e3ad2b487821233dc038. Revert until we sort out the sbios and firmware combinations that work correctly. bug: https://bugs.freedesktop.org/show_bug.cgi?id=108606 Cc: stable@vger.kernel.org # v4.19 Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Problem: During GPU recover DAL would hang in amdgpu_pm_compute_clocks->amdgpu_fence_wait_empty Fix: Turns out there was a typo introduced by 3320b8d2 drm/amdgpu: remove job->ring which caused skipping amdgpu_fence_driver_force_completion and so the hangged job was never force signaled and this would cause the hang later in DAL. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 10月, 2018 1 次提交
-
-
由 Emily Deng 提交于
Need to check adev->powerplay.pp_funcs. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 10月, 2018 5 次提交
-
-
由 Rex Zhu 提交于
Extract the function of fw loading out of powerplay. Do fw loading between hw_init/resuem_phase1 and phase2 Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
We need to do some IPs earlier to deal with ordering issues similar to how resume is split into two phases. Will do fw loading via smu/psp between the two phases. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
1. one is for create/free bo when init/fini 2. one is for fill the bo before fw loading the ucode bo only need to be created when load driver and free when driver unload. when resume/reset, driver only need to re-fill the bo if the bo is allocated in vram. Suggested by Christian. v2: Return error when bo create failed. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
Fix cg/pg unexpected set in hw init failed case. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
1. only call late_init when hw_init successful, so check status.hw instand of status.valid in late_init. 2. set status.late_initialized true if late_init was not implemented. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 10月, 2018 2 次提交
-
-
由 Rex Zhu 提交于
Move in_suspend flag to adev from gfx, so can be used in other ip blocks, also keep consistent with gpu_in_reset flag. Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
MGPU fan boost feature is enabled only when two or more dGPUs in the system. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-