- 20 3月, 2019 6 次提交
-
-
由 xinhui pan 提交于
add ras post init function. Do some initialization after all IP have finished their late init. Add new member flags which will control the ras work flow. For now, vbios enable ras for us on boot. That might change in the future. So there should be a flag from vbios to tell us if ras is enabled or not on boot. Looks like there is no such info now. Other bits of the flags are reserved to control other parts of ras. Signed-off-by: Nxinhui pan <xinhui.pan@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 xinhui pan 提交于
Mark vram pages with errors as bad and prevent the driver from using them. Signed-off-by: Nxinhui pan <xinhui.pan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 xinhui pan 提交于
add obj management. add feature control. add debugfs infrastructure. add sysfs infrastructure. add IH infrastructure. add recovery infrastructure. It is a framework. Other IPs need call amdgpu_ras_xxx function instead of psp_ras_xxx functions. v2: squash in warning fixes Signed-off-by: Nxinhui pan <xinhui.pan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Problem: Using SDMA for TLB invalidation in certain ASICs exposed a problem of IB pool not being ready while SDMA already up on Init and already shutt down while SDMA still running on Fini. This caused IB allocation failure. Temproary fix was commited into a bringup branch but this is the generic fix. Fix: Init IB pool rigth after GMC is ready but before SDMA is ready. Do th opposite for Fini. v2: Remove restriction on SDMA early init and move amdgpu_ib_pool_fini Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 shaoyunl 提交于
Driver vote low to high pstate switch whenever there is an outstanding XGMI mapping request. Driver vote high to low pstate when all the outstanding XGMI mapping is terminated. Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Likun Gao 提交于
Move pp_feature from the struct of amd_powerplay to amdgpu_device. Add pp_feature limit for overdrive interface. v2: put pp_feature into struct amdgpu_pm. v3: merge feature_mask with pp_feature. Signed-off-by: NLikun Gao <Likun.Gao@amd.com> Reviewed-by: NKevin Wang <kevin1.wang@amd.com> Suggested-by: NAlex Deucher <alexander.deucher@amd.com> Suggested-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 2月, 2019 1 次提交
-
-
由 Harish Kasiviswanathan 提交于
The new Vega series GPU cards have in-built bridges. To get the pcie speed and width supported by the platform walk the hierarchy and get the slowest link. Signed-off-by: NHarish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 1月, 2019 2 次提交
-
-
由 Andrey Grodzovsky 提交于
Decauple sched threads stop and start and ring mirror list handling from the policy of what to do about the guilty jobs. When stoppping the sched thread and detaching sched fences from non signaled HW fenes wait for all signaled HW fences to complete before rerunning the jobs. v2: Fix resubmission of guilty job into HW after refactoring. v4: Full restart for all the jobs, not only from guilty ring. Extract karma increase into standalone function. v5: Rework waiting for signaled jobs without relying on the job struct itself as those might already be freed for non 'guilty' job's schedulers. Expose karma increase to drivers. v6: Use list_for_each_entry_safe_continue and drm_sched_process_job in case fence already signaled. Call drm_sched_increase_karma only once for amdgpu and add documentation. v7: Wait only for the latest job's fence. Suggested-by: NChristian Koenig <Christian.Koenig@amd.com> Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 wentalou 提交于
sriov would meet guest driver load failure, if calling amdgpu_asic_reset in amdgpu_device_init. sriov should skip asic_reset in device_init. Signed-off-by: NWentao Lou <Wentao.Lou@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 1月, 2019 1 次提交
-
-
由 Daniel Vetter 提交于
Having the probe helper stuff (which pretty much everyone needs) in the drm_crtc_helper.h file (which atomic drivers should never need) is confusing. Split them out. To make sure I actually achieved the goal here I went through all drivers. And indeed, all atomic drivers are now free of drm_crtc_helper.h includes. v2: Make it compile. There was so much compile fail on arm drivers that I figured I'll better not include any of the acks on v1. v3: Massive rebase because i915 has lost a lot of drmP.h includes, but not all: Through drm_crtc_helper.h > drm_modeset_helper.h -> drmP.h there was still one, which this patch largely removes. Which means rolling out lots more includes all over. This will also conflict with ongoing drmP.h cleanup by others I expect. v3: Rebase on top of atomic bochs. v4: Review from Laurent for bridge/rcar/omap/shmob/core bits: - (re)move some of the added includes, use the better include files in other places (all suggested from Laurent adopted unchanged). - sort alphabetically v5: Actually try to sort them, and while at it, sort all the ones I touch. v6: Rebase onto i915 changes. v7: Rebase once more. Acked-by: NHarry Wentland <harry.wentland@amd.com> Acked-by: NSam Ravnborg <sam@ravnborg.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: NBenjamin Gaignard <benjamin.gaignard@linaro.org> Acked-by: NJani Nikula <jani.nikula@intel.com> Acked-by: NNeil Armstrong <narmstrong@baylibre.com> Acked-by: NOleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Acked-by: NCK Hu <ck.hu@mediatek.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NSam Ravnborg <sam@ravnborg.org> Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: NLiviu Dudau <liviu.dudau@arm.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: virtualization@lists.linux-foundation.org Cc: etnaviv@lists.freedesktop.org Cc: linux-samsung-soc@vger.kernel.org Cc: intel-gfx@lists.freedesktop.org Cc: linux-mediatek@lists.infradead.org Cc: linux-amlogic@lists.infradead.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: spice-devel@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Cc: linux-tegra@vger.kernel.org Cc: xen-devel@lists.xen.org Link: https://patchwork.freedesktop.org/patch/msgid/20190117210334.13234-1-daniel.vetter@ffwll.ch
-
- 15 1月, 2019 6 次提交
-
-
由 Alex Deucher 提交于
To deal with situations like kexec or GPU VM passthrough where the device may have been used previously without a proper GPU reset between. v2: rebase bug: https://bugs.freedesktop.org/show_bug.cgi?id=108585 bug: https://bugs.freedesktop.org/show_bug.cgi?id=108754Reviewed-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tom St Denis 提交于
v2: Move locks around in other functions so that this function can stand on its own. Also only hold the hive specific lock for add/remove device instead of the driver global lock so you can't add/remove devices in parallel from one hive. v3: add reset_lock Acked-by: Shaoyun.liu < Shaoyun.liu@amd.com> Signed-off-by: NTom St Denis <tom.stdenis@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
When init fail, send rel init, req_fini and rel_fini to host for the finishing routine. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 wentalou 提交于
distinguish ip_reinit_early_sriov and ip_reinit_late_sriov by different log RE-INIT-early and RE-INIT-late Signed-off-by: NWentao Lou <Wentao.Lou@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
The pfvf exchange need be in exclusive mode. And add pfvf exchange in gpu reset. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-By: NXiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
For virtual display feature, no need to pin cursor bo. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 1月, 2019 1 次提交
-
-
由 Daniel Vetter 提交于
It's not a core function, and the matching atomic functions are also not in the core. Plus the suspend/resume helper is also already there. Needs a tiny bit of open-coding, but less midlayer beats that I think. v2: Rebase onto ast (which gained a new user). Cc: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NSean Paul <sean@poorly.run> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <maxime.ripard@bootlin.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: Rex Zhu <Rex.Zhu@amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Shaoyun Liu <Shaoyun.Liu@amd.com> Cc: Monk Liu <Monk.Liu@amd.com> Cc: nouveau@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20181217194303.14397-4-daniel.vetter@ffwll.ch
-
- 03 1月, 2019 2 次提交
-
-
由 Emily Deng 提交于
The pfvf exchange need be in exclusive mode. And add pfvf exchange in gpu reset. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-By: NXiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
For virtual display feature, no need to pin cursor bo. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 12月, 2018 1 次提交
-
-
由 wentalou 提交于
XGMI hive put kfd_pre_reset into amdgpu_device_lock_adev, but outside req_full_gpu of sriov. It would make sriov hang during reset. Signed-off-by: NWentao Lou <Wentao.Lou@amd.com> Reviewed-by: NShaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 12月, 2018 1 次提交
-
-
由 Andrey Grodzovsky 提交于
I retested Bonaire (gfx7 dGPU) and it works fine. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 12月, 2018 1 次提交
-
-
由 Alex Deucher 提交于
SI does not use doorbells, move asic doorbell init later asic check. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=108920Reviewed-by: NOak Zeng <Oak.Zeng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 04 12月, 2018 2 次提交
-
-
由 Andrey Grodzovsky 提交于
Use per hive wq to concurrently send reset commands to all nodes in the hive. v2: Switch to system_highpri_wq after dropping dedicated queue. Fix non XGMI code path KASAN error. Stop the hive reset for each node loop if there is a reset failure on any of the nodes. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
XGMI hive has some resources allocted on device init which needs to be deallocated when the device is unregistered. v2: Remove creation of dedicated wq for XGMI hive reset. v3: Use the gmc.xgmi.supported flag Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 12月, 2018 1 次提交
-
-
由 Oak Zeng 提交于
When paging queue is enabled, it use the second page of doorbell. The AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the kernel doorbells are in the first page. So with paging queue enabled, the total kernel doorbell range should be original num_doorbell plus one page (0x400 in dword), not *2. Signed-off-by: NOak Zeng <ozeng@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 11月, 2018 4 次提交
-
-
由 Andrey Grodzovsky 提交于
For XGMI hive case do reset in steps where each step iterates over all devs in hive. This especially important for asic reset since all PSP FW in hive must come up within a limited time (around 1 sec) to properply negotiate the link. Do this by refactoring amdgpu_device_gpu_recover and amdgpu_device_reset into pre_asic_reset, asic_reset and post_asic_reset functions where is part is exectued for all the GPUs in the hive before going to the next step. v2: Update names for amdgpu_device_lock/unlock functions. v3: Introduce per hive locking to avoid multiple resets for GPUs in same hive. v4: Remove delayed_workqueue()/ttm_bo_unlock_delayed_workqueue() - they are copy & pasted over from radeon and on amdgpu there isn't any reason for that any more. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
This is prep work for updating each PSP FW in hive after GPU reset. Split into build topology SW state and update each PSP FW in the hive. Save topology and count of XGMI devices for reuse. v2: Create seperate header for XGMI. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
ASIC specific doorbell layout is used instead of enum definition Signed-off-by: NOak Zeng <ozeng@amd.com> Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com> Suggested-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Also call functioin amdgpu_device_doorbell_init after amdgpu_device_ip_early_init because the former depends on the later to set up asic-specific init_doorbell_index function Signed-off-by: NOak Zeng <ozeng@amd.com> Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com> Suggested-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 11月, 2018 1 次提交
-
-
由 Philip Yang 提交于
Because increase SDMA_DOORBELL_RANGE to add new SDMA doorbell for paging queue will break SRIOV, instead we can reserve and map two doorbell pages for amdgpu, paging queues doorbell index use same index as SDMA gfx queues index but on second page. For Vega20, after we change doorbell layout to increase SDMA doorbell for 8 SDMA RLC queues later, we could use new doorbell index for paging queue. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 11月, 2018 1 次提交
-
-
由 Hawking Zhang 提交于
Setup and tear down xgmi as part of psp. v2: - make psp_xgmi_terminate static - squash in: drm/amdgpu: only issue xgmi cmd when it is enabled drm/amdgpu/psp: terminate xgmi ta in suspend and hw_fini phase Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 11月, 2018 2 次提交
-
-
由 Rex Zhu 提交于
There is no functional changes, Use function arguments for SRIOV special variables which is hardcode in those functions. so we can share those functions in baremetal. Reviewed-by: NMonk Liu <Monk.Liu@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
After testing looks like these subset of ASICs has GPU reset working for the most part. Enable reset due to job timeout. v2: Switch from GFX version to ASIC type. v3: Fix identation Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 11月, 2018 2 次提交
-
-
由 Christian König 提交于
This is still completely breaking my Raven system. This reverts commit cdf2f910fa969adca1b0e3ad2b487821233dc038. Revert until we sort out the sbios and firmware combinations that work correctly. bug: https://bugs.freedesktop.org/show_bug.cgi?id=108606 Cc: stable@vger.kernel.org # v4.19 Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Problem: During GPU recover DAL would hang in amdgpu_pm_compute_clocks->amdgpu_fence_wait_empty Fix: Turns out there was a typo introduced by 3320b8d2 drm/amdgpu: remove job->ring which caused skipping amdgpu_fence_driver_force_completion and so the hangged job was never force signaled and this would cause the hang later in DAL. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 10月, 2018 1 次提交
-
-
由 Emily Deng 提交于
Need to check adev->powerplay.pp_funcs. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 10月, 2018 4 次提交
-
-
由 Rex Zhu 提交于
Extract the function of fw loading out of powerplay. Do fw loading between hw_init/resuem_phase1 and phase2 Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
We need to do some IPs earlier to deal with ordering issues similar to how resume is split into two phases. Will do fw loading via smu/psp between the two phases. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
1. one is for create/free bo when init/fini 2. one is for fill the bo before fw loading the ucode bo only need to be created when load driver and free when driver unload. when resume/reset, driver only need to re-fill the bo if the bo is allocated in vram. Suggested by Christian. v2: Return error when bo create failed. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Rex Zhu 提交于
Fix cg/pg unexpected set in hw init failed case. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NRex Zhu <Rex.Zhu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-