提交 · 506acd7ee64ce1c94d60c38c7721e1c0133fdae2 · openanolis / cloud-kernel

21 11月, 2018 10 次提交

drm/amd/powerplay: Enable/Disable NBPSTATE on On/OFF of UVD · 506acd7e

由 Akshu Agrawal 提交于 9月 24, 2018

commit 51ef434a upstream.

We observe black lines (underflow) on display when playing a
4K video with UVD. On Disabling Low memory P state this issue is
not seen.
Multiple runs of power measurement shows no imapct.
Signed-off-by: NAkshu Agrawal <akshu.agrawal@amd.com>
Signed-off-by: NSatyajit Sahu <satyajit.sahu@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

506acd7e

drm/amdgpu: Suppress keypresses from ACPI_VIDEO events · c0845493

由 Lyude Paul 提交于 9月 21, 2018

commit 582f58de upstream.

Currently we return NOTIFY_DONE for any event which we don't think is
ours. However, many laptops will send more then just an ATIF event and
will also send an ACPI_VIDEO_NOTIFY_PROBE event as well. Since we don't
check for this, we return NOTIFY_DONE which causes a keypress for the
ACPI event to be propogated to userspace. This is the equivalent of
someone pressing the display key on a laptop every time there's a
hotplug event.

So, check for ACPI_VIDEO_NOTIFY_PROBE events and suppress keypresses
from them.
Signed-off-by: NLyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c0845493

drm/amdgpu: add missing CHIP_HAINAN in amdgpu_ucode_get_load_type · 3f43692a

由 Alex Deucher 提交于 8月 28, 2018

commit d9997b64 upstream.

This caused a confusing error message, but there is functionally
no problem since the default method is DIRECT.
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3f43692a

drm/amdgpu: Fix typo in amdgpu_vmid_mgr_init · 633322af

由 Rex Zhu 提交于 10月 12, 2018

commit 3df27645 upstream.

fix a typo in for loop: i->j
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

633322af

drm/amdgpu: fix integer overflow test in amdgpu_bo_list_create() · 6fc72c1b

由 Dan Carpenter 提交于 8月 10, 2018

[ Upstream commit ff30e9e8 ]

We accidentally left out the size of the amdgpu_bo_list struct.  It
could lead to memory corruption on 32 bit systems.  You'd have to
pick the absolute maximum and set "num_entries == 59652323" then size
would wrap to 16 bytes.

Fixes: 920990cb ("drm/amdgpu: allocate the bo_list array after the list")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NBas Nieuwenhuizen <basni@chromium.org>
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6fc72c1b

drm/amdgpu: Fix SDMA TO after GPU reset v3 · 3b6ff8eb

由 Andrey Grodzovsky 提交于 9月 10, 2018

[ Upstream commit d8de8260a45aae8f74af77eae9a162bdc0ed48d2 ]

After GPU reset amdgpu_vm_clear_bo triggers VM flush
but job->vm_pd_addr is not set causing SDMA TO.

v2:
Per advise by Christian König avoid flushing VM for jobs where
job->vm_pd_addr wasn't explicitly set.

v3:
Shortcut vm_flush_needed early.

Fixes cbd5285 drm/amdgpu: move setting the GART addr into TTM.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3b6ff8eb

drm/amd/display: fix gamma not being applied · a1d0566c

由 SivapiriyanKumarasamy 提交于 9月 12, 2018

[ Upstream commit 30049754 ]

[WHY]
Previously night light forced a full update by
applying a  transfer function update regardless of if it was changed.
This logic was removed,

Now gamma surface updates are only applied when there is also a plane
info update, this does not work in cases such as using the night light
slider.

[HOW]
When moving the night light slider we will perform a full update if
the gamma has changed and there is a surface, even when the surface
has not changed. Also get stream updates in setgamma prior to
update planes and stream.
Signed-off-by: NSivapiriyanKumarasamy <sivapiriyan.kumarasamy@amd.com>
Reviewed-by: NAnthony Koo <Anthony.Koo@amd.com>
Acked-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a1d0566c

drm/amd/display: Raise dispclk value for dce120 by 15% · ce46da15

由 Nicholas Kazlauskas 提交于 9月 12, 2018

[ Upstream commit 481f576c ]

[Why]

The DISPCLK value was previously requested to be 15% higher for all
ASICs that went through the dce110 bandwidth code path. As part of a
refactoring of dce_clocks and the dce110 set bandwidth codepath this
was removed for power saving considerations.

That change caused display corruption under certain hardware
configurations with Vega10.

[How]

The 15% DISPCLK increase is brought back but only on dce110 for now.
This is should be a temporary workaround until the root cause is sorted
out for why this occurs on Vega (or other ASICs, if reported).
Tested-by: NNick Sarnie <sarnex@gentoo.org>
Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: NHarry Wentland <Harry.Wentland@amd.com>
Acked-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ce46da15

drm/amdgpu/powerplay: fix missing break in switch statements · 17a5d018

由 Colin Ian King 提交于 10月 08, 2018

[ Upstream commit 14b284832e7dea6f54f0adfd7bed105548b94e57 ]

There are several switch statements that are missing break statements.
Add missing breaks to handle any fall-throughs corner cases.

Detected by CoverityScan, CID#1457175 ("Missing break in switch")

Fixes: 18aafc59 ("drm/amd/powerplay: implement fw related smu interface for iceland.")
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

17a5d018

drm/amd/display: fix bug of accessing invalid memory · 139dde1e

由 Su Sung Chung 提交于 9月 20, 2018

[ Upstream commit 43c3ff27a47d83d153c4adc088243ba594582bf5 ]

[Why]
A loop inside of build_evenly_distributed_points function that traverse through
the array of points become an infinite loop when m_GammaUpdates does not
get assigned to any value.

[How]
In DMColor, clear m_gammaIsValid bit just before writting all Zeromem for
m_GammaUpdates, to prevent calling build_evenly_distributed_points
before m_GammaUpdates gets assigned to some value.
Signed-off-by: NSu Sung Chung <Su.Chung@amd.com>
Reviewed-by: NAric Cyr <Aric.Cyr@amd.com>
Acked-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

139dde1e

04 10月, 2018 2 次提交

drm/amdkfd: Fix incorrect use of process->mm · 11b29c9e

由 Felix Kuehling 提交于 10月 02, 2018

This mm_struct pointer should never be dereferenced. If running in
a user thread, just use current->mm. If running in a kernel worker
use get_task_mm to get a safe reference to the mm_struct.
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

11b29c9e

drm/amd/display: Signal hw_done() after waiting for flip_done() · 987bf116

由 Shirish S 提交于 9月 24, 2018

In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before
we signal hw_done().

[Why]

This is to temporarily address a paging error that occurs when a
nonblocking commit contends with another commit, particularly in a
mirrored display configuration where at least 2 CRTCs are updated.
The error occurs in drm_atomic_helper_wait_for_flip_done(), when we
attempt to access the contents of new_crtc_state->commit.

Here's the sequence for a mirrored 2 display setup (irrelevant steps
left out for clarity):

**THREAD 1**                        | **THREAD 2**
                                    |
Initialize atomic state for flip    |
                                    |
Queue worker                        |
                                   ...

                                    | Do work for flip
                                    |
                                    | Signal hw_done() on CRTC 1
                                    | Signal hw_done() on CRTC 2
                                    |
                                    | Wait for flip_done() on CRTC 1

                                <---- **PREEMPTED BY THREAD 1**

Initialize atomic state for cursor  |
update (1)                          |
                                    |
Do cursor update work on both CRTCs |
                                    |
Clear atomic state (2)              |
**DONE**                            |
                                   ...
                                    |
                                    | Wait for flip_done() on CRTC 2
                                    | *ERROR*
                                    |

The issue starts with (1). When the atomic state is initialized, the
current CRTC states are duplicated to be the new_crtc_states, and
referenced to be the old_crtc_states. (The new_crtc_states are to be
filled with update data.)

Some things to note:

* Due to the mirrored configuration, the cursor updates on both CRTCs.

* At this point, the pflip IRQ has already been handled, and flip_done
  signaled on all CRTCs. The cursor commit can therefore continue.

* The old_crtc_states used by the cursor update are the **same states**
  as the new_crtc_states used by the flip worker.

At (2), the old_crtc_state is freed (*), and the cursor commit
completes. We then context switch back to the flip worker, where we
attempt to access the new_crtc_state->commit object. This is
problematic, as this state has already been freed.

(*) Technically, 'state->crtcs[i].state' is freed, which was made to
    reference old_crtc_state in drm_atomic_helper_swap_state()

[How]

By moving hw_done() after wait_for_flip_done(), we're guaranteed that
the new_crtc_state (from the flip worker's perspective) still exists.
This is because any other commit will be blocked, waiting for the
hw_done() signal.

Note that both the i915 and imx drivers have this sequence flipped
already, masking this problem.
Signed-off-by: NShirish S <shirish.s@amd.com>
Signed-off-by: NLeo Li <sunpeng.li@amd.com>
Reviewed-by: NHarry Wentland <harry.wentland@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

987bf116

27 9月, 2018 3 次提交

drm/amd/display: Fix Edid emulation for linux · fbbdadf2

由 Bhawanpreet Lakha 提交于 9月 26, 2018

[Why]
EDID emulation didn't work properly for linux, as we stop programming
if nothing is connected physically.

[How]
We get a flag from DRM when we want to do edid emulation. We check if
this flag is true and nothing is connected physically, if so we only
program the front end using VIRTUAL_SIGNAL.
Signed-off-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: NHarry Wentland <Harry.Wentland@amd.com>
Acked-by: NLeo Li <sunpeng.li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fbbdadf2

drm/amd/display: Fix Vega10 lightup on S3 resume · 599760d6

由 Roman Li 提交于 9月 26, 2018

[Why]
There have been a few reports of Vega10 display remaining blank
after S3 resume. The regression is caused by workaround for mode
change on Vega10 - skip set_bandwidth if stream count is 0.
As a result we skipped dispclk reset on suspend, thus on resume
we may skip the clock update assuming it hasn't been changed.
On some systems it causes display blank or 'out of range'.

[How]
Revert "drm/amd/display: Fix Vega10 black screen after mode change"
Verified that it hadn't cause mode change regression.
Signed-off-by: NRoman Li <Roman.Li@amd.com>
Reviewed-by: NSun peng Li <Sunpeng.Li@amd.com>
Acked-by: NLeo Li <sunpeng.li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

599760d6

drm/amdgpu: Fix vce work queue was not cancelled when suspend · 61ea6f58

由 Rex Zhu 提交于 9月 27, 2018

The vce cancel_delayed_work_sync never be called.
driver call the function in error path.

This caused the A+A suspend hang when runtime pm enebled.
As we will visit the smu in the idle queue. this will cause
smu hang because the dgpu has been suspend, and the dgpu also
will be waked up. As the smu has been hang, so the dgpu resume
will failed.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

61ea6f58

20 9月, 2018 4 次提交

drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs · 44d8cc6f

由 Yong Zhao 提交于 9月 12, 2018

Because CRAT_CU_FLAGS_IOMMU_PRESENT was not set in some BIOS crat, we
need to workaround this.

For future compatibility, we also overwrite the bit in capability according
to the value of needs_iommu_device.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NYong Zhao <Yong.Zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

44d8cc6f

drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9 · 15426dbb

由 Yong Zhao 提交于 9月 12, 2018

CWSR fails on Raven if the control stack is MTYPE_UC, which is used
for regular GART mappings. As a workaround we map it using MTYPE_NC.

The MEC firmware expects the control stack at one page offset from the
start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU
added a memory allocation flag just for this purpose.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

15426dbb

drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7 · caaa4c8a

由 Amber Lin 提交于 9月 12, 2018

A wrong register bit was examinated for checking SDMA status so it reports
false failures. This typo only appears on gfx_v7. gfx_v8 checks the correct
bit.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAmber Lin <Amber.Lin@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

caaa4c8a

drm/amdgpu: add new polaris pci id · 30f3984e

由 Alex Deucher 提交于 9月 18, 2018

Add new pci id.
Reviewed-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

30f3984e

12 9月, 2018 1 次提交

drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk · 0165de98

由 Christian König 提交于 9月 10, 2018

Slowly leaking memory one page at a time :)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0165de98

11 9月, 2018 3 次提交

drm/amdgpu: move PSP init prior to IH in gpu reset · 3a74987b

由 Emily Deng 提交于 9月 10, 2018

since we use PSP to program IH regs now
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NEmily Deng <Emily.Deng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3a74987b

drm/amdgpu: Fix SDMA hang in prt mode v2 · 68ebc13e

由 Tao Zhou 提交于 9月 07, 2018

Fix SDMA hang in prt mode, clear XNACK_WATERMARK in reg SDMA0_UTCL1_WATERMK to avoid the issue

Affected ASICs: VEGA10 VEGA12 RV1 RV2

v2: add reg clear for SDMA1
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Tested-by: NYukun Li <yukun1.li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

68ebc13e

drm/amdgpu: fix amdgpu_mn_unlock() in the CS error path · b463d4e5

由 Christian König 提交于 9月 03, 2018

Avoid unlocking a lock we never locked.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b463d4e5

29 8月, 2018 1 次提交

drm/amdgpu: Need to set moved to true when evict bo · 6ddd9769

由 Emily Deng 提交于 8月 28, 2018

Fix the VMC page fault when the running sequence is as below:
1.amdgpu_gem_create_ioctl
2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called
amdgpu_vm_bo_base_init, so won't called
list_add_tail(&base->bo_list, &bo->va). Even the bo was evicted,
it won't set the bo_base->moved.
3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called
list_move_tail(&base->vm_status, &vm->evicted), but not set the
bo_base->moved.
4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is
not set true, the function amdgpu_vm_bo_insert_map will call
list_move(&bo_va->base.vm_status, &vm->moved)
5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the
moved list, not in the evict list. So VMC page fault occurs.
Signed-off-by: NEmily Deng <Emily.Deng@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6ddd9769

28 8月, 2018 8 次提交

drm/amdgpu: Remove duplicated power source update · 2f4e7db0

由 Rex Zhu 提交于 8月 23, 2018

when ac/dc switch, driver will be notified by acpi event.
then the power source will be updated. so don't need to
get power source when set power state.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2f4e7db0

drm/amd/display: Fix memory leak caused by missed dc_sink_release · e7603dad

由 SivapiriyanKumarasamy 提交于 8月 15, 2018

[Why]
There is currently an intermittent hang from a memory leak in
DTN stress testing. It is caused by unfreed memory during driver
disable.

[How]
Do a dc_sink_release in the case that skips it incorrectly.
Signed-off-by: NSivapiriyanKumarasamy <sivapiriyan.kumarasamy@amd.com>
Reviewed-by: NAric Cyr <Aric.Cyr@amd.com>
Acked-by: NLeo Li <sunpeng.li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e7603dad

drm/amdgpu: fix holding mn_lock while allocating memory · 4a2de54d

由 Christian König 提交于 8月 24, 2018

We can't hold the mn_lock while allocating memory.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4a2de54d

drm/amdgpu: Power on uvd block when hw_fini · 72ef23de

由 Rex Zhu 提交于 8月 23, 2018

when hw_fini/suspend, smu only need to power on uvd block
if uvd pg is supported, don't need to call uvd to do hw_init.

v2: fix typo in patch descriptions and comments.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Tested-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

72ef23de

drm/amdgpu: Update power state at the end of smu hw_init. · 2ab4d0e7

由 Rex Zhu 提交于 8月 24, 2018

For SI/Kv, the power state is managed by function
amdgpu_pm_compute_clocks.

when dpm enabled, we should call amdgpu_pm_compute_clocks
to update current power state instand of set boot state.

this change can fix the oops when kfd driver was enabled on Kv.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Tested-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2ab4d0e7

drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins · 6d39df14

由 Rex Zhu 提交于 8月 23, 2018

Forgot to add vce pg support via smu for Kaveri/Mullins.

Fixes: 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
                       to set_powergating_by_smu")

v2: refine patch descriptions suggested by Michel
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Tested-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6d39df14

drm/amdgpu: Enable/disable gfx PG feature in rlc safe mode · 8ef23364

由 Rex Zhu 提交于 8月 24, 2018

This is required by gfx hw and can fix the rlc hang when
do s3 stree test on Cz/St.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NHang Zhou <hang.zhou@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8ef23364

drm/amdgpu: Adjust the VM size based on system memory size v2 · fca5d959

由 Felix Kuehling 提交于 8月 21, 2018

Set the VM size based on system memory size between the ASIC-specific
limits given by min_vm_size and max_bits. GFXv9 GPUs will keep their
default VM size of 256TB (48 bit). Only older GPUs will adjust VM size
depending on system memory size.

This makes more VM space available for ROCm applications on GFXv8 GPUs
that want to map all available VRAM and system memory in their SVM
address space.

v2:
* Clarify comment
* Round up memory size before >> 30
* Round up automatic vm_size to power of two
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fca5d959

25 8月, 2018 1 次提交

drm/amd/display: Fix bug use wrong pp interface · a296b162

由 Rex Zhu 提交于 8月 16, 2018

Used wrong pp interface, the original interface is
exposed by dpm on SI and paritial CI.

Pointed out by Francis David <david.francis@amd.com>

v2: dal only need to set min_dcefclk and min_fclk to smu.
    so use display_clock_voltage_request interface,
    instand of update all display configuration.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a296b162

23 8月, 2018 5 次提交

drm/amdgpu: Fix page fault and kasan warning on pci device remove. · eb7e5cfc

由 Andrey Grodzovsky 提交于 8月 22, 2018

Problem:
When executing echo 1 > /sys/class/drm/card0/device/remove kasan warning
as bellow and page fault happen because adev->gart.pages already freed by the
time amdgpu_gart_unbind is called.

BUG: KASAN: user-memory-access in amdgpu_gart_unbind+0x98/0x180 [amdgpu]
Write of size 8 at addr 0000000000003648 by task bash/1828
CPU: 2 PID: 1828 Comm: bash Tainted: G        W  O      4.18.0-rc1-dev+ #29
Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming/AX370-Gaming-CF, BIOS F3 06/19/2017
Call Trace:
dump_stack+0x71/0xab
kasan_report+0x109/0x390
amdgpu_gart_unbind+0x98/0x180 [amdgpu]
ttm_tt_unbind+0x43/0x60 [ttm]
ttm_bo_move_ttm+0x83/0x1c0 [ttm]
ttm_bo_handle_move_mem+0xb97/0xd00 [ttm]
ttm_bo_evict+0x273/0x530 [ttm]
ttm_mem_evict_first+0x29c/0x360 [ttm]
ttm_bo_force_list_clean+0xfc/0x210 [ttm]
ttm_bo_clean_mm+0xe7/0x160 [ttm]
amdgpu_ttm_fini+0xda/0x1d0 [amdgpu]
amdgpu_bo_fini+0xf/0x60 [amdgpu]
gmc_v8_0_sw_fini+0x36/0x70 [amdgpu]
amdgpu_device_fini+0x2d0/0x7d0 [amdgpu]
amdgpu_driver_unload_kms+0x6a/0xd0 [amdgpu]
drm_dev_unregister+0x79/0x180 [drm]
amdgpu_pci_remove+0x2a/0x60 [amdgpu]
pci_device_remove+0x5b/0x100
device_release_driver_internal+0x236/0x360
pci_stop_bus_device+0xbf/0xf0
pci_stop_and_remove_bus_device_locked+0x16/0x30
remove_store+0xda/0xf0
kernfs_fop_write+0x186/0x220
__vfs_write+0xcc/0x330
vfs_write+0xe6/0x250
ksys_write+0xb1/0x140
do_syscall_64+0x77/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f66ebbb32c0

Fix:
Split gmc_v{6,7,8,9}_0_gart_fini to postpone amdgpu_gart_fini to after
memory managers are shut down since gart unbind happens
as part of this procedure
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

eb7e5cfc

amdgpu: fix multi-process hang issue · 2f40c6ea

由 Emily Deng 提交于 8月 22, 2018

SWDEV-146499: hang during multi vulkan process testing

cause:
the second frame's PREAMBLE_IB have clear-state
and LOAD actions, those actions ruin the pipeline
that is still doing process in the previous frame's
work-load IB.

fix:
need insert pipeline sync if have context switch for
SRIOV (because only SRIOV will report PREEMPTION flag
to UMD)
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Signed-off-by: NEmily Deng <Emily.Deng@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2f40c6ea

drm/amdgpu: fix preamble handling · d98ff24e

由 Christian König 提交于 8月 21, 2018

At this point the command submission can still be interrupted.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d98ff24e

drm/amdgpu: fix VM clearing for the root PD · 8604ffcb

由 Christian König 提交于 8月 16, 2018

We need to figure out the address after validating the BO, not before.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8604ffcb

mm, oom: distinguish blockable mode for mmu notifiers · 93065ac7

由 Michal Hocko 提交于 8月 21, 2018

There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks.

Currently we simply back off and mark an oom victim with blockable mmu
notifiers as done after a short sleep.  That can result in selecting a new
oom victim prematurely because the previous one still hasn't torn its
memory down yet.

We can do much better though.  Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held.  Moreover
majority of notifiers only care about a portion of the address space and
there is absolutely zero reason to fail when we are unmapping an unrelated
range.  Many notifiers do really block and wait for HW which is harder to
handle and we have to bail out though.

This patch handles the low hanging fruit.
__mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
are not allowed to sleep if the flag is set to false.  This is achieved by
using trylock instead of the sleepable lock for most callbacks and
continue as long as we do not block down the call chain.

I think we can improve that even further because there is a common pattern
to do a range lookup first and then do something about that.  The first
part can be done without a sleeping lock in most cases AFAICS.

The oom_reaper end then simply retries if there is at least one notifier
which couldn't make any progress in !blockable mode.  A retry loop is
already implemented to wait for the mmap_sem and this is basically the
same thing.

The simplest way for driver developers to test this code path is to wrap
userspace code which uses these notifiers into a memcg and set the hard
limit to hit the oom.  This can be done e.g.  after the test faults in all
the mmu notifier managed memory and set the hard limit to something really
small.  Then we are looking for a proper process tear down.

[akpm@linux-foundation.org: coding style fixes]
[akpm@linux-foundation.org: minor code simplification]
Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
Reported-by: NDavid Rientjes <rientjes@google.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

93065ac7

22 8月, 2018 2 次提交

drm/amd/display: Don't build DCN1 when kcov is enabled · 9d1d02ff

由 Leo (Sunpeng) Li 提交于 8月 16, 2018

DCN1 contains code that utilizes fp math. When
CONFIG_KCOV_INSTRUMENT_ALL and CONFIG_KCOV_ENABLE_COMPARISONS are
enabled, build errors are found. See this earlier patch for details:

https://lists.freedesktop.org/archives/dri-devel/2018-August/186131.html

As a short term solution, disable CONFIG_DRM_AMD_DC_DCN1_0 when
KCOV_INSTRUMENT_ALL and KCOV_ENABLE_COMPARISONS are enabled. In
addition, make it a fully derived config, taking into account
CONFIG_X86.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NLeo (Sunpeng) Li <sunpeng.li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9d1d02ff

Revert "drm/amdgpu/display: Replace CONFIG_DRM_AMD_DC_DCN1_0 with CONFIG_X86" · dc37a9a0

由 Leo (Sunpeng) Li 提交于 8月 16, 2018

This reverts commit 8624c3c4dbfe24fc6740687236a2e196f5f4bfb0.

We need CONFIG_DRM_AMD_DC_DCN1_0 to guard code that is using fp math.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NLeo (Sunpeng) Li <sunpeng.li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dc37a9a0

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功