提交 · bc21fe9a5844c5bc8f7ec319b11d2671a94eb867 · openeuler / Kernel

08 12月, 2022 1 次提交

drm/amdgpu/sdma_v4_0: turn off SDMA ring buffer in the s2idle suspend · bc21fe9a

由 Prike Liang 提交于 12月 01, 2022

In the SDMA s0ix save process requires to turn off SDMA ring buffer for
avoiding the SDMA in-flight request, otherwise will suffer from SDMA page
fault which causes by page request from in-flight SDMA ring accessing at
SDMA restore phase.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2248
Cc: stable@vger.kernel.org # 6.0,5.15+
Fixes: f8f4e2a5 ("drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix.")
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Tested-by: NMario Limonciello <mario.limonciello@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bc21fe9a

02 12月, 2022 1 次提交

drm/amdgpu: enable Vangogh VCN indirect sram mode · 9a8cc8ca

由 Leo Liu 提交于 11月 29, 2022

So that uses PSP to initialize HW.

Fixes: 0c2c02b6 ("drm/amdgpu/vcn: add firmware support for dimgrey_cavefish")
Signed-off-by: NLeo Liu <leo.liu@amd.com>
Reviewed-by: NJames Zhu <James.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

9a8cc8ca

01 12月, 2022 2 次提交

drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame · 6f6cb171

由 Lee Jones 提交于 11月 25, 2022

Patch series "Fix a bunch of allmodconfig errors", v2.

Since b339ec9c ("kbuild: Only default to -Werror if COMPILE_TEST")
WERROR now defaults to COMPILE_TEST meaning that it's enabled for
allmodconfig builds.  This leads to some interesting build failures when
using Clang, each resolved in this set.

With this set applied, I am able to obtain a successful allmodconfig Arm
build.


This patch (of 2):

calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 ||
ARM64) architectures built with Clang (all released versions), whereby the
stack frame gets blown up to well over 5k.  This would cause an immediate
kernel panic on most architectures.  We'll revert this when the following
bug report has been resolved:
https://github.com/llvm/llvm-project/issues/41896.

Link: https://lkml.kernel.org/r/20221125120750.3537134-1-lee@kernel.org
Link: https://lkml.kernel.org/r/20221125120750.3537134-2-lee@kernel.orgSigned-off-by: NLee Jones <lee@kernel.org>
Suggested-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@gmail.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Lee Jones <lee@kernel.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Tom Rix <trix@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

6f6cb171

drm/i915: fix TLB invalidation for Gen12 video and compute engines · 04aa6437

由 Andrzej Hajda 提交于 11月 14, 2022

In case of Gen12 video and compute engines, TLB_INV registers are masked -
to modify one bit, corresponding bit in upper half of the register must
be enabled, otherwise nothing happens.

CVE: CVE-2022-4139
Suggested-by: NChris Wilson <chris.p.wilson@intel.com>
Signed-off-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Fixes: 7938d615 ("drm/i915: Flush TLBs before releasing backing store")
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

04aa6437

29 11月, 2022 4 次提交

drm/i915: Never return 0 if not all requests retired · 12b8b046

由 Janusz Krzysztofik 提交于 11月 21, 2022

Users of intel_gt_retire_requests_timeout() expect 0 return value on
success.  However, we have no protection from passing back 0 potentially
returned by a call to dma_fence_wait_timeout() when it succedes right
after its timeout has expired.

Replace 0 with -ETIME before potentially using the timeout value as return
code, so -ETIME is returned if there are still some requests not retired
after timeout, 0 otherwise.

v3: Use conditional expression, more compact but also better reflecting
    intention standing behind the change.

v2: Move the added lines down so flush_submission() is not affected.

Fixes: f33a8a51 ("drm/i915: Merge wait_for_timelines with retire_request")
Signed-off-by: NJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-3-janusz.krzysztofik@linux.intel.com
(cherry picked from commit f301a29f)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

12b8b046

drm/i915: Fix negative value passed as remaining time · a8899b87

由 Janusz Krzysztofik 提交于 11月 21, 2022

Commit b97060a9 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned. However, when
request retirement happens to succeed despite an error returned by a call
to dma_fence_wait_timeout(), that error code (a negative value) is passed
back instead of remaining time. If we then pass that negative value
forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
will be triggered.

If request retirement succeeds but an error code is passed back via
remaininig_timeout, we may have no clue on how much of the initial timeout
might have been left for spending it on waiting for GuC to become idle.
OTOH, since all pending requests have been successfully retired, that
error code has been already ignored by intel_gt_retire_requests_timeout(),
then we shouldn't fail.

Assume no more time has been left on error and pass 0 timeout value to
intel_uc_wait_for_idle() to give it a chance to return success if GuC is
already idle.

v3: Don't fail on any error passed back via remaining_timeout.

v2: Fix the issue on the caller side, not the provider.

Fixes: b97060a9 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: NJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.15+
Reviewed-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit f235dbd5)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

a8899b87

drm/i915: Remove non-existent pipes from bigjoiner pipe mask · 3c1ea6a5

由 Ville Syrjälä 提交于 11月 18, 2022

bigjoiner_pipes() doesn't consider that:
- RKL only has three pipes
- some pipes may be fused off

This means that intel_atomic_check_bigjoiner() won't reject
all configurations that would need a non-existent pipe.
Instead we just keep on rolling witout actually having
reserved the slave pipe we need.

It's possible that we don't outright explode anywhere due to
this since eg. for_each_intel_crtc_in_pipe_mask() will only
walk the crtcs we've registered even though the passed in
pipe_mask asks for more of them. But clearly the thing won't
do what is expected of it when the required pipes are not
present.

Fix the problem by consulting the device info pipe_mask already
in bigjoiner_pipes().

Cc: stable@vger.kernel.org
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221118185201.10469-1-ville.syrjala@linux.intel.comReviewed-by: NArun R Murthy <arun.r.murthy@intel.com>
(cherry picked from commit f1c87a94)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

3c1ea6a5

drm/i915/mtl: Fix dram info readout · 2f383054

由 Radhakrishna Sripada 提交于 11月 17, 2022

MEM_SS_INFO_GLOBAL Register info read from the hardware is cached in val. However
the variable is being modified when determining the DRAM type thereby clearing out
the channels and qgv info extracted later in the function xelpdp_get_dram_info. Preserve
the register value and use extracted fields in the switch statement.

Fixes: 825477e7 ("drm/i915/mtl: Obtain SAGV values from MMIO instead of GT pcode mailbox")
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: NRadhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221117213015.584417-1-radhakrishna.sripada@intel.com
(cherry picked from commit ec35c41d)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

2f383054

23 11月, 2022 12 次提交

drm/amdgpu/vcn: re-use original vcn0 doorbell value · ecb41b71

由 Jane Jian 提交于 11月 16, 2022

root cause that S2A need to use deduct offset flag.
after setting this flag, vcn0 doorbell value works.
so return it as before
Signed-off-by: NJane Jian <Jane.Jian@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ecb41b71

drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read" · 602ad43c

由 Alex Deucher 提交于 11月 21, 2022

This partially reverts 20543be9.

Calling drm_connector_update_edid_property() in
amdgpu_connector_free_edid() causes a noticeable pause in
the system every 10 seconds on polled outputs so revert this
part of the change.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2257
Cc: Claudio Suarez <cssk@net-c.es>
Acked-by: NLuben Tuikov <luben.tuikov@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

602ad43c

drm/amd/display: No display after resume from WB/CB · a6e1775d

由 Tsung-hua Lin 提交于 11月 09, 2022

[why]
First MST sideband message returns AUX_RET_ERROR_HPD_DISCON
on certain intel platform. Aux transaction considered failure
if HPD unexpected pulled low. The actual aux transaction success
in such case, hence do not return error.

[how]
Not returning error when AUX_RET_ERROR_HPD_DISCON detected
on the first sideband message.

v2: squash in fix (Alex)
Reviewed-by: NJerry Zuo <Jerry.Zuo@amd.com>
Acked-by: NBrian Chang <Brian.Chang@amd.com>
Signed-off-by: NTsung-hua Lin <Tsung-hua.Lin@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

a6e1775d

drm/amdgpu: fix use-after-free during gpu recovery · 3cb93f39

由 Stanley.Yang 提交于 11月 16, 2022

[Why]
    [  754.862560] refcount_t: underflow; use-after-free.
    [  754.862898] Call Trace:
    [  754.862903]  <TASK>
    [  754.862913]  amdgpu_job_free_cb+0xc2/0xe1 [amdgpu]
    [  754.863543]  drm_sched_main.cold+0x34/0x39 [amd_sched]

[How]
    The fw_fence may be not init, check whether dma_fence_init
    is performed before job free
Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3cb93f39

drm/amd/pm: update driver if header for smu_13_0_7 · f2e1aa26

由 lyndonli 提交于 11月 21, 2022

update driver if header for smu_13_0_7
Signed-off-by: Nlyndonli <Lyndon.Li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x

f2e1aa26

drm/amd/display: Fix rotated cursor offset calculation · a26a54fb

由 David Galiffi 提交于 11月 10, 2022

[Why]
Underflow is observed when cursor is still enabled when the cursor
rectangle is outside the bounds of it's surface viewport.

[How]
Update parameters used to determine when cursor should be disabled.
Reviewed-by: NMartin Leung <Martin.Leung@amd.com>
Acked-by: NBrian Chang <Brian.Chang@amd.com>
Signed-off-by: NDavid Galiffi <David.Galiffi@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a26a54fb

drm/amd/display: Use new num clk levels struct for max mclk index · e667ee3b

由 Dillon Varone 提交于 11月 11, 2022

[WHY?]
When calculating watermark and dlg values, the max mclk level index and
associated speed are needed to find the correlated dummy latency value.
Currently the incorrect index is given due to a clock manager refactor.

[HOW?]
Use num_memclk_level from num_entries_per_clk struct for getting the correct max
mem speed.
Reviewed-by: NJun Lei <Jun.Lei@amd.com>
Acked-by: NBrian Chang <Brian.Chang@amd.com>
Signed-off-by: NDillon Varone <Dillon.Varone@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e667ee3b

drm/amd/display: Avoid setting pixel rate divider to N/A · 2a5dd86a

由 Taimur Hassan 提交于 11月 11, 2022

[Why]
Pixel rate divider values should never be set to N/A (0xF) as the K1/K2
field is only 1/2 bits wide.

[How]
Set valid divider values for virtual and FRL/DP2 cases.
Reviewed-by: NNicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: NBrian Chang <Brian.Chang@amd.com>
Signed-off-by: NTaimur Hassan <Syed.Hassan@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2a5dd86a

drm/amd/display: Use viewport height for subvp mall allocation size · dd2c028c

由 Dillon Varone 提交于 11月 10, 2022

[WHY?]
MALL allocation size depends on the viewport height, not the addressable
vertical lines, which will not match when scaling.

[HOW?]
Base MALL allocation size calculations off viewport height.
Reviewed-by: NAlvin Lee <Alvin.Lee2@amd.com>
Reviewed-by: NMartin Leung <Martin.Leung@amd.com>
Acked-by: NBrian Chang <Brian.Chang@amd.com>
Signed-off-by: NDillon Varone <Dillon.Varone@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dd2c028c

drm/amd/display: Update soc bounding box for dcn32/dcn321 · 5d82c82f

由 Dillon Varone 提交于 11月 07, 2022

[Description]
New values for soc bounding box and dummy pstate.
Reviewed-by: NJun Lei <Jun.Lei@amd.com>
Acked-by: NBrian Chang <Brian.Chang@amd.com>
Signed-off-by: NDillon Varone <Dillon.Varone@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x

5d82c82f

drm/amd/dc/dce120: Fix audio register mapping, stop triggering KASAN · 44035ec2

由 Lyude Paul 提交于 11月 14, 2022

There's been a very long running bug that seems to have been neglected for
a while, where amdgpu consistently triggers a KASAN error at start:

  BUG: KASAN: global-out-of-bounds in read_indirect_azalia_reg+0x1d4/0x2a0 [amdgpu]
  Read of size 4 at addr ffffffffc2274b28 by task modprobe/1889

After digging through amd's rather creative method for accessing registers,
I eventually discovered the problem likely has to do with the fact that on
my dce120 GPU there are supposedly 7 sets of audio registers. But we only
define a register mapping for 6 sets.

So, fix this and fix the KASAN warning finally.
Signed-off-by: NLyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

44035ec2

drm/amdgpu/psp: don't free PSP buffers on suspend · 4f2bea62

由 Alex Deucher 提交于 11月 16, 2022

We can reuse the same buffers on resume.

v2: squash in S4 fix from Shikai

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2213Reviewed-by: NChristian König <christian.koenig@amd.com>
Tested-by: NGuilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

4f2bea62

22 11月, 2022 10 次提交

drm/amd/amdgpu: reserve vm invalidation engine for firmware · 91abf28a

由 Jack Xiao 提交于 11月 16, 2022

If mes enabled, reserve VM invalidation engine 5 for firmware.
Signed-off-by: NJack Xiao <Jack.Xiao@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x

91abf28a

drm/amdgpu: Enable Aldebaran devices to report CU Occupancy · b9ab82da

由 Ramesh Errabolu 提交于 11月 16, 2022

Allow user to know number of compute units (CU) that are in use at any
given moment. Enable access to the method kgd_gfx_v9_get_cu_occupancy
that computes CU occupancy.
Signed-off-by: NRamesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

b9ab82da

drm/amdgpu: fix userptr HMM range handling v2 · 4458da0b

由 Christian König 提交于 11月 10, 2022

The basic problem here is that it's not allowed to page fault while
holding the reservation lock.

So it can happen that multiple processes try to validate an userptr
at the same time.

Work around that by putting the HMM range object into the mutex
protected bo list for now.

v2: make sure range is set to NULL in case of an error
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4458da0b

drm/amdgpu: always register an MMU notifier for userptr · b39df63b

由 Christian König 提交于 11月 09, 2022

Since switching to HMM we always need that because we no longer grab
references to the pages.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b39df63b

drm/amdgpu/dm/mst: Fix uninitialized var in pre_compute_mst_dsc_configs_for_state() · 85ef1679

由 Lyude Paul 提交于 11月 18, 2022

Coverity noticed this one, so let's fix it.

Fixes: ba891436 ("drm/amdgpu/mst: Stop ignoring error codes and deadlocking")
Signed-off-by: NLyude Paul <lyude@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NHarry Wentland <harry.wentland@amd.com>
Cc: stable@vger.kernel.org # v5.6+

85ef1679

drm/amdgpu/dm/dp_mst: Don't grab mst_mgr->lock when computing DSC state · d60b82aa

由 Lyude Paul 提交于 11月 14, 2022

Now that we've fixed the issue with using the incorrect topology manager,
we're actually grabbing the topology manager's lock - and consequently
deadlocking. Luckily for us though, there's actually nothing in AMD's DSC
state computation code that really should need this lock. The one exception
is the mutex_lock() in dm_dp_mst_is_port_support_mode(), however we grab no
locks beneath &mgr->lock there so that should be fine to leave be.

Gitlab issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2171Signed-off-by: NLyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed ("drm/amd/display: MST DSC compute fair share")
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: NWayne Lin <Wayne.Lin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d60b82aa

drm/amdgpu/dm/mst: Use the correct topology mgr pointer in amdgpu_dm_connector · dfbc0041

由 Lyude Paul 提交于 11月 14, 2022

This bug hurt me. Basically, it appears that we've been grabbing the
entirely wrong mutex in the MST DSC computation code for amdgpu! While
we've been grabbing:

  amdgpu_dm_connector->mst_mgr

That's zero-initialized memory, because the only connectors we'll ever
actually be doing DSC computations for are MST ports. Which have mst_mgr
zero-initialized, and instead have the correct topology mgr pointer located
at:

  amdgpu_dm_connector->mst_port->mgr;

I'm a bit impressed that until now, this code has managed not to crash
anyone's systems! It does seem to cause a warning in LOCKDEP though:

  [   66.637670] DEBUG_LOCKS_WARN_ON(lock->magic != lock)

This was causing the problems that appeared to have been introduced by:

  commit 4d07b0bc ("drm/display/dp_mst: Move all payload info into the atomic state")

This wasn't actually where they came from though. Presumably, before the
only thing we were doing with the topology mgr pointer was attempting to
grab mst_mgr->lock. Since the above commit however, we grab much more
information from mst_mgr including the atomic MST state and respective
modesetting locks.

This patch also implies that up until now, it's quite likely we could be
susceptible to race conditions when going through the MST topology state
for DSC computations since we technically will not have grabbed any lock
when going through it.

So, let's fix this by adjusting all the respective code paths to look at
the right pointer and skip things that aren't actual MST connectors from a
topology.

Gitlab issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2171Signed-off-by: NLyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed ("drm/amd/display: MST DSC compute fair share")
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: NWayne Lin <Wayne.Lin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dfbc0041

drm/display/dp_mst: Fix drm_dp_mst_add_affected_dsc_crtcs() return code · 2f3a1273

由 Lyude Paul 提交于 11月 14, 2022

Looks like that we're accidentally dropping a pretty important return code
here. For some reason, we just return -EINVAL if we fail to get the MST
topology state. This is wrong: error codes are important and should never
be squashed without being handled, which here seems to have the potential
to cause a deadlock.
Signed-off-by: NLyude Paul <lyude@redhat.com>
Reviewed-by: NWayne Lin <Wayne.Lin@amd.com>
Fixes: 8ec04671 ("drm/dp_mst: Add helper to trigger modeset on affected DSC MST CRTCs")
Cc: <stable@vger.kernel.org> # v5.6+
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2f3a1273

drm/amdgpu/mst: Stop ignoring error codes and deadlocking · ba891436

由 Lyude Paul 提交于 11月 14, 2022

It appears that amdgpu makes the mistake of completely ignoring the return
values from the DP MST helpers, and instead just returns a simple
true/false. In this case, it seems to have come back to bite us because as
a result of simply returning false from
compute_mst_dsc_configs_for_state(), amdgpu had no way of telling when a
deadlock happened from these helpers. This could definitely result in some
kernel splats.

V2:
* Address Wayne's comments (fix another bunch of spots where we weren't
  passing down return codes)
Signed-off-by: NLyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed ("drm/amd/display: MST DSC compute fair share")
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: NWayne Lin <Wayne.Lin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ba891436

drm/amd/display: Align dcn314_smu logging with other DCNs · 3ca68238

由 Roman Li 提交于 11月 14, 2022

[Why]
Assert on non-OK response from SMU is unnecessary.
It was replaced with respective log message on other asics
in the past with commit:
"drm/amd/display: Removing assert statements for Linux"

[How]
Remove assert and add dbg logging as on other DCNs.
Signed-off-by: NRoman Li <roman.li@amd.com>
Reviewed-by: NSaaem Rizvi <SyedSaaem.Rizvi@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3ca68238

21 11月, 2022 2 次提交

drm/i915: Fix warn in intel_display_power_*_domain() functions · ebbaa439

由 Imre Deak 提交于 11月 14, 2022

The intel_display_power_*_domain() functions should always warn if a
default domain is returned as a fallback, fix this up. Spotted by Ville.

Fixes: 979e1b32 ("drm/i915: Sanitize the port -> DDI/AUX power domain mapping for each platform")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221114122251.21327-2-imre.deak@intel.com
(cherry picked from commit 10b85f0e)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

ebbaa439

drm/i915/ttm: never purge busy objects · 00a6c36c

由 Matthew Auld 提交于 11月 15, 2022

In i915_gem_madvise_ioctl() we immediately purge the object is not
currently used, like when the mm.pages are NULL.  With shmem the pages
might still be hanging around or are perhaps swapped out. Similarly with
ttm we might still have the pages hanging around on the ttm resource,
like with lmem or shmem, but here we need to be extra careful since
async unbinds are possible as well as in-progress kernel moves. In
i915_ttm_purge() we expect the pipeline-gutting to nuke the ttm resource
for us, however if it's busy the memory is only moved to a ghost object,
which then leads to broken behaviour when for example clearing the
i915_tt->filp, since the actual ttm_tt is still alive and populated,
even though it's been moved to the ghost object.  When we later destroy
the ghost object we hit the following, since the filp is now NULL:

[  +0.006982] #PF: supervisor read access in kernel mode
[  +0.005149] #PF: error_code(0x0000) - not-present page
[  +0.005147] PGD 11631d067 P4D 11631d067 PUD 115972067 PMD 0
[  +0.005676] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  +0.012962] Workqueue: events ttm_device_delayed_workqueue [ttm]
[  +0.006022] RIP: 0010:i915_ttm_tt_unpopulate+0x3a/0x70 [i915]
[  +0.005879] Code: 89 fb 48 85 f6 74 11 8b 55 4c 48 8b 7d 30 45 31 c0 31 c9 e8 18 6a e5 e0 80 7d 60 00 74 20 48 8b 45 68
8b 55 08 4c 89 e7 5b 5d <48> 8b 40 20 83 e2 01 41 5c 89 d1 48 8b 70
 30 e9 42 b2 ff ff 4c 89
[  +0.018782] RSP: 0000:ffffc9000bf6fd70 EFLAGS: 00010202
[  +0.005244] RAX: 0000000000000000 RBX: ffff8883e12ae380 RCX: 0000000000000000
[  +0.007150] RDX: 000000008000000e RSI: ffffffff823559b4 RDI: ffff8883e12ae3c0
[  +0.007142] RBP: ffff888103b65d48 R08: 0000000000000001 R09: 0000000000000001
[  +0.007144] R10: 0000000000000001 R11: ffff88829c2c8040 R12: ffff8883e12ae3c0
[  +0.007148] R13: 0000000000000001 R14: ffff888115184140 R15: ffff888115184248
[  +0.007154] FS:  0000000000000000(0000) GS:ffff88844db00000(0000) knlGS:0000000000000000
[  +0.008108] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.005763] CR2: 0000000000000020 CR3: 000000013fdb4004 CR4: 00000000003706e0
[  +0.007152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  +0.007145] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  +0.007154] Call Trace:
[  +0.002459]  <TASK>
[  +0.002126]  ttm_tt_unpopulate.part.0+0x17/0x70 [ttm]
[  +0.005068]  ttm_bo_tt_destroy+0x1c/0x50 [ttm]
[  +0.004464]  ttm_bo_cleanup_memtype_use+0x25/0x40 [ttm]
[  +0.005244]  ttm_bo_cleanup_refs+0x90/0x2c0 [ttm]
[  +0.004721]  ttm_bo_delayed_delete+0x235/0x250 [ttm]
[  +0.004981]  ttm_device_delayed_workqueue+0x13/0x40 [ttm]
[  +0.005422]  process_one_work+0x248/0x560
[  +0.004028]  worker_thread+0x4b/0x390
[  +0.003682]  ? process_one_work+0x560/0x560
[  +0.004199]  kthread+0xeb/0x120
[  +0.003163]  ? kthread_complete_and_exit+0x20/0x20
[  +0.004815]  ret_from_fork+0x1f/0x30

v2:
 - Just use ttm_bo_wait() directly (Niranjana)
 - Add testcase reference

Testcase: igt@gem_madvise@dontneed-evict-race
Fixes: 213d5092 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Reported-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.15+
Reviewed-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: NNirmoy Das <Nirmoy.Das@intel.com>
Reviewed-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115104620.120432-1-matthew.auld@intel.com
(cherry picked from commit 5524b5e5)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

00a6c36c

19 11月, 2022 1 次提交

drm/amdgpu: handle gang submit before VMID · b09d6acb

由 Christian König 提交于 11月 18, 2022

Otherwise it can happen that not all gang members can get a VMID
assigned and we deadlock.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Tested-by: NTimur Kristóf <timur.kristof@gmail.com>
Acked-by: NTimur Kristóf <timur.kristof@gmail.com>
Fixes: 68ce8b24 ("drm/amdgpu: add gang submit backend v2")
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221118153023.312582-1-christian.koenig@amd.com

b09d6acb

18 11月, 2022 1 次提交

gpu: host1x: Avoid trying to use GART on Tegra20 · c2418f91

由 Robin Murphy 提交于 10月 20, 2022

Since commit c7e3ca51 ("iommu/tegra: gart: Do not register with
bus") quite some time ago, the GART driver has effectively disabled
itself to avoid issues with the GPU driver expecting it to work in ways
that it doesn't. As of commit 57365a04 ("iommu: Move bus setup to
IOMMU device registration") that bodge no longer works, but really the
GPU driver should be responsible for its own behaviour anyway. Make the
workaround explicit.
Reported-by: NJon Hunter <jonathanh@nvidia.com>
Suggested-by: NDmitry Osipenko <digetx@gmail.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Tested-by: NJon Hunter <jonathanh@nvidia.com>
Signed-off-by: NThierry Reding <treding@nvidia.com>

c2418f91

16 11月, 2022 6 次提交

drm/display: Don't assume dual mode adaptors support i2c sub-addressing · 5954acba

由 Simon Rettberg 提交于 10月 06, 2022

Current dual mode adaptor ("DP++") detection code assumes that all
adaptors support i2c sub-addressing for read operations from the
DP-HDMI adaptor ID buffer. It has been observed that multiple
adaptors do not in fact support this, and always return data starting
at register 0. On affected adaptors, the code fails to read the proper
registers that would identify the device as a type 2 adaptor, and
handles those as type 1, limiting the TMDS clock to 165MHz, even if
the according register would announce a higher TMDS clock.
Fix this by always reading the ID buffer starting from offset 0, and
discarding any bytes before the actual offset of interest.

We tried finding authoritative documentation on whether or not this is
allowed behaviour, but since all the official VESA docs are paywalled,
the best we could come up with was the spec sheet for Texas Instruments'
SNx5DP149 chip family.[1] It explicitly mentions that sub-addressing is
supported for register writes, but *not* for reads (See NOTE in
section 8.5.3). Unless TI openly decided to violate the VESA spec, one
could take that as a hint that sub-addressing is in fact not mandated
by VESA.
The other two adaptors affected used the PS8409(A) and the LT8611,
according to the data returned from their ID buffers.

[1] https://www.ti.com/lit/ds/symlink/sn75dp149.pdf

Cc: stable@vger.kernel.org
Signed-off-by: NSimon Rettberg <simon.rettberg@rz.uni-freiburg.de>
Reviewed-by: NRafael Gieschke <rafael.gieschke@rz.uni-freiburg.de>
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006113314.41101987@computerAcked-by: NJani Nikula <jani.nikula@intel.com>

5954acba

drm/amd/pm: fix SMU13 runpm hang due to unintentional workaround · 4b14841c

由 Evan Quan 提交于 11月 08, 2022

The workaround designed for some specific ASICs is wrongly applied
to SMU13 ASICs. That leads to some runpm hang.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

4b14841c

drm/amd/pm: enable runpm support over BACO for SMU13.0.7 · df7c013e

由 Evan Quan 提交于 11月 08, 2022

Enable SMU13.0.7 runpm support.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x

df7c013e

drm/amd/pm: enable runpm support over BACO for SMU13.0.0 · 8652da45

由 Evan Quan 提交于 11月 08, 2022

Enable SMU13.0.0 runpm support.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x

8652da45

drm/amdgpu: there is no vbios fb on devices with no display hw (v2) · f8794f31

由 Alex Deucher 提交于 11月 11, 2022

If we enable virtual display functionality on parts with
no display hardware we can end up trying to check for and
reserve the vbios FB area on devices where it doesn't exist.
Check if display hardware is actually present on the hardware
before trying to reserve the memory.

v2: move the check into common code
Acked-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f8794f31

drm/amdkfd: Fix a memory limit issue · 6f9eea43

由 Eric Huang 提交于 11月 14, 2022

It is to resolve a regression, which fails to allocate
VRAM due to no free memory in application, the reason
is we add check of vram_pin_size for memory limit, and
application is pinning the memory for Peerdirect, KFD
should not count it in memory limit. So removing
vram_pin_size will resolve it.
Signed-off-by: NEric Huang <jinhuieric.huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6f9eea43

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功