提交 · 6e3f3c239ee547c5b55a85f467c92a6ba7eee83a · openeuler / Kernel

13 6月, 2022 2 次提交

drm/i915/gt: Fix memory leaks in per-gt sysfs · 6e3f3c23

由 Ashutosh Dixit 提交于 5月 25, 2022

All kmalloc'd kobjects need a kobject_put() to free memory. For example in
previous code, kobj_gt_release() never gets called. The requirement of
kobject_put() now results in a slightly different code organization.

v2: s/gtn/gt/ (Andi)

Fixes: b770bcfa ("drm/i915/gt: create per-tile sysfs interface")
Signed-off-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: NAndi Shyti <andi.shyti@linux.intel.com>
Acked-by: NAndrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a6f6686517c85fba61a0c45097f5bb4fe7e257fb.1653484574.git.ashutosh.dixit@intel.com
(cherry picked from commit 69d6bf5c)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

6e3f3c23

drm/i915/reset: Fix error_state_read ptr + offset use · c9b576d0

由 Alan Previn 提交于 3月 10, 2022

Fix our pointer offset usage in error_state_read
when there is no i915_gpu_coredump but buf offset
is non-zero.

This fixes a kernel page fault can happen when
multiple tests are running concurrently in a loop
and one is producing engine resets and consuming
the i915 error_state dump while the other is
forcing full GT resets. (takes a while to trigger).

The dmesg call trace:

[ 5590.803000] BUG: unable to handle page fault for address:
               ffffffffa0b0e000
[ 5590.803009] #PF: supervisor read access in kernel mode
[ 5590.803013] #PF: error_code(0x0000) - not-present page
[ 5590.803016] PGD 5814067 P4D 5814067 PUD 5815063 PMD 109de4067
               PTE 0
[ 5590.803022] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 5590.803026] CPU: 5 PID: 13656 Comm: i915_hangman Tainted: G U
                    5.17.0-rc5-ups69-guc-err-capt-rev6+ #136
[ 5590.803033] Hardware name: Intel Corporation Alder Lake Client
                    Platform/AlderLake-M LP4x RVP, BIOS ADLPFWI1.R00.
                    3031.A02.2201171222	01/17/2022
[ 5590.803039] RIP: 0010:memcpy_erms+0x6/0x10
[ 5590.803045] Code: fe ff ff cc eb 1e 0f 1f 00 48 89 f8 48 89 d1
                     48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3
                     66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4
                     c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20
                     72 7e 40 38 fe
[ 5590.803054] RSP: 0018:ffffc90003a8fdf0 EFLAGS: 00010282
[ 5590.803057] RAX: ffff888107ee9000 RBX: ffff888108cb1a00
               RCX: 0000000000000f8f
[ 5590.803061] RDX: 0000000000001000 RSI: ffffffffa0b0e000
               RDI: ffff888107ee9071
[ 5590.803065] RBP: 0000000000000000 R08: 0000000000000001
               R09: 0000000000000001
[ 5590.803069] R10: 0000000000000001 R11: 0000000000000002
               R12: 0000000000000019
[ 5590.803073] R13: 0000000000174fff R14: 0000000000001000
               R15: ffff888107ee9000
[ 5590.803077] FS: 00007f62a99bee80(0000) GS:ffff88849f880000(0000)
               knlGS:0000000000000000
[ 5590.803082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5590.803085] CR2: ffffffffa0b0e000 CR3: 000000010a1a8004
               CR4: 0000000000770ee0
[ 5590.803089] PKRU: 55555554
[ 5590.803091] Call Trace:
[ 5590.803093] <TASK>
[ 5590.803096] error_state_read+0xa1/0xd0 [i915]
[ 5590.803175] kernfs_fop_read_iter+0xb2/0x1b0
[ 5590.803180] new_sync_read+0x116/0x1a0
[ 5590.803185] vfs_read+0x114/0x1b0
[ 5590.803189] ksys_read+0x63/0xe0
[ 5590.803193] do_syscall_64+0x38/0xc0
[ 5590.803197] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 5590.803201] RIP: 0033:0x7f62aaea5912
[ 5590.803204] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 5a b9 0c 00 e8 05
                     19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25
                     18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff
                     ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 5590.803213] RSP: 002b:00007fff5b659ae8 EFLAGS: 00000246
               ORIG_RAX: 0000000000000000
[ 5590.803218] RAX: ffffffffffffffda RBX: 0000000000100000
               RCX: 00007f62aaea5912
[ 5590.803221] RDX: 000000000008b000 RSI: 00007f62a8c4000f
               RDI: 0000000000000006
[ 5590.803225] RBP: 00007f62a8bcb00f R08: 0000000000200010
               R09: 0000000000101000
[ 5590.803229] R10: 0000000000000001 R11: 0000000000000246
               R12: 0000000000000006
[ 5590.803233] R13: 0000000000075000 R14: 00007f62a8acb010
               R15: 0000000000200000
[ 5590.803238] </TASK>
[ 5590.803240] Modules linked in: i915 ttm drm_buddy drm_dp_helper
                        drm_kms_helper syscopyarea sysfillrect sysimgblt
                        fb_sys_fops prime_numbers nfnetlink br_netfilter
                        overlay mei_pxp mei_hdcp x86_pkg_temp_thermal
                        coretemp kvm_intel snd_hda_codec_hdmi snd_hda_intel
                        snd_intel_dspcfg snd_hda_codec snd_hwdep
                        snd_hda_core snd_pcm mei_me mei fuse ip_tables
                        x_tables crct10dif_pclmul e1000e crc32_pclmul ptp
                        i2c_i801 ghash_clmulni_intel i2c_smbus pps_core
                        [last unloa ded: ttm]
[ 5590.803277] CR2: ffffffffa0b0e000
[ 5590.803280] ---[ end trace 0000000000000000 ]---

Fixes: 0e39037b ("drm/i915: Cache the error string")
Signed-off-by: NAlan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220311004311.514198-2-alan.previn.teres.alexis@intel.com
(cherry picked from commit 3304033a)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

c9b576d0

10 6月, 2022 1 次提交

drm: imx: fix compiler warning with gcc-12 · 7aefd8b5

由 Linus Torvalds 提交于 6月 08, 2022

Gcc-12 correctly warned about this code using a non-NULL pointer as a
truth value:

  drivers/gpu/drm/imx/ipuv3-crtc.c: In function ‘ipu_crtc_disable_planes’:
  drivers/gpu/drm/imx/ipuv3-crtc.c:72:21: error: the comparison will always evaluate as ‘true’ for the address of ‘plane’ will never be NULL [-Werror=address]
     72 |                 if (&ipu_crtc->plane[1] && plane == &ipu_crtc->plane[1]->base)
        |                     ^

due to the extraneous '&' address-of operator.

Philipp Zabel points out that The mistake had no adverse effect since
the following condition doesn't actually dereference the NULL pointer,
but the intent of the code was obviously to check for it, not to take
the address of the member.

Fixes: eb8c8880 ("drm/imx: add deferred plane disabling")
Acked-by: NPhilipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7aefd8b5

09 6月, 2022 2 次提交

drm/ast: Support multiple outputs · 477277c7

由 Thomas Zimmermann 提交于 6月 07, 2022

Systems with AST graphics can have multiple output; typically VGA
plus some other port. Record detected output chips in a bitmask and
initialize each output on its own.

Assume a VGA output by default and use SIL164 and DP501 if available.
For ASTDP assume that it can run in parallel with VGA.

Tested on AST2100.

v3:
	* define a macro for each BIT(ast_tx_chip) (Patrik)
v2:
	* make VGA/SIL164/DP501 mutually exclusive
Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: NPatrik Jakobsson <patrik.r.jakobsson@gmail.com>
Fixes: a59b0264 ("drm/ast: Initialize encoder and connector for VGA in helper function")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220607092008.22123-2-tzimmermann@suse.de
(cherry picked from commit 7f35680a)
Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de>

477277c7

drm/amdgpu/mes: only invalid/prime icache when finish loading both pipe MES FWs. · 431d0712

由 Yifan Zhang 提交于 6月 03, 2022

invalid/prime icahce operation takes effect both pipes cuconrrently,
therefore CP_MES_IC_BASE_LO/HI and CP_MES_MDBASE_LO/HI both have to be
set before prime icache. Otherwise MES hardware gets garbage data in
above regsters and causes page fault

[  470.873200] amdgpu 0000:33:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:217 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[  470.873222] amdgpu 0000:33:00.0: amdgpu:   in page starting at address 0x000092cb89b00000 from client 10
[  470.873234] amdgpu 0000:33:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000BB3
[  470.873242] amdgpu 0000:33:00.0: amdgpu:      Faulty UTCL2 client ID: CPC (0x5)
[  470.873247] amdgpu 0000:33:00.0: amdgpu:      MORE_FAULTS: 0x1
[  470.873251] amdgpu 0000:33:00.0: amdgpu:      WALKER_ERROR: 0x1
[  470.873256] amdgpu 0000:33:00.0: amdgpu:      PERMISSION_FAULTS: 0xb
[  470.873260] amdgpu 0000:33:00.0: amdgpu:      MAPPING_ERROR: 0x1
[  470.873264] amdgpu 0000:33:00.0: amdgpu:      RW: 0x0
Signed-off-by: NYifan Zhang <yifan1.zhang@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTim Huang <Tim.Huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

431d0712

08 6月, 2022 7 次提交

drm/amdgpu/jpeg2: Add jpeg vmid update under IB submit · 578eb317

由 Mohammad Zafar Ziya 提交于 6月 07, 2022

Add jpeg vmid update under IB submit
Signed-off-by: NMohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NLijo Lazar <lijo.lazar@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

578eb317

drm/amdgpu: always flush the TLB on gfx8 · 84205d00

由 Christian König 提交于 6月 03, 2022

The TLB on GFX8 stores each block of 8 PTEs where any of the valid bits
are set.

Fixes: 5255e146 ("drm/amdgpu: rework TLB flushing")
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Tested-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

84205d00

drm/amdgpu: fix limiting AV1 to the first instance on VCN3 · 1d2afeb7

由 Christian König 提交于 6月 03, 2022

The job is not yet initialized here.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2037Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Tested-by: NPierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Fixes: cdc7893f ("drm/amdgpu: use job and ib structures directly in CS parsers")
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1d2afeb7

drm/amdkfd:Fix fw version for 10.3.6 · a956a11e

由 Jesse Zhang 提交于 6月 07, 2022

fix fw error when loading fw for 10.3.6
Signed-off-by: NJesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NMario Limonciello <mario.limonciello@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x

a956a11e

drm/amdgpu: Add MODE register to wave debug info in gfx11 · b3f9234e

由 Joseph Greathouse 提交于 6月 06, 2022

All other chips, from gfx6-gfx10, now include the MODE register at the
end of the wave debug state. This appears to have been missed in gfx11,
so this patch adds in MODE to the debug state for gfx11.
Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b3f9234e

Revert "drm/amd/display: Pass the new context into disable OTG WA" · 8b8ce2b9

由 Nicholas Kazlauskas 提交于 5月 17, 2022

This reverts commit 8440f575.

Causes a hang when hotplugging DP, shutting down system, or
enabling dual eDP.
Reviewed-by: NDmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: NHamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8b8ce2b9

Revert "drm/amdgpu: Ensure the DMA engine is deactivated during set ups" · 41782d70

由 Guchun Chen 提交于 6月 06, 2022

This reverts commit b992a190.

This causes regression in GPU reset related test.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: ricetons@gmail.com
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

41782d70

07 6月, 2022 2 次提交

drm/atomic: Force bridge self-refresh-exit on CRTC switch · e54a4424

由 Brian Norris 提交于 2月 28, 2022

It's possible to change which CRTC is in use for a given
connector/encoder/bridge while we're in self-refresh without fully
disabling the connector/encoder/bridge along the way. This can confuse
the bridge encoder/bridge, because
(a) it needs to track the SR state (trying to perform "active"
    operations while the panel is still in SR can be Bad(TM)); and
(b) it tracks the SR state via the CRTC state (and after the switch, the
    previous SR state is lost).

Thus, we need to either somehow carry the self-refresh state over to the
new CRTC, or else force an encoder/bridge self-refresh transition during
such a switch.

I choose the latter, so we disable the encoder (and exit PSR) before
attaching it to the new CRTC (where we can continue to assume a clean
(non-self-refresh) state).

This fixes PSR issues seen on Rockchip RK3399 systems with
drivers/gpu/drm/bridge/analogix/analogix_dp_core.c.

Change in v2:

- Drop "->enable" condition; this could possibly be "->active" to
  reflect the intended hardware state, but it also is a little
  over-specific. We want to make a transition through "disabled" any
  time we're exiting PSR at the same time as a CRTC switch.
  (Thanks Liu Ying)

Cc: Liu Ying <victor.liu@oss.nxp.com>
Cc: <stable@vger.kernel.org>
Fixes: 1452c25b ("drm: Add helpers to kick off self refresh mode in drivers")
Signed-off-by: NBrian Norris <briannorris@chromium.org>
Reviewed-by: NSean Paul <seanpaul@chromium.org>
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228122522.v2.2.Ic15a2ef69c540aee8732703103e2cff51fb9c399@changeid

e54a4424

drm/bridge: analogix_dp: Support PSR-exit to disable transition · ca871659

由 Brian Norris 提交于 2月 28, 2022

Most eDP panel functions only work correctly when the panel is not in
self-refresh. In particular, analogix_dp_bridge_disable() tends to hit
AUX channel errors if the panel is in self-refresh.

Given the above, it appears that so far, this driver assumes that we are
never in self-refresh when it comes time to fully disable the bridge.
Prior to commit 846c7dfc ("drm/atomic: Try to preserve the crtc
enabled state in drm_atomic_remove_fb, v2."), this tended to be true,
because we would automatically disable the pipe when framebuffers were
removed, and so we'd typically disable the bridge shortly after the last
display activity.

However, that is not guaranteed: an idle (self-refresh) display pipe may
be disabled, e.g., when switching CRTCs. We need to exit PSR first.

Stable notes: this is definitely a bugfix, and the bug has likely
existed in some form for quite a while. It may predate the "PSR helpers"
refactor, but the code looked very different before that, and it's
probably not worth rewriting the fix.

Cc: <stable@vger.kernel.org>
Fixes: 6c836d96 ("drm/rockchip: Use the helpers for PSR")
Signed-off-by: NBrian Norris <briannorris@chromium.org>
Reviewed-by: NSean Paul <seanpaul@chromium.org>
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228122522.v2.1.I161904be17ba14526f78536ccd78b85818449b51@changeid

ca871659

04 6月, 2022 7 次提交

drm/amdgpu: suppress the compile warning about 64 bit type · 12d6c18c

由 Evan Quan 提交于 5月 30, 2022

Suppress the compile warning below:
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1292
gfx_v11_0_rlc_backdoor_autoload_copy_ucode() warn: should '1 << id' be a 64 bit type?
Reported-by: Nkernel test robot <lkp@intel.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

12d6c18c

drm/amd/pm: suppress compile warnings about possible unaligned accesses · 4513edf7

由 Evan Quan 提交于 5月 30, 2022

Suppress the following compile warnings:
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v11_0_pptable.h:163:17:
warning: field smc_pptable within 'struct smu_11_0_powerplay_table' is
less aligned than 'PPTable_t' and is usually due to 'struct smu_11_0_powerplay_table'
being packed, which can lead to unaligned accesses [-Wunaligned-access]
         PPTable_t smc_pptable;                        //PPTable_t in smu11_driver_if.h
                   ^
   1 warning generated.
--
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v11_0_7_pptable.h:193:17:
warning: field smc_pptable within 'struct smu_11_0_7_powerplay_table' is
less aligned than 'PPTable_t' and is usually due to 'struct smu_11_0_7_powerplay_table'
being packed, which can lead to unaligned accesses [-Wunaligned-access]
         PPTable_t smc_pptable;                        //PPTable_t in smu11_driver_if.h
                   ^
   1 warning generated.
--
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v13_0_pptable.h:161:12:
warning: field smc_pptable within 'struct smu_13_0_powerplay_table' is less aligned than
'PPTable_t' and is usually due to 'struct smu_13_0_powerplay_table' being packed, which
can lead to unaligned accesses [-Wunaligned-access]
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reported-by: Nkernel test robot <lkp@intel.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4513edf7

drm/amdkfd: Fix partial migration bugs · 88467db6

由 Philip Yang 提交于 6月 03, 2022

Migration range from system memory to VRAM, if system page can not be
locked or unmapped, we do partial migration and leave some pages in
system memory. Several bugs found to copy pages and update GPU mapping
for this situation:

1. copy to vram should use migrate->npage which is total pages of range
as npages, not migrate->cpages which is number of pages can be migrated.

2. After partial copy, set VRAM res cursor as j + 1, j is number of
system pages copied plus 1 page to skip copy.

3. copy to ram, should collect all continuous VRAM pages and copy
together.

4. Call amdgpu_vm_update_range, should pass in offset as bytes, not
as number of pages.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

88467db6

drm/amdkfd: add pinned BOs to kfd_bo_list · 4fac4fcf

由 Lang Yu 提交于 5月 31, 2022

The kfd_bo_list is used to restore process BOs after
evictions. As page tables could be destroyed during
evictions, we should also update pinned BOs' page tables
during restoring to make sure they are valid.

So for pinned BOs,
1, Validate them and update their page tables.
2, Don't add eviction fence for them.

v2:
 - Don't handle pinned ones specially in BO validation.(Felix)
Signed-off-by: NLang Yu <Lang.Yu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4fac4fcf

drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved · 4d1e5f12

由 Philip Yang 提交于 6月 01, 2022

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.
Suggested-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4d1e5f12

drm/amdgpu: enable tmz by default for GC 10.3.7 · 37101730

由 Sunil Khatri 提交于 5月 30, 2022

Add IP GC 10.3.7 in the list of target to have
tmz enabled by default.
Signed-off-by: NSunil Khatri <sunil.khatri@amd.com>
Reviewed-by: NAlexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x

37101730

drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitions · 7c4f4f19

由 Mario Limonciello 提交于 5月 31, 2022

Loading amdgpu on GC 10.3.7 shows an ERR level message:
`kfd kfd: amdgpu: GC IP 0a0307 not supported in kfd`

Add these targets to match yellow carp structures.
Reported-by: NDavid Chang <david.chang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Tested-by: NJesse(Jie) Zhang <Jesse.Zhang@amd.com>
Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x

7c4f4f19

03 6月, 2022 3 次提交

drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate · 525d6515

由 Yury Norov 提交于 4月 28, 2022

The smu_v1X_0_set_allowed_mask() uses bitmap_copy() to convert
bitmap to 32-bit array. This may be wrong due to endiannes issues.
Fix it by switching to bitmap_{from,to}_arr32.

CC: Alexander Gordeev <agordeev@linux.ibm.com>
CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Christian Borntraeger <borntraeger@linux.ibm.com>
CC: Claudio Imbrenda <imbrenda@linux.ibm.com>
CC: David Hildenbrand <david@redhat.com>
CC: Heiko Carstens <hca@linux.ibm.com>
CC: Janosch Frank <frankja@linux.ibm.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: Sven Schnelle <svens@linux.ibm.com>
CC: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NYury Norov <yury.norov@gmail.com>

525d6515

drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate · a37e94fe

由 Yury Norov 提交于 1月 23, 2022

i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: NYury Norov <yury.norov@gmail.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

a37e94fe

LoongArch: Add writecombine support for drm · 439057ec

由 Huacai Chen 提交于 5月 31, 2022

LoongArch maintains cache coherency in hardware, but its WUC attribute
(Weak-ordered UnCached, which is similar to WC) is out of the scope of
cache coherency machanism. This means WUC can only used for write-only
memory regions.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: NWANG Xuerui <git@xen0n.name>
Reviewed-by: NJiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: NHuacai Chen <chenhuacai@loongson.cn>

439057ec

02 6月, 2022 16 次提交

drm/msm/dpu: Move min BW request and full BW disable back to mdss · b9364eed

由 Douglas Anderson 提交于 5月 31, 2022

In commit a670ff57 ("drm/msm/dpu: always use mdp device to scale
bandwidth") we fully moved interconnect stuff to the DPU driver. This
had no change for sc7180 but _did_ have an impact for other SoCs. It
made them match the sc7180 scheme.

Unfortunately, the sc7180 scheme seems like it was a bit broken.
Specifically the interconnect needs to be on for more than just the
DPU driver's AXI bus. In the very least it also needs to be on for the
DSI driver's AXI bus. This can be seen fairly easily by doing this on
a ChromeOS sc7180-trogdor class device:

set_power_policy --ac_screen_dim_delay=5 --ac_screen_off_delay=10
sleep 10
cd /sys/bus/platform/devices/ae94000.dsi/power
echo on > control

When you do that, you'll get a warning splat in the logs about
"gcc_disp_hf_axi_clk status stuck at 'off'".

One could argue that perhaps what I have done above is "illegal" and
that it can't happen naturally in the system because in normal system
usage the DPU is pretty much always on when DSI is on. That being
said:
* In official ChromeOS builds (admittedly a 5.4 kernel with backports)
we have seen that splat at bootup.
* Even though we don't use "autosuspend" for these components, we
don't use the "put_sync" variants. Thus plausibly the DSI could stay
"runtime enabled" past when the DPU is enabled. Techncially we
shouldn't do that if the DPU's suspend ends up yanking our clock.

Let's change things such that the "bare minimum" request for the
interconnect happens in the mdss driver again. That means that all of
the children can assume that the interconnect is on at the minimum
bandwidth. We'll then let the DPU request the higher amount that it
wants.

It should be noted that this isn't as hacky of a solution as it might
initially appear. Specifically:
* Since MDSS and DPU individually get their own references to the
interconnect then the framework will actually handle aggregating
them. The two drivers are _not_ clobbering each other.
* When the Qualcomm interconnect driver aggregates it takes the max of
all the peaks. Thus having MDSS request a peak, as we're doing here,
won't actually change the total interconnect bandwidth (it won't be
added to the request for the DPU). This perhaps explains why the
"average" requested in MDSS was historically 0 since that one
_would_ be added in.

NOTE also that in the downstream ChromeOS 5.4 and 5.15 kernels, we're
also seeing some RPMH hangs that are addressed by this fix. These
hangs are showing up in the field and on _some_ devices with enough
stress testing of suspend/resume. Specifically right at suspend time
with a stack crawl that looks like this (from chromeos-5.15 tree):
rpmh_write_batch+0x19c/0x240
qcom_icc_bcm_voter_commit+0x210/0x420
qcom_icc_set+0x28/0x38
apply_constraints+0x70/0xa4
icc_set_bw+0x150/0x24c
dpu_runtime_resume+0x50/0x1c4
pm_generic_runtime_resume+0x30/0x44
__genpd_runtime_resume+0x68/0x7c
genpd_runtime_resume+0x12c/0x20c
__rpm_callback+0x98/0x138
rpm_callback+0x30/0x88
rpm_resume+0x370/0x4a0
__pm_runtime_resume+0x80/0xb0
dpu_kms_enable_commit+0x24/0x30
msm_atomic_commit_tail+0x12c/0x630
commit_tail+0xac/0x150
drm_atomic_helper_commit+0x114/0x11c
drm_atomic_commit+0x68/0x78
drm_atomic_helper_disable_all+0x158/0x1c8
drm_atomic_helper_suspend+0xc0/0x1c0
drm_mode_config_helper_suspend+0x2c/0x60
msm_pm_prepare+0x2c/0x40
pm_generic_prepare+0x30/0x44
genpd_prepare+0x80/0xd0
device_prepare+0x78/0x17c
dpm_prepare+0xb0/0x384
dpm_suspend_start+0x34/0xc0

We don't completely understand all the mechanisms in play, but the
hang seemed to come and go with random factors. It's not terribly
surprising that the hang is gone after this patch since the line of
code that was failing is no longer present in the kernel.

Fixes: a670ff57 ("drm/msm/dpu: always use mdp device to scale bandwidth")
Fixes: c33b7c03 ("drm/msm/dpu: add support for clk and bw scaling for display")
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Reviewed-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>
Tested-by: Jessica Zhang <quic_jesszhan@quicinc.com> # RB3 (sdm845) and
Reviewed-by: NStephen Boyd <swboyd@chromium.org>
Reviewed-by: NDmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/487884/
Link: https://lore.kernel.org/r/20220531160059.v2.1.Ie7f6d4bf8cce28131da31a43354727e417cae98d@changeidSigned-off-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>

b9364eed

drm/msm/dpu: Fix pointer dereferenced before checking · 8caad14e

由 Haowen Bai 提交于 5月 30, 2022

The phys_enc->wb_idx is dereferencing before null checking, so move
it after checking.
Signed-off-by: NHaowen Bai <baihaowen@meizu.com>
Reviewed-by: NDmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: d7d0e73f ("drm/msm/dpu: introduce the dpu_encoder_phys_* for
Reviewed-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/487606/
Link: https://lore.kernel.org/r/1653877196-23114-1-git-send-email-baihaowen@meizu.comSigned-off-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>

8caad14e

drm/msm/dpu: Remove unused code · fb0af2da

由 Jiapeng Chong 提交于 5月 24, 2022

Eliminate the follow clang warning:

drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c:544:33: warning: variable
‘mode’ set but not used [-Wunused-but-set-variable].
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Fixes: 3177589c("drm/msm/dpu: encoder: drop unused mode_fixup callback")
Reviewed-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/487136/
Link: https://lore.kernel.org/r/20220524081413.37895-1-jiapeng.chong@linux.alibaba.comSigned-off-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>

fb0af2da

drm/msm/disp/dpu1: remove superfluous init · 6daf7e4a

由 Vinod Koul 提交于 5月 25, 2022

Commit 58dca981 ("drm/msm/disp/dpu1: Add support for DSC in
encoder") added dsc_common_mode variable which was set to zero but then
again programmed, so drop the superfluous init.

Fixes: 58dca981 ("drm/msm/disp/dpu1: Add support for DSC in encoder")
Reported-by: Nkernel test robot <yujie.liu@intel.com>
Reviewed-by: NDmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: NVinod Koul <vkoul@kernel.org>
Patchwork: https://patchwork.freedesktop.org/patch/487208/
Link: https://lore.kernel.org/r/20220525073912.2706505-1-vkoul@kernel.orgSigned-off-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>

6daf7e4a

drm/msm/dp: Always clear mask bits to disable interrupts at dp_ctrl_reset_irq_ctrl() · 993a2adc

由 Kuogee Hsieh 提交于 5月 17, 2022

dp_catalog_ctrl_reset() will software reset DP controller. But it will
not reset programmable registers to default value. DP driver still have
to clear mask bits to interrupt status registers to disable interrupts
after software reset of controller.

At current implementation, dp_ctrl_reset_irq_ctrl() will software reset dp
controller but did not call dp_catalog_ctrl_enable_irq(false) to clear hpd
related interrupt mask bits to disable hpd related interrupts due to it
mistakenly think hpd related interrupt mask bits will be cleared by software
reset of dp controller automatically. This mistake may cause system to crash
during suspending procedure due to unexpected irq fired and trigger event
thread to access dp controller registers with controller clocks are disabled.

This patch fixes system crash during suspending problem by removing "enable"
flag condition checking at dp_ctrl_reset_irq_ctrl() so that hpd related
interrupt mask bits are cleared to prevent unexpected from happening.

Changes in v2:
-- add more details commit text

Changes in v3:
-- add synchrons_irq()
-- add atomic_t suspended

Changes in v4:
-- correct Fixes's commit ID
-- remove synchrons_irq()

Changes in v5:
-- revise commit text

Changes in v6:
-- add event_lock to protect "suspended"

Changes in v7:
-- delete "suspended" flag

Fixes: 989ebe7b ("drm/msm/dp: do not initialize phy until plugin interrupt received")
Signed-off-by: NKuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: NStephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/486591/
Link: https://lore.kernel.org/r/1652804494-19650-1-git-send-email-quic_khsieh@quicinc.comSigned-off-by: NAbhinav Kumar <quic_abhinavk@quicinc.com>

993a2adc

drm/amdkfd: Use mmget_not_zero in MMU notifier · fa582c6f

由 Philip Yang 提交于 5月 26, 2022

MMU notifier callback may pass in mm with mm->mm_users==0 when process
is exiting, use mmget_no_zero to avoid accessing invalid mm in deferred
list work after mm is gone.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fa582c6f

drm/amdgpu: Resolve RAS GFX error count issue after cold boot on Arcturus · 2a460963

由 Candice Li 提交于 6月 01, 2022

Adjust the sequence for ras late init and separate ras reset error status
from query status.

v2: squash in fix from Candice
Signed-off-by: NCandice Li <candice.li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2a460963

drm/amdgpu: fix ras supported check · 28caf8c4

由 Stanley.Yang 提交于 5月 31, 2022

Fix aldebaran ras supported check on SRIOV guest side,
the previous check conditicon block all ras feature
on baremetal
Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

28caf8c4

drm/amd/display: remove stale config guards · fd843d03

由 Aurabindo Pillai 提交于 4月 14, 2022

This code should be executed.
Signed-off-by: NAurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

fd843d03

drm/amdgpu: make gfx_v11_0_rlc_stop static · 8365ed22

由 sunliming 提交于 5月 29, 2022

This symbol is not used outside of gfx_v11_0.c, so marks it static.

Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1945:6: warning: no previous
prototype for function 'gfx_v11_0_rlc_stop' [-Wmissing-prototypes].
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: Nsunliming <sunliming@kylinos.cn>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8365ed22

drm/amdgpu: fix a missing break in gfx_v11_0_handle_priv_fault · 418214dd

由 sunliming 提交于 5月 29, 2022

Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:5873:2: warning: unannotated
fall-through between switch labels [-Wimplicit-fallthrough].
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: Nsunliming <sunliming@kylinos.cn>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

418214dd

drm/amdgpu: fix aper_base for APU · ae969b62

由 Roman Li 提交于 5月 25, 2022

[Why]
Wrong fb offset results in dmub f/w errors and white screen.
[drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3

[How]
Read aper_base from mmhub because GC is off by default

v2: use BAR for passthrough (Alex)
Signed-off-by: NRoman Li <Roman.Li@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ae969b62

drm/amdgpu: update VCN codec support for Yellow Carp · 97e50305

由 Alex Deucher 提交于 5月 26, 2022

Supports AV1.  Mesa already has support for this and
doesn't rely on the kernel caps for yellow carp, so
this was already working from an application perspective.

Fixes: 55439817 ("amdgpu/nv.c - Added video codec support for Yellow Carp")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2002Reviewed-by: NLeo Liu <leo.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

97e50305

drm/amdgpu: make program_imu_rlc_ram static · 11594fa1

由 Jiapeng Chong 提交于 5月 25, 2022

This symbol is not used outside of imu_v11_0.c, so marks it
static.

Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:302:6: warning: no previous
prototype for ‘program_imu_rlc_ram’ [-Wmissing-prototypes].
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

11594fa1

drm/amd/display: 3.2.187 · 06754184

由 Aric Cyr 提交于 5月 15, 2022

This version brings along the following fixes:

* Changes to DP LT fallback behavior to more closely match the DP standard
* Added new interfaces for lut pipeline
* Restore ref_dtblck value when clk struct is cleared in init_clocks
* Fixes DMUB outbox trace in S4
* Fixes lingering DIO FIFO errors when DIO no longer enabled
* Reads Golden Settings Table from VBIOS
Acked-by: NJasdeep Dhillon <jdhillon@amd.com>
Signed-off-by: NAric Cyr <aric.cyr@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

06754184

drm/amd/display: Fix possible infinite loop in DP LT fallback · 583ad888

由 Ilya 提交于 2月 07, 2022

[Why]
It's possible for some fallback scenarios to result in infinite looping
during link training.

[How]
This change modifies DP LT fallback behavior to more closely match the
DP standard. Keep track of the link rate during the EQ_FAIL fallback,
and use it as the maximum link rate for the CR sequence.
Reviewed-by: NWenjing Liu <Wenjing.Liu@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NIlya <Ilya.Bakoulin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

583ad888

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功