提交 · 3ede8f72a9a2825efca23a3552e80a1202ea88fd · openeuler / Kernel

03 6月, 2021 4 次提交

nvme-tcp: allow selecting the network interface for connections · 3ede8f72

由 Martin Belanger 提交于 5月 20, 2021

In our application, we need a way to force TCP connections to go out a
specific IP interface instead of letting Linux select the interface
based on the routing tables.

Add the 'host-iface' option to allow specifying the interface to use.
When the option host-iface is specified, the driver uses the specified
interface to set the option SO_BINDTODEVICE on the TCP socket before
connecting.

This new option is needed in addtion to the existing host-traddr for
the following reasons:

Specifying an IP interface by its associated IP address is less
intuitive than specifying the actual interface name and, in some cases,
simply doesn't work. That's because the association between interfaces
and IP addresses is not predictable. IP addresses can be changed or can
change by themselves over time (e.g. DHCP). Interface names are
predictable [1] and will persist over time. Consider the following
configuration.

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state ...
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 100.0.0.100/24 scope global lo
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
    link/ether 08:00:27:21:65:ec brd ff:ff:ff:ff:ff:ff
    inet 100.0.0.100/24 scope global enp0s3
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
    link/ether 08:00:27:4f:95:5c brd ff:ff:ff:ff:ff:ff
    inet 100.0.0.100/24 scope global enp0s8
       valid_lft forever preferred_lft forever

The above is a VM that I configured with the same IP address
(100.0.0.100) on all interfaces. Doing a reverse lookup to identify the
unique interface associated with 100.0.0.100 does not work here. And
this is why the option host_iface is required. I understand that the
above config does not represent a standard host system, but I'm using
this to prove a point: "We can never know how users will configure
their systems". By te way, The above configuration is perfectly fine
by Linux.

The current TCP implementation for host_traddr performs a
bind()-before-connect(). This is a common construct to set the source
IP address on a TCP socket before connecting. This has no effect on how
Linux selects the interface for the connection. That's because Linux
uses the Weak End System model as described in RFC1122 [2]. On the other
hand, setting the Source IP Address has benefits and should be supported
by linux-nvme. In fact, setting the Source IP Address is a mandatory
FedGov requirement (e.g. connection to a RADIUS/TACACS+ server).
Consider the following configuration.

$ ip addr list dev enp0s8
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
    link/ether 08:00:27:4f:95:5c brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.101/24 brd 192.168.56.255 scope global enp0s8
       valid_lft 426sec preferred_lft 426sec
    inet 192.168.56.102/24 scope global secondary enp0s8
       valid_lft forever preferred_lft forever
    inet 192.168.56.103/24 scope global secondary enp0s8
       valid_lft forever preferred_lft forever
    inet 192.168.56.104/24 scope global secondary enp0s8
       valid_lft forever preferred_lft forever

Here we can see that several addresses are associated with interface
enp0s8. By default, Linux always selects the default IP address,
192.168.56.101, as the source address when connecting over interface
enp0s8. Some users, however, want the ability to specify a different
source address (e.g., 192.168.56.102, 192.168.56.103, ...). The option
host_traddr can be used as-is to perform this function.

In conclusion, I believe that we need 2 options for TCP connections.
One that can be used to specify an interface (host-iface). And one that
can be used to set the source address (host-traddr). Users should be
allowed to use one or the other, or both, or none. Of course, the
documentation for host_traddr will need some clarification. It should
state that when used for TCP connection, this option only sets the
source address. And the documentation for host_iface should say that
this option is only available for TCP connections.

References:
[1] https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/
[2] https://tools.ietf.org/html/rfc1122

Tested both IPv4 and IPv6 connections.
Signed-off-by: NMartin Belanger <martin.belanger@dell.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3ede8f72

nvme-pci: look for StorageD3Enable on companion ACPI device instead · e21e0243

由 Mario Limonciello 提交于 5月 28, 2021

The documentation around the StorageD3Enable property hints that it
should be made on the PCI device. This is where newer AMD systems set
the property and it's required for S0i3 support.

So rather than look for nodes of the root port only present on Intel
systems, switch to the companion ACPI device for all systems.
David Box from Intel indicated this should work on Intel as well.

Link: https://lore.kernel.org/linux-nvme/YK6gmAWqaRmvpJXb@google.com/T/#m900552229fa455867ee29c33b854845fce80ba70
Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro
Fixes: df4f9bc4 ("nvme-pci: add support for ACPI StorageD3Enable property")
Suggested-by: NLiang Prike <Prike.Liang@amd.com>
Acked-by: NRaul E Rangel <rrangel@chromium.org>
Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
Reviewed-by: NDavid E. Box <david.e.box@linux.intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e21e0243

nvme: extend and modify the APST configuration algorithm · ebd8a93a

由 Alexey Bogoslavsky 提交于 4月 28, 2021

The algorithm that was used until now for building the APST configuration
table has been found to produce entries with excessively long ITPT
(idle time prior to transition) for devices declaring relatively long
entry and exit latencies for non-operational power states. This leads
to unnecessary waste of power and, as a result, failure to pass
mandatory power consumption tests on Chromebook platforms.

The new algorithm is based on two predefined ITPT values and two
predefined latency tolerances. Based on these values, as well as on
exit and entry latencies reported by the device, the algorithm looks
for up to 2 suitable non-operational power states to use as primary
and secondary APST transition targets. The predefined values are
supplied to the nvme driver as module parameters:

 - apst_primary_timeout_ms (default: 100)
 - apst_secondary_timeout_ms (default: 2000)
 - apst_primary_latency_tol_us (default: 15000)
 - apst_secondary_latency_tol_us (default: 100000)

The algorithm echoes the approach used by Intel's and Microsoft's drivers
on Windows. The specific default parameter values are also based on those
drivers. Yet, this patch doesn't introduce the ability to dynamically
regenerate the APST table in the event of switching the power source from
AC to battery and back. Adding this functionality may be considered in the
future. In the meantime, the timeouts and tolerances reflect a compromise
between values used by Microsoft for AC and battery scenarios.

In most NVMe devices the new algorithm causes them to implement a more
aggressive power saving policy. While beneficial in most cases, this
sometimes comes at the price of a higher IO processing latency in certain
scenarios as well as at the price of a potential impact on the drive's
endurance (due to more frequent context saving when entering deep non-
operational states). So in order to provide a fallback for systems where
these regressions cannot be tolerated, the patch allows to revert to
the legacy behavior by setting either apst_primary_timeout_ms or
apst_primary_latency_tol_us parameter to 0. Eventually (and possibly after
fine tuning the default values of the module parameters) the legacy behavior
can be removed.

TESTING.

The new algorithm has been extensively tested. Initially, simulations were
used to compare APST tables generated by old and new algorithms for a wide
range of devices. After that, power consumption, performance and latencies
were measured under different workloads on devices from multiple vendors
(WD, Intel, Samsung, Hynix, Kioxia). Below is the description of the tests
and the findings.

General observations.
The effect the patch has on the APST table varies depending on the entry and
exit latencies advertised by the devices. For some devices, the effect is
negligible (e.g. Kioxia KBG40ZNS), for some significant, making the
transitions to PS3 and PS4 much quicker (e.g. WD SN530, Intel 760P), or making
the sleep deeper, PS4 rather than PS3 after a similar amount of time (e.g.
SK Hynix BC511). For some devices (e.g. Samsung PM991) the effect is mixed:
the initial transition happens after a longer idle time, but takes the device
to a lower power state.

Workflows.
In order to evaluate the patch's effect on the power consumption and latency,
7 workflows were used for each device. The workflows were designed to test
the scenarios where significant differences between the old and new behaviors
are most likely. Each workflow was tested twice: with the new and with the
old APST table generation implementation. Power consumption, performance and
latency were measured in the process. The following workflows were used:
1) Consecutive write at the maximum rate with IO depth of 2, with no pauses
2) Repeated pattern of 1000 consecutive writes of 4K packets followed by 50ms
   idle time
3) Repeated pattern of 1000 consecutive writes of 4K packets followed by 150ms
   idle time
4) Repeated pattern of 1000 consecutive writes of 4K packets followed by 500ms
   idle time
5) Repeated pattern of 1000 consecutive writes of 4K packets followed by 1.5s
   idle time
6) Repeated pattern of 1000 consecutive writes of 4K packets followed by 5s
   idle time
7) Repeated pattern of a single random read of a 4K packet followed by 150ms
   idle time

Power consumption
Actual power consumption measurements produced predictable results in
accordance with the APST mechanism's theory of operation.
Devices with long entry and exit latencies such as WD SN530 showed huge
improvement on scenarios 4,5 and 6 of up to 62%. Devices such as Kioxia
KBG40ZNS where the resulting APST table looks virtually identical with
both legacy and new algorithms, showed little or no change in the average power
consumption on all workflows. Devices with extra short latencies such as
Samsung PM991 showed moderate increase in power consumption of up to 18% in
worst case scenarios.
In addition, on Intel and Samsung devices a more complex impact was observed
on scenarios 3, 4 and 7. Our understanding is that due to longer stay in deep
non-operational states between the writes the devices start performing background
operations leading to an increase of power consumption. With the old APST tables
part of these operations are delayed until the scenario is over and a longer idle
period begins, but eventually this extra power is consumed anyway.

Performance.
In terms of performance measured on sustained write or read scenarios, the
effect of the patch is minimal as in this case the device doesn't enter low power
states.

Latency
As expected, in devices where the patch causes a more aggressive power saving
policy (e.g. WD SN530, Intel 760P), an increase in latency was observed in
certain scenarios. Workflow number 7, specifically designed to simulate the
worst case scenario as far as latency is concerned, indeed shows a sharp
increase in average latency (~2ms -> ~53ms on Intel 760P and 0.6 -> 10ms on
WD SN530). The latency increase on other workloads and other devices is much
milder or non-existent.
Signed-off-by: NAlexey Bogoslavsky <alexey.bogoslavsky@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ebd8a93a

nvme: remove redundant initialization of variable ret · 13ce7e62

由 Colin Ian King 提交于 5月 13, 2021

The variable ret is being initialized with a value that is never read,
it is being updated later on. The assignment is redundant and can be
removed.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

13ce7e62

24 5月, 2021 2 次提交

rsxx: Use struct_size() in vmalloc() · 81840358

由 Gustavo A. R. Silva 提交于 5月 13, 2021

Make use of the struct_size() helper instead of an open-coded version,
in order to avoid any potential type mistakes or integer overflows
that, in the worst scenario, could lead to heap overflows.

This code was detected with the help of Coccinelle and, audited and
fixed manually.
Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20210513203730.GA212128@embeddedorSigned-off-by: NJens Axboe <axboe@kernel.dk>

81840358

aoe: remove unnecessary mutex_init() · 65a8db39

由 Yang Yingliang 提交于 5月 11, 2021

The mutex ktio_spawn_lock is initialized statically.
It is unnecessary to initialize by mutex_init().
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210511113440.3772053-1-yangyingliang@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

65a8db39

21 5月, 2021 2 次提交

xen-pciback: reconfigure also from backend watch handler · c81d3d24

由 Jan Beulich 提交于 5月 18, 2021

When multiple PCI devices get assigned to a guest right at boot, libxl
incrementally populates the backend tree. The writes for the first of
the devices trigger the backend watch. In turn xen_pcibk_setup_backend()
will set the XenBus state to Initialised, at which point no further
reconfigures would happen unless a device got hotplugged. Arrange for
reconfigure to also get triggered from the backend watch handler.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/2337cbd6-94b9-4187-9862-c03ea12e0c61@suse.comSigned-off-by: NJuergen Gross <jgross@suse.com>

c81d3d24

xen-pciback: redo VF placement in the virtual topology · 4ba50e7c

由 Jan Beulich 提交于 5月 18, 2021

The commit referenced below was incomplete: It merely affected what
would get written to the vdev-<N> xenstore node. The guest would still
find the function at the original function number as long as
__xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt
__xen_pcibk_get_pcifront_dev().

Undo overriding the function to zero and instead make sure that VFs at
function zero remain alone in their slot. This has the added benefit of
improving overall capacity, considering that there's only a total of 32
slots available right now (PCI segment and bus can both only ever be
zero at present).

Fixes: 8a5248fe ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots")
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.comSigned-off-by: NJuergen Gross <jgross@suse.com>

4ba50e7c

20 5月, 2021 16 次提交

platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet · e68671e9

由 Hans de Goede 提交于 5月 20, 2021

Add touchscreen info for the Chuwi Hi10 Pro (CWI529) tablet. This includes
info for getting the firmware directly from the UEFI, so that the user does
not need to manually install the firmware in /lib/firmware/silead.

This change will make the touchscreen on these devices work OOTB,
without requiring any manual setup.
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210520093228.7439-1-hdegoede@redhat.com

e68671e9

dma-buf: fix unintended pin/unpin warnings · 7e008b02

由 Christian König 提交于 5月 17, 2021

DMA-buf internal users call the pin/unpin functions without having a
dynamic attachment. Avoid the warning and backtrace in the logs.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Bugs: https://gitlab.freedesktop.org/drm/intel/-/issues/3481
Fixes: c545781e ("dma-buf: doc polish for pin/unpin")
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
CC: stable@kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210517115705.2141-1-christian.koenig@amd.com

7e008b02

drm/amdgpu: stop touching sched.ready in the backend · a2b4785f

由 Christian König 提交于 5月 18, 2021

This unfortunately comes up in regular intervals and breaks
GPU reset for the engine in question.

The sched.ready flag controls if an engine can't get working
during hw_init, but should never be set to false during hw_fini.

v2: squash in unused variable fix (Alex)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a2b4785f

drm/amd/amdgpu: fix a potential deadlock in gpu reset · 9c2876d5

由 Lang Yu 提交于 5月 17, 2021

When amdgpu_ib_ring_tests failed, the reset logic called
amdgpu_device_ip_suspend twice, then deadlock occurred.
Deadlock log:

[  805.655192] amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110).
[  806.290952] [drm] free PSP TMR buffer

[  806.319406] ============================================
[  806.320315] WARNING: possible recursive locking detected
[  806.321225] 5.11.0-custom #1 Tainted: G        W  OEL
[  806.322135] --------------------------------------------
[  806.323043] cat/2593 is trying to acquire lock:
[  806.323825] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.325668]
               but task is already holding lock:
[  806.326664] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.328430]
               other info that might help us debug this:
[  806.329539]  Possible unsafe locking scenario:

[  806.330549]        CPU0
[  806.330983]        ----
[  806.331416]   lock(&adev->dm.dc_lock);
[  806.332086]   lock(&adev->dm.dc_lock);
[  806.332738]
                *** DEADLOCK ***

[  806.333747]  May be due to missing lock nesting notation

[  806.334899] 3 locks held by cat/2593:
[  806.335537]  #0: ffff888100d3f1b8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_read+0x4e/0x110
[  806.337009]  #1: ffff888136b1fd78 (&adev->reset_sem){++++}-{3:3}, at: amdgpu_device_lock_adev+0x42/0x94 [amdgpu]
[  806.339018]  #2: ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.340869]
               stack backtrace:
[  806.341621] CPU: 6 PID: 2593 Comm: cat Tainted: G        W  OEL    5.11.0-custom #1
[  806.342921] Hardware name: AMD Celadon-CZN/Celadon-CZN, BIOS WLD0C23N_Weekly_20_12_2 12/23/2020
[  806.344413] Call Trace:
[  806.344849]  dump_stack+0x93/0xbd
[  806.345435]  __lock_acquire.cold+0x18a/0x2cf
[  806.346179]  lock_acquire+0xca/0x390
[  806.346807]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.347813]  __mutex_lock+0x9b/0x930
[  806.348454]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.349434]  ? amdgpu_device_indirect_rreg+0x58/0x70 [amdgpu]
[  806.350581]  ? _raw_spin_unlock_irqrestore+0x47/0x50
[  806.351437]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.352437]  ? rcu_read_lock_sched_held+0x4f/0x80
[  806.353252]  ? rcu_read_lock_sched_held+0x4f/0x80
[  806.354064]  mutex_lock_nested+0x1b/0x20
[  806.354747]  ? mutex_lock_nested+0x1b/0x20
[  806.355457]  dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.356427]  ? soc15_common_set_clockgating_state+0x17d/0x19 [amdgpu]
[  806.357736]  amdgpu_device_ip_suspend_phase1+0x78/0xd0 [amdgpu]
[  806.360394]  amdgpu_device_ip_suspend+0x21/0x70 [amdgpu]
[  806.362926]  amdgpu_device_pre_asic_reset+0xb3/0x270 [amdgpu]
[  806.365560]  amdgpu_device_gpu_recover.cold+0x679/0x8eb [amdgpu]
Signed-off-by: NLang Yu <Lang.Yu@amd.com>
Acked-by: NChristian KÃnig <christian.koenig@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9c2876d5

drm/amdgpu: update sdma golden setting for Navi12 · 77194d86

由 Guchun Chen 提交于 5月 17, 2021

Current golden setting is out of date.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

77194d86

drm/amdgpu: update gc golden setting for Navi12 · 99c45ba5

由 Guchun Chen 提交于 5月 17, 2021

Current golden setting is out of date.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

99c45ba5

drm/amdgpu: Fix a use-after-free · 1e5c3738

由 xinhui pan 提交于 5月 18, 2021

looks like we forget to set ttm->sg to NULL.
Hit panic below

[ 1235.844104] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b7b4b: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
[ 1235.989074] Call Trace:
[ 1235.991751]  sg_free_table+0x17/0x20
[ 1235.995667]  amdgpu_ttm_backend_unbind.cold+0x4d/0xf7 [amdgpu]
[ 1236.002288]  amdgpu_ttm_backend_destroy+0x29/0x130 [amdgpu]
[ 1236.008464]  ttm_tt_destroy+0x1e/0x30 [ttm]
[ 1236.013066]  ttm_bo_cleanup_memtype_use+0x51/0xa0 [ttm]
[ 1236.018783]  ttm_bo_release+0x262/0xa50 [ttm]
[ 1236.023547]  ttm_bo_put+0x82/0xd0 [ttm]
[ 1236.027766]  amdgpu_bo_unref+0x26/0x50 [amdgpu]
[ 1236.032809]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x7aa/0xd90 [amdgpu]
[ 1236.040400]  kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
[ 1236.046912]  kfd_ioctl+0x463/0x690 [amdgpu]
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1e5c3738

drm/amdgpu: add video_codecs query support for aldebaran · ab95cb3e

由 James Zhu 提交于 5月 18, 2021

Add video_codecs query support for aldebaran.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ab95cb3e

drm/amd/amdgpu: fix refcount leak · fa7e6abc

由 Jingwen Chen 提交于 5月 17, 2021

[Why]
the gem object rfb->base.obj[0] is get according to num_planes
in amdgpufb_create, but is not put according to num_planes

[How]
put rfb->base.obj[0] in amdgpu_fbdev_destroy according to num_planes
Signed-off-by: NJingwen Chen <Jingwen.Chen2@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fa7e6abc

drm/amd/display: Disconnect non-DP with no EDID · 08003927

由 Chris Park 提交于 5月 04, 2021

[Why]
Active DP dongles return no EDID when dongle
is connected, but VGA display is taken out.
Current driver behavior does not remove the
active display when this happens, and this is
a gap between dongle DTP and dongle behavior.

[How]
For active DP dongles and non-DP scenario,
disconnect sink on detection when no EDID
is read due to timeout.
Signed-off-by: NChris Park <Chris.Park@amd.com>
Reviewed-by: NNicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: NStylon Wang <stylon.wang@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

08003927

drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang · dbd1003d

由 Changfeng 提交于 5月 14, 2021

There is problem with 3DCGCG firmware and it will cause compute test
hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid
compute hang.
Signed-off-by: NChangfeng <Changfeng.Zhu@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

dbd1003d

drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE · d5375156

由 Yi Li 提交于 5月 14, 2021

When PAGE_SIZE is larger than AMDGPU_PAGE_SIZE, the number of GPU TLB
entries which need to update in amdgpu_map_buffer() should be multiplied
by AMDGPU_GPU_PAGES_IN_CPU_PAGE (PAGE_SIZE / AMDGPU_PAGE_SIZE).
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NYi Li <liyi@loongson.cn>
Signed-off-by: NHuacai Chen <chenhuacai@loongson.cn>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

d5375156

drm/radeon: use the dummy page for GART if needed · 0c8df343

由 Christian König 提交于 5月 12, 2021

Imported BOs don't have a pagelist any more.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Fixes: 0575ff3d ("drm/radeon: stop using pages with drm_prime_sg_to_page_addr_arrays v2")
CC: stable@vger.kernel.org # 5.12

0c8df343

drm/amd/display: Use the correct max downscaling value for DCN3.x family · 84c63d04

由 Nikola Cornij 提交于 5月 06, 2021

[why]
As per spec, DCN3.x can do 6:1 downscaling and DCN2.x can do 4:1. The
max downscaling limit value for DCN2.x is 250, which means it's
calculated as 1000 / 4 = 250. For DCN3.x this then gives 1000 / 6 = 167.

[how]
Set maximum downscaling limit to 167 for DCN3.x
Signed-off-by: NNikola Cornij <nikola.cornij@amd.com>
Reviewed-by: NCharlene Liu <Charlene.Liu@amd.com>
Reviewed-by: NHarry Wentland <Harry.Wentland@amd.com>
Acked-by: NStylon Wang <stylon.wang@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

84c63d04

RDMA/uverbs: Fix a NULL vs IS_ERR() bug · 463a3f66

由 Dan Carpenter 提交于 5月 14, 2021

The uapi_get_object() function returns error pointers, it never returns
NULL.

Fixes: 149d3845 ("RDMA/uverbs: Add a method to introspect handles in a context")
Link: https://lore.kernel.org/r/YJ6Got+U7lz+3n9a@mwandaSigned-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

463a3f66

RDMA/mlx5: Fix query DCT via DEVX · cfa3b797

由 Maor Gottlieb 提交于 5月 19, 2021

When executing DEVX command to query QP object, we need to take the QP
type from the mlx5_ib_qp struct which hold the driver specific QP types as
well, such as DC.

Fixes: 34613eb1 ("IB/mlx5: Enable modify and query verbs objects via DEVX")
Link: https://lore.kernel.org/r/6eee15d63f09bb70787488e0cf96216e2957f5aa.1621413654.git.leonro@nvidia.comReviewed-by: NYishai Hadas <yishaih@nvidia.com>
Signed-off-by: NMaor Gottlieb <maorg@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

cfa3b797

19 5月, 2021 16 次提交

Revert "i915: fix remap_io_sg to verify the pgprot" · 293837b9

由 Linus Torvalds 提交于 5月 19, 2021

This reverts commit b12d691e.

It turns out this is not ready for primetime yet.  The intentions are
good, but using remap_pfn_range() requires that there is nothing already
mapped in the area, and the i915 code seems to very much intentionally
remap the same area multiple times.

That will then just trigger the

                BUG_ON(!pte_none(*pte));

in mm/memory.c: remap_pte_range().

There are also reports of mapping type inconsistencies, resulting in
warnings and in screen corruption.

Link: https://lore.kernel.org/lkml/20210519024322.GA29704@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/YKUjvoaKKggAmpIR@sf/
Link: https://lore.kernel.org/lkml/b6b61cf0-5874-f4c0-1fcc-4b3848451c31@redhat.com/Reported-by: Nkernel test robot <oliver.sang@intel.com>
Reported-by: NKalle Valo <kvalo@codeaurora.org>
Reported-by: NHans de Goede <hdegoede@redhat.com>
Reported-by: NSergei Trofimovich <slyfox@gentoo.org>
Acked-by: NChristoph Hellwig <hch@lst.de>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

293837b9

platform/x86: touchscreen_dmi: Add info for the Mediacom Winpad 7.0 W700 tablet · 39a6172e

由 Teava Radu 提交于 5月 04, 2021

Add touchscreen info for the Mediacom Winpad 7.0 W700 tablet.
Tested on 5.11 hirsute.
Note: it's hw clone to Wintron surftab 7.
Signed-off-by: NTeava Radu <rateava@gmail.com>
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210504185746.175461-6-hdegoede@redhat.com

39a6172e

platform/x86: intel_punit_ipc: Append MODULE_DEVICE_TABLE for ACPI · bc1eca60

由 Andy Shevchenko 提交于 5月 19, 2021

The intel_punit_ipc driver might be compiled as a module.
When udev handles the event of the devices appearing
the intel_punit_ipc module is missing.

Append MODULE_DEVICE_TABLE for ACPI case to fix the loading issue.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210519101521.79338-1-andriy.shevchenko@linux.intel.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>

bc1eca60

platform/x86: dell-smbios-wmi: Fix oops on rmmod dell_smbios · 3a535874

由 Hans de Goede 提交于 5月 18, 2021

init_dell_smbios_wmi() only registers the dell_smbios_wmi_driver on systems
where the Dell WMI interface is supported. While exit_dell_smbios_wmi()
unregisters it unconditionally, this leads to the following oops:

[  175.722921] ------------[ cut here ]------------
[  175.722925] Unexpected driver unregister!
[  175.722939] WARNING: CPU: 1 PID: 3630 at drivers/base/driver.c:194 driver_unregister+0x38/0x40
...
[  175.723089] Call Trace:
[  175.723094]  cleanup_module+0x5/0xedd [dell_smbios]
...
[  175.723148] ---[ end trace 064c34e1ad49509d ]---

Make the unregister happen on the same condition the register happens
to fix this.

Cc: Mario Limonciello <mario.limonciello@outlook.com>
Fixes: 1a258e67 ("platform/x86: dell-smbios-wmi: Add new WMI dispatcher driver")
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Reviewed-by: NMario Limonciello <mario.limonciello@outlook.com>
Reviewed-by: NMark Gross <mgross@linux.intel.com>
Link: https://lore.kernel.org/r/20210518125027.21824-1-hdegoede@redhat.com

3a535874

platform/x86: hp-wireless: add AMD's hardware id to the supported list · f048630b

由 Shyam Sundar S K 提交于 5月 14, 2021

Newer AMD based laptops uses AMDI0051 as the hardware id to support the
airplane mode button. Adding this to the supported list.
Signed-off-by: NShyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20210514180047.1697543-1-Shyam-sundar.S-k@amd.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>

f048630b

platform/x86: intel_int0002_vgpio: Only call enable_irq_wake() when using s2idle · b68e182a

由 Hans de Goede 提交于 5月 12, 2021

Commit 871f1f2b ("platform/x86: intel_int0002_vgpio: Only implement
irq_set_wake on Bay Trail") stopped passing irq_set_wake requests on to
the parents IRQ because this was breaking suspend (causing immediate
wakeups) on an Asus E202SA.

This workaround for the Asus E202SA is causing wakeup by USB keyboard to
not work on other devices with Airmont CPU cores such as the Medion Akoya
E1239T. In hindsight the problem with the Asus E202SA has nothing to do
with Silvermont vs Airmont CPU cores, so the differentiation between the
2 types of CPU cores introduced by the previous fix is wrong.

The real issue at hand is s2idle vs S3 suspend where the suspend is
mostly handled by firmware. The parent IRQ for the INT0002 device is shared
with the ACPI SCI and the real problem is that the INT0002 code should not
be messing with the wakeup settings of that IRQ when suspend/resume is
being handled by the firmware.

Note that on systems which support both s2idle and S3 suspend, which
suspend method to use can be changed at runtime.

This patch fixes both the Asus E202SA spurious wakeups issue as well as
the wakeup by USB keyboard not working on the Medion Akoya E1239T issue.

These are both fixed by replacing the old workaround with delaying the
enable_irq_wake(parent_irq) call till system-suspend time and protecting
it with a !pm_suspend_via_firmware() check so that we still do not call
it on devices using firmware-based (S3) suspend such as the Asus E202SA.

Note rather then adding #ifdef CONFIG_PM_SLEEP, this commit simply adds
a "depends on PM_SLEEP" to the Kconfig since this drivers whole purpose
is to deal with wakeup events, so using it without CONFIG_PM_SLEEP makes
no sense.

Cc: Maxim Mikityanskiy <maxtram95@gmail.com>
Fixes: 871f1f2b ("platform/x86: intel_int0002_vgpio: Only implement irq_set_wake on Bay Trail")
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20210512125523.55215-2-hdegoede@redhat.com

b68e182a

platform/x86: gigabyte-wmi: add support for B550 Aorus Elite · dac282de

由 Thomas Weißschuh 提交于 5月 11, 2021

Reported as working here:
https://github.com/t-8ch/linux-gigabyte-wmi-driver/issues/1#issuecomment-837210304Signed-off-by: NThomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20210510221545.412522-3-linux@weissschuh.netSigned-off-by: NHans de Goede <hdegoede@redhat.com>

dac282de

platform/x86: gigabyte-wmi: add support for X570 UD · 8605d64f

由 Thomas Weißschuh 提交于 5月 11, 2021

Reported as working here:
https://github.com/t-8ch/linux-gigabyte-wmi-driver/issues/4Signed-off-by: NThomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20210510221545.412522-2-linux@weissschuh.netSigned-off-by: NHans de Goede <hdegoede@redhat.com>

8605d64f

platform/x86: gigabyte-wmi: streamline dmi matching · 86bf2b8f

由 Thomas Weißschuh 提交于 5月 11, 2021

Streamline dmi matching.
Signed-off-by: NThomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20210510221545.412522-1-linux@weissschuh.netSigned-off-by: NHans de Goede <hdegoede@redhat.com>

86bf2b8f

platform/mellanox: mlxbf-tmfifo: Fix a memory barrier issue · 1c0e5701

由 Liming Sun 提交于 5月 07, 2021

The virtio framework uses wmb() when updating avail->idx. It
guarantees the write order, but not necessarily loading order
for the code accessing the memory. This commit adds a load barrier
after reading the avail->idx to make sure all the data in the
descriptor is visible. It also adds a barrier when returning the
packet to virtio framework to make sure read/writes are visible to
the virtio code.

Fixes: 1357dfd7 ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc")
Signed-off-by: NLiming Sun <limings@nvidia.com>
Reviewed-by: NVadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/1620433812-17911-1-git-send-email-limings@nvidia.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>

1c0e5701

platform/surface: dtx: Fix poll function · 9795d823

由 Maximilian Luz 提交于 5月 13, 2021

The poll function should not return -ERESTARTSYS.

Furthermore, locking in this function is completely unnecessary. The
ddev->lock protects access to the main device and controller (ddev->dev
and ddev->ctrl), ensuring that both are and remain valid while being
accessed by clients. Both are, however, never accessed in the poll
function. The shutdown test (via atomic bit flags) be safely done
without locking, so drop locking here entirely.
Reported-by: Nkernel test robot <lkp@intel.com>
Fixes: 1d609992 ("platform/surface: Add DTX driver)
Signed-off-by: NMaximilian Luz <luzmaximilian@gmail.com>
Link: https://lore.kernel.org/r/20210513134437.2431022-1-luzmaximilian@gmail.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>

9795d823

platform/surface: aggregator: Do not mark interrupt as shared · 647e6cc9

由 Maximilian Luz 提交于 5月 05, 2021

Having both IRQF_NO_AUTOEN and IRQF_SHARED set causes
request_threaded_irq() to return with -EINVAL (see comment in flag
validation in that function). As the interrupt is currently not shared
between multiple devices, drop the IRQF_SHARED flag.

Fixes: 507cf5a2 ("platform/surface: aggregator: move to use request_irq by IRQF_NO_AUTOEN flag")
Signed-off-by: NMaximilian Luz <luzmaximilian@gmail.com>
Link: https://lore.kernel.org/r/20210505133635.1499703-1-luzmaximilian@gmail.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>

647e6cc9

drm/i915/gt: Disable HiZ Raw Stall Optimization on broken gen7 · 023dfa96

由 Simon Rettberg 提交于 4月 26, 2021

When resetting CACHE_MODE registers, don't enable HiZ Raw Stall
Optimization on Ivybridge GT1 and Baytrail, as it causes severe glitches
when rendering any kind of 3D accelerated content.
This optimization is disabled on these platforms by default according to
official documentation from 01.org.

Fixes: ef99a60f ("drm/i915/gt: Clear CACHE_MODE prior to clearing residuals")
BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/3081
BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/3404
BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/3071Reviewed-by: NManuel Bentele <development@manuel-bentele.de>
Signed-off-by: NSimon Rettberg <simon.rettberg@rz.uni-freiburg.de>
Reviewed-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo removed invalid Fixes line]
Link: https://patchwork.freedesktop.org/patch/msgid/20210426161124.2b7fd708@dellnichtsogutkiste
(cherry picked from commit 929b734a)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

023dfa96

drm/i915/gem: Pin the L-shape quirked object as unshrinkable · 036867e9

由 Chris Wilson 提交于 5月 17, 2021

When instantiating a tiled object on an L-shaped memory machine, we mark
the object as unshrinkable to prevent the shrinker from trying to swap
out the pages. We have to do this as we do not know the swizzling on the
individual pages, and so the data will be scrambled across swap out/in.

Not only do we need to move the object off the shrinker list, we need to
mark the object with shrink_pin so that the counter is consistent across
calls to madvise.

v2: in the madvise ioctl we need to check if the object is currently
shrinkable/purgeable, not if the object type supports shrinking

Fixes: 0175969e ("drm/i915/gem: Use shrinkable status for unknown swizzle quirks")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3293
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3450Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Link: https://patchwork.freedesktop.org/patch/msgid/20210517084640.18862-1-matthew.auld@intel.com
(cherry picked from commit 8777d17b)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

036867e9

nvme-fc: clear q_live at beginning of association teardown · a7d13914

由 James Smart 提交于 5月 10, 2021

The __nvmf_check_ready() routine used to bounce all filesystem io if the
controller state isn't LIVE.  However, a later patch changed the logic so
that it rejection ends up being based on the Q live check.  The FC
transport has a slightly different sequence from rdma and tcp for
shutting down queues/marking them non-live.  FC marks its queue non-live
after aborting all ios and waiting for their termination, leaving a
rather large window for filesystem io to continue to hit the transport.
Unfortunately this resulted in filesystem I/O or applications seeing I/O
errors.

Change the FC transport to mark the queues non-live at the first sign of
teardown for the association (when I/O is initially terminated).

Fixes: 73a53799 ("nvme-fabrics: allow to queue requests for live queues")
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a7d13914

nvme-tcp: rerun io_work if req_list is not empty · a0fdd141

由 Keith Busch 提交于 5月 17, 2021

A possible race condition exists where the request to send data is
enqueued from nvme_tcp_handle_r2t()'s will not be observed by
nvme_tcp_send_all() if it happens to be running. The driver relies on
io_work to send the enqueued request when it is runs again, but the
concurrently running nvme_tcp_send_all() may not have released the
send_mutex at that time. If no future commands are enqueued to re-kick
the io_work, the request will timeout in the SEND_H2C state, resulting
in a timeout error like:

  nvme nvme0: queue 1: timeout request 0x3 type 6

Ensure the io_work continues to run as long as the req_list is not empty.

Fixes: db5ad6b7 ("nvme-tcp: try to send request in queue_rq context")
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a0fdd141

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功