提交 · d7ff441a13a15f24e516c5edadda0dbd946c2cfd · openeuler / Kernel

14 1月, 2022 40 次提交

net: stmmac: platform: fix build warning when with !CONFIG_PM_SLEEP · d7ff441a

由 Joakim Zhang 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 98b02755d544ce26ac0a41ff52bf56f2bd79e4c0
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=98b02755d544ce26ac0a41ff52bf56f2bd79e4c0

--------------------------------

commit 2a48d96f upstream.

Use __maybe_unused for noirq_suspend()/noirq_resume() hooks to avoid
build warning with !CONFIG_PM_SLEEP:

>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function]
     796 | static int stmmac_pltfr_noirq_resume(struct device *dev)
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function]
     775 | static int stmmac_pltfr_noirq_suspend(struct device *dev)
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors

Fixes: 276aae37 ("net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume")
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NJoakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d7ff441a

shm: extend forced shm destroy to support objects from several IPC nses · e1467432

由 Alexander Mikhalitsyn 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit a15261d2a1214c9304d17d4b9b819255c7406de5
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a15261d2a1214c9304d17d4b9b819255c7406de5

--------------------------------

commit 85b6d246 upstream.

Currently, the exit_shm() function not designed to work properly when
task->sysvshm.shm_clist holds shm objects from different IPC namespaces.

This is a real pain when sysctl kernel.shm_rmid_forced = 1, because it
leads to use-after-free (reproducer exists).

This is an attempt to fix the problem by extending exit_shm mechanism to
handle shm's destroy from several IPC ns'es.

To achieve that we do several things:

1. add a namespace (non-refcounted) pointer to the struct shmid_kernel

2. during new shm object creation (newseg()/shmget syscall) we
   initialize this pointer by current task IPC ns

3. exit_shm() fully reworked such that it traverses over all shp's in
   task->sysvshm.shm_clist and gets IPC namespace not from current task
   as it was before but from shp's object itself, then call
   shm_destroy(shp, ns).

Note: We need to be really careful here, because as it was said before
(1), our pointer to IPC ns non-refcnt'ed.  To be on the safe side we
using special helper get_ipc_ns_not_zero() which allows to get IPC ns
refcounter only if IPC ns not in the "state of destruction".

Q/A

Q: Why can we access shp->ns memory using non-refcounted pointer?
A: Because shp object lifetime is always shorther than IPC namespace
   lifetime, so, if we get shp object from the task->sysvshm.shm_clist
   while holding task_lock(task) nobody can steal our namespace.

Q: Does this patch change semantics of unshare/setns/clone syscalls?
A: No. It's just fixes non-covered case when process may leave IPC
   namespace without getting task->sysvshm.shm_clist list cleaned up.

Link: https://lkml.kernel.org/r/67bb03e5-f79c-1815-e2bf-949c67047418@colorfullife.com
Link: https://lkml.kernel.org/r/20211109151501.4921-1-manfred@colorfullife.com
Fixes: ab602f79 ("shm: make exit_shm work proportional to task activity")
Co-developed-by: NManfred Spraul <manfred@colorfullife.com>
Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
Signed-off-by: NAlexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e1467432

s390/mm: validate VMA in PGSTE manipulation functions · 8b56b68d

由 David Hildenbrand 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit aa20e966d8a1249754da934342cc3793f4638e7f
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=aa20e966d8a1249754da934342cc3793f4638e7f

--------------------------------

commit fe3d1002 upstream.

We should not walk/touch page tables outside of VMA boundaries when
holding only the mmap sem in read mode. Evil user space can modify the
VMA layout just before this function runs and e.g., trigger races with
page table removal code since commit dd2283f2 ("mm: mmap: zap pages
with read mmap_sem in munmap"). gfn_to_hva() will only translate using
KVM memory regions, but won't validate the VMA.

Further, we should not allocate page tables outside of VMA boundaries: if
evil user space decides to map hugetlbfs to these ranges, bad things will
happen because we suddenly have PTE or PMD page tables where we
shouldn't have them.

Similarly, we have to check if we suddenly find a hugetlbfs VMA, before
calling get_locked_pte().

Fixes: 2d42f947 ("s390/kvm: Add PGSTE manipulation functions")
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NClaudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: NHeiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20210909162248.14969-4-david@redhat.comSigned-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8b56b68d

tty: hvc: replace BUG_ON() with negative return value · 683c7e90

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit a94e4a7b77edb1ae20f41c7c53677a9c6cee1fd8
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a94e4a7b77edb1ae20f41c7c53677a9c6cee1fd8

--------------------------------

commit e679004d upstream.

Xen frontends shouldn't BUG() in case of illegal data received from
their backends. So replace the BUG_ON()s when reading illegal data from
the ring page with negative return values.
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20210707091045.460-1-jgross@suse.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

683c7e90

xen/netfront: don't trust the backend response data blindly · ddbdd49c

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 1c5f722a8fdf19d383112fb701525e1b6870d8ca
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1c5f722a8fdf19d383112fb701525e1b6870d8ca

--------------------------------

commit a884daa6 upstream.

Today netfront will trust the backend to send only sane response data.
In order to avoid privilege escalations or crashes in case of malicious
backends verify the data to be within expected limits. Especially make
sure that the response always references an outstanding request.

Note that only the tx queue needs special id handling, as for the rx
queue the id is equal to the index in the ring page.

Introduce a new indicator for the device whether it is broken and let
the device stop working when it is set. Set this indicator in case the
backend sets any weird data.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ddbdd49c

xen/netfront: disentangle tx_skb_freelist · 89721d05

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 334b0f278761a65718324137e87efe722d536700
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=334b0f278761a65718324137e87efe722d536700

--------------------------------

commit 21631d2d upstream.

The tx_skb_freelist elements are in a single linked list with the
request id used as link reference. The per element link field is in a
union with the skb pointer of an in use request.

Move the link reference out of the union in order to enable a later
reuse of it for requests which need a populated skb pointer.

Rename add_id_to_freelist() and get_id_from_freelist() to
add_id_to_list() and get_id_from_list() in order to prepare using
those for other lists as well. Define ~0 as value to indicate the end
of a list and place that value into the link for a request not being
on the list.

When freeing a skb zero the skb pointer in the request. Use a NULL
value of the skb pointer instead of skb_entry_is_link() for deciding
whether a request has a skb linked to it.

Remove skb_entry_set_link() and open code it instead as it is really
trivial now.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

89721d05

xen/netfront: don't read data from request on the ring page · daa245df

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit e17ee047eea7122c1d4196ed39032e517dad4152
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e17ee047eea7122c1d4196ed39032e517dad4152

--------------------------------

commit 162081ec upstream.

In order to avoid a malicious backend being able to influence the local
processing of a request build the request locally first and then copy
it to the ring page. Any reading from the request influencing the
processing in the frontend needs to be done on the local instance.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

daa245df

xen/netfront: read response from backend only once · e4293a7a

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit f5e493709800243181e268ddbfae949d2cc37f0b
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f5e493709800243181e268ddbfae949d2cc37f0b

--------------------------------

commit 8446066b upstream.

In order to avoid problems in case the backend is modifying a response
on the ring page while the frontend has already seen it, just read the
response into a local buffer in one go and then operate on that buffer
only.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e4293a7a

xen/blkfront: don't trust the backend response data blindly · 5a81bb17

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 1ffb20f0527dab03c17f0182ec6a63b9301af5f1
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1ffb20f0527dab03c17f0182ec6a63b9301af5f1

--------------------------------

commit b94e4b14 upstream.

Today blkfront will trust the backend to send only sane response data.
In order to avoid privilege escalations or crashes in case of malicious
backends verify the data to be within expected limits. Especially make
sure that the response always references an outstanding request.

Introduce a new state of the ring BLKIF_STATE_ERROR which will be
switched to in case an inconsistency is being detected. Recovering from
this state is possible only via removing and adding the virtual device
again (e.g. via a suspend/resume cycle).

Make all warning messages issued due to valid error responses rate
limited in order to avoid message floods being triggered by a malicious
backend.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Link: https://lore.kernel.org/r/20210730103854.12681-4-jgross@suse.comSigned-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5a81bb17

xen/blkfront: don't take local copy of a request from the ring page · 096dfca4

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 8e147855fcf275f30dbc93e1a8f4031724e7ad13
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8e147855fcf275f30dbc93e1a8f4031724e7ad13

--------------------------------

commit 8f5a695d upstream.

In order to avoid a malicious backend being able to influence the local
copy of a request build the request locally first and then copy it to
the ring page instead of doing it the other way round as today.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Link: https://lore.kernel.org/r/20210730103854.12681-3-jgross@suse.comSigned-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

096dfca4

xen/blkfront: read response from backend only once · 06a11abb

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 273f04d5d135c5a00f2b8666f51c2fe87b38bcb7
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=273f04d5d135c5a00f2b8666f51c2fe87b38bcb7

--------------------------------

commit 71b66243 upstream.

In order to avoid problems in case the backend is modifying a response
on the ring page while the frontend has already seen it, just read the
response into a local buffer in one go and then operate on that buffer
only.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Link: https://lore.kernel.org/r/20210730103854.12681-2-jgross@suse.comSigned-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

06a11abb

xen: sync include/xen/interface/io/ring.h with Xen's newest version · 8653296a

由 Juergen Gross 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit b98284aa3fc520e79e59753855c40f63b8c5389f
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b98284aa3fc520e79e59753855c40f63b8c5389f

--------------------------------

commit 629a5d87 upstream.

Sync include/xen/interface/io/ring.h with Xen's newest version in
order to get the RING_COPY_RESPONSE() and RING_RESPONSE_PROD_OVERFLOW()
macros.

Note that this will correct the wrong license info by adding the
missing original copyright notice.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8653296a

tracing: Check pid filtering when creating events · 466b4c62

由 Steven Rostedt (VMware) 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 406f2d5fe368d440fd4e262188a6640f92804c5d
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=406f2d5fe368d440fd4e262188a6640f92804c5d

--------------------------------

commit 6cb20650 upstream.

When pid filtering is activated in an instance, all of the events trace
files for that instance has the PID_FILTER flag set. This determines
whether or not pid filtering needs to be done on the event, otherwise the
event is executed as normal.

If pid filtering is enabled when an event is created (via a dynamic event
or modules), its flag is not updated to reflect the current state, and the
events are not filtered properly.

Cc: stable@vger.kernel.org
Fixes: 3fdaf80f ("tracing: Implement event pid filtering")
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

466b4c62

vhost/vsock: fix incorrect used length reported to the guest · c2f9218b

由 Stefano Garzarella 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 4fd0ad08ee332d7b61e0fc7fabead1fb57554065
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4fd0ad08ee332d7b61e0fc7fabead1fb57554065

--------------------------------

commit 49d8c5ff upstream.

The "used length" reported by calling vhost_add_used() must be the
number of bytes written by the device (using "in" buffers).

In vhost_vsock_handle_tx_kick() the device only reads the guest
buffers (they are all "out" buffers), without writing anything,
so we must pass 0 as "used length" to comply virtio spec.

Fixes: 433fc58e ("VSOCK: Introduce vhost_vsock.ko")
Cc: stable@vger.kernel.org
Reported-by: NHalil Pasic <pasic@linux.ibm.com>
Suggested-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211122163525.294024-2-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c2f9218b

iommu/amd: Clarify AMD IOMMUv2 initialization messages · 8394e9a9

由 Joerg Roedel 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit fbc0514e1a343f82cfa7afa3aedda9007ccaac9b
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=fbc0514e1a343f82cfa7afa3aedda9007ccaac9b

--------------------------------

commit 717e88aa upstream.

The messages printed on the initialization of the AMD IOMMUv2 driver
have caused some confusion in the past. Clarify the messages to lower
the confusion in the future.

Cc: stable@vger.kernel.org
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20211123105507.7654-3-joro@8bytes.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8394e9a9

smb3: do not error on fsync when readonly · 17bdfaa8

由 Steve French 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 5655b8bccb8a19a34c83225a1c7bf2adca50c2c3
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5655b8bccb8a19a34c83225a1c7bf2adca50c2c3

--------------------------------

[ Upstream commit 71e6864e ]

Linux allows doing a flush/fsync on a file open for read-only,
but the protocol does not allow that.  If the file passed in
on the flush is read-only try to find a writeable handle for
the same inode, if that is not possible skip sending the
fsync call to the server to avoid breaking the apps.
Reported-by: NJulian Sikorski <belegdol@gmail.com>
Tested-by: NJulian Sikorski <belegdol@gmail.com>
Suggested-by: NJeremy Allison <jra@samba.org>
Reviewed-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: NSteve French <stfrench@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

17bdfaa8

ceph: properly handle statfs on multifs setups · 8bfdeaf3

由 Jeff Layton 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit c380062d0850d854578ef47ca714f135e6597a99
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c380062d0850d854578ef47ca714f135e6597a99

--------------------------------

[ Upstream commit 8cfc0c7e ]

ceph_statfs currently stuffs the cluster fsid into the f_fsid field.
This was fine when we only had a single filesystem per cluster, but now
that we have multiples we need to use something that will vary between
them.

Change ceph_statfs to xor each 32-bit chunk of the fsid (aka cluster id)
into the lower bits of the statfs->f_fsid. Change the lower bits to hold
the fscid (filesystem ID within the cluster).

That should give us a value that is guaranteed to be unique between
filesystems within a cluster, and should minimize the chance of
collisions between mounts of different clusters.

URL: https://tracker.ceph.com/issues/52812Reported-by: NSachin Prabhu <sprabhu@redhat.com>
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NXiubo Li <xiubli@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8bfdeaf3

f2fs: set SBI_NEED_FSCK flag when inconsistent node block found · b704b0a9

由 Weichao Guo 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 22423c966e02175a45bc010134d734e920b76cdf
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=22423c966e02175a45bc010134d734e920b76cdf

--------------------------------

[ Upstream commit 6663b138 ]

Inconsistent node block will cause a file fail to open or read,
which could make the user process crashes or stucks. Let's mark
SBI_NEED_FSCK flag to trigger a fix at next fsck time. After
unlinking the corrupted file, the user process could regenerate
a new one and work correctly.
Signed-off-by: NWeichao Guo <guoweichao@oppo.com>
Reviewed-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b704b0a9

sched/scs: Reset task stack state in bringup_cpu() · 2b27c2fe

由 Mark Rutland 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit e6ee7abd6bfe559ad9989004b34c320fd638c526
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e6ee7abd6bfe559ad9989004b34c320fd638c526

--------------------------------

[ Upstream commit dce1ca05 ]

To hot unplug a CPU, the idle task on that CPU calls a few layers of C
code before finally leaving the kernel. When KASAN is in use, poisoned
shadow is left around for each of the active stack frames, and when
shadow call stacks are in use. When shadow call stacks (SCS) are in use
the task's saved SCS SP is left pointing at an arbitrary point within
the task's shadow call stack.

When a CPU is offlined than onlined back into the kernel, this stale
state can adversely affect execution. Stale KASAN shadow can alias new
stackframes and result in bogus KASAN warnings. A stale SCS SP is
effectively a memory leak, and prevents a portion of the shadow call
stack being used. Across a number of hotplug cycles the idle task's
entire shadow call stack can become unusable.

We previously fixed the KASAN issue in commit:

  e1b77c92 ("sched/kasan: remove stale KASAN poison after hotplug")

... by removing any stale KASAN stack poison immediately prior to
onlining a CPU.

Subsequently in commit:

  f1a0a376 ("sched/core: Initialize the idle task with preemption disabled")

... the refactoring left the KASAN and SCS cleanup in one-time idle
thread initialization code rather than something invoked prior to each
CPU being onlined, breaking both as above.

We fixed SCS (but not KASAN) in commit:

  63acd42c ("sched/scs: Reset the shadow stack when idle_task_exit")

... but as this runs in the context of the idle task being offlined it's
potentially fragile.

To fix these consistently and more robustly, reset the SCS SP and KASAN
shadow of a CPU's idle task immediately before we online that CPU in
bringup_cpu(). This ensures the idle task always has a consistent state
when it is running, and removes the need to so so when exiting an idle
task.

Whenever any thread is created, dup_task_struct() will give the task a
stack which is free of KASAN shadow, and initialize the task's SCS SP,
so there's no need to specially initialize either for idle thread within
init_idle(), as this was only necessary to handle hotplug cycles.

I've tested this on arm64 with:

* gcc 11.1.0, defconfig +KASAN_INLINE, KASAN_STACK
* clang 12.0.0, defconfig +KASAN_INLINE, KASAN_STACK, SHADOW_CALL_STACK

... offlining and onlining CPUS with:

| while true; do
|   for C in /sys/devices/system/cpu/cpu*/online; do
|     echo 0 > $C;
|     echo 1 > $C;
|   done
| done

Fixes: f1a0a376 ("sched/core: Initialize the idle task with preemption disabled")
Reported-by: NQian Cai <quic_qiancai@quicinc.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NValentin Schneider <valentin.schneider@arm.com>
Tested-by: NQian Cai <quic_qiancai@quicinc.com>
Link: https://lore.kernel.org/lkml/20211115113310.35693-1-mark.rutland@arm.com/Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2b27c2fe

tcp: correctly handle increased zerocopy args struct size · 59e15d72

由 Arjun Roy 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 71e38a0c7cf88c9ea672b8aa9cf978e01fdb17a3
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=71e38a0c7cf88c9ea672b8aa9cf978e01fdb17a3

--------------------------------

[ Upstream commit e0fecb28 ]

A prior patch increased the size of struct tcp_zerocopy_receive
but did not update do_tcp_getsockopt() handling to properly account
for this.

This patch simply reintroduces content erroneously cut from the
referenced prior patch that handles the new struct size.

Fixes: 18fb76ed ("net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy.")
Signed-off-by: NArjun Roy <arjunroy@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

59e15d72

net: mscc: ocelot: correctly report the timestamping RX filters in ethtool · 8cedf512

由 Vladimir Oltean 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 72f2117e450b631d269ad3a5372223febe487e13
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=72f2117e450b631d269ad3a5372223febe487e13

--------------------------------

[ Upstream commit c49a35ee ]

The driver doesn't support RX timestamping for non-PTP packets, but it
declares that it does. Restrict the reported RX filters to PTP v2 over
L2 and over L4.

Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8cedf512

net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP · d6885fa5

由 Vladimir Oltean 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 73115a2b38dd09ed28015a4d58b0587742fd7299
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=73115a2b38dd09ed28015a4d58b0587742fd7299

--------------------------------

[ Upstream commit 8a075464 ]

The ocelot driver, when asked to timestamp all receiving packets, 1588
v1 or NTP, says "nah, here's 1588 v2 for you".

According to this discussion:
https://patchwork.kernel.org/project/netdevbpf/patch/20211104133204.19757-8-martin.kaistra@linutronix.de/#24577647
drivers that downgrade from a wider request to a narrower response (or
even a response where the intersection with the request is empty) are
buggy, and should return -ERANGE instead. This patch fixes that.

Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support")
Suggested-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d6885fa5

net/smc: Don't call clcsock shutdown twice when smc shutdown · df3673b6

由 Tony Lu 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 215167df4512f2e7f3ace6b864a1697fcfeea03d
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=215167df4512f2e7f3ace6b864a1697fcfeea03d

--------------------------------

[ Upstream commit bacb6c1e ]

When applications call shutdown() with SHUT_RDWR in userspace,
smc_close_active() calls kernel_sock_shutdown(), and it is called
twice in smc_shutdown().

This fixes this by checking sk_state before do clcsock shutdown, and
avoids missing the application's call of smc_shutdown().

Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/
Fixes: 606a63c9 ("net/smc: Ensure the active closing peer first closes clcsock")
Signed-off-by: NTony Lu <tonylu@linux.alibaba.com>
Reviewed-by: NWen Gu <guwen@linux.alibaba.com>
Acked-by: NKarsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20211126024134.45693-1-tonylu@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

df3673b6

net: vlan: fix underflow for the real_dev refcnt · 0bf5cd4a

由 Ziyang Xuan 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 6e800ee43218a56acc93676bbb3d93b74779e555
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6e800ee43218a56acc93676bbb3d93b74779e555

--------------------------------

[ Upstream commit 01d9cc2d ]

Inject error before dev_hold(real_dev) in register_vlan_dev(),
and execute the following testcase:

ip link add dev dummy1 type dummy
ip link add name dummy1.100 link dummy1 type vlan id 100
ip link del dev dummy1

When the dummy netdevice is removed, we will get a WARNING as following:

=======================================================================
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 2 PID: 0 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0

and an endless loop of:

=======================================================================
unregister_netdevice: waiting for dummy1 to become free. Usage count = -1073741824

That is because dev_put(real_dev) in vlan_dev_free() be called without
dev_hold(real_dev) in register_vlan_dev(). It makes the refcnt of real_dev
underflow.

Move the dev_hold(real_dev) to vlan_dev_init() which is the call-back of
ndo_init(). That makes dev_hold() and dev_put() for vlan's real_dev
symmetrical.

Fixes: 563bcbae ("net: vlan: fix a UAF in vlan_dev_real_dev()")
Reported-by: NPetr Machata <petrm@nvidia.com>
Suggested-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Link: https://lore.kernel.org/r/20211126015942.2918542-1-william.xuanziyang@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0bf5cd4a

net/sched: sch_ets: don't peek at classes beyond 'nbands' · 6214996e

由 Davide Caratti 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit ae2659d2c670252759ee9c823c4e039c0e05a6f2
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ae2659d2c670252759ee9c823c4e039c0e05a6f2

--------------------------------

[ Upstream commit de6d2592 ]

when the number of DRR classes decreases, the round-robin active list can
contain elements that have already been freed in ets_qdisc_change(). As a
consequence, it's possible to see a NULL dereference crash, caused by the
attempt to call cl->qdisc->ops->peek(cl->qdisc) when cl->qdisc is NULL:

 BUG: kernel NULL pointer dereference, address: 0000000000000018
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 1 PID: 910 Comm: mausezahn Not tainted 5.16.0-rc1+ #475
 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
 RIP: 0010:ets_qdisc_dequeue+0x129/0x2c0 [sch_ets]
 Code: c5 01 41 39 ad e4 02 00 00 0f 87 18 ff ff ff 49 8b 85 c0 02 00 00 49 39 c4 0f 84 ba 00 00 00 49 8b ad c0 02 00 00 48 8b 7d 10 <48> 8b 47 18 48 8b 40 38 0f ae e8 ff d0 48 89 c3 48 85 c0 0f 84 9d
 RSP: 0000:ffffbb36c0b5fdd8 EFLAGS: 00010287
 RAX: ffff956678efed30 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000002 RSI: ffffffff9b938dc9 RDI: 0000000000000000
 RBP: ffff956678efed30 R08: e2f3207fe360129c R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000001 R12: ffff956678efeac0
 R13: ffff956678efe800 R14: ffff956611545000 R15: ffff95667ac8f100
 FS:  00007f2aa9120740(0000) GS:ffff95667b800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000018 CR3: 000000011070c000 CR4: 0000000000350ee0
 Call Trace:
  <TASK>
  qdisc_peek_dequeued+0x29/0x70 [sch_ets]
  tbf_dequeue+0x22/0x260 [sch_tbf]
  __qdisc_run+0x7f/0x630
  net_tx_action+0x290/0x4c0
  __do_softirq+0xee/0x4f8
  irq_exit_rcu+0xf4/0x130
  sysvec_apic_timer_interrupt+0x52/0xc0
  asm_sysvec_apic_timer_interrupt+0x12/0x20
 RIP: 0033:0x7f2aa7fc9ad4
 Code: b9 ff ff 48 8b 54 24 18 48 83 c4 08 48 89 ee 48 89 df 5b 5d e9 ed fc ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <53> 48 83 ec 10 48 8b 05 10 64 33 00 48 8b 00 48 85 c0 0f 85 84 00
 RSP: 002b:00007ffe5d33fab8 EFLAGS: 00000202
 RAX: 0000000000000002 RBX: 0000561f72c31460 RCX: 0000561f72c31720
 RDX: 0000000000000002 RSI: 0000561f72c31722 RDI: 0000561f72c31720
 RBP: 000000000000002a R08: 00007ffe5d33fa40 R09: 0000000000000014
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000561f7187e380
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000561f72c31460
  </TASK>
 Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt intel_rapl_msr iTCO_vendor_support intel_rapl_common joydev virtio_balloon lpc_ich i2c_i801 i2c_smbus pcspkr ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel serio_raw libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod
 CR2: 0000000000000018

Ensuring that 'alist' was never zeroed [1] was not sufficient, we need to
remove from the active list those elements that are no more SP nor DRR.

[1] https://lore.kernel.org/netdev/60d274838bf09777f0371253416e8af71360bc08.1633609148.git.dcaratti@redhat.com/

v3: fix race between ets_qdisc_change() and ets_qdisc_dequeue() delisting
    DRR classes beyond 'nbands' in ets_qdisc_change() with the qdisc lock
    acquired, thanks to Cong Wang.

v2: when a NULL qdisc is found in the DRR active list, try to dequeue skb
    from the next list item.
Reported-by: NHangbin Liu <liuhangbin@gmail.com>
Fixes: dcc68b4d ("net: sch_ets: Add a new Qdisc")
Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/7a5c496eed2d62241620bdbb83eb03fb9d571c99.1637762721.git.dcaratti@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6214996e

tls: fix replacing proto_ops · bf532d7b

由 Jakub Kicinski 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit e3509feb46fa15680a9c8afbcb760e962349c1e2
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e3509feb46fa15680a9c8afbcb760e962349c1e2

--------------------------------

[ Upstream commit f3911f73 ]

We replace proto_ops whenever TLS is configured for RX. But our
replacement also overrides sendpage_locked, which will crash
unless TX is also configured. Similarly we plug both of those
in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely
different implementation for TX.

Last but not least we always plug in something based on inet_stream_ops
even though a few of the callbacks differ for IPv6 (getname, release,
bind).

Use a callback building method similar to what we do for struct proto.

Fixes: c46234eb ("tls: RX path for ktls")
Fixes: d4ffb02d ("net/tls: enable sk_msg redirect to tls socket egress")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bf532d7b

tls: splice_read: fix record type check · 98bc1786

由 Jakub Kicinski 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 22156242b1042d5cce74f1bd20db541abdd2ecd7
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=22156242b1042d5cce74f1bd20db541abdd2ecd7

--------------------------------

[ Upstream commit 520493f6 ]

We don't support splicing control records. TLS 1.3 changes moved
the record type check into the decrypt if(). The skb may already
be decrypted and still be an alert.

Note that decrypt_skb_update() is idempotent and updates ctx->decrypted
so the if() is pointless.

Reorder the check for decryption errors with the content type check
while touching them. This part is not really a bug, because if
decryption failed in TLS 1.3 content type will be DATA, and for
TLS 1.2 it will be correct. Nevertheless its strange to touch output
before checking if the function has failed.

Fixes: fedf201e ("net: tls: Refactor control message handling on recv")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

98bc1786

MIPS: use 3-level pgtable for 64KB page size on MIPS_VA_BITS_48 · 0430b8db

由 Huang Pei 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 3b6c71c097daff9dd724cd306042045d18fd6b03
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3b6c71c097daff9dd724cd306042045d18fd6b03

--------------------------------

[ Upstream commit 41ce097f ]

It hangup when booting Loongson 3A1000 with BOTH
CONFIG_PAGE_SIZE_64KB and CONFIG_MIPS_VA_BITS_48, that it turn
out to use 2-level pgtable instead of 3-level. 64KB page size
with 2-level pgtable only cover 42 bits VA, use 3-level pgtable
to cover all 48 bits VA(55 bits)

Fixes: 1e321fa9 ("MIPS64: Support of at least 48 bits of SEGBITS)
Signed-off-by: NHuang Pei <huangpei@loongson.cn>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0430b8db

MIPS: loongson64: fix FTLB configuration · 0b34649f

由 Huang Pei 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit a6a5d853f1e6b731d5c4709001f71b4a35c31f1b
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a6a5d853f1e6b731d5c4709001f71b4a35c31f1b

--------------------------------

[ Upstream commit 7db5e9e9 ]

It turns out that 'decode_configs' -> 'set_ftlb_enable' is called under
c->cputype unset, which leaves FTLB disabled on BOTH 3A2000 and 3A3000

Fix it by calling "decode_configs" after c->cputype is initialized

Fixes: da1bd297 ("MIPS: Loongson64: Probe CPU features via CPUCFG")
Signed-off-by: NHuang Pei <huangpei@loongson.cn>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0b34649f

igb: fix netpoll exit with traffic · f25ea7bd

由 Jesse Brandeburg 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 5e823dbee23cc06712d6d39dc7bb38711f407ffc
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5e823dbee23cc06712d6d39dc7bb38711f407ffc

--------------------------------

[ Upstream commit eaeace60 ]

Oleksandr brought a bug report where netpoll causes trace
messages in the log on igb.

Danielle brought this back up as still occurring, so we'll try
again.

[22038.710800] ------------[ cut here ]------------
[22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll
[22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 netpoll_poll_dev+0x18a/0x1a0

As Alex suggested, change the driver to return work_done at the
exit of napi_poll, which should be safe to do in this driver
because it is not polling multiple queues in this single napi
context (multiple queues attached to one MSI-X vector). Several
other drivers contain the same simple sequence, so I hope
this will not create new problems.

Fixes: 16eb8815 ("igb: Refactor clean_rx_irq to reduce overhead and improve performance")
Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: NDanielle Ratson <danieller@nvidia.com>
Suggested-by: NAlexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: NDanielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/20211123204000.1597971-1-jesse.brandeburg@intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f25ea7bd

nvmet: use IOCB_NOWAIT only if the filesystem supports it · 0d8af269

由 Maurizio Lombardi 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit f2a58ff3e3ad6104fde2561407f35a1aba5a5b7b
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f2a58ff3e3ad6104fde2561407f35a1aba5a5b7b

--------------------------------

[ Upstream commit c024b226 ]

Submit I/O requests with the IOCB_NOWAIT flag set only if
the underlying filesystem supports it.

Fixes: 50a909db ("nvmet: use IOCB_NOWAIT for file-ns buffered I/O")
Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0d8af269

net/smc: Fix loop in smc_listen · c2e4f796

由 Guo DaXing 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 12ceb52f2cc49583394bad42a39468fde9d8e0cc
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=12ceb52f2cc49583394bad42a39468fde9d8e0cc

--------------------------------

[ Upstream commit 9ebb0c4b ]

The kernel_listen function in smc_listen will fail when all the available
ports are occupied.  At this point smc->clcsock->sk->sk_data_ready has
been changed to smc_clcsock_data_ready.  When we call smc_listen again,
now both smc->clcsock->sk->sk_data_ready and smc->clcsk_data_ready point
to the smc_clcsock_data_ready function.

The smc_clcsock_data_ready() function calls lsmc->clcsk_data_ready which
now points to itself resulting in an infinite loop.

This patch restores smc->clcsock->sk->sk_data_ready with the old value.

Fixes: a60a2b1e ("net/smc: reduce active tcp_listen workers")
Signed-off-by: NGuo DaXing <guodaxing@huawei.com>
Acked-by: NTony Lu <tonylu@linux.alibaba.com>
Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c2e4f796

net/smc: Fix NULL pointer dereferencing in smc_vlan_by_tcpsk() · 330266c4

由 Karsten Graul 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit c94cbd262b6aa3b54d73a1ed1f9c0d19df57f4ff
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c94cbd262b6aa3b54d73a1ed1f9c0d19df57f4ff

--------------------------------

[ Upstream commit 587acad4 ]

Coverity reports a possible NULL dereferencing problem:

in smc_vlan_by_tcpsk():
6. returned_null: netdev_lower_get_next returns NULL (checked 29 out of 30 times).
7. var_assigned: Assigning: ndev = NULL return value from netdev_lower_get_next.
1623                ndev = (struct net_device *)netdev_lower_get_next(ndev, &lower);
CID 1468509 (#1 of 1): Dereference null return value (NULL_RETURNS)
8. dereference: Dereferencing a pointer that might be NULL ndev when calling is_vlan_dev.
1624                if (is_vlan_dev(ndev)) {

Remove the manual implementation and use netdev_walk_all_lower_dev() to
iterate over the lower devices. While on it remove an obsolete function
parameter comment.

Fixes: cb9d43f6 ("net/smc: determine vlan_id of stacked net_device")
Suggested-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

330266c4

net: phylink: Force retrigger in case of latched link-fail indicator · 322ebb87

由 Russell King (Oracle) 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 3d4937c6a328947f980ba0e7a7f03901a2ec2aa0
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3d4937c6a328947f980ba0e7a7f03901a2ec2aa0

--------------------------------

[ Upstream commit dbae3388 ]

On mv88e6xxx 1G/2.5G PCS, the SerDes register 4.2001.2 has the following
description:
  This register bit indicates when link was lost since the last
  read. For the current link status, read this register
  back-to-back.

Thus to get current link state, we need to read the register twice.

But doing that in the link change interrupt handler would lead to
potentially ignoring link down events, which we really want to avoid.

Thus this needs to be solved in phylink's resolve, by retriggering
another resolve in the event when PCS reports link down and previous
link was up, and by re-reading PCS state if the previous link was down.

The wrong value is read when phylink requests change from sgmii to
2500base-x mode, and link won't come up. This fixes the bug.

Fixes: 9525ae83 ("phylink: add phylink infrastructure")
Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: NMarek Behún <kabel@kernel.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

322ebb87

net: phylink: Force link down and retrigger resolve on interface change · 27e8408e

由 Russell King (Oracle) 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 50162ff3c80fe8db7cfff76ecf54b9e80c5923a3
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=50162ff3c80fe8db7cfff76ecf54b9e80c5923a3

--------------------------------

[ Upstream commit 80662f4f ]

On PHY state change the phylink_resolve() function can read stale
information from the MAC and report incorrect link speed and duplex to
the kernel message log.

Example with a Marvell 88X3310 PHY connected to a SerDes port on Marvell
88E6393X switch:
- PHY driver triggers state change due to PHY interface mode being
  changed from 10gbase-r to 2500base-x due to copper change in speed
  from 10Gbps to 2.5Gbps, but the PHY itself either hasn't yet changed
  its interface to the host, or the interrupt about loss of SerDes link
  hadn't arrived yet (there can be a delay of several milliseconds for
  this), so we still think that the 10gbase-r mode is up
- phylink_resolve()
  - phylink_mac_pcs_get_state()
    - this fills in speed=10g link=up
  - interface mode is updated to 2500base-x but speed is left at 10Gbps
  - phylink_major_config()
    - interface is changed to 2500base-x
  - phylink_link_up()
    - mv88e6xxx_mac_link_up()
      - .port_set_speed_duplex()
        - speed is set to 10Gbps
    - reports "Link is Up - 10Gbps/Full" to dmesg

Afterwards when the interrupt finally arrives for mv88e6xxx, another
resolve is forced in which we get the correct speed from
phylink_mac_pcs_get_state(), but since the interface is not being
changed anymore, we don't call phylink_major_config() but only
phylink_mac_config(), which does not set speed/duplex anymore.

To fix this, we need to force the link down and trigger another resolve
on PHY interface change event.

Fixes: 9525ae83 ("phylink: add phylink infrastructure")
Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: NMarek Behún <kabel@kernel.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

27e8408e

lan743x: fix deadlock in lan743x_phy_link_status_change() · 54f8e4f0

由 Heiner Kallweit 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 95ba8f0d57ce1248eb105fa0a003d57ec98ab730
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=95ba8f0d57ce1248eb105fa0a003d57ec98ab730

--------------------------------

[ Upstream commit ddb826c2 ]

Usage of phy_ethtool_get_link_ksettings() in the link status change
handler isn't needed, and in combination with the referenced change
it results in a deadlock. Simply remove the call and replace it with
direct access to phydev->speed. The duplex argument of
lan743x_phy_update_flowcontrol() isn't used and can be removed.

Fixes: c10a485c ("phy: phy_ethtool_ksettings_get: Lock the phy for consistency")
Reported-by: NAlessandro B Maurici <abmaurici@gmail.com>
Tested-by: NAlessandro B Maurici <abmaurici@gmail.com>
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/40e27f76-0ba3-dcef-ee32-a78b9df38b0f@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

54f8e4f0

tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows · a22687f0

由 Eric Dumazet 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit c5e4316d9c02e926ed74f53ecd65a757dcbe0cd9
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c5e4316d9c02e926ed74f53ecd65a757dcbe0cd9

--------------------------------

[ Upstream commit 4e1fddc9 ]

While testing BIG TCP patch series, I was expecting that TCP_RR workloads
with 80KB requests/answers would send one 80KB TSO packet,
then being received as a single GRO packet.

It turns out this was not happening, and the root cause was that
cubic Hystart ACK train was triggering after a few (2 or 3) rounds of RPC.

Hystart was wrongly setting CWND/SSTHRESH to 30, while my RPC
needed a budget of ~20 segments.

Ideally these TCP_RR flows should not exit slow start.

Cubic Hystart should reset itself at each round, instead of assuming
every TCP flow is a bulk one.

Note that even after this patch, Hystart can still trigger, depending
on scheduling artifacts, but at a higher CWND/SSTHRESH threshold,
keeping optimal TSO packet sizes.

Tested:

ip link set dev eth0 gro_ipv6_max_size 131072 gso_ipv6_max_size 131072
nstat -n; netperf -H ... -t TCP_RR  -l 5  -- -r 80000,80000 -K cubic; nstat|egrep "Ip6InReceives|Hystart|Ip6OutRequests"

Before:

   8605
Ip6InReceives                   87541              0.0
Ip6OutRequests                  129496             0.0
TcpExtTCPHystartTrainDetect     1                  0.0
TcpExtTCPHystartTrainCwnd       30                 0.0

After:

  8760
Ip6InReceives                   88514              0.0
Ip6OutRequests                  87975              0.0

Fixes: ae27e98a ("[TCP] CUBIC v2.3")
Co-developed-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Link: https://lore.kernel.org/r/20211123202535.1843771-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a22687f0

drm/amd/display: Set plane update flags for all planes in reset · 2528da71

由 Nicholas Kazlauskas 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 3187623096091d8c60231de5ca0e020bfa5e6ee9
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3187623096091d8c60231de5ca0e020bfa5e6ee9

--------------------------------

[ Upstream commit 21431f70 ]

[Why]
We're only setting the flags on stream[0]'s planes so this logic fails
if we have more than one stream in the state.

This can cause a page flip timeout with multiple displays in the
configuration.

[How]
Index into the stream_status array using the stream index - it's a 1:1
mapping.

Fixes: cdaae837 ("drm/amd/display: Handle GPU reset for DC block")
Reviewed-by: NHarry Wentland <Harry.Wentland@amd.com>
Acked-by: NQingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: NDaniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2528da71

PM: hibernate: use correct mode for swsusp_close() · 42453468

由 Thomas Zeitlhofer 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit f634c755a0ee16232a450406cbd6266f0f500d2d
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f634c755a0ee16232a450406cbd6266f0f500d2d

--------------------------------

[ Upstream commit cefcf24b ]

Commit 39fbef4b ("PM: hibernate: Get block device exclusively in
swsusp_check()") changed the opening mode of the block device to
(FMODE_READ | FMODE_EXCL).

In the corresponding calls to swsusp_close(), the mode is still just
FMODE_READ which triggers the warning in blkdev_flush_mapping() on
resume from hibernate.

So, use the mode (FMODE_READ | FMODE_EXCL) also when closing the
device.

Fixes: 39fbef4b ("PM: hibernate: Get block device exclusively in swsusp_check()")
Signed-off-by: NThomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

42453468

net/ncsi : Add payload to be 32-bit aligned to fix dropped packets · 4bf8df41

由 Kumar Thangavel 提交于 1月 14, 2022

stable inclusion
from stable-v5.10.83
commit 440bd9faad298c3cb31dcc6713070dc718769d47
bugzilla: 185879 https://gitee.com/openeuler/kernel/issues/I4QUVG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=440bd9faad298c3cb31dcc6713070dc718769d47

--------------------------------

[ Upstream commit ac132852 ]

Update NC-SI command handler (both standard and OEM) to take into
account of payload paddings in allocating skb (in case of payload
size is not 32-bit aligned).

The checksum field follows payload field, without taking payload
padding into account can cause checksum being truncated, leading to
dropped packets.

Fixes: fb4ee675 ("net/ncsi: Add NCSI OEM command support")
Signed-off-by: NKumar Thangavel <thangavel.k@hcl.com>
Acked-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>
Reviewed-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4bf8df41

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功