提交 · 4f52733be78f1721bd4d4e9ef342cd91ae843bc3 · openeuler / Kernel

28 5月, 2022 40 次提交

mac80211: fix forwarded mesh frames AC & queue selection · 4f52733b

由 Nicolas Escande 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit fa65989a48679dd67d8d0fbccd4e204142d4c707
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=fa65989a48679dd67d8d0fbccd4e204142d4c707

--------------------------------

commit 859ae701 upstream.

There are two problems with the current code that have been highlighted
with the AQL feature that is now enbaled by default.

First problem is in ieee80211_rx_h_mesh_fwding(),
ieee80211_select_queue_80211() is used on received packets to choose
the sending AC queue of the forwarding packet although this function
should only be called on TX packet (it uses ieee80211_tx_info).
This ends with forwarded mesh packets been sent on unrelated random AC
queue. To fix that, AC queue can directly be infered from skb->priority
which has been extracted from QOS info (see ieee80211_parse_qos()).

Second problem is the value of queue_mapping set on forwarded mesh
frames via skb_set_queue_mapping() is not the AC of the packet but a
hardware queue index. This may or may not work depending on AC to HW
queue mapping which is driver specific.

Both of these issues lead to improper AC selection while forwarding
mesh packets but more importantly due to improper airtime accounting
(which is done on a per STA, per AC basis) caused traffic stall with
the introduction of AQL.

Fixes: cf440128 ("mac80211: fix unnecessary frame drops in mesh fwding")
Fixes: d3c1597b ("mac80211: fix forwarded mesh frame queue mapping")
Co-developed-by: NRemi Pommarel <repk@triplefau.lt>
Signed-off-by: NRemi Pommarel <repk@triplefau.lt>
Signed-off-by: NNicolas Escande <nico.escande@gmail.com>
Link: https://lore.kernel.org/r/20220214173214.368862-1-nico.escande@gmail.comSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4f52733b

ice: fix concurrent reset and removal of VFs · 60ac7168

由 Jacob Keller 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 05ae1f0fe9c6c5ead08b306e665763a352d20716
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=05ae1f0fe9c6c5ead08b306e665763a352d20716

--------------------------------

commit fadead80 upstream.

Commit c503e632 ("ice: Stop processing VF messages during teardown")
introduced a driver state flag, ICE_VF_DEINIT_IN_PROGRESS, which is
intended to prevent some issues with concurrently handling messages from
VFs while tearing down the VFs.

This change was motivated by crashes caused while tearing down and
bringing up VFs in rapid succession.

It turns out that the fix actually introduces issues with the VF driver
caused because the PF no longer responds to any messages sent by the VF
during its .remove routine. This results in the VF potentially removing
its DMA memory before the PF has shut down the device queues.

Additionally, the fix doesn't actually resolve concurrency issues within
the ice driver. It is possible for a VF to initiate a reset just prior
to the ice driver removing VFs. This can result in the remove task
concurrently operating while the VF is being reset. This results in
similar memory corruption and panics purportedly fixed by that commit.

Fix this concurrency at its root by protecting both the reset and
removal flows using the existing VF cfg_lock. This ensures that we
cannot remove the VF while any outstanding critical tasks such as a
virtchnl message or a reset are occurring.

This locking change also fixes the root cause originally fixed by commit
c503e632 ("ice: Stop processing VF messages during teardown"), so we
can simply revert it.

Note that I kept these two changes together because simply reverting the
original commit alone would leave the driver vulnerable to worse race
conditions.

Fixes: c503e632 ("ice: Stop processing VF messages during teardown")
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

60ac7168

ice: Fix race conditions between virtchnl handling and VF ndo ops · a2c3987e

由 Brett Creeley 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 41edeeaae51a1064a7e7cdea70623377cb2655cc
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=41edeeaae51a1064a7e7cdea70623377cb2655cc

--------------------------------

commit e6ba5273 upstream.

The VF can be configured via the PF's ndo ops at the same time the PF is
receiving/handling virtchnl messages. This has many issues, with
one of them being the ndo op could be actively resetting a VF (i.e.
resetting it to the default state and deleting/re-adding the VF's VSI)
while a virtchnl message is being handled. The following error was seen
because a VF ndo op was used to change a VF's trust setting while the
VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing:

[35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
[35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
[35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6

Fix this by making sure the virtchnl handling and VF ndo ops that
trigger VF resets cannot run concurrently. This is done by adding a
struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex
will be locked around the critical operations and VFR. Since the ndo ops
will trigger a VFR, the virtchnl thread will use mutex_trylock(). This
is done because if any other thread (i.e. VF ndo op) has the mutex, then
that means the current VF message being handled is no longer valid, so
just ignore it.

This issue can be seen using the following commands:

for i in {0..50}; do
        rmmod ice
        modprobe ice

        sleep 1

        echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
        echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs

        ip link set ens785f1 vf 0 trust on
        ip link set ens785f0 vf 0 trust on

        sleep 2

        echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
        echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
        sleep 1
        echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
        echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs

        ip link set ens785f1 vf 0 trust on
        ip link set ens785f0 vf 0 trust on
done

Fixes: 7c710869 ("ice: Add handlers for VF netdevice operations")
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a2c3987e

net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server · d3de39e5

由 D. Wythe 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 9bb7237cc740b9f4f8904d1823ed71c71a5e83e8
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9bb7237cc740b9f4f8904d1823ed71c71a5e83e8

--------------------------------

commit 4940a1fd upstream.

The problem of SMC_CLC_DECL_ERR_REGRMB on the server is very clear.
Based on the fact that whether a new SMC connection can be accepted or
not depends on not only the limit of conn nums, but also the available
entries of rtoken. Since the rtoken release is trigger by peer, while
the conn nums is decrease by local, tons of thing can happen in this
time difference.

This only thing that needs to be mentioned is that now all connection
creations are completely protected by smc_server_lgr_pending lock, it's
enough to check only the available entries in rtokens_used_mask.

Fixes: cd6851f3 ("smc: remote memory buffers (RMBs)")
Signed-off-by: ND. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d3de39e5

net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client · 5462793f

由 D. Wythe 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit d7eb662625eb56615f3caec6bac7a6f400080c7a
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d7eb662625eb56615f3caec6bac7a6f400080c7a

--------------------------------

commit 0537f0a2 upstream.

The main reason for this unexpected SMC_CLC_DECL_ERR_REGRMB in client
dues to following execution sequence:

Server Conn A:           Server Conn B:			Client Conn B:

smc_lgr_unregister_conn
                        smc_lgr_register_conn
                        smc_clc_send_accept     ->
                                                        smc_rtoken_add
smcr_buf_unuse
		->		Client Conn A:
				smc_rtoken_delete

smc_lgr_unregister_conn() makes current link available to assigned to new
incoming connection, while smcr_buf_unuse() has not executed yet, which
means that smc_rtoken_add may fail because of insufficient rtoken_entry,
reversing their execution order will avoid this problem.

Fixes: 3e034725 ("net/smc: common functions for RMBs and send buffers")
Signed-off-by: ND. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5462793f

net/smc: fix connection leak · 5388fda2

由 D. Wythe 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 2e8d465b83db307f04ad265848f8ab3f78f6918f
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2e8d465b83db307f04ad265848f8ab3f78f6918f

--------------------------------

commit 9f1c50cf upstream.

There's a potential leak issue under following execution sequence :

smc_release  				smc_connect_work
if (sk->sk_state == SMC_INIT)
					send_clc_confirim
	tcp_abort();
					...
					sk.sk_state = SMC_ACTIVE
smc_close_active
switch(sk->sk_state) {
...
case SMC_ACTIVE:
	smc_close_final()
	// then wait peer closed

Unfortunately, tcp_abort() may discard CLC CONFIRM messages that are
still in the tcp send buffer, in which case our connection token cannot
be delivered to the server side, which means that we cannot get a
passive close message at all. Therefore, it is impossible for the to be
disconnected at all.

This patch tries a very simple way to avoid this issue, once the state
has changed to SMC_ACTIVE after tcp_abort(), we can actively abort the
smc connection, considering that the state is SMC_INIT before
tcp_abort(), abandoning the complete disconnection process should not
cause too much problem.

In fact, this problem may exist as long as the CLC CONFIRM message is
not received by the server. Whether a timer should be added after
smc_close_final() needs to be discussed in the future. But even so, this
patch provides a faster release for connection in above case, it should
also be valuable.

Fixes: 39f41f36 ("net/smc: common release code for non-accepted sockets")
Signed-off-by: ND. Wythe <alibuda@linux.alibaba.com>
Acked-by: NKarsten Graul <kgraul@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5388fda2

net: dcb: flush lingering app table entries for unregistered devices · 6161ede4

由 Vladimir Oltean 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 6a8a4dc2a279b225783a838e04ecd469df6ff21d
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6a8a4dc2a279b225783a838e04ecd469df6ff21d

--------------------------------

commit 91b0383f upstream.

If I'm not mistaken (and I don't think I am), the way in which the
dcbnl_ops work is that drivers call dcb_ieee_setapp() and this populates
the application table with dynamically allocated struct dcb_app_type
entries that are kept in the module-global dcb_app_list.

However, nobody keeps exact track of these entries, and although
dcb_ieee_delapp() is supposed to remove them, nobody does so when the
interface goes away (example: driver unbinds from device). So the
dcb_app_list will contain lingering entries with an ifindex that no
longer matches any device in dcb_app_lookup().

Reclaim the lost memory by listening for the NETDEV_UNREGISTER event and
flushing the app table entries of interfaces that are now gone.

In fact something like this used to be done as part of the initial
commit (blamed below), but it was done in dcbnl_exit() -> dcb_flushapp(),
essentially at module_exit time. That became dead code after commit
7a6b6f51 ("DCB: fix kconfig option") which essentially merged
"tristate config DCB" and "bool config DCBNL" into a single "bool config
DCB", so net/dcb/dcbnl.c could not be built as a module anymore.

Commit 36b9ad80 ("net/dcb: make dcbnl.c explicitly non-modular")
recognized this and deleted dcbnl_exit() and dcb_flushapp() altogether,
leaving us with the version we have today.

Since flushing application table entries can and should be done as soon
as the netdevice disappears, fundamentally the commit that is to blame
is the one that introduced the design of this API.

Fixes: 9ab933ab ("dcbnl: add appliction tlv handlers")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6161ede4

net: ipv6: ensure we call ipv6_mc_down() at most once · 2c7dfc5a

由 j.nixdorf@avm.de 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit f4c63b24dea9cc2043ff845dcca9aaf8109ea38a
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f4c63b24dea9cc2043ff845dcca9aaf8109ea38a

--------------------------------

commit 9995b408 upstream.

There are two reasons for addrconf_notify() to be called with NETDEV_DOWN:
either the network device is actually going down, or IPv6 was disabled
on the interface.

If either of them stays down while the other is toggled, we repeatedly
call the code for NETDEV_DOWN, including ipv6_mc_down(), while never
calling the corresponding ipv6_mc_up() in between. This will cause a
new entry in idev->mc_tomb to be allocated for each multicast group
the interface is subscribed to, which in turn leaks one struct ifmcaddr6
per nontrivial multicast group the interface is subscribed to.

The following reproducer will leak at least $n objects:

ip addr add ff2e::4242/32 dev eth0 autojoin
sysctl -w net.ipv6.conf.eth0.disable_ipv6=1
for i in $(seq 1 $n); do
	ip link set up eth0; ip link set down eth0
done

Joining groups with IPV6_ADD_MEMBERSHIP (unprivileged) or setting the
sysctl net.ipv6.conf.eth0.forwarding to 1 (=> subscribing to ff02::2)
can also be used to create a nontrivial idev->mc_list, which will the
leak objects with the right up-down-sequence.

Based on both sources for NETDEV_DOWN events the interface IPv6 state
should be considered:

 - not ready if the network interface is not ready OR IPv6 is disabled
   for it
 - ready if the network interface is ready AND IPv6 is enabled for it

The functions ipv6_mc_up() and ipv6_down() should only be run when this
state changes.

Implement this by remembering when the IPv6 state is ready, and only
run ipv6_mc_down() if it actually changed from ready to not ready.

The other direction (not ready -> ready) already works correctly, as:

 - the interface notification triggered codepath for NETDEV_UP /
   NETDEV_CHANGE returns early if ipv6 is disabled, and
 - the disable_ipv6=0 triggered codepath skips fully initializing the
   interface as long as addrconf_link_ready(dev) returns false
 - calling ipv6_mc_up() repeatedly does not leak anything

Fixes: 3ce62a84 ("ipv6: exit early in addrconf_notify() if IPv6 is disabled")
Signed-off-by: NJohannes Nixdorf <j.nixdorf@avm.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2c7dfc5a

batman-adv: Don't expect inter-netns unique iflink indices · b51ac826

由 Sven Eckelmann 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit a9c4a74ad5ae4a23ce4db8cc9a0ead08ce5b60f3
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a9c4a74ad5ae4a23ce4db8cc9a0ead08ce5b60f3

--------------------------------

commit 6c1f41af upstream.

The ifindex doesn't have to be unique for multiple network namespaces on
the same machine.

  $ ip netns add test1
  $ ip -net test1 link add dummy1 type dummy
  $ ip netns add test2
  $ ip -net test2 link add dummy2 type dummy

  $ ip -net test1 link show dev dummy1
  6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether 96:81:55:1e:dd:85 brd ff:ff:ff:ff:ff:ff
  $ ip -net test2 link show dev dummy2
  6: dummy2: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether 5a:3c:af:35:07:c3 brd ff:ff:ff:ff:ff:ff

But the batman-adv code to walk through the various layers of virtual
interfaces uses this assumption because dev_get_iflink handles it
internally and doesn't return the actual netns of the iflink. And
dev_get_iflink only documents the situation where ifindex == iflink for
physical devices.

But only checking for dev->netdev_ops->ndo_get_iflink is also not an option
because ipoib_get_iflink implements it even when it sometimes returns an
iflink != ifindex and sometimes iflink == ifindex. The caller must
therefore make sure itself to check both netns and iflink + ifindex for
equality. Only when they are equal, a "physical" interface was detected
which should stop the traversal. On the other hand, vxcan_get_iflink can
also return 0 in case there was currently no valid peer. In this case, it
is still necessary to stop.

Fixes: b7eddd0b ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Fixes: 5ed4a460 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Reported-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b51ac826

batman-adv: Request iflink once in batadv_get_real_netdevice · 25c52d6d

由 Sven Eckelmann 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 3dae11d21fc8aa57f389fd32ab884b638a04aff2
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3dae11d21fc8aa57f389fd32ab884b638a04aff2

--------------------------------

commit 6116ba09 upstream.

There is no need to call dev_get_iflink multiple times for the same
net_device in batadv_get_real_netdevice. And since some of the
ndo_get_iflink callbacks are dynamic (for example via RCUs like in
vxcan_get_iflink), it could easily happen that the returned values are not
stable. The pre-checks before __dev_get_by_index are then of course bogus.

Fixes: 5ed4a460 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

25c52d6d

batman-adv: Request iflink once in batadv-on-batadv check · 034e1d1a

由 Sven Eckelmann 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit dcf10d78ff2c38dc5097cb59ae44367db17ed0c0
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=dcf10d78ff2c38dc5097cb59ae44367db17ed0c0

--------------------------------

commit 690bb6fb upstream.

There is no need to call dev_get_iflink multiple times for the same
net_device in batadv_is_on_batman_iface. And since some of the
.ndo_get_iflink callbacks are dynamic (for example via RCUs like in
vxcan_get_iflink), it could easily happen that the returned values are not
stable. The pre-checks before __dev_get_by_index are then of course bogus.

Fixes: b7eddd0b ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

034e1d1a

netfilter: nf_queue: handle socket prefetch · b71c158d

由 Florian Westphal 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 81f817f3e559d3e4e56110f6132f8322a97fbc8c
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=81f817f3e559d3e4e56110f6132f8322a97fbc8c

--------------------------------

commit 3b836da4 upstream.

In case someone combines bpf socket assign and nf_queue, then we will
queue an skb who references a struct sock that did not have its
reference count incremented.

As we leave rcu protection, there is no guarantee that skb->sk is still
valid.

For refcount-less skb->sk case, try to increment the reference count
and then override the destructor.

In case of failure we have two choices: orphan the skb and 'delete'
preselect or let nf_queue() drop the packet.

Do the latter, it should not happen during normal operation.

Fixes: cf7fbe66 ("bpf: Add socket assign support")
Acked-by: NJoe Stringer <joe@cilium.io>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b71c158d

netfilter: nf_queue: fix possible use-after-free · c6614e9b

由 Florian Westphal 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 4d05239203fa38ea8a6f31e228460da4cb17a71a
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4d05239203fa38ea8a6f31e228460da4cb17a71a

--------------------------------

commit c3873070 upstream.

Eric Dumazet says:
  The sock_hold() side seems suspect, because there is no guarantee
  that sk_refcnt is not already 0.

On failure, we cannot queue the packet and need to indicate an
error.  The packet will be dropped by the caller.

v2: split skb prefetch hunk into separate change

Fixes: 271b72c7 ("udp: RCU handling for Unicast packets.")
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c6614e9b

netfilter: nf_queue: don't assume sk is full socket · f9146cf8

由 Florian Westphal 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 3b9ba964f77cbac7679379f82a6a08ddbef3bc33
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3b9ba964f77cbac7679379f82a6a08ddbef3bc33

--------------------------------

commit 747670fd upstream.

There is no guarantee that state->sk refers to a full socket.

If refcount transitions to 0, sock_put calls sk_free which then ends up
with garbage fields.

I'd like to thank Oleksandr Natalenko and Jiri Benc for considerable
debug work and pointing out state->sk oddities.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Tested-by: NOleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f9146cf8

net: fix up skbs delta_truesize in UDP GRO frag_list · 04a05c72

由 lena wang 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 4e178ed14bda47942c1ccad3f60b774870b45db9
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4e178ed14bda47942c1ccad3f60b774870b45db9

--------------------------------

commit 224102de upstream.

The truesize for a UDP GRO packet is added by main skb and skbs in main
skb's frag_list:
skb_gro_receive_list
        p->truesize += skb->truesize;

The commit 53475c5d ("net: fix use-after-free when UDP GRO with
shared fraglist") introduced a truesize increase for frag_list skbs.
When uncloning skb, it will call pskb_expand_head and trusesize for
frag_list skbs may increase. This can occur when allocators uses
__netdev_alloc_skb and not jump into __alloc_skb. This flow does not
use ksize(len) to calculate truesize while pskb_expand_head uses.
skb_segment_list
err = skb_unclone(nskb, GFP_ATOMIC);
pskb_expand_head
        if (!skb->sk || skb->destructor == sock_edemux)
                skb->truesize += size - osize;

If we uses increased truesize adding as delta_truesize, it will be
larger than before and even larger than previous total truesize value
if skbs in frag_list are abundant. The main skb truesize will become
smaller and even a minus value or a huge value for an unsigned int
parameter. Then the following memory check will drop this abnormal skb.

To avoid this error we should use the original truesize to segment the
main skb.

Fixes: 53475c5d ("net: fix use-after-free when UDP GRO with shared fraglist")
Signed-off-by: Nlena wang <lena.wang@mediatek.com>
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/1646133431-8948-1-git-send-email-lena.wang@mediatek.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

04a05c72

e1000e: Correct NVM checksum verification flow · da20ea4e

由 Sasha Neftin 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit eb5e444fe37d467e54d2945c1293f311ce782f67
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=eb5e444fe37d467e54d2945c1293f311ce782f67

--------------------------------

commit ffd24fa2 upstream.

Update MAC type check e1000_pch_tgp because for e1000_pch_cnp,
NVM checksum update is still possible.
Emit a more detailed warning message.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1191663
Fixes: 4051f683 ("e1000e: Do not take care about recovery NVM checksum")
Reported-by: NThomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: NSasha Neftin <sasha.neftin@intel.com>
Tested-by: NNaama Meir <naamax.meir@linux.intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

da20ea4e

xfrm: enforce validity of offload input flags · 0508dbab

由 Leon Romanovsky 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit b53d4bfd1a6894e00dc8d654af61a22bb914dde4
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b53d4bfd1a6894e00dc8d654af61a22bb914dde4

--------------------------------

commit 7c76ecd9 upstream.

struct xfrm_user_offload has flags variable that received user input,
but kernel didn't check if valid bits were provided. It caused a situation
where not sanitized input was forwarded directly to the drivers.

For example, XFRM_OFFLOAD_IPV6 define that was exposed, was used by
strongswan, but not implemented in the kernel at all.

As a solution, check and sanitize input flags to forward
XFRM_OFFLOAD_INBOUND to the drivers.

Fixes: d77e38e6 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0508dbab

xfrm: fix the if_id check in changelink · adc740bc

由 Antony Antony 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 2f0e6d80e8b570aeb7e6eb6db2e2dd9fdbb6236c
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2f0e6d80e8b570aeb7e6eb6db2e2dd9fdbb6236c

--------------------------------

commit 6d0d95a1 upstream.

if_id will be always 0, because it was not yet initialized.

Fixes: 8dce4391 ("xfrm: interface with if_id 0 should return error")
Reported-by: NPavel Machek <pavel@denx.de>
Signed-off-by: NAntony Antony <antony.antony@secunet.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

adc740bc

bpf, sockmap: Do not ignore orig_len parameter · a9971a84

由 Eric Dumazet 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 24efaae03b0d093a40e91dce2b820bab03664bca
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=24efaae03b0d093a40e91dce2b820bab03664bca

--------------------------------

commit 60ce37b0 upstream.

Currently, sk_psock_verdict_recv() returns skb->len

This is problematic because tcp_read_sock() might have
passed orig_len < skb->len, due to the presence of TCP urgent data.

This causes an infinite loop from tcp_read_sock()

Followup patch will make tcp_read_sock() more robust vs bad actors.

Fixes: ef565928 ("bpf, sockmap: Allow skipping sk_skb parser program")
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Acked-by: NJakub Sitnicki <jakub@cloudflare.com>
Tested-by: NJakub Sitnicki <jakub@cloudflare.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20220302161723.3910001-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a9971a84

netfilter: fix use-after-free in __nf_register_net_hook() · 95d7ea49

由 Eric Dumazet 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 8b0142c4143c1ca297dcf2c0cdd045d65dae2344
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8b0142c4143c1ca297dcf2c0cdd045d65dae2344

--------------------------------

commit 56763f12 upstream.

We must not dereference @new_hooks after nf_hook_mutex has been released,
because other threads might have freed our allocated hooks already.

BUG: KASAN: use-after-free in nf_hook_entries_get_hook_ops include/linux/netfilter.h:130 [inline]
BUG: KASAN: use-after-free in hooks_validate net/netfilter/core.c:171 [inline]
BUG: KASAN: use-after-free in __nf_register_net_hook+0x77a/0x820 net/netfilter/core.c:438
Read of size 2 at addr ffff88801c1a8000 by task syz-executor237/4430

CPU: 1 PID: 4430 Comm: syz-executor237 Not tainted 5.17.0-rc5-syzkaller-00306-g2293be58 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x336 mm/kasan/report.c:255
 __kasan_report mm/kasan/report.c:442 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 nf_hook_entries_get_hook_ops include/linux/netfilter.h:130 [inline]
 hooks_validate net/netfilter/core.c:171 [inline]
 __nf_register_net_hook+0x77a/0x820 net/netfilter/core.c:438
 nf_register_net_hook+0x114/0x170 net/netfilter/core.c:571
 nf_register_net_hooks+0x59/0xc0 net/netfilter/core.c:587
 nf_synproxy_ipv6_init+0x85/0xe0 net/netfilter/nf_synproxy_core.c:1218
 synproxy_tg6_check+0x30d/0x560 net/ipv6/netfilter/ip6t_SYNPROXY.c:81
 xt_check_target+0x26c/0x9e0 net/netfilter/x_tables.c:1038
 check_target net/ipv6/netfilter/ip6_tables.c:530 [inline]
 find_check_entry.constprop.0+0x7f1/0x9e0 net/ipv6/netfilter/ip6_tables.c:573
 translate_table+0xc8b/0x1750 net/ipv6/netfilter/ip6_tables.c:735
 do_replace net/ipv6/netfilter/ip6_tables.c:1153 [inline]
 do_ip6t_set_ctl+0x56e/0xb90 net/ipv6/netfilter/ip6_tables.c:1639
 nf_setsockopt+0x83/0xe0 net/netfilter/nf_sockopt.c:101
 ipv6_setsockopt+0x122/0x180 net/ipv6/ipv6_sockglue.c:1024
 rawv6_setsockopt+0xd3/0x6a0 net/ipv6/raw.c:1084
 __sys_setsockopt+0x2db/0x610 net/socket.c:2180
 __do_sys_setsockopt net/socket.c:2191 [inline]
 __se_sys_setsockopt net/socket.c:2188 [inline]
 __x64_sys_setsockopt+0xba/0x150 net/socket.c:2188
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f65a1ace7d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 71 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f65a1a7f308 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f65a1ace7d9
RDX: 0000000000000040 RSI: 0000000000000029 RDI: 0000000000000003
RBP: 00007f65a1b574c8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000020000000 R11: 0000000000000246 R12: 00007f65a1b55130
R13: 00007f65a1b574c0 R14: 00007f65a1b24090 R15: 0000000000022000
 </TASK>

The buggy address belongs to the page:
page:ffffea0000706a00 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1c1a8
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000000 ffffea0001c1b108 ffffea000046dd08 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as freed
page last allocated via order 2, migratetype Unmovable, gfp_mask 0x52dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO), pid 4430, ts 1061781545818, free_ts 1061791488993
 prep_new_page mm/page_alloc.c:2434 [inline]
 get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4165
 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5389
 __alloc_pages_node include/linux/gfp.h:572 [inline]
 alloc_pages_node include/linux/gfp.h:595 [inline]
 kmalloc_large_node+0x62/0x130 mm/slub.c:4438
 __kmalloc_node+0x35a/0x4a0 mm/slub.c:4454
 kmalloc_node include/linux/slab.h:604 [inline]
 kvmalloc_node+0x97/0x100 mm/util.c:580
 kvmalloc include/linux/slab.h:731 [inline]
 kvzalloc include/linux/slab.h:739 [inline]
 allocate_hook_entries_size net/netfilter/core.c:61 [inline]
 nf_hook_entries_grow+0x140/0x780 net/netfilter/core.c:128
 __nf_register_net_hook+0x144/0x820 net/netfilter/core.c:429
 nf_register_net_hook+0x114/0x170 net/netfilter/core.c:571
 nf_register_net_hooks+0x59/0xc0 net/netfilter/core.c:587
 nf_synproxy_ipv6_init+0x85/0xe0 net/netfilter/nf_synproxy_core.c:1218
 synproxy_tg6_check+0x30d/0x560 net/ipv6/netfilter/ip6t_SYNPROXY.c:81
 xt_check_target+0x26c/0x9e0 net/netfilter/x_tables.c:1038
 check_target net/ipv6/netfilter/ip6_tables.c:530 [inline]
 find_check_entry.constprop.0+0x7f1/0x9e0 net/ipv6/netfilter/ip6_tables.c:573
 translate_table+0xc8b/0x1750 net/ipv6/netfilter/ip6_tables.c:735
 do_replace net/ipv6/netfilter/ip6_tables.c:1153 [inline]
 do_ip6t_set_ctl+0x56e/0xb90 net/ipv6/netfilter/ip6_tables.c:1639
 nf_setsockopt+0x83/0xe0 net/netfilter/nf_sockopt.c:101
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1352 [inline]
 free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1404
 free_unref_page_prepare mm/page_alloc.c:3325 [inline]
 free_unref_page+0x19/0x690 mm/page_alloc.c:3404
 kvfree+0x42/0x50 mm/util.c:613
 rcu_do_batch kernel/rcu/tree.c:2527 [inline]
 rcu_core+0x7b1/0x1820 kernel/rcu/tree.c:2778
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558

Memory state around the buggy address:
 ffff88801c1a7f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88801c1a7f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff88801c1a8000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                   ^
 ffff88801c1a8080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88801c1a8100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Fixes: 2420b79f ("netfilter: debug: check for sorted array")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Acked-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

95d7ea49

xfrm: fix MTU regression · 1dad6848

由 Jiri Bohac 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 4952faa77d8d1c4c146ac077e13d6245738979f4
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4952faa77d8d1c4c146ac077e13d6245738979f4

--------------------------------

commit 6596a022 upstream.

Commit 749439bf ("ipv6: fix udpv6
sendmsg crash caused by too small MTU") breaks PMTU for xfrm.

A Packet Too Big ICMPv6 message received in response to an ESP
packet will prevent all further communication through the tunnel
if the reported MTU minus the ESP overhead is smaller than 1280.

E.g. in a case of a tunnel-mode ESP with sha256/aes the overhead
is 92 bytes. Receiving a PTB with MTU of 1371 or less will result
in all further packets in the tunnel dropped. A ping through the
tunnel fails with "ping: sendmsg: Invalid argument".

Apparently the MTU on the xfrm route is smaller than 1280 and
fails the check inside ip6_setup_cork() added by 749439bf.

We found this by debugging USGv6/ipv6ready failures. Failing
tests are: "Phase-2 Interoperability Test Scenario IPsec" /
5.3.11 and 5.4.11 (Tunnel Mode: Fragmentation).

Commit b515d263 ("xfrm:
xfrm_state_mtu should return at least 1280 for ipv6") attempted
to fix this but caused another regression in TCP MSS calculations
and had to be reverted.

The patch below fixes the situation by dropping the MTU
check and instead checking for the underflows described in the
749439bf commit message.
Signed-off-by: NJiri Bohac <jbohac@suse.cz>
Fixes: 749439bf ("ipv6: fix udpv6 sendmsg crash caused by too small MTU")
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1dad6848

mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls · accb2c04

由 Daniel Borkmann 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit e93f2be33d4f4c1aa350dd79b6d1179746ff4cb5
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e93f2be33d4f4c1aa350dd79b6d1179746ff4cb5

--------------------------------

commit 0708a0af upstream.

syzkaller was recently triggering an oversized kvmalloc() warning via
xdp_umem_create().

The triggered warning was added back in 7661809d ("mm: don't allow
oversized kvmalloc() calls"). The rationale for the warning for huge
kvmalloc sizes was as a reaction to a security bug where the size was
more than UINT_MAX but not everything was prepared to handle unsigned
long sizes.

Anyway, the AF_XDP related call trace from this syzkaller report was:

  kvmalloc include/linux/mm.h:806 [inline]
  kvmalloc_array include/linux/mm.h:824 [inline]
  kvcalloc include/linux/mm.h:829 [inline]
  xdp_umem_pin_pages net/xdp/xdp_umem.c:102 [inline]
  xdp_umem_reg net/xdp/xdp_umem.c:219 [inline]
  xdp_umem_create+0x6a5/0xf00 net/xdp/xdp_umem.c:252
  xsk_setsockopt+0x604/0x790 net/xdp/xsk.c:1068
  __sys_setsockopt+0x1fd/0x4e0 net/socket.c:2176
  __do_sys_setsockopt net/socket.c:2187 [inline]
  __se_sys_setsockopt net/socket.c:2184 [inline]
  __x64_sys_setsockopt+0xb5/0x150 net/socket.c:2184
  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Björn mentioned that requests for >2GB allocation can still be valid:

  The structure that is being allocated is the page-pinning accounting.
  AF_XDP has an internal limit of U32_MAX pages, which is *a lot*, but
  still fewer than what memcg allows (PAGE_COUNTER_MAX is a LONG_MAX/
  PAGE_SIZE on 64 bit systems). [...]

  I could just change from U32_MAX to INT_MAX, but as I stated earlier
  that has a hacky feeling to it. [...] From my perspective, the code
  isn't broken, with the memcg limits in consideration. [...]

Linus says:

  [...] Pretty much every time this has come up, the kernel warning has
  shown that yes, the code was broken and there really wasn't a reason
  for doing allocations that big.

  Of course, some people would be perfectly fine with the allocation
  failing, they just don't want the warning. I didn't want __GFP_NOWARN
  to shut it up originally because I wanted people to see all those
  cases, but these days I think we can just say "yeah, people can shut
  it up explicitly by saying 'go ahead and fail this allocation, don't
  warn about it'".

  So enough time has passed that by now I'd certainly be ok with [it].

Thus allow call-sites to silence such userspace triggered splats if the
allocation requests have __GFP_NOWARN. For xdp_umem_pin_pages()'s call
to kvcalloc() this is already the case, so nothing else needed there.

Fixes: 7661809d ("mm: don't allow oversized kvmalloc() calls")
Reported-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Cc: Björn Töpel <bjorn@kernel.org>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/bpf/CAJ+HfNhyfsT5cS_U9EC213ducHs9k9zNxX9+abqC0kTrPbQ0gg@mail.gmail.com
Link: https://lore.kernel.org/bpf/20211201202905.b9892171e3f5b9a60f9da251@linux-foundation.orgReviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Ackd-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

accb2c04

ntb: intel: fix port config status offset for SPR · 8da228a7

由 Dave Jiang 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 912186db092c4be979917a036ee94adbd2eb0b05
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=912186db092c4be979917a036ee94adbd2eb0b05

--------------------------------

commit d5081bf5 upstream.

The field offset for port configuration status on SPR has been changed to
bit 14 from ICX where it resides at bit 12. By chance link status detection
continued to work on SPR. This is due to bit 12 being a configuration bit
which is in sync with the status bit. Fix this by checking for a SPR device
and checking correct status bit.

Fixes: 26bfe3d0 ("ntb: intel: Add Icelake (gen4) support for Intel NTB")
Tested-by: NJerry Dai <jerry.dai@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8da228a7

thermal: core: Fix TZ_GET_TRIP NULL pointer dereference · 36443c16

由 Nicolas Cavallari 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 1c0b51e62a50e9291764d022ed44549e65d6ab9c
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1c0b51e62a50e9291764d022ed44549e65d6ab9c

--------------------------------

commit 5838a148 upstream.

Do not call get_trip_hyst() from thermal_genl_cmd_tz_get_trip() if
the thermal zone does not define one.

Fixes: 1ce50e7d ("thermal: core: genetlink support for events/cmd/sampling")
Signed-off-by: NNicolas Cavallari <nicolas.cavallari@green-communications.fr>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

36443c16

xen/netfront: destroy queues before real_num_tx_queues is zeroed · 5af73b94

由 Marek Marczykowski-Górecki 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit a1753d5c29a6fb9a8966dcf04cb4f3b71e303ae8
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a1753d5c29a6fb9a8966dcf04cb4f3b71e303ae8

--------------------------------

commit dcf4ff7a upstream.

xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to
delete queues. Since d7dac083
("net-sysfs: update the queue counts in the unregistration path"),
unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two
facts together means, that xennet_destroy_queues() called from
xennet_remove() cannot do its job, because it's called after
unregister_netdev(). This results in kfree-ing queues that are still
linked in napi, which ultimately crashes:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0 P4D 0
    Oops: 0000 [#1] PREEMPT SMP PTI
    CPU: 1 PID: 52 Comm: xenwatch Tainted: G        W         5.16.10-1.32.fc32.qubes.x86_64+ #226
    RIP: 0010:free_netdev+0xa3/0x1a0
    Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00
    RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000
    RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff
    RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050
    R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680
    FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0
    Call Trace:
     <TASK>
     xennet_remove+0x13d/0x300 [xen_netfront]
     xenbus_dev_remove+0x6d/0xf0
     __device_release_driver+0x17a/0x240
     device_release_driver+0x24/0x30
     bus_remove_device+0xd8/0x140
     device_del+0x18b/0x410
     ? _raw_spin_unlock+0x16/0x30
     ? klist_iter_exit+0x14/0x20
     ? xenbus_dev_request_and_reply+0x80/0x80
     device_unregister+0x13/0x60
     xenbus_dev_changed+0x18e/0x1f0
     xenwatch_thread+0xc0/0x1a0
     ? do_wait_intr_irq+0xa0/0xa0
     kthread+0x16b/0x190
     ? set_kthread_struct+0x40/0x40
     ret_from_fork+0x22/0x30
     </TASK>

Fix this by calling xennet_destroy_queues() from xennet_uninit(),
when real_num_tx_queues is still available. This ensures that queues are
destroyed when real_num_tx_queues is set to 0, regardless of how
unregister_netdev() was called.

Originally reported at
https://github.com/QubesOS/qubes-issues/issues/7257

Fixes: d7dac083 ("net-sysfs: update the queue counts in the unregistration path")
Cc: stable@vger.kernel.org
Signed-off-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5af73b94

drm/i915: s/JSP2/ICP2/ PCH · a5e36a7b

由 Ville Syrjälä 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit ce41d80391967c6b48f7bedf1a381237338e71e1
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ce41d80391967c6b48f7bedf1a381237338e71e1

--------------------------------

commit 08783aa7 upstream.

This JSP2 PCH actually seems to be some special Apple
specific ICP variant rather than a JSP. Make it so. Or at
least all the references to it seem to be some Apple ICL
machines. Didn't manage to find these PCI IDs in any
public chipset docs unfortunately.

The only thing we're losing here with this JSP->ICP change
is Wa_14011294188, but based on the HSD that isn't actually
needed on any ICP based design (including JSP), only TGP
based stuff (including MCC) really need it. The documented
w/a just never made that distinction because Windows didn't
want to differentiate between JSP and MCC (not sure how
they handle hpd/ddc/etc. then though...).

Cc: stable@vger.kernel.org
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4226
Fixes: 943682e3 ("drm/i915: Introduce Jasper Lake PCH")
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220224132142.12927-1-ville.syrjala@linux.intel.comAcked-by: NVivek Kasireddy <vivek.kasireddy@intel.com>
Tested-by: NTomas Bzatek <bugs@bzatek.net>
(cherry picked from commit 53581504)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a5e36a7b

iommu/amd: Recover from event log overflow · 0588c25f

由 Lennert Buytenhek 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 61a895da48443c899083c9eddd9b77484e232707
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=61a895da48443c899083c9eddd9b77484e232707

--------------------------------

commit 5ce97f4e upstream.

The AMD IOMMU logs I/O page faults and such to a ring buffer in
system memory, and this ring buffer can overflow.  The AMD IOMMU
spec has the following to say about the interrupt status bit that
signals this overflow condition:

	EventOverflow: Event log overflow. RW1C. Reset 0b. 1 = IOMMU
	event log overflow has occurred. This bit is set when a new
	event is to be written to the event log and there is no usable
	entry in the event log, causing the new event information to
	be discarded. An interrupt is generated when EventOverflow = 1b
	and MMIO Offset 0018h[EventIntEn] = 1b. No new event log
	entries are written while this bit is set. Software Note: To
	resume logging, clear EventOverflow (W1C), and write a 1 to
	MMIO Offset 0018h[EventLogEn].

The AMD IOMMU driver doesn't currently implement this recovery
sequence, meaning that if a ring buffer overflow occurs, logging
of EVT/PPR/GA events will cease entirely.

This patch implements the spec-mandated reset sequence, with the
minor tweak that the hardware seems to want to have a 0 written to
MMIO Offset 0018h[EventLogEn] first, before writing an 1 into this
field, or the IOMMU won't actually resume logging events.
Signed-off-by: NLennert Buytenhek <buytenh@arista.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/YVrSXEdW2rzEfOvk@wantstofly.orgSigned-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0588c25f

ASoC: ops: Shift tested values in snd_soc_put_volsw() by +min · 7daccf37

由 Marek Vasut 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 6951a5888165a38bb7c39a2d18f5668b2f1241c7
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6951a5888165a38bb7c39a2d18f5668b2f1241c7

--------------------------------

commit 9bdd10d5 upstream.

While the $val/$val2 values passed in from userspace are always >= 0
integers, the limits of the control can be signed integers and the $min
can be non-zero and less than zero. To correctly validate $val/$val2
against platform_max, add the $min offset to val first.

Fixes: 817f7c93 ("ASoC: ops: Reject out of bounds values in snd_soc_put_volsw()")
Signed-off-by: NMarek Vasut <marex@denx.de>
Cc: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220215130645.164025-1-marex@denx.deSigned-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7daccf37

riscv: Fix config KASAN && DEBUG_VIRTUAL · c1d4c3c1

由 Alexandre Ghiti 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit dd9dd24fd7cb5310fa1db2b1b03431c96663fa7c
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=dd9dd24fd7cb5310fa1db2b1b03431c96663fa7c

--------------------------------

commit c648c4bb upstream.

__virt_to_phys function is called very early in the boot process (ie
kasan_early_init) so it should not be instrumented by KASAN otherwise it
bugs.

Fix this by declaring phys_addr.c as non-kasan instrumentable.
Signed-off-by: NAlexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b727 (riscv: Add KASAN support)
Cc: stable@vger.kernel.org
Signed-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c1d4c3c1

riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP · dd735047

由 Alexandre Ghiti 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 7211aab2881b0a8b6a002ec2eb341b2d3cb9f003
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7211aab2881b0a8b6a002ec2eb341b2d3cb9f003

--------------------------------

commit a3d32803 upstream.

In order to get the pfn of a struct page* when sparsemem is enabled
without vmemmap, the mem_section structures need to be initialized which
happens in sparse_init.

But kasan_early_init calls pfn_to_page way before sparse_init is called,
which then tries to dereference a null mem_section pointer.

Fix this by removing the usage of this function in kasan_early_init.

Fixes: 8ad8b727 ("riscv: Add KASAN support")
Signed-off-by: NAlexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

dd735047

riscv/efi_stub: Fix get_boot_hartid_from_fdt() return value · efa42b45

由 Sunil V L 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 00fb385f0ac44cfcc8286d27c8841bc12cf5a08f
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=00fb385f0ac44cfcc8286d27c8841bc12cf5a08f

--------------------------------

commit dcf0c838 upstream.

The get_boot_hartid_from_fdt() function currently returns U32_MAX
for failure case which is not correct because U32_MAX is a valid
hartid value. This patch fixes the issue by returning error code.

Cc: <stable@vger.kernel.org>
Fixes: d7071743 ("RISC-V: Add EFI stub support.")
Signed-off-by: NSunil V L <sunilvl@ventanamicro.com>
Reviewed-by: NHeinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

efa42b45

ALSA: intel_hdmi: Fix reference to PCM buffer address · d618f0dc

由 Nickthink 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 336872601cb8eb2b09bccbae81b7354d5fbd1cca
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=336872601cb8eb2b09bccbae81b7354d5fbd1cca

--------------------------------

commit 0aa6b294 upstream.

PCM buffers might be allocated dynamically when the buffer
preallocation failed or a larger buffer is requested, and it's not
guaranteed that substream->dma_buffer points to the actually used
buffer.  The driver needs to refer to substream->runtime->dma_addr
instead for the buffer address.
Signed-off-by: Zhen Ni <nizhen@uniontech.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220302074241.30469-1-nizhen@uniontech.comSigned-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d618f0dc

tracing: Add ustring operation to filtering string pointers · e80b3601

由 Steven Rostedt 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit e57dfaf66f2b74911e45134e51b95759993fa302
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e57dfaf66f2b74911e45134e51b95759993fa302

--------------------------------

[ Upstream commit f37c3bbc ]

Since referencing user space pointers is special, if the user wants to
filter on a field that is a pointer to user space, then they need to
specify it.

Add a ".ustring" attribute to the field name for filters to state that the
field is pointing to user space such that the kernel can take the
appropriate action to read that pointer.

Link: https://lore.kernel.org/all/yt9d8rvmt2jq.fsf@linux.ibm.com/

Fixes: 77360f9b ("tracing: Add test for user space strings when filtering on string pointers")
Tested-by: NSven Schnelle <svens@linux.ibm.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e80b3601

drm/amdgpu: check vm ready by amdgpu_vm->evicting flag · 0f2e9179

由 Qiang Yu 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 4a9d2390f3e2d128b1a73279d16bb1176207a0e2
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4a9d2390f3e2d128b1a73279d16bb1176207a0e2

--------------------------------

[ Upstream commit c1a66c3b ]

Workstation application ANSA/META v21.1.4 get this error dmesg when
running CI test suite provided by ANSA/META:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
   it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
   evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
   will set amdgpu_vm->evicting, but latter due to not in visible
   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
   but fail in amdgpu_vm_bo_update_mapping() (check
   amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better clear the error log by checking
the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling
amdgpu_vm_bo_update_mapping() later.

Another reason is amdgpu_vm->evicted list holds all BOs (both
user buffer and page table), but only page table BOs' eviction
prevent VM ops. amdgpu_vm->evicting flag is set only for page
table BOs, so we should use evicting flag instead of evicted list
in amdgpu_vm_ready().

The side effect of this change is: previously blocked VM op (user
buffer in "evicted" list but no page table in it) gets done
immediately.

v2: update commit comments.
Acked-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NQiang Yu <qiang.yu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0f2e9179

ata: pata_hpt37x: fix PCI clock detection · 3f53e747

由 Sergey Shtylyov 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 67e25eb1b4749740e079d94d5f40c2287f4ca1c5
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=67e25eb1b4749740e079d94d5f40c2287f4ca1c5

--------------------------------

[ Upstream commit 5f6b0f2d ]

The f_CNT register (at the PCI config. address 0x78) is 16-bit, not
8-bit! The bug was there from the very start... :-(
Signed-off-by: NSergey Shtylyov <s.shtylyov@omp.ru>
Fixes: 669a5db4 ("[libata] Add a bunch of PATA drivers.")
Cc: stable@vger.kernel.org
Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3f53e747

serial: stm32: prevent TDR register overwrite when sending x_char · d9487024

由 Valentin Caron 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 335f11ff74f25dc5e86d89efac9adb2aa03149d4
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=335f11ff74f25dc5e86d89efac9adb2aa03149d4

--------------------------------

[ Upstream commit d3d079bd ]

When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.

This code checks if the previous value in TDR register is sent before
writing the x_char value into register.

Fixes: 48a6092f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: NValentin Caron <valentin.caron@foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d9487024

tracing: Add test for user space strings when filtering on string pointers · 0d3baacc

由 Steven Rostedt 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit c999c5927e96e51c0666fbdd78a9e6dd47fa200b
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c999c5927e96e51c0666fbdd78a9e6dd47fa200b

--------------------------------

[ Upstream commit 77360f9b ]

Pingfan reported that the following causes a fault:

  echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
  echo 1 > events/syscalls/sys_enter_at/enable

The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:

 kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0001) - permissions violation
 PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
 Oops: 0001 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 RIP: 0010:strlen+0x0/0x20
 Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
       48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
       48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
 RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
 RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
 RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
 RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
 R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
 R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
 FS:  00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  filter_pred_pchar+0x18/0x40
  filter_match_preds+0x31/0x70
  ftrace_syscall_enter+0x27a/0x2c0
  syscall_trace_enter.constprop.0+0x1aa/0x1d0
  do_syscall_64+0x16/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7fa861d88664

The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.

To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.

Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.

Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home

Cc: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Reported-by: NPingfan Liu <kernelfans@gmail.com>
Tested-by: NPingfan Liu <kernelfans@gmail.com>
Fixes: 87a342f5 ("tracing/filters: Support filtering for char * strings")
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0d3baacc

exfat: fix i_blocks for files truncated over 4 GiB · 10d6ec86

由 Christophe Vu-Brugier 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit db36a94ed66baa56f54393ad672f19b313c04ade
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=db36a94ed66baa56f54393ad672f19b313c04ade

--------------------------------

[ Upstream commit 92fba084 ]

In exfat_truncate(), the computation of inode->i_blocks is wrong if
the file is larger than 4 GiB because a 32-bit variable is used as a
mask. This is fixed and simplified by using round_up().

Also fix the same buggy computation in exfat_read_root() and another
(correct) one in exfat_fill_inode(). The latter was fixed another way
last month but can be simplified by using round_up() as well. See:

  commit 0c336d6e ("exfat: fix incorrect loading of i_blocks for
                        large files")

Fixes: 98d91704 ("exfat: add file operations")
Cc: stable@vger.kernel.org # v5.7+
Suggested-by: NMatthew Wilcox <willy@infradead.org>
Reviewed-by: NSungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: NChristophe Vu-Brugier <christophe.vu-brugier@seagate.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

10d6ec86

exfat: reuse exfat_inode_info variable instead of calling EXFAT_I() · b700ef89

由 Christophe Vu-Brugier 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 1b810d5cb6ce6fb75f32094724cd2e3a720a89b2
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1b810d5cb6ce6fb75f32094724cd2e3a720a89b2

--------------------------------

[ Upstream commit 7dee6f57 ]

Also add a local "struct exfat_inode_info *ei" variable to
exfat_truncate() to simplify the code.
Signed-off-by: NChristophe Vu-Brugier <christophe.vu-brugier@seagate.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b700ef89

net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990 · c48e8198

由 Daniele Palmas 提交于 5月 28, 2022

stable inclusion
from stable-v5.10.104
commit 00d5ac05af3a126e1fbd11a3309478b2b3b0296e
bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=00d5ac05af3a126e1fbd11a3309478b2b3b0296e

--------------------------------

[ Upstream commit 21e8a963 ]

Add quirk CDC_MBIM_FLAG_AVOID_ALTSETTING_TOGGLE for Telit FN990
0x1071 composition in order to avoid bind error.
Signed-off-by: NDaniele Palmas <dnlplm@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c48e8198

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功