提交 · 9d6803921a16f4d768dc41a75375629828f4d91e · openeuler / Kernel

08 4月, 2021 2 次提交

net: hsr: Reset MAC header for Tx path · 9d680392

由 Kurt Kanzenbach 提交于 4月 06, 2021

Reset MAC header in HSR Tx path. This is needed, because direct packet
transmission, e.g. by specifying PACKET_QDISC_BYPASS does not reset the MAC
header.

This has been observed using the following setup:

|$ ip link add name hsr0 type hsr slave1 lan0 slave2 lan1 supervision 45 version 1
|$ ifconfig hsr0 up
|$ ./test hsr0

The test binary is using mmap'ed sockets and is specifying the
PACKET_QDISC_BYPASS socket option.

This patch resolves the following warning on a non-patched kernel:

|[  112.725394] ------------[ cut here ]------------
|[  112.731418] WARNING: CPU: 1 PID: 257 at net/hsr/hsr_forward.c:560 hsr_forward_skb+0x484/0x568
|[  112.739962] net/hsr/hsr_forward.c:560: Malformed frame (port_src hsr0)

The warning can be safely removed, because the other call sites of
hsr_forward_skb() make sure that the skb is prepared correctly.

Fixes: d346a3fa ("packet: introduce PACKET_QDISC_BYPASS socket option")
Signed-off-by: NKurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d680392

net/rds: Avoid potential use after free in rds_send_remove_from_sock · 0c85a7e8

由 Aditya Pakki 提交于 4月 06, 2021

In case of rs failure in rds_send_remove_from_sock(), the 'rm' resource
is freed and later under spinlock, causing potential use-after-free.
Set the free pointer to NULL to avoid undefined behavior.
Signed-off-by: NAditya Pakki <pakki001@umn.edu>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c85a7e8

07 4月, 2021 2 次提交

ethtool: fix incorrect datatype in set_eee ops · 63cf3238

由 Wong Vee Khee 提交于 4月 06, 2021

The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool
header file. Hence, we should use ethnl_update_u32() in set_eee ops.

Fixes: fd77be7b ("ethtool: set EEE settings with EEE_SET request")
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: NWong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63cf3238

tipc: increment the tmp aead refcnt before attaching it · 2a2403ca

由 Xin Long 提交于 4月 06, 2021

Li Shuang found a NULL pointer dereference crash in her testing:

  [] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  [] RIP: 0010:tipc_crypto_rcv_complete+0xc8/0x7e0 [tipc]
  [] Call Trace:
  []  <IRQ>
  []  tipc_crypto_rcv+0x2d9/0x8f0 [tipc]
  []  tipc_rcv+0x2fc/0x1120 [tipc]
  []  tipc_udp_recv+0xc6/0x1e0 [tipc]
  []  udpv6_queue_rcv_one_skb+0x16a/0x460
  []  udp6_unicast_rcv_skb.isra.35+0x41/0xa0
  []  ip6_protocol_deliver_rcu+0x23b/0x4c0
  []  ip6_input+0x3d/0xb0
  []  ipv6_rcv+0x395/0x510
  []  __netif_receive_skb_core+0x5fc/0xc40

This is caused by NULL returned by tipc_aead_get(), and then crashed when
dereferencing it later in tipc_crypto_rcv_complete(). This might happen
when tipc_crypto_rcv_complete() is called by two threads at the same time:
the tmp attached by tipc_crypto_key_attach() in one thread may be released
by the one attached by that in the other thread.

This patch is to fix it by incrementing the tmp's refcnt before attaching
it instead of calling tipc_aead_get() after attaching it.

Fixes: fc1b6d6d ("tipc: introduce TIPC encryption & authentication")
Reported-by: NLi Shuang <shuali@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a2403ca

06 4月, 2021 3 次提交

batman-adv: initialize "struct batadv_tvlv_tt_vlan_data"->reserved field · 08c27f33

由 Tetsuo Handa 提交于 4月 05, 2021

KMSAN found uninitialized value at batadv_tt_prepare_tvlv_local_data()
[1], for commit ced72933 ("batman-adv: use CRC32C instead of CRC16
in TT code") inserted 'reserved' field into "struct batadv_tvlv_tt_data"
and commit 7ea7b4a1 ("batman-adv: make the TT CRC logic VLAN
specific") moved that field to "struct batadv_tvlv_tt_vlan_data" but left
that field uninitialized.

[1] https://syzkaller.appspot.com/bug?id=07f3e6dba96f0eb3cabab986adcd8a58b9bdbe9dReported-by: Nsyzbot <syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com>
Tested-by: Nsyzbot <syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com>
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: ced72933 ("batman-adv: use CRC32C instead of CRC16 in TT code")
Fixes: 7ea7b4a1 ("batman-adv: make the TT CRC logic VLAN specific")
Acked-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08c27f33

net-ipv6: bugfix - raw & sctp - switch to ipv6_can_nonlocal_bind() · 630e4576

由 Maciej Żenczykowski 提交于 4月 05, 2021

Found by virtue of ipv6 raw sockets not honouring the per-socket
IP{,V6}_FREEBIND setting.

Based on hits found via:
git grep '[.]ip_nonlocal_bind'
We fix both raw ipv6 sockets to honour IP{,V6}_FREEBIND and IP{,V6}_TRANSPARENT,
and we fix sctp sockets to honour IP{,V6}_TRANSPARENT (they already honoured
FREEBIND), and not just the ipv6 'ip_nonlocal_bind' sysctl.

The helper is defined as:
static inline bool ipv6_can_nonlocal_bind(struct net *net, struct inet_sock *inet) {
return net->ipv6.sysctl.ip_nonlocal_bind || inet->freebind || inet->transparent;
}
so this change only widens the accepted opt-outs and is thus a clean bugfix.

I'm not entirely sure what 'fixes' tag to add, since this is AFAICT an ancient bug,
but IMHO this should be applied to stable kernels as far back as possible.
As such I'm adding a 'fixes' tag with the commit that originally added the helper,
which happened in 4.19. Backporting to older LTS kernels (at least 4.9 and 4.14)
would presumably require open-coding it or backporting the helper as well.

Other possibly relevant commits:
v4.18-rc6-1502-g83ba4645 net: add helpers checking if socket can be bound to nonlocal address
v4.18-rc6-1431-gd0c1f011 net/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind
v4.14-rc5-271-gb71d21c2 sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND
v4.7-rc7-1883-g9b974202 sctp: support ipv6 nonlocal bind
v4.1-12247-g35a256fe ipv6: Nonlocal bind

Cc: Lorenzo Colitti <lorenzo@google.com>
Fixes: 83ba4645 ("net: add helpers checking if socket can be bound to nonlocal address")
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Reviewed-By: NLorenzo Colitti <lorenzo@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

630e4576

openvswitch: fix send of uninitialized stack memory in ct limit reply · 4d51419d

由 Ilya Maximets 提交于 4月 04, 2021

'struct ovs_zone_limit' has more members than initialized in
ovs_ct_limit_get_default_limit().  The rest of the memory is a random
kernel stack content that ends up being sent to userspace.

Fix that by using designated initializer that will clear all
non-specified fields.

Fixes: 11efd5cb ("openvswitch: Support conntrack zone limit")
Signed-off-by: NIlya Maximets <i.maximets@ovn.org>
Acked-by: NTonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d51419d

03 4月, 2021 1 次提交

net: cls_api: Fix uninitialised struct field bo->unlocked_driver_cb · 990b03b0

由 Yunjian Wang 提交于 4月 01, 2021

The 'unlocked_driver_cb' struct field in 'bo' is not being initialized
in tcf_block_offload_init(). The uninitialized 'unlocked_driver_cb'
will be used when calling unlocked_driver_cb(). So initialize 'bo' to
zero to avoid the issue.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 0fdcf78d ("net: use flow_indr_dev_setup_offload()")
Signed-off-by: NYunjian Wang <wangyunjian@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

990b03b0

02 4月, 2021 3 次提交

mptcp: revert "mptcp: provide subflow aware release function" · 0a3cc579

由 Paolo Abeni 提交于 4月 01, 2021

This change reverts commit ad98dd37 ("mptcp: provide subflow aware
release function"). The latter introduced a deadlock spotted by
syzkaller and is not needed anymore after the previous commit.

Fixes: ad98dd37 ("mptcp: provide subflow aware release function")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a3cc579

mptcp: forbit mcast-related sockopt on MPTCP sockets · 86581852

由 Paolo Abeni 提交于 4月 01, 2021

Unrolling mcast state at msk dismantel time is bug prone, as
syzkaller reported:

======================================================
WARNING: possible circular locking dependency detected
5.11.0-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor905/8822 is trying to acquire lock:
ffffffff8d678fe8 (rtnl_mutex){+.+.}-{3:3}, at: ipv6_sock_mc_close+0xd7/0x110 net/ipv6/mcast.c:323

but task is already holding lock:
ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1600 [inline]
ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: mptcp6_release+0x57/0x130 net/mptcp/protocol.c:3507

which lock already depends on the new lock.

Instead we can simply forbit any mcast-related setsockopt

Fixes: 717e79c8 ("mptcp: Add setsockopt()/getsockopt() socket operations")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86581852

net: udp: Add support for getsockopt(..., ..., UDP_GRO, ..., ...); · 98184612

由 Norman Maurer 提交于 4月 01, 2021

Support for UDP_GRO was added in the past but the implementation for
getsockopt was missed which did lead to an error when we tried to
retrieve the setting for UDP_GRO. This patch adds the missing switch
case for UDP_GRO

Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: NNorman Maurer <norman_maurer@apple.com>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98184612

01 4月, 2021 3 次提交

xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory model · 622d1369

由 Ong Boon Leong 提交于 3月 31, 2021

xdp_return_frame() may be called outside of NAPI context to return
xdpf back to page_pool. xdp_return_frame() calls __xdp_return() with
napi_direct = false. For page_pool memory model, __xdp_return() calls
xdp_return_frame_no_direct() unconditionally and below false negative
kernel BUG throw happened under preempt-rt build:

[  430.450355] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/3884
[  430.451678] caller is __xdp_return+0x1ff/0x2e0
[  430.452111] CPU: 0 PID: 3884 Comm: modprobe Tainted: G     U      E     5.12.0-rc2+ #45

Changes in v2:
 - This patch fixes the issue by making xdp_return_frame_no_direct() is
   only called if napi_direct = true, as recommended for better by
   Jesper Dangaard Brouer. Thanks!

Fixes: 2539650f ("xdp: Helpers for disabling napi_direct of xdp_return_frame")
Signed-off-by: NOng Boon Leong <boon.leong.ong@intel.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

622d1369

net/rds: Fix a use after free in rds_message_map_pages · bdc2ab5c

由 Lv Yunlong 提交于 3月 30, 2021

In rds_message_map_pages, the rm is freed by rds_message_put(rm).
But rm is still used by rm->data.op_sg in return value.

My patch assigns ERR_CAST(rm->data.op_sg) to err before the rm is
freed to avoid the uaf.

Fixes: 7dba9203 ("net/rds: Use ERR_PTR for rds_message_alloc_sgs()")
Signed-off-by: NLv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: NHåkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bdc2ab5c

neighbour: Disregard DEAD dst in neigh_update · d47ec7a0

由 Tong Zhu 提交于 3月 19, 2021

After a short network outage, the dst_entry is timed out and put
in DST_OBSOLETE_DEAD. We are in this code because arp reply comes
from this neighbour after network recovers. There is a potential
race condition that dst_entry is still in DST_OBSOLETE_DEAD.
With that, another neighbour lookup causes more harm than good.

In best case all packets in arp_queue are lost. This is
counterproductive to the original goal of finding a better path
for those packets.

I observed a worst case with 4.x kernel where a dst_entry in
DST_OBSOLETE_DEAD state is associated with loopback net_device.
It leads to an ethernet header with all zero addresses.
A packet with all zero source MAC address is quite deadly with
mac80211, ath9k and 802.11 block ack.  It fails
ieee80211_find_sta_by_ifaddr in ath9k (xmit.c). Ath9k flushes tx
queue (ath_tx_complete_aggr). BAW (block ack window) is not
updated. BAW logic is damaged and ath9k transmission is disabled.
Signed-off-by: NTong Zhu <zhutong@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d47ec7a0

31 3月, 2021 5 次提交

net: let skb_orphan_partial wake-up waiters. · 9adc89af

由 Paolo Abeni 提交于 3月 30, 2021

Currently the mentioned helper can end-up freeing the socket wmem
without waking-up any processes waiting for more write memory.

If the partially orphaned skb is attached to an UDP (or raw) socket,
the lack of wake-up can hang the user-space.

Even for TCP sockets not calling the sk destructor could have bad
effects on TSQ.

Address the issue using skb_orphan to release the sk wmem before
setting the new sock_efree destructor. Additionally bundle the
whole ownership update in a new helper, so that later other
potential users could avoid duplicate code.

v1 -> v2:
 - use skb_orphan() instead of sort of open coding it (Eric)
 - provide an helper for the ownership change (Eric)

Fixes: f6ba8d33 ("netem: fix skb_orphan_partial()")
Suggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9adc89af

sch_htb: fix null pointer dereference on a null new_q · ae81feb7

由 Yunjian Wang 提交于 3月 30, 2021

sch_htb: fix null pointer dereference on a null new_q

Currently if new_q is null, the null new_q pointer will be
dereference when 'q->offload' is true. Fix this by adding
a braces around htb_parent_to_leaf_offload() to avoid it.

Addresses-Coverity: ("Dereference after null check")
Fixes: d03b195b ("sch_htb: Hierarchical QoS hardware offload")
Signed-off-by: NYunjian Wang <wangyunjian@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae81feb7

net: qrtr: Fix memory leak on qrtr_tx_wait failure · 8a03dd92

由 Loic Poulain 提交于 3月 30, 2021

qrtr_tx_wait does not check for radix_tree_insert failure, causing
the 'flow' object to be unreferenced after qrtr_tx_wait return. Fix
that by releasing flow on radix_tree_insert failure.

Fixes: 5fdeb0d3 ("net: qrtr: Implement outgoing flow control")
Reported-by: syzbot+739016799a89c530b32a@syzkaller.appspotmail.com
Signed-off-by: NLoic Poulain <loic.poulain@linaro.org>
Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: NManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a03dd92

net: sched: bump refcount for new action in ACT replace mode · 6855e821

由 Kumar Kartikeya Dwivedi 提交于 3月 30, 2021

Currently, action creation using ACT API in replace mode is buggy.
When invoking for non-existent action index 42,

	tc action replace action bpf obj foo.o sec <xyz> index 42

kernel creates the action, fills up the netlink response, and then just
deletes the action after notifying userspace.

	tc action show action bpf

doesn't list the action.

This happens due to the following sequence when ovr = 1 (replace mode)
is enabled:

tcf_idr_check_alloc is used to atomically check and either obtain
reference for existing action at index, or reserve the index slot using
a dummy entry (ERR_PTR(-EBUSY)).

This is necessary as pointers to these actions will be held after
dropping the idrinfo lock, so bumping the reference count is necessary
as we need to insert the actions, and notify userspace by dumping their
attributes. Finally, we drop the reference we took using the
tcf_action_put_many call in tcf_action_add. However, for the case where
a new action is created due to free index, its refcount remains one.
This when paired with the put_many call leads to the kernel setting up
the action, notifying userspace of its creation, and then tearing it
down. For existing actions, the refcount is still held so they remain
unaffected.

Fortunately due to rtnl_lock serialization requirement, such an action
with refcount == 1 will not be concurrently deleted by anything else, at
best CLS API can move its refcount up and down by binding to it after it
has been published from tcf_idr_insert_many. Since refcount is atleast
one until put_many call, CLS API cannot delete it. Also __tcf_action_put
release path already ensures deterministic outcome (either new action
will be created or existing action will be reused in case CLS API tries
to bind to action concurrently) due to idr lock serialization.

We fix this by making refcount of newly created actions as 2 in ACT API
replace mode. A relaxed store will suffice as visibility is ensured only
after the tcf_idr_insert_many call.

Note that in case of creation or overwriting using CLS API only (i.e.
bind = 1), overwriting existing action object is not allowed, and any
such request is silently ignored (without error).

The refcount bump that occurs in tcf_idr_check_alloc call there for
existing action will pair with tcf_exts_destroy call made from the
owner module for the same action. In case of action creation, there
is no existing action, so no tcf_exts_destroy callback happens.

This means no code changes for CLS API.

Fixes: cae422f3 ("net: sched: use reference counting action init")
Signed-off-by: NKumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6855e821

net/ncsi: Avoid channel_monitor hrtimer deadlock · 03cb4d05

由 Milton Miller 提交于 3月 29, 2021

Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed
deadlock on SMP because stop calls del_timer_sync on the timer that
invoked channel_monitor as its timer function.

Recognise the inherent race of marking the monitor disabled before
deleting the timer by just returning if enable was cleared.  After
a timeout (the default case -- reset to START when response received)
just mark the monitor.enabled false.

If the channel has an entry on the channel_queue list, or if the
state is not ACTIVE or INACTIVE, then warn and mark the timer stopped
and don't restart, as the locking is broken somehow.

Fixes: 0795fb20 ("net/ncsi: Stop monitor if channel times out or is inactive")
Signed-off-by: NMilton Miller <miltonm@us.ibm.com>
Signed-off-by: NEddie James <eajames@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03cb4d05

30 3月, 2021 3 次提交

xfrm/compat: Cleanup WARN()s that can be user-triggered · ef19e111

由 Dmitry Safonov 提交于 3月 30, 2021

Replace WARN_ONCE() that can be triggered from userspace with
pr_warn_once(). Those still give user a hint what's the issue.

I've left WARN()s that are not possible to trigger with current
code-base and that would mean that the code has issues:
- relying on current compat_msg_min[type] <= xfrm_msg_min[type]
- expected 4-byte padding size difference between
  compat_msg_min[type] and xfrm_msg_min[type]
- compat_policy[type].len <= xfrma_policy[type].len
(for every type)

Reported-by: syzbot+834ffd1afc7212eb8147@syzkaller.appspotmail.com
Fixes: 5f3eea6b ("xfrm/compat: Attach xfrm dumps to 64=>32 bit translator")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: NDmitry Safonov <dima@arista.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

ef19e111

net:tipc: Fix a double free in tipc_sk_mcast_rcv · 6bf24dc0

由 Lv Yunlong 提交于 3月 28, 2021

In the if(skb_peek(arrvq) == skb) branch, it calls __skb_dequeue(arrvq) to get
the skb by skb = skb_peek(arrvq). Then __skb_dequeue() unlinks the skb from arrvq
and returns the skb which equals to skb_peek(arrvq). After __skb_dequeue(arrvq)
finished, the skb is freed by kfree_skb(__skb_dequeue(arrvq)) in the first time.

Unfortunately, the same skb is freed in the second time by kfree_skb(skb) after
the branch completed.

My patch removes kfree_skb() in the if(skb_peek(arrvq) == skb) branch, because
this skb will be freed by kfree_skb(skb) finally.

Fixes: cb1b7280 ("tipc: eliminate race condition at multicast reception")
Signed-off-by: NLv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6bf24dc0

net: dsa: Fix type was not set for devlink port · fb6ec87f

由 Maxim Kochetkov 提交于 3月 29, 2021

If PHY is not available on DSA port (described at devicetree but absent or
failed to detect) then kernel prints warning after 3700 secs:

[ 3707.948771] ------------[ cut here ]------------
[ 3707.948784] Type was not set for devlink port.
[ 3707.948894] WARNING: CPU: 1 PID: 17 at net/core/devlink.c:8097 0xc083f9d8

We should unregister the devlink port as a user port and
re-register it as an unused port before executing "continue" in case of
dsa_port_setup error.

Fixes: 86f8b1c0 ("net: dsa: Do not make user port errors fatal")
Signed-off-by: NMaxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fb6ec87f

29 3月, 2021 3 次提交

can: isotp: fix msg_namelen values depending on CAN_REQUIRED_SIZE · f522d955

由 Oliver Hartkopp 提交于 3月 25, 2021

Since commit f5223e9e ("can: extend sockaddr_can to include j1939
members") the sockaddr_can has been extended in size and a new
CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol
specific needed size.

The ABI for the msg_name and msg_namelen has not been adapted to the
new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to
a problem when an existing binary reads the (increased) struct
sockaddr_can in msg_name.

Fixes: e057dd3f ("can: add ISO 15765-2:2016 transport protocol")
Reported-by: NRichard Weinberger <richard@nod.at>
Acked-by: NKurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.zimbra@nod.at/T/#t
Link: https://lore.kernel.org/r/20210325125850.1620-2-socketcan@hartkopp.netSigned-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

f522d955

can: bcm/raw: fix msg_namelen values depending on CAN_REQUIRED_SIZE · 9e971474

由 Oliver Hartkopp 提交于 3月 25, 2021

Fixes: f5223e9e ("can: extend sockaddr_can to include j1939 members")
Reported-by: NRichard Weinberger <richard@nod.at>
Tested-by: NRichard Weinberger <richard@nod.at>
Acked-by: NKurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.zimbra@nod.at/T/#t
Link: https://lore.kernel.org/r/20210325125850.1620-1-socketcan@hartkopp.netSigned-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

9e971474

xfrm: Provide private skb extensions for segmented and hw offloaded ESP packets · c7dbf4c0

由 Steffen Klassert 提交于 3月 26, 2021

Commit 94579ac3 ("xfrm: Fix double ESP trailer insertion in IPsec
crypto offload.") added a XFRM_XMIT flag to avoid duplicate ESP trailer
insertion on HW offload. This flag is set on the secpath that is shared
amongst segments. This lead to a situation where some segments are
not transformed correctly when segmentation happens at layer 3.

Fix this by using private skb extensions for segmented and hw offloaded
ESP packets.

Fixes: 94579ac3 ("xfrm: Fix double ESP trailer insertion in IPsec crypto offload.")
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

c7dbf4c0

26 3月, 2021 5 次提交

nfc: Avoid endless loops caused by repeated llcp_sock_connect() · 4b5db93e

由 Xiaoming Ni 提交于 3月 25, 2021

When sock_wait_state() returns -EINPROGRESS, "sk->sk_state" is
 LLCP_CONNECTING. In this case, llcp_sock_connect() is repeatedly invoked,
 nfc_llcp_sock_link() will add sk to local->connecting_sockets twice.
 sk->sk_node->next will point to itself, that will make an endless loop
 and hang-up the system.
To fix it, check whether sk->sk_state is LLCP_CONNECTING in
 llcp_sock_connect() to avoid repeated invoking.

Fixes: b4011239 ("NFC: llcp: Fix non blocking sockets connections")
Reported-by: N"kiyin(尹亮)" <kiyin@tencent.com>
Link: https://www.openwall.com/lists/oss-security/2020/11/01/1
Cc: <stable@vger.kernel.org> #v3.11
Signed-off-by: NXiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b5db93e

nfc: fix memory leak in llcp_sock_connect() · 7574fcdb

由 Xiaoming Ni 提交于 3月 25, 2021

In llcp_sock_connect(), use kmemdup to allocate memory for
 "llcp_sock->service_name". The memory is not released in the sock_unlink
label of the subsequent failure branch.
As a result, memory leakage occurs.

fix CVE-2020-25672

Fixes: d646960f ("NFC: Initial LLCP support")
Reported-by: N"kiyin(尹亮)" <kiyin@tencent.com>
Link: https://www.openwall.com/lists/oss-security/2020/11/01/1
Cc: <stable@vger.kernel.org> #v3.3
Signed-off-by: NXiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7574fcdb

nfc: fix refcount leak in llcp_sock_connect() · 8a4cd82d

由 Xiaoming Ni 提交于 3月 25, 2021

nfc_llcp_local_get() is invoked in llcp_sock_connect(),
but nfc_llcp_local_put() is not invoked in subsequent failure branches.
As a result, refcount leakage occurs.
To fix it, add calling nfc_llcp_local_put().

fix CVE-2020-25671
Fixes: c7aa1225 ("NFC: Take a reference on the LLCP local pointer when creating a socket")
Reported-by: N"kiyin(尹亮)" <kiyin@tencent.com>
Link: https://www.openwall.com/lists/oss-security/2020/11/01/1
Cc: <stable@vger.kernel.org> #v3.6
Signed-off-by: NXiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a4cd82d

nfc: fix refcount leak in llcp_sock_bind() · c33b1cc6

由 Xiaoming Ni 提交于 3月 25, 2021

nfc_llcp_local_get() is invoked in llcp_sock_bind(),
but nfc_llcp_local_put() is not invoked in subsequent failure branches.
As a result, refcount leakage occurs.
To fix it, add calling nfc_llcp_local_put().

fix CVE-2020-25670
Fixes: c7aa1225 ("NFC: Take a reference on the LLCP local pointer when creating a socket")
Reported-by: N"kiyin(尹亮)" <kiyin@tencent.com>
Link: https://www.openwall.com/lists/oss-security/2020/11/01/1
Cc: <stable@vger.kernel.org> #v3.6
Signed-off-by: NXiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c33b1cc6

net: dsa: only unset VLAN filtering when last port leaves last VLAN-aware bridge · 479dc497

由 Vladimir Oltean 提交于 3月 24, 2021

DSA is aware of switches with global VLAN filtering since the blamed
commit, but it makes a bad decision when multiple bridges are spanning
the same switch:

ip link add br0 type bridge vlan_filtering 1
ip link add br1 type bridge vlan_filtering 1
ip link set swp2 master br0
ip link set swp3 master br0
ip link set swp4 master br1
ip link set swp5 master br1
ip link set swp5 nomaster
ip link set swp4 nomaster
[138665.939930] sja1105 spi0.1: port 3: dsa_core: VLAN filtering is a global setting
[138665.947514] DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE

When all ports leave br1, DSA blindly attempts to disable VLAN filtering
on the switch, ignoring the fact that br0 still exists and is VLAN-aware
too. It fails while doing that.

This patch checks whether any port exists at all and is under a
VLAN-aware bridge.

Fixes: d371b7c9 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NKurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

479dc497

24 3月, 2021 3 次提交

xfrm: BEET mode doesn't support fragments for inner packets · 68dc022d

由 Xin Long 提交于 3月 19, 2021

BEET mode replaces the IP(6) Headers with new IP(6) Headers when sending
packets. However, when it's a fragment before the replacement, currently
kernel keeps the fragment flag and replace the address field then encaps
it with ESP. It would cause in RX side the fragments to get reassembled
before decapping with ESP, which is incorrect.

In Xiumei's testing, these fragments went over an xfrm interface and got
encapped with ESP in the device driver, and the traffic was broken.

I don't have a good way to fix it, but only to warn this out in dmesg.
Reported-by: NXiumei Mu <xmu@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

68dc022d

net: bridge: don't notify switchdev for local FDB addresses · 6ab4c311

由 Vladimir Oltean 提交于 3月 22, 2021

As explained in this discussion:
https://lore.kernel.org/netdev/20210117193009.io3nungdwuzmo5f7@skbuf/

the switchdev notifiers for FDB entries managed to have a zero-day bug.
The bridge would not say that this entry is local:

ip link add br0 type bridge
ip link set swp0 master br0
bridge fdb add dev swp0 00:01:02:03:04:05 master local

and the switchdev driver would be more than happy to offload it as a
normal static FDB entry. This is despite the fact that 'local' and
non-'local' entries have completely opposite directions: a local entry
is locally terminated and not forwarded, whereas a static entry is
forwarded and not locally terminated. So, for example, DSA would install
this entry on swp0 instead of installing it on the CPU port as it should.

There is an even sadder part, which is that the 'local' flag is implicit
if 'static' is not specified, meaning that this command produces the
same result of adding a 'local' entry:

bridge fdb add dev swp0 00:01:02:03:04:05 master

I've updated the man pages for 'bridge', and after reading it now, it
should be pretty clear to any user that the commands above were broken
and should have never resulted in the 00:01:02:03:04:05 address being
forwarded (this behavior is coherent with non-switchdev interfaces):
https://patchwork.kernel.org/project/netdevbpf/cover/20210211104502.2081443-1-olteanv@gmail.com/
If you're a user reading this and this is what you want, just use:

bridge fdb add dev swp0 00:01:02:03:04:05 master static

Because switchdev should have given drivers the means from day one to
classify FDB entries as local/non-local, but didn't, it means that all
drivers are currently broken. So we can just as well omit the switchdev
notifications for local FDB entries, which is exactly what this patch
does to close the bug in stable trees. For further development work
where drivers might want to trap the local FDB entries to the host, we
can add a 'bool is_local' to br_switchdev_fdb_call_notifiers(), and
selectively make drivers act upon that bit, while all the others ignore
those entries if the 'is_local' bit is set.

Fixes: 6b26b51b ("net: bridge: Add support for notifying devices about FDB add/del")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ab4c311

net/sched: act_ct: clear post_ct if doing ct_clear · 8ca1b090

由 Marcelo Ricardo Leitner 提交于 3月 22, 2021

Invalid detection works with two distinct moments: act_ct tries to find
a conntrack entry and set post_ct true, indicating that that was
attempted. Then, when flow dissector tries to dissect CT info and no
entry is there, it knows that it was tried and no entry was found, and
synthesizes/sets
                  key->ct_state = TCA_FLOWER_KEY_CT_FLAGS_TRACKED |
                                  TCA_FLOWER_KEY_CT_FLAGS_INVALID;
mimicing what OVS does.

OVS has this a bit more streamlined, as it recomputes the key after
trying to find a conntrack entry for it.

Issue here is, when we have 'tc action ct clear', it didn't clear
post_ct, causing a subsequent match on 'ct_state -trk' to fail, due to
the above. The fix, thus, is to clear it.

Reproducer rules:
tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 0 \
	protocol ip flower ip_proto tcp ct_state -trk \
	action ct zone 1 pipe \
	action goto chain 2
tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 2 \
	protocol ip flower \
	action ct clear pipe \
	action goto chain 4
tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 4 \
	protocol ip flower ct_state -trk \
	action mirred egress redirect dev enp130s0f1np1_0

With the fix, the 3rd rule matches, like it does with OVS kernel
datapath.

Fixes: 7baf2429 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Nwenxu <wenxu@ucloud.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ca1b090

23 3月, 2021 2 次提交

net: dsa: don't assign an error value to tag_ops · e0c755a4

由 George McCollister 提交于 3月 22, 2021

Use a temporary variable to hold the return value from
dsa_tag_driver_get() instead of assigning it to dst->tag_ops. Leaving
an error value in dst->tag_ops can result in deferencing an invalid
pointer when a deferred switch configuration happens later.

Fixes: 357f203b ("net: dsa: keep a copy of the tagging protocol in the DSA switch tree")
Signed-off-by: NGeorge McCollister <george.mccollister@gmail.com>
Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0c755a4

net: ipconfig: ic_dev can be NULL in ic_close_devs · a50a151e

由 Vladimir Oltean 提交于 3月 22, 2021

ic_close_dev contains a generalization of the logic to not close a
network interface if it's the host port for a DSA switch. This logic is
disguised behind an iteration through the lowers of ic_dev in
ic_close_dev.

When no interface for ipconfig can be found, ic_dev is NULL, and
ic_close_dev:
- dereferences a NULL pointer when assigning selected_dev
- would attempt to search through the lower interfaces of a NULL
  net_device pointer

So we should protect against that case.

The "lower_dev" iterator variable was shortened to "lower" in order to
keep the 80 character limit.

Fixes: f68cbaed ("net: ipconfig: avoid use-after-free in ic_close_devs")
Fixes: 46acf7bd ("Revert "net: ipv4: handle DSA enabled master network devices"")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: NHeiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a50a151e

22 3月, 2021 3 次提交

esp: delete NETIF_F_SCTP_CRC bit from features for esp offload · 154deab6

由 Xin Long 提交于 3月 19, 2021

Now in esp4/6_gso_segment(), before calling inner proto .gso_segment,
NETIF_F_CSUM_MASK bits are deleted, as HW won't be able to do the
csum for inner proto due to the packet encrypted already.

So the UDP/TCP packet has to do the checksum on its own .gso_segment.
But SCTP is using CRC checksum, and for that NETIF_F_SCTP_CRC should
be deleted to make SCTP do the csum in own .gso_segment as well.

In Xiumei's testing with SCTP over IPsec/veth, the packets are kept
dropping due to the wrong CRC checksum.
Reported-by: NXiumei Mu <xmu@redhat.com>
Fixes: 7862b405 ("esp: Add gso handlers for esp4 and esp6")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

154deab6

net: xfrm: Use sequence counter with associated spinlock · bc8e0adf

由 Ahmed S. Darwish 提交于 3月 16, 2021

A sequence counter write section must be serialized or its internal
state can get corrupted. A plain seqcount_t does not contain the
information of which lock must be held to guaranteee write side
serialization.

For xfrm_state_hash_generation, use seqcount_spinlock_t instead of plain
seqcount_t.  This allows to associate the spinlock used for write
serialization with the sequence counter. It thus enables lockdep to
verify that the write serialization lock is indeed held before entering
the sequence counter write section.

If lockdep is disabled, this lock association is compiled out and has
neither storage size nor runtime overhead.
Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

bc8e0adf

net: xfrm: Localize sequence counter per network namespace · e88add19

由 Ahmed S. Darwish 提交于 3月 16, 2021

A sequence counter write section must be serialized or its internal
state can get corrupted. The "xfrm_state_hash_generation" seqcount is
global, but its write serialization lock (net->xfrm.xfrm_state_lock) is
instantiated per network namespace. The write protection is thus
insufficient.

To provide full protection, localize the sequence counter per network
namespace instead. This should be safe as both the seqcount read and
write sections access data exclusively within the network namespace. It
also lays the foundation for transforming "xfrm_state_hash_generation"
data type from seqcount_t to seqcount_LOCKNAME_t in further commits.

Fixes: b65e3d7b ("xfrm: state: add sequence count to detect hash resizes")
Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

e88add19

21 3月, 2021 1 次提交

can: isotp: tx-path: zero initialize outgoing CAN frames · b5f020f8

由 Oliver Hartkopp 提交于 3月 19, 2021

Commit d4eb538e ("can: isotp: TX-path: ensure that CAN frame flags are
initialized") ensured the TX flags to be properly set for outgoing CAN
frames.

In fact the root cause of the issue results from a missing initialization
of outgoing CAN frames created by isotp. This is no problem on the CAN bus
as the CAN driver only picks the correctly defined content from the struct
can(fd)_frame. But when the outgoing frames are monitored (e.g. with
candump) we potentially leak some bytes in the unused content of
struct can(fd)_frame.

Fixes: e057dd3f ("can: add ISO 15765-2:2016 transport protocol")
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20210319100619.10858-1-socketcan@hartkopp.netSigned-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

b5f020f8

20 3月, 2021 1 次提交

selinux: vsock: Set SID for socket returned by accept() · 1f935e8e

由 David Brazdil 提交于 3月 19, 2021

For AF_VSOCK, accept() currently returns sockets that are unlabelled.
Other socket families derive the child's SID from the SID of the parent
and the SID of the incoming packet. This is typically done as the
connected socket is placed in the queue that accept() removes from.

Reuse the existing 'security_sk_clone' hook to copy the SID from the
parent (server) socket to the child. There is no packet SID in this
case.

Fixes: d021c344 ("VSOCK: Introduce VM Sockets")
Signed-off-by: NDavid Brazdil <dbrazdil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f935e8e

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功