提交 · b8194d785d6f974174a7be07fbe5a726667b07fc · openeuler / Kernel

17 12月, 2022 2 次提交

kvm: rename last argument to kvm_get_dirty_log_protect · b8194d78

由 Paolo Bonzini 提交于 10月 23, 2018

mainline inclusion
from mainline-v5.0
commit: 8fe65a82
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I66COX
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=8fe65a8299f9e1f40cb95308ab7b3c4ad80bf801

--------------------------------

When manual dirty log reprotect will be enabled, kvm_get_dirty_log_protect's
pointer argument will always be false on exit, because no TLB flush is needed
until the manual re-protection operation.  Rename it from "is_dirty" to "flush",
which more accurately tells the caller what they have to do with it.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b8194d78

kvm: make KVM_CAP_ENABLE_CAP_VM architecture agnostic · 5694b2e4

由 Paolo Bonzini 提交于 2月 16, 2017

mainline inclusion
from mainline-v5.0
commit: e5d83c74
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I66COX
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=e5d83c74a5800c2a1fa3ba982c1c4b2b39ae6db2

--------------------------------

The first such capability to be handled in virt/kvm/ will be manual
dirty page reprotection.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5694b2e4

08 12月, 2022 26 次提交

mm/sharepool: Fix a double free problem caused by init_local_group · e7162989

由 Chen Jun 提交于 12月 08, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I64Y5Y
CVE: NA

-------------------------------

If local_group_add_task fails in init_local_group. ida free the
same id twice.

init_local_group
  local_group_add_task    // failed
  goto free_spg

free_spg:
  free_sp_group_locked
    free_sp_group_id      // free spg->id
free_spg_id:
  free_new_spg_id         // double free spg->id

To fix it, return before calling free_new_spg_id.
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
Reviewed-by: Nchenweilong <chenweilong@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

e7162989

bpf, test_run: Fix alignment problem in bpf_prog_test_run_skb() · 12f3b19c

由 Baisong Zhong 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 730fb1ef974a13915bc7651364d8b3318891cd70
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit d3fd203f upstream.

We got a syzkaller problem because of aarch64 alignment fault
if KFENCE enabled. When the size from user bpf program is an odd
number, like 399, 407, etc, it will cause the struct skb_shared_info's
unaligned access. As seen below:

  BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032

  Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
   __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
   arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
   arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
   atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
   __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
   skb_clone+0xf4/0x214 net/core/skbuff.c:1481
   ____bpf_clone_redirect net/core/filter.c:2433 [inline]
   bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
   bpf_prog_d3839dd9068ceb51+0x80/0x330
   bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
   bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
   bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381

  kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, cache=kmalloc-512

  allocated by task 15074 on cpu 0 at 1342.585390s:
   kmalloc include/linux/slab.h:568 [inline]
   kzalloc include/linux/slab.h:675 [inline]
   bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
   bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
   __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381

To fix the problem, we adjust @size so that (@size + @hearoom) is a
multiple of SMP_CACHE_BYTES. So we make sure the struct skb_shared_info
is aligned to a cache line.

Fixes: 1cf1cae9 ("bpf: introduce BPF_PROG_TEST_RUN command")
Signed-off-by: NBaisong Zhong <zhongbaisong@huawei.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20221102081620.1465154-1-zhongbaisong@huawei.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

12f3b19c

macvlan: enforce a consistent minimal mtu · 8364d44b

由 Eric Dumazet 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 650137a7c0b2892df2e5b0bc112d7b09a78c93c8
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit b64085b0 upstream.

macvlan should enforce a minimal mtu of 68, even at link creation.

This patch avoids the current behavior (which could lead to crashes
in ipv6 stack if the link is brought up)

$ ip link add macvlan1 link eno1 mtu 8 type macvlan  # This should fail !
$ ip link sh dev macvlan1
5: macvlan1@eno1: <BROADCAST,MULTICAST> mtu 8 qdisc noop
    state DOWN mode DEFAULT group default qlen 1000
    link/ether 02:47:6c:24:74:82 brd ff:ff:ff:ff:ff:ff
$ ip link set macvlan1 mtu 67
Error: mtu less than device minimum.
$ ip link set macvlan1 mtu 68
$ ip link set macvlan1 mtu 8
Error: mtu less than device minimum.

Fixes: 91572088 ("net: use core MTU range checking in core net infra")
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

8364d44b

net: macvlan: fix memory leaks of macvlan_common_newlink · aa800fa8

由 Chuang Wang 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit a81b44d1df1f07f00c0dcc0a0b3d2fa24a46289e
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit 23569b56 ]

kmemleak reports memory leaks in macvlan_common_newlink, as follows:

 ip link add link eth0 name .. type macvlan mode source macaddr add
 <MAC-ADDR>

kmemleak reports:

unreferenced object 0xffff8880109bb140 (size 64):
  comm "ip", pid 284, jiffies 4294986150 (age 430.108s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 b8 aa 5a 12 80 88 ff ff  ..........Z.....
    80 1b fa 0d 80 88 ff ff 1e ff ac af c7 c1 6b 6b  ..............kk
  backtrace:
    [<ffffffff813e06a7>] kmem_cache_alloc_trace+0x1c7/0x300
    [<ffffffff81b66025>] macvlan_hash_add_source+0x45/0xc0
    [<ffffffff81b66a67>] macvlan_changelink_sources+0xd7/0x170
    [<ffffffff81b6775c>] macvlan_common_newlink+0x38c/0x5a0
    [<ffffffff81b6797e>] macvlan_newlink+0xe/0x20
    [<ffffffff81d97f8f>] __rtnl_newlink+0x7af/0xa50
    [<ffffffff81d98278>] rtnl_newlink+0x48/0x70
    ...

In the scenario where the macvlan mode is configured as 'source',
macvlan_changelink_sources() will be execured to reconfigure list of
remote source mac addresses, at the same time, if register_netdevice()
return an error, the resource generated by macvlan_changelink_sources()
is not cleaned up.

Using this patch, in the case of an error, it will execute
macvlan_flush_sources() to ensure that the resource is cleaned up.

Fixes: aa5fd0fb ("driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.")
Signed-off-by: NChuang Wang <nashuiliang@gmail.com>
Link: https://lore.kernel.org/r/20221109090735.690500-1-nashuiliang@gmail.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

aa800fa8

ipv6: addrlabel: fix infoleak when sending struct ifaddrlblmsg to network · ebaa932a

由 Alexander Potapenko 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 6d26d0587abccb9835382a0b53faa7b9b1cd83e3
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit c23fb2c8 ]

When copying a `struct ifaddrlblmsg` to the network, __ifal_reserved
remained uninitialized, resulting in a 1-byte infoleak:

  BUG: KMSAN: kernel-network-infoleak in __netdev_start_xmit ./include/linux/netdevice.h:4841
   __netdev_start_xmit ./include/linux/netdevice.h:4841
   netdev_start_xmit ./include/linux/netdevice.h:4857
   xmit_one net/core/dev.c:3590
   dev_hard_start_xmit+0x1dc/0x800 net/core/dev.c:3606
   __dev_queue_xmit+0x17e8/0x4350 net/core/dev.c:4256
   dev_queue_xmit ./include/linux/netdevice.h:3009
   __netlink_deliver_tap_skb net/netlink/af_netlink.c:307
   __netlink_deliver_tap+0x728/0xad0 net/netlink/af_netlink.c:325
   netlink_deliver_tap net/netlink/af_netlink.c:338
   __netlink_sendskb net/netlink/af_netlink.c:1263
   netlink_sendskb+0x1d9/0x200 net/netlink/af_netlink.c:1272
   netlink_unicast+0x56d/0xf50 net/netlink/af_netlink.c:1360
   nlmsg_unicast ./include/net/netlink.h:1061
   rtnl_unicast+0x5a/0x80 net/core/rtnetlink.c:758
   ip6addrlbl_get+0xfad/0x10f0 net/ipv6/addrlabel.c:628
   rtnetlink_rcv_msg+0xb33/0x1570 net/core/rtnetlink.c:6082
  ...
  Uninit was created at:
   slab_post_alloc_hook+0x118/0xb00 mm/slab.h:742
   slab_alloc_node mm/slub.c:3398
   __kmem_cache_alloc_node+0x4f2/0x930 mm/slub.c:3437
   __do_kmalloc_node mm/slab_common.c:954
   __kmalloc_node_track_caller+0x117/0x3d0 mm/slab_common.c:975
   kmalloc_reserve net/core/skbuff.c:437
   __alloc_skb+0x27a/0xab0 net/core/skbuff.c:509
   alloc_skb ./include/linux/skbuff.h:1267
   nlmsg_new ./include/net/netlink.h:964
   ip6addrlbl_get+0x490/0x10f0 net/ipv6/addrlabel.c:608
   rtnetlink_rcv_msg+0xb33/0x1570 net/core/rtnetlink.c:6082
   netlink_rcv_skb+0x299/0x550 net/netlink/af_netlink.c:2540
   rtnetlink_rcv+0x26/0x30 net/core/rtnetlink.c:6109
   netlink_unicast_kernel net/netlink/af_netlink.c:1319
   netlink_unicast+0x9ab/0xf50 net/netlink/af_netlink.c:1345
   netlink_sendmsg+0xebc/0x10f0 net/netlink/af_netlink.c:1921
  ...

This patch ensures that the reserved field is always initialized.

Reported-by: syzbot+3553517af6020c4f2813f1003fe76ef3cbffe98d@syzkaller.appspotmail.com
Fixes: 2a8cc6c8 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.")
Signed-off-by: NAlexander Potapenko <glider@google.com>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

ebaa932a

net: gso: fix panic on frag_list with mixed head alloc types · d6f805ed

由 Jiri Benc 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit bd5362e58721e4d0d1a37796593bd6e51536ce7a
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit 9e4b7a99 ]

Since commit 3dcbdb13 ("net: gso: Fix skb_segment splat when
splitting gso_size mangled skb having linear-headed frag_list"), it is
allowed to change gso_size of a GRO packet. However, that commit assumes
that "checking the first list_skb member suffices; i.e if either of the
list_skb members have non head_frag head, then the first one has too".

It turns out this assumption does not hold. We've seen BUG_ON being hit
in skb_segment when skbs on the frag_list had differing head_frag with
the vmxnet3 driver. This happens because __netdev_alloc_skb and
__napi_alloc_skb can return a skb that is page backed or kmalloced
depending on the requested size. As the result, the last small skb in
the GRO packet can be kmalloced.

There are three different locations where this can be fixed:

(1) We could check head_frag in GRO and not allow GROing skbs with
different head_frag. However, that would lead to performance
regression on normal forward paths with unmodified gso_size, where
!head_frag in the last packet is not a problem.

(2) Set a flag in bpf_skb_net_grow and bpf_skb_net_shrink indicating
that NETIF_F_SG is undesirable. That would need to eat a bit in
sk_buff. Furthermore, that flag can be unset when all skbs on the
frag_list are page backed. To retain good performance,
bpf_skb_net_grow/shrink would have to walk the frag_list.

(3) Walk the frag_list in skb_segment when determining whether
NETIF_F_SG should be cleared. This of course slows things down.

This patch implements (3). To limit the performance impact in
skb_segment, the list is walked only for skbs with SKB_GSO_DODGY set
that have gso_size changed. Normal paths thus will not hit it.

We could check only the last skb but since we need to walk the whole
list anyway, let's stay on the safe side.

Fixes: 3dcbdb13 ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list")
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Reviewed-by: NWillem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/e04426a6a91baf4d1081e1b478c82b5de25fdf21.1667407944.git.jbenc@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

d6f805ed

tcp/udp: Make early_demux back namespacified. · 41b39f2b

由 Kuniyuki Iwashima 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.265
commit 7162f05f1f0f2b9554ea207842a1389c067cc1db
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 11052589 upstream.

Commit e21145a9 ("ipv4: namespacify ip_early_demux sysctl knob") made
it possible to enable/disable early_demux on a per-netns basis.  Then, we
introduced two knobs, tcp_early_demux and udp_early_demux, to switch it for
TCP/UDP in commit dddb64bc ("net: Add sysctl to toggle early demux for
tcp and udp").  However, the .proc_handler() was wrong and actually
disabled us from changing the behaviour in each netns.

We can execute early_demux if net.ipv4.ip_early_demux is on and each proto
.early_demux() handler is not NULL.  When we toggle (tcp|udp)_early_demux,
the change itself is saved in each netns variable, but the .early_demux()
handler is a global variable, so the handler is switched based on the
init_net's sysctl variable.  Thus, netns (tcp|udp)_early_demux knobs have
nothing to do with the logic.  Whether we CAN execute proto .early_demux()
is always decided by init_net's sysctl knob, and whether we DO it or not is
by each netns ip_early_demux knob.

This patch namespacifies (tcp|udp)_early_demux again.  For now, the users
of the .early_demux() handler are TCP and UDP only, and they are called
directly to avoid retpoline.  So, we can remove the .early_demux() handler
from inet6?_protos and need not dereference them in ip6?_rcv_finish_core().
If another proto needs .early_demux(), we can restore it at that time.

Fixes: dddb64bc ("net: Add sysctl to toggle early demux for tcp and udp")
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20220713175207.7727-1-kuniyu@amazon.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

41b39f2b

ipv6: fix WARNING in ip6_route_net_exit_late() · 05a9ea59

由 Zhengchao Shao 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.265
commit 83fbf246ced54dadd7b9adc2a16efeff30ba944d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit 768b3c74 ]

During the initialization of ip6_route_net_init_late(), if file
ipv6_route or rt6_stats fails to be created, the initialization is
successful by default. Therefore, the ipv6_route or rt6_stats file
doesn't be found during the remove in ip6_route_net_exit_late(). It
will cause WRNING.

The following is the stack information:
name 'rt6_stats'
WARNING: CPU: 0 PID: 9 at fs/proc/generic.c:712 remove_proc_entry+0x389/0x460
Modules linked in:
Workqueue: netns cleanup_net
RIP: 0010:remove_proc_entry+0x389/0x460
PKRU: 55555554
Call Trace:
<TASK>
ops_exit_list+0xb0/0x170
cleanup_net+0x4ea/0xb00
process_one_work+0x9bf/0x1710
worker_thread+0x665/0x1080
kthread+0x2e4/0x3a0
ret_from_fork+0x1f/0x30
</TASK>

Fixes: cdb18761 ("[NETNS][IPV6] route6 - create route6 proc files for the namespace")
Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221102020610.351330-1-shaozhengchao@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

05a9ea59

net, neigh: Fix null-ptr-deref in neigh_table_clear() · d31519a4

由 Chen Zhongjin 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.265
commit b736592de2aa53aee2d48d6b129bc0c892007bbe
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit f8017317 ]

When IPv6 module gets initialized but hits an error in the middle,
kenel panic with:

KASAN: null-ptr-deref in range [0x0000000000000598-0x000000000000059f]
CPU: 1 PID: 361 Comm: insmod
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
RIP: 0010:__neigh_ifdown.isra.0+0x24b/0x370
RSP: 0018:ffff888012677908 EFLAGS: 00000202
...
Call Trace:
 <TASK>
 neigh_table_clear+0x94/0x2d0
 ndisc_cleanup+0x27/0x40 [ipv6]
 inet6_init+0x21c/0x2cb [ipv6]
 do_one_initcall+0xd3/0x4d0
 do_init_module+0x1ae/0x670
...
Kernel panic - not syncing: Fatal exception

When ipv6 initialization fails, it will try to cleanup and calls:

neigh_table_clear()
  neigh_ifdown(tbl, NULL)
    pneigh_queue_purge(&tbl->proxy_queue, dev_net(dev == NULL))
    # dev_net(NULL) triggers null-ptr-deref.

Fix it by passing NULL to pneigh_queue_purge() in neigh_ifdown() if dev
is NULL, to make kernel not panic immediately.

Fixes: 66ba215c ("neigh: fix possible DoS due to net iface start/stop loop")
Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NDenis V. Lunev <den@openvz.org>
Link: https://lore.kernel.org/r/20221101121552.21890-1-chenzhongjin@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

d31519a4

tcp: fix indefinite deferral of RTO with SACK reneging · e6d23a4b

由 Neal Cardwell 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.264
commit 633da7b30b240000b1d9b690e43848406a0d060f
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit 3d2af9cc ]

This commit fixes a bug that can cause a TCP data sender to repeatedly
defer RTOs when encountering SACK reneging.

The bug is that when we're in fast recovery in a scenario with SACK
reneging, every time we get an ACK we call tcp_check_sack_reneging()
and it can note the apparent SACK reneging and rearm the RTO timer for
srtt/2 into the future. In some SACK reneging scenarios that can
happen repeatedly until the receive window fills up, at which point
the sender can't send any more, the ACKs stop arriving, and the RTO
fires at srtt/2 after the last ACK. But that can take far too long
(O(10 secs)), since the connection is stuck in fast recovery with a
low cwnd that cannot grow beyond ssthresh, even if more bandwidth is
available.

This fix changes the logic in tcp_check_sack_reneging() to only rearm
the RTO timer if data is cumulatively ACKed, indicating forward
progress. This avoids this kind of nearly infinite loop of RTO timer
re-arming. In addition, this meets the goals of
tcp_check_sack_reneging() in handling Windows TCP behavior that looks
temporarily like SACK reneging but is not really.

Many thanks to Jakub Kicinski and Neil Spring, who reported this issue
and provided critical packet traces that enabled root-causing this
issue. Also, many thanks to Jakub Kicinski for testing this fix.

Fixes: 5ae344c9 ("tcp: reduce spurious retransmits due to transient SACK reneging")
Reported-by: NJakub Kicinski <kuba@kernel.org>
Reported-by: NNeil Spring <ntspring@fb.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Tested-by: NJakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20221021170821.1093930-1-ncardwell.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

e6d23a4b

net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed · c50c8779

由 Zhengchao Shao 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.264
commit 5a2ea549be94924364f6911227d99be86e8cf34a
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit d266935a ]

When the ops_init() interface is invoked to initialize the net, but
ops->init() fails, data is released. However, the ptr pointer in
net->gen is invalid. In this case, when nfqnl_nf_hook_drop() is invoked
to release the net, invalid address access occurs.

The process is as follows:
setup_net()
	ops_init()
		data = kzalloc(...)   ---> alloc "data"
		net_assign_generic()  ---> assign "date" to ptr in net->gen
		...
		ops->init()           ---> failed
		...
		kfree(data);          ---> ptr in net->gen is invalid
	...
	ops_exit_list()
		...
		nfqnl_nf_hook_drop()
			*q = nfnl_queue_pernet(net) ---> q is invalid

The following is the Call Trace information:
BUG: KASAN: use-after-free in nfqnl_nf_hook_drop+0x264/0x280
Read of size 8 at addr ffff88810396b240 by task ip/15855
Call Trace:
<TASK>
dump_stack_lvl+0x8e/0xd1
print_report+0x155/0x454
kasan_report+0xba/0x1f0
nfqnl_nf_hook_drop+0x264/0x280
nf_queue_nf_hook_drop+0x8b/0x1b0
__nf_unregister_net_hook+0x1ae/0x5a0
nf_unregister_net_hooks+0xde/0x130
ops_exit_list+0xb0/0x170
setup_net+0x7ac/0xbd0
copy_net_ns+0x2e6/0x6b0
create_new_namespaces+0x382/0xa50
unshare_nsproxy_namespaces+0xa6/0x1c0
ksys_unshare+0x3a4/0x7e0
__x64_sys_unshare+0x2d/0x40
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>

Allocated by task 15855:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_kmalloc+0xa1/0xb0
__kmalloc+0x49/0xb0
ops_init+0xe7/0x410
setup_net+0x5aa/0xbd0
copy_net_ns+0x2e6/0x6b0
create_new_namespaces+0x382/0xa50
unshare_nsproxy_namespaces+0xa6/0x1c0
ksys_unshare+0x3a4/0x7e0
__x64_sys_unshare+0x2d/0x40
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0

Freed by task 15855:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x40
____kasan_slab_free+0x155/0x1b0
slab_free_freelist_hook+0x11b/0x220
__kmem_cache_free+0xa4/0x360
ops_init+0xb9/0x410
setup_net+0x5aa/0xbd0
copy_net_ns+0x2e6/0x6b0
create_new_namespaces+0x382/0xa50
unshare_nsproxy_namespaces+0xa6/0x1c0
ksys_unshare+0x3a4/0x7e0
__x64_sys_unshare+0x2d/0x40
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: f875bae0 ("net: Automatically allocate per namespace data.")
Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

c50c8779

serial: 8250: Flush DMA Rx on RLSI · 176d2de6

由 Ilpo Järvinen 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 40f5fa26c11bca5c947d218cc4fe6e0c64932070
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 1980860e upstream.

Returning true from handle_rx_dma() without flushing DMA first creates
a data ordering hazard. If DMA Rx has handled any character at the
point when RLSI occurs, the non-DMA path handles any pending characters
jumping them ahead of those characters that are pending under DMA.

Fixes: 75df022b ("serial: 8250_dma: Fix RX handling")
Cc: <stable@vger.kernel.org>
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20221108121952.5497-5-ilpo.jarvinen@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

176d2de6

serial: 8250: Fall back to non-DMA Rx if IIR_RDI occurs · 80703cbe

由 Ilpo Järvinen 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 62cda857457c7de0922852d54d69b140bd6eeb7e
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit a931237c upstream.

DW UART sometimes triggers IIR_RDI during DMA Rx when IIR_RX_TIMEOUT
should have been triggered instead. Since IIR_RDI has higher priority
than IIR_RX_TIMEOUT, this causes the Rx to hang into interrupt loop.
The problem seems to occur at least with some combinations of
small-sized transfers (I've reproduced the problem on Elkhart Lake PSE
UARTs).

If there's already an on-going Rx DMA and IIR_RDI triggers, fall
graciously back to non-DMA Rx. That is, behave as if IIR_RX_TIMEOUT had
occurred.

8250_omap already considers IIR_RDI similar to this change so its
nothing unheard of.

Fixes: 75df022b ("serial: 8250_dma: Fix RX handling")
Cc: <stable@vger.kernel.org>
Co-developed-by: NSrikanth Thokala <srikanth.thokala@intel.com>
Signed-off-by: NSrikanth Thokala <srikanth.thokala@intel.com>
Co-developed-by: NAman Kumar <aman.kumar@intel.com>
Signed-off-by: NAman Kumar <aman.kumar@intel.com>
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20221108121952.5497-2-ilpo.jarvinen@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

80703cbe

capabilities: fix potential memleak on error path from vfs_getxattr_alloc() · 779c11cb

由 Gaosheng Cui 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.265
commit 90577bcc01c4188416a47269f8433f70502abe98
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 8cf0a1bc upstream.

In cap_inode_getsecurity(), we will use vfs_getxattr_alloc() to
complete the memory allocation of tmpbuf, if we have completed
the memory allocation of tmpbuf, but failed to call handler->get(...),
there will be a memleak in below logic:

  |-- ret = (int)vfs_getxattr_alloc(mnt_userns, ...)
    |           /* ^^^ alloc for tmpbuf */
    |-- value = krealloc(*xattr_value, error + 1, flags)
    |           /* ^^^ alloc memory */
    |-- error = handler->get(handler, ...)
    |           /* error! */
    |-- *xattr_value = value
    |           /* xattr_value is &tmpbuf (memory leak!) */

So we will try to free(tmpbuf) after vfs_getxattr_alloc() fails to fix it.

Cc: stable@vger.kernel.org
Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities")
Signed-off-by: NGaosheng Cui <cuigaosheng1@huawei.com>
Acked-by: NSerge Hallyn <serge@hallyn.com>
[PM: subject line and backtrace tweaks]
Signed-off-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

779c11cb

security: commoncap: fix -Wstringop-overread warning · 7e2d8bfc

由 Arnd Bergmann 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.191
commit 2f34dd12fd7a28888286924d74c0313532bc52d8
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 82e5d8cc upstream.

gcc-11 introdces a harmless warning for cap_inode_getsecurity:

security/commoncap.c: In function ‘cap_inode_getsecurity’:
security/commoncap.c:440:33: error: ‘memcpy’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread]
  440 |                                 memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32);
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The problem here is that tmpbuf is initialized to NULL, so gcc assumes
it is not accessible unless it gets set by vfs_getxattr_alloc().  This is
a legitimate warning as far as I can tell, but the code is correct since
it correctly handles the error when that function fails.

Add a separate NULL check to tell gcc about it as well.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NChristian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: NJames Morris <jamorris@linux.microsoft.com>
Cc: Andrey Zhizhikin <andrey.z@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

7e2d8bfc

ring_buffer: Do not deactivate non-existant pages · ee1cf85b

由 Daniil Tatianin 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 455ea324770205525cbc13f155806a5346794339
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 56f4ca0a upstream.

rb_head_page_deactivate() expects cpu_buffer to contain a valid list of
->pages, so verify that the list is actually present before calling it.

Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.

Link: https://lkml.kernel.org/r/20221114143129.3534443-1-d-tatianin@yandex-team.ru

Cc: stable@vger.kernel.org
Fixes: 77ae365e ("ring-buffer: make lockless")
Signed-off-by: NDaniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

ee1cf85b

ftrace: Fix null pointer dereference in ftrace_add_mod() · 33ffa638

由 Xiu Jianfeng 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit b5bfc61f541d3f092b13dedcfe000d86eb8e133c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 19ba6c8a upstream.

The @ftrace_mod is allocated by kzalloc(), so both the members {prev,next}
of @ftrace_mode->list are NULL, it's not a valid state to call list_del().
If kstrdup() for @ftrace_mod->{func|module} fails, it goes to @out_free
tag and calls free_ftrace_mod() to destroy @ftrace_mod, then list_del()
will write prev->next and next->prev, where null pointer dereference
happens.

BUG: kernel NULL pointer dereference, address: 0000000000000008
Oops: 0002 [#1] PREEMPT SMP NOPTI
Call Trace:
 <TASK>
 ftrace_mod_callback+0x20d/0x220
 ? do_filp_open+0xd9/0x140
 ftrace_process_regex.isra.51+0xbf/0x130
 ftrace_regex_write.isra.52.part.53+0x6e/0x90
 vfs_write+0xee/0x3a0
 ? __audit_filter_op+0xb1/0x100
 ? auditd_test_task+0x38/0x50
 ksys_write+0xa5/0xe0
 do_syscall_64+0x3a/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Kernel panic - not syncing: Fatal exception

So call INIT_LIST_HEAD() to initialize the list member to fix this issue.

Link: https://lkml.kernel.org/r/20221116015207.30858-1-xiujianfeng@huawei.com

Cc: stable@vger.kernel.org
Fixes: 673feb9d ("ftrace: Add :mod: caching infrastructure to trace_array")
Signed-off-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

33ffa638

ftrace: Optimize the allocation for mcount entries · 3d193fb1

由 Wang Wensheng 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit d110bb57a7e9831465aa3abb6c0d1cc658b05fbe
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit bcea02b0 upstream.

If we can't allocate this size, try something smaller with half of the
size. Its order should be decreased by one instead of divided by two.

Link: https://lkml.kernel.org/r/20221109094434.84046-3-wangwensheng4@huawei.com

Cc: <mhiramat@kernel.org>
Cc: <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Fixes: a7900875 ("ftrace: Allocate the mcount record pages as groups")
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>
Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

3d193fb1

kprobe: reverse kp->flags when arm_kprobe failed · 04150923

由 Li Qiang 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.265
commit d608ed66abfaccc233404be2583ab89c37e560fc
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 4a6f316d upstream.

In aggregate kprobe case, when arm_kprobe failed,
we need set the kp->flags with KPROBE_FLAG_DISABLED again.
If not, the 'kp' kprobe will been considered as enabled
but it actually not enabled.

Link: https://lore.kernel.org/all/20220902155820.34755-1-liq3ea@163.com/

Fixes: 12310e34 ("kprobes: Propagate error from arm_kprobe_ftrace()")
Cc: stable@vger.kernel.org
Signed-off-by: NLi Qiang <liq3ea@163.com>
Acked-by: NMasami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: NMasami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

04150923

mm: fs: initialize fsdata passed to write_begin/write_end interface · c01f46a9

由 Alexander Potapenko 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 8a5be2948f350d34b1f6acb9ca3be4c89359a057
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 1468c6f4 upstream.

Functions implementing the a_ops->write_end() interface accept the `void
*fsdata` parameter that is supposed to be initialized by the corresponding
a_ops->write_begin() (which accepts `void **fsdata`).

However not all a_ops->write_begin() implementations initialize `fsdata`
unconditionally, so it may get passed uninitialized to a_ops->write_end(),
resulting in undefined behavior.

Fix this by initializing fsdata with NULL before the call to
write_begin(), rather than doing so in all possible a_ops implementations.

This patch covers only the following cases found by running x86 KMSAN
under syzkaller:

 - generic_perform_write()
 - cont_expand_zero() and generic_cont_expand_simple()
 - page_symlink()

Other cases of passing uninitialized fsdata may persist in the codebase.

Link: https://lkml.kernel.org/r/20220915150417.722975-43-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

c01f46a9

nfs4: Fix kmemleak when allocate slot failed · 920f74ac

由 Zhang Xiaoxu 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.265
commit 86ce0e93cf6fb4d0c447323ac66577c642628b9d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit 7e843672 ]

If one of the slot allocate failed, should cleanup all the other
allocated slots, otherwise, the allocated slots will leak:

  unreferenced object 0xffff8881115aa100 (size 64):
    comm ""mount.nfs"", pid 679, jiffies 4294744957 (age 115.037s)
    hex dump (first 32 bytes):
      00 cc 19 73 81 88 ff ff 00 a0 5a 11 81 88 ff ff  ...s......Z.....
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<000000007a4c434a>] nfs4_find_or_create_slot+0x8e/0x130
      [<000000005472a39c>] nfs4_realloc_slot_table+0x23f/0x270
      [<00000000cd8ca0eb>] nfs40_init_client+0x4a/0x90
      [<00000000128486db>] nfs4_init_client+0xce/0x270
      [<000000008d2cacad>] nfs4_set_client+0x1a2/0x2b0
      [<000000000e593b52>] nfs4_create_server+0x300/0x5f0
      [<00000000e4425dd2>] nfs4_try_get_tree+0x65/0x110
      [<00000000d3a6176f>] vfs_get_tree+0x41/0xf0
      [<0000000016b5ad4c>] path_mount+0x9b3/0xdd0
      [<00000000494cae71>] __x64_sys_mount+0x190/0x1d0
      [<000000005d56bdec>] do_syscall_64+0x35/0x80
      [<00000000687c9ae4>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: abf79bb3 ("NFS: Add a slot table to struct nfs_client for NFSv4.0 transport blocking")
Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

920f74ac

kernfs: fix use-after-free in __kernfs_remove · a1a691b4

由 Christian A. Ehrhardt 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.264
commit 028cf780743eea79abffa7206b9dcfc080ad3546
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 4abc9965 upstream.

Syzkaller managed to trigger concurrent calls to
kernfs_remove_by_name_ns() for the same file resulting in
a KASAN detected use-after-free. The race occurs when the root
node is freed during kernfs_drain().

To prevent this acquire an additional reference for the root
of the tree that is removed before calling __kernfs_remove().

Found by syzkaller with the following reproducer (slab_nomerge is
required):

syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0)
r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0)
close(r0)
pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800)
mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}})

Sample report:

==================================================================
BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline]
BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857

CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:317 [inline]
 print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433
 kasan_report+0xa3/0x130 mm/kasan/report.c:495
 kernfs_type include/linux/kernfs.h:335 [inline]
 kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
 __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
 __kernfs_remove fs/kernfs/dir.c:1356 [inline]
 kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
 sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f725f983aed
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed
RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000
RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000
R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000
 </TASK>

Allocated by task 855:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:45 [inline]
 set_alloc_info mm/kasan/common.c:437 [inline]
 __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470
 kasan_slab_alloc include/linux/kasan.h:224 [inline]
 slab_post_alloc_hook mm/slab.h:727 [inline]
 slab_alloc_node mm/slub.c:3243 [inline]
 slab_alloc mm/slub.c:3251 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3258 [inline]
 kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268
 kmem_cache_zalloc include/linux/slab.h:723 [inline]
 __kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593
 kernfs_new_node fs/kernfs/dir.c:655 [inline]
 kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010
 sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59
 create_dir lib/kobject.c:63 [inline]
 kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223
 kobject_add_varg lib/kobject.c:358 [inline]
 kobject_init_and_add+0x101/0x160 lib/kobject.c:441
 sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 857:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:45
 kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370
 ____kasan_slab_free mm/kasan/common.c:367 [inline]
 ____kasan_slab_free mm/kasan/common.c:329 [inline]
 __kasan_slab_free+0x108/0x190 mm/kasan/common.c:375
 kasan_slab_free include/linux/kasan.h:200 [inline]
 slab_free_hook mm/slub.c:1754 [inline]
 slab_free_freelist_hook mm/slub.c:1780 [inline]
 slab_free mm/slub.c:3534 [inline]
 kmem_cache_free+0x9c/0x340 mm/slub.c:3551
 kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547
 kernfs_put+0x42/0x50 fs/kernfs/dir.c:521
 __kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407
 __kernfs_remove fs/kernfs/dir.c:1356 [inline]
 kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
 sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
 p9_client_create+0xd4d/0x1190 net/9p/client.c:993
 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
 vfs_get_tree+0x85/0x2e0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x675/0x1d00 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888008880780
 which belongs to the cache kernfs_node_cache of size 128
The buggy address is located 112 bytes inside of
 128-byte region [ffff888008880780, ffff888008880800)

The buggy address belongs to the physical page:
page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880
flags: 0x100000000000200(slab|node=0|zone=1)
raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280
raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
 ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                             ^
 ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
 ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================
Acked-by: NTejun Heo <tj@kernel.org>
Cc: stable <stable@kernel.org> # -rc3
Signed-off-by: NChristian A. Ehrhardt <lk@c--e.de>
Link: https://lore.kernel.org/r/20220913121723.691454-1-lk@c--e.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

a1a691b4

mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages · 7a5b0955

由 Rik van Riel 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.264
commit 2b35432d324898ec41beb27031d2a1a864a4d40e
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit 12df140f upstream.

The h->*_huge_pages counters are protected by the hugetlb_lock, but
alloc_huge_page has a corner case where it can decrement the counter
outside of the lock.

This could lead to a corrupted value of h->resv_huge_pages, which we have
observed on our systems.

Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a
potential race.

Link: https://lkml.kernel.org/r/20221017202505.0e6a4fcd@imladris.surriel.com
Fixes: a88c7695 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count")
Signed-off-by: NRik van Riel <riel@surriel.com>
Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Glen McCready <gkmccready@meta.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

7a5b0955

mm: /proc/pid/smaps_rollup: fix no vma's null-deref · 33213b46

由 Seth Jenkins 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.264
commit dbe863bce7679c7f5ec0e993d834fe16c5e687b5
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

Commit 258f669e ("mm: /proc/pid/smaps_rollup: convert to single value
seq_file") introduced a null-deref if there are no vma's in the task in
show_smaps_rollup.

Fixes: 258f669e ("mm: /proc/pid/smaps_rollup: convert to single value seq_file")
Signed-off-by: NSeth Jenkins <sethjenkins@google.com>
Reviewed-by: NAlexey Dobriyan <adobriyan@gmail.com>
Tested-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

33213b46

signal handling: don't use BUG_ON() for debugging · a2f88993

由 Xia Fukun 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 93d9cef55f8fe463e3b9f6c73c7a32619222c657
category: bugfix
bugzilla: 187828, https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

[ Upstream commit a382f8fe ]

These are indeed "should not happen" situations, but it turns out recent
changes made the 'task_is_stopped_or_trace()' case trigger (fix for that
exists, is pending more testing), and the BUG_ON() makes it
unnecessarily hard to actually debug for no good reason.

It's been that way for a long time, but let's make it clear: BUG_ON() is
not good for debugging, and should never be used in situations where you
could just say "this shouldn't happen, but we can continue".

Use WARN_ON_ONCE() instead to make sure it gets logged, and then just
continue running.  Instead of making the system basically unusuable
because you crashed the machine while potentially holding some very core
locks (eg this function is commonly called while holding 'tasklist_lock'
for writing).
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NXia Fukun <xiafukun@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

a2f88993

ida: don't use BUG_ON() for debugging · eaab6483

由 Xia Fukun 提交于 12月 08, 2022

stable inclusion
from stable-v4.19.267
commit 33d2f83e3f2c1fdabb365d25bed3aa630041cbc0
category: bugfix
bugzilla: 188002, https://gitee.com/openeuler/kernel/issues/I63UEU
CVE: NA

--------------------------------

commit fc82bbf4 upstream.

This is another old BUG_ON() that just shouldn't exist (see also commit
a382f8fe: "signal handling: don't use BUG_ON() for debugging").

In fact, as Matthew Wilcox points out, this condition shouldn't really
even result in a warning, since a negative id allocation result is just
a normal allocation failure:

  "I wonder if we should even warn here -- sure, the caller is trying to
   free something that wasn't allocated, but we don't warn for
   kfree(NULL)"

and goes on to point out how that current error check is only causing
people to unnecessarily do their own index range checking before freeing
it.

This was noted by Itay Iellin, because the bluetooth HCI socket cookie
code does *not* do that range checking, and ends up just freeing the
error case too, triggering the BUG_ON().

The HCI code requires CAP_NET_RAW, and seems to just result in an ugly
splat, but there really is no reason to BUG_ON() here, and we have
generally striven for allocation models where it's always ok to just do

    free(alloc());

even if the allocation were to fail for some random reason (usually
obviously that "random" reason being some resource limit).

Fixes: 88eca020 ("ida: simplified functions for id allocation")
Reported-by: NItay Iellin <ieitayie@gmail.com>
Suggested-by: NMatthew Wilcox <willy@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NXia Fukun <xiafukun@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

eaab6483

06 12月, 2022 1 次提交

!272 [openEuler-1.0-LTS] Add MWAIT Cx support for Zhaoxin CPUs. · 75ea48ac

由 openeuler-ci-bot 提交于 12月 06, 2022

Merge Pull Request from: @leoliu-oc 
 
When the processor is idle，low-power idle states (C-states) can be used to save power. For Zhaoxin processors，there are two methods to enter idle states. One is HLT instruction and legacy method of I/O reads from the CPI-defined register (known as P_LVLx)，the other one is MWAIT instruction with idle states hints.

Default for legacy operating system，HLT and P_LVLx I/O reads are used for Zhaoxin Processors to enter idle states, but we have checked on some Zhaoxin platform that MWAIT instruction is more efficient than P_LVLx I/O reads and HLT, so we add MWAIT Cx support for Zhaoxin Processors.

### Issue
https://gitee.com/openeuler/kernel/issues/I62TOM

### Test
N/A

### Known Issue
N/A

### Default config change
N/A

 
 
Link:https://gitee.com/openeuler/kernel/pulls/272 
Reviewed-by: Laibin Qiu <qiulaibin@huawei.com> 
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>

75ea48ac

05 12月, 2022 3 次提交

Bluetooth: L2CAP: Fix u8 overflow · dde9a0f9

由 Sungwoo Kim 提交于 12月 05, 2022

maillist inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I63D3E
CVE: CVE-2022-45934

--------------------------------

By keep sending L2CAP_CONF_REQ packets, chan->num_conf_rsp increases
multiple times and eventually it will wrap around the maximum number
(i.e., 255).
This patch prevents this by adding a boundary check with
L2CAP_MAX_CONF_RSP

Btmon log:
Bluetooth monitor ver 5.64
= Note: Linux version 6.1.0-rc2 (x86_64)                               0.264594
= Note: Bluetooth subsystem version 2.22                               0.264636
@ MGMT Open: btmon (privileged) version 1.22                  {0x0001} 0.272191
= New Index: 00:00:00:00:00:00 (Primary,Virtual,hci0)          [hci0] 13.877604
@ RAW Open: 9496 (privileged) version 2.22                   {0x0002} 13.890741
= Open Index: 00:00:00:00:00:00                                [hci0] 13.900426
(...)
> ACL Data RX: Handle 200 flags 0x00 dlen 1033             #32 [hci0] 14.273106
        invalid packet size (12 != 1033)
        08 00 01 00 02 01 04 00 01 10 ff ff              ............
> ACL Data RX: Handle 200 flags 0x00 dlen 1547             #33 [hci0] 14.273561
        invalid packet size (14 != 1547)
        0a 00 01 00 04 01 06 00 40 00 00 00 00 00        ........@.....
> ACL Data RX: Handle 200 flags 0x00 dlen 2061             #34 [hci0] 14.274390
        invalid packet size (16 != 2061)
        0c 00 01 00 04 01 08 00 40 00 00 00 00 00 00 04  ........@.......
> ACL Data RX: Handle 200 flags 0x00 dlen 2061             #35 [hci0] 14.274932
        invalid packet size (16 != 2061)
        0c 00 01 00 04 01 08 00 40 00 00 00 07 00 03 00  ........@.......
= bluetoothd: Bluetooth daemon 5.43                                   14.401828
> ACL Data RX: Handle 200 flags 0x00 dlen 1033             #36 [hci0] 14.275753
        invalid packet size (12 != 1033)
        08 00 01 00 04 01 04 00 40 00 00 00              ........@...
Signed-off-by: NSungwoo Kim <iam@sung-woo.kim>
Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: NBaisong Zhong <zhongbaisong@huawei.com>
Reviewed-by: NLiu Jian <liujian56@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

dde9a0f9

l2tp: Don't sleep and disable BH under writer-side sk_callback_lock · 08d3ac9e

由 Jakub Sitnicki 提交于 12月 05, 2022

mainline inclusion
from mainline-v6.1-rc7
commit af295e85
category: bugfix
bugzilla: 188056, https://gitee.com/src-openeuler/kernel/issues/I62RNU
CVE: CVE-2022-4129

--------------------------------

When holding a reader-writer spin lock we cannot sleep. Calling
setup_udp_tunnel_sock() with write lock held violates this rule, because we
end up calling percpu_down_read(), which might sleep, as syzbot reports
[1]:

 __might_resched.cold+0x222/0x26b kernel/sched/core.c:9890
 percpu_down_read include/linux/percpu-rwsem.h:49 [inline]
 cpus_read_lock+0x1b/0x140 kernel/cpu.c:310
 static_key_slow_inc+0x12/0x20 kernel/jump_label.c:158
 udp_tunnel_encap_enable include/net/udp_tunnel.h:187 [inline]
 setup_udp_tunnel_sock+0x43d/0x550 net/ipv4/udp_tunnel_core.c:81
 l2tp_tunnel_register+0xc51/0x1210 net/l2tp/l2tp_core.c:1509
 pppol2tp_connect+0xcdc/0x1a10 net/l2tp/l2tp_ppp.c:723

Trim the writer-side critical section for sk_callback_lock down to the
minimum, so that it covers only operations on sk_user_data.

Also, when grabbing the sk_callback_lock, we always need to disable BH, as
Eric points out. Failing to do so leads to deadlocks because we acquire
sk_callback_lock in softirq context, which can get stuck waiting on us if:

1) it runs on the same CPU, or

       CPU0
       ----
  lock(clock-AF_INET6);
  <Interrupt>
    lock(clock-AF_INET6);

2) lock ordering leads to priority inversion

       CPU0                    CPU1
       ----                    ----
  lock(clock-AF_INET6);
                               local_irq_disable();
                               lock(&tcp_hashinfo.bhash[i].lock);
                               lock(clock-AF_INET6);
  <Interrupt>
    lock(&tcp_hashinfo.bhash[i].lock);

... as syzbot reports [2,3]. Use the _bh variants for write_(un)lock.

[1] https://lore.kernel.org/netdev/0000000000004e78ec05eda79749@google.com/
[2] https://lore.kernel.org/netdev/000000000000e38b6605eda76f98@google.com/
[3] https://lore.kernel.org/netdev/000000000000dfa31e05eda76f75@google.com/

v2:
- Check and set sk_user_data while holding sk_callback_lock for both
  L2TP encapsulation types (IP and UDP) (Tetsuo)

Cc: Tom Parkin <tparkin@katalix.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Fixes: b68777d5 ("l2tp: Serialize access to sk_user_data with sk_callback_lock")
Reported-by: NEric Dumazet <edumazet@google.com>
Reported-by: syzbot+703d9e154b3b58277261@syzkaller.appspotmail.com
Reported-by: syzbot+50680ced9e98a61f7698@syzkaller.appspotmail.com
Reported-by: syzbot+de987172bb74a381879b@syzkaller.appspotmail.com
Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NLu Wei <luwei32@huawei.com>
Reviewed-by: NLiu Jian <liujian56@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

08d3ac9e

l2tp: Serialize access to sk_user_data with sk_callback_lock · 9e00bd66

由 Jakub Sitnicki 提交于 12月 05, 2022

mainline inclusion
from mainline-v6.1-rc6
commit b68777d5
category: bugfix
bugzilla: 188056, https://gitee.com/src-openeuler/kernel/issues/I62RNU
CVE: CVE-2022-4129

--------------------------------

sk->sk_user_data has multiple users, which are not compatible with each
other. Writers must synchronize by grabbing the sk->sk_callback_lock.

l2tp currently fails to grab the lock when modifying the underlying tunnel
socket fields. Fix it by adding appropriate locking.

We err on the side of safety and grab the sk_callback_lock also inside the
sk_destruct callback overridden by l2tp, even though there should be no
refs allowing access to the sock at the time when sk_destruct gets called.

v4:
- serialize write to sk_user_data in l2tp sk_destruct

v3:
- switch from sock lock to sk_callback_lock
- document write-protection for sk_user_data

v2:
- update Fixes to point to origin of the bug
- use real names in Reported/Tested-by tags

Cc: Tom Parkin <tparkin@katalix.com>
Fixes: 3557baab ("[L2TP]: PPP over L2TP driver core")
Reported-by: NHaowei Yan <g1042620637@gmail.com>
Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

conflict:
	net/l2tp/l2tp_core.c
Signed-off-by: NLu Wei <luwei32@huawei.com>
Reviewed-by: NLiu Jian <liujian56@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

9e00bd66

29 11月, 2022 5 次提交

!288 Add support for ConnectX6 Lx and ConnectX6Dx with openEuler inbox driver · 150b3f83

由 openeuler-ci-bot 提交于 11月 29, 2022

Merge Pull Request from: @meitingli 
 
Add the upcoming ConnectX-6 Dx, ConnectX-6 LX device ID.
In addition, add "ConnectX Family mlx5Gen Virtual Function" device ID. 
Every new HCA VF will be identified with this device ID. 
Different VF models will be distinguished by their revision id.
 
 
Link:https://gitee.com/openeuler/kernel/pulls/288 
Reviewed-by: Laibin Qiu <qiulaibin@huawei.com> 
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>

150b3f83

net/mlx5: Update the list of the PCI supported devices · a42e2dc5

由 Shani Shapp 提交于 11月 12, 2019

mainline inclusion
from mainline-v5.4
commit b7eca940
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63MQP
CVE: NA

----------------------------------------------------------------------

Add the upcoming ConnectX-6 LX device ID.

Fixes: 85327a9c ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: NShani Shapp <shanish@mellanox.com>
Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NMeiting Li <limeiting1@huawei.com>

a42e2dc5

net/mlx5: Update the list of the PCI supported devices · 4975dd57

由 Eran Ben Elisha 提交于 1月 27, 2019

mainline inclusion
from mainline-v5.1-rc1
commit 85327a9c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I63MQP
CVE: NA

----------------------------------------------------------------------

Add the upcoming ConnectX-6 Dx.

In addition, add "ConnectX Family mlx5Gen Virtual Function" device ID.
Every new HCA VF will be identified with this device ID. Different VF
models will be distinguished by their revision id.
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: NAya Levin <ayal@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NMeiting Li <limeiting1@huawei.com>

4975dd57

drivers: net: slip: fix NPD bug in sl_tx_timeout() · 599915d8

由 Duoming Zhou 提交于 11月 29, 2022

stable inclusion
from stable-4.19.239
commit 753b9d220a7d36dac70e7c6d05492d10d6f9dd36
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I62KQZ
CVE: CVE-2022-41858

--------------------------------

[ Upstream commit ec4eb8a8 ]

When a slip driver is detaching, the slip_close() will act to
cleanup necessary resources and sl->tty is set to NULL in
slip_close(). Meanwhile, the packet we transmit is blocked,
sl_tx_timeout() will be called. Although slip_close() and
sl_tx_timeout() use sl->lock to synchronize, we don`t judge
whether sl->tty equals to NULL in sl_tx_timeout() and the
null pointer dereference bug will happen.

   (Thread 1)                 |      (Thread 2)
                              | slip_close()
                              |   spin_lock_bh(&sl->lock)
                              |   ...
...                           |   sl->tty = NULL //(1)
sl_tx_timeout()               |   spin_unlock_bh(&sl->lock)
  spin_lock(&sl->lock);       |
  ...                         |   ...
  tty_chars_in_buffer(sl->tty)|
    if (tty->ops->..) //(2)   |
    ...                       |   synchronize_rcu()

We set NULL to sl->tty in position (1) and dereference sl->tty
in position (2).

This patch adds check in sl_tx_timeout(). If sl->tty equals to
NULL, sl_tx_timeout() will goto out.
Signed-off-by: NDuoming Zhou <duoming@zju.edu.cn>
Reviewed-by: NJiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20220405132206.55291-1-duoming@zju.edu.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NLiu Jian <liujian56@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

599915d8

staging: rtl8712: fix use after free bugs · 3e0deabf

由 Dan Carpenter 提交于 11月 29, 2022

stable inclusion
from stable-v5.10.142
commit 19e3f69d19801940abc2ac37c169882769ed9770
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I62AX2
CVE: CVE-2022-4095

--------------------------------

_Read/Write_MACREG callbacks are NULL so the read/write_macreg_hdl()
functions don't do anything except free the "pcmd" pointer.  It
results in a use after free.  Delete them.

Fixes: 2865d42c ("staging: r8712u: Add the new driver to the mainline kernel")
Cc: stable <stable@kernel.org>
Reported-by: NZheng Wang <hackerzheng666@gmail.com>
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/Yw4ASqkYcUhUfoY2@kiliSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NGuan Jing <guanjing6@huawei.com>
Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
Reviewed-by: NChen Hui <judy.chenhui@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

3e0deabf

27 11月, 2022 1 次提交

x86/tsc: use topology_max_packages() in tsc watchdog check · 4f283abb

由 Feng Tang 提交于 11月 27, 2022

hulk inclusion
category: bugfix
bugzilla: 187942, https://gitee.com/openeuler/kernel/issues/I5U037
CVE: NA

-------------------------------

Commit b50db709 ("x86/tsc: Disable clocksource watchdog for TSC
on qualified platorms") was introduced to solve problem that
sometimes TSC clocksource is wrongly judged as unstable by watchdog
like 'jiffies', HPET, etc.

In it, the hardware socket number is a key factor for judging
whether to disable the watchdog for TSC, and 'nr_online_nodes' was
chosen as an estimation due to it is needed in early boot phase
before registering 'tsc-early' clocksource, where all none-boot
CPUs are not brought up yet.

In recent patch review, Dave Hansen pointed out there are many
cases that 'nr_online_nodes' could have issue, like:
* numa emulation (numa=fake=4 etc.)
* numa=off
* platforms with CPU+DRAM nodes, CPU-less HBM nodes, CPU-less
  persistent memory nodes.

Peter Zijlstra suggested to use logical package ids, but it is
only usable after smp_init() and all CPUs are initialized.

One solution is to skip the watchdog for 'tsc-early' clocksource,
and move the check after smp_init(), while before 'tsc'
clocksoure is registered, where topology_max_packages() could
be used as a much more accurate socket number.
Signed-off-by: NFeng Tang <feng.tang@intel.com>

Conflict:
	arch/x86/kernel/tsc.c
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

4f283abb

26 11月, 2022 2 次提交

scsi: hisi_sas: Set iptt aborted flag when receiving an abnormal CQ · 4cccc16a

由 Xingui Yang 提交于 11月 26, 2022

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I62ZXO
CVE: NA

------------------------------------------------

During the write I/O, when the SAS PHY switch is tested, the hardware
may reports two CQs for one IO. the first cq indicates invalid port when
DPH scheduling, the second cq indicates that response frame has been
written to the memory but the I/O is ended abnormally due to I/O data
underload. So set iptt aborted flag when receiving an abnormal CQ, then the
host will discards the IPTT frame received from the SAS hard disk.
Signed-off-by: NXingui Yang <yangxingui@huawei.com>
Reviewed-by: Nkang fenglong <kangfenglong@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

4cccc16a

ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0 · bc9ebdce

由 Luís Henriques 提交于 11月 26, 2022

mainline inclusion
from mainline-v6.0-rc7
commit 29a5b8a1
category: bugfix
bugzilla: 187444, https://gitee.com/openeuler/kernel/issues/I6261Z
CVE: NA

--------------------------------

When walking through an inode extents, the ext4_ext_binsearch_idx() function
assumes that the extent header has been previously validated.  However, there
are no checks that verify that the number of entries (eh->eh_entries) is
non-zero when depth is > 0.  And this will lead to problems because the
EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this:

[  135.245946] ------------[ cut here ]------------
[  135.247579] kernel BUG at fs/ext4/extents.c:2258!
[  135.249045] invalid opcode: 0000 [#1] PREEMPT SMP
[  135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4
[  135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
[  135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0
[  135.256475] Code:
[  135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246
[  135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023
[  135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c
[  135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c
[  135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024
[  135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000
[  135.272394] FS:  00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[  135.274510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0
[  135.277952] Call Trace:
[  135.278635]  <TASK>
[  135.279247]  ? preempt_count_add+0x6d/0xa0
[  135.280358]  ? percpu_counter_add_batch+0x55/0xb0
[  135.281612]  ? _raw_read_unlock+0x18/0x30
[  135.282704]  ext4_map_blocks+0x294/0x5a0
[  135.283745]  ? xa_load+0x6f/0xa0
[  135.284562]  ext4_mpage_readpages+0x3d6/0x770
[  135.285646]  read_pages+0x67/0x1d0
[  135.286492]  ? folio_add_lru+0x51/0x80
[  135.287441]  page_cache_ra_unbounded+0x124/0x170
[  135.288510]  filemap_get_pages+0x23d/0x5a0
[  135.289457]  ? path_openat+0xa72/0xdd0
[  135.290332]  filemap_read+0xbf/0x300
[  135.291158]  ? _raw_spin_lock_irqsave+0x17/0x40
[  135.292192]  new_sync_read+0x103/0x170
[  135.293014]  vfs_read+0x15d/0x180
[  135.293745]  ksys_read+0xa1/0xe0
[  135.294461]  do_syscall_64+0x3c/0x80
[  135.295284]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

This patch simply adds an extra check in __ext4_ext_check(), verifying that
eh_entries is not 0 when eh_depth is > 0.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283
Cc: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
Signed-off-by: NLuís Henriques <lhenriques@suse.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NBaokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

bc9ebdce

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功