- 07 7月, 2023 40 次提交
-
-
由 Pratyush Yadav 提交于
stable inclusion from stable-v4.19.284 commit 779332447108545ef04682ea29af5f85c0202aee category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- commit 8a02fb71 upstream. Commit 50749f2d ("tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.") added a call to skb_orphan_frags_rx() to fix leaks with zerocopy skbs. But it ended up adding a leak of its own. When skb_orphan_frags_rx() fails, the function just returns, leaking the skb it just cloned. Free it before returning. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. Fixes: 50749f2d ("tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.") Signed-off-by: NPratyush Yadav <ptyadav@amazon.de> Reviewed-by: NKuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230522153020.32422-1-ptyadav@amazon.deSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Kuniyuki Iwashima 提交于
stable inclusion from stable-v4.19.284 commit cc56de054d828935aa37734b479f82fa34b5f9bd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- commit ad42a35b upstream. syzbot reported [0] a null-ptr-deref in sk_get_rmem0() while using IPPROTO_UDPLITE (0x88): 14:25:52 executing program 1: r0 = socket$inet6(0xa, 0x80002, 0x88) We had a similar report [1] for probably sk_memory_allocated_add() in __sk_mem_raise_allocated(), and commit c915fe13 ("udplite: fix NULL pointer dereference") fixed it by setting .memory_allocated for udplite_prot and udplitev6_prot. To fix the variant, we need to set either .sysctl_wmem_offset or .sysctl_rmem. Now UDP and UDPLITE share the same value for .memory_allocated, so we use the same .sysctl_wmem_offset for UDP and UDPLITE. [0]: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 6829 Comm: syz-executor.1 Not tainted 6.4.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/28/2023 RIP: 0010:sk_get_rmem0 include/net/sock.h:2907 [inline] RIP: 0010:__sk_mem_raise_allocated+0x806/0x17a0 net/core/sock.c:3006 Code: c1 ea 03 80 3c 02 00 0f 85 23 0f 00 00 48 8b 44 24 08 48 8b 98 38 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 <0f> b6 14 02 48 89 d8 83 e0 07 83 c0 03 38 d0 0f 8d 6f 0a 00 00 8b RSP: 0018:ffffc90005d7f450 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90004d92000 RDX: 0000000000000000 RSI: ffffffff88066482 RDI: ffffffff8e2ccbb8 RBP: ffff8880173f7000 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000030000 R13: 0000000000000001 R14: 0000000000000340 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9800000(0063) knlGS:00000000f7f1cb40 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 000000002e82f000 CR3: 0000000034ff0000 CR4: 00000000003506f0 Call Trace: <TASK> __sk_mem_schedule+0x6c/0xe0 net/core/sock.c:3077 udp_rmem_schedule net/ipv4/udp.c:1539 [inline] __udp_enqueue_schedule_skb+0x776/0xb30 net/ipv4/udp.c:1581 __udpv6_queue_rcv_skb net/ipv6/udp.c:666 [inline] udpv6_queue_rcv_one_skb+0xc39/0x16c0 net/ipv6/udp.c:775 udpv6_queue_rcv_skb+0x194/0xa10 net/ipv6/udp.c:793 __udp6_lib_mcast_deliver net/ipv6/udp.c:906 [inline] __udp6_lib_rcv+0x1bda/0x2bd0 net/ipv6/udp.c:1013 ip6_protocol_deliver_rcu+0x2e7/0x1250 net/ipv6/ip6_input.c:437 ip6_input_finish+0x150/0x2f0 net/ipv6/ip6_input.c:482 NF_HOOK include/linux/netfilter.h:303 [inline] NF_HOOK include/linux/netfilter.h:297 [inline] ip6_input+0xa0/0xd0 net/ipv6/ip6_input.c:491 ip6_mc_input+0x40b/0xf50 net/ipv6/ip6_input.c:585 dst_input include/net/dst.h:468 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] NF_HOOK include/linux/netfilter.h:297 [inline] ipv6_rcv+0x250/0x380 net/ipv6/ip6_input.c:309 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5491 __netif_receive_skb+0x1f/0x1c0 net/core/dev.c:5605 netif_receive_skb_internal net/core/dev.c:5691 [inline] netif_receive_skb+0x133/0x7a0 net/core/dev.c:5750 tun_rx_batched+0x4b3/0x7a0 drivers/net/tun.c:1553 tun_get_user+0x2452/0x39c0 drivers/net/tun.c:1989 tun_chr_write_iter+0xdf/0x200 drivers/net/tun.c:2035 call_write_iter include/linux/fs.h:1868 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x945/0xd50 fs/read_write.c:584 ksys_write+0x12b/0x250 fs/read_write.c:637 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] __do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178 do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203 entry_SYSENTER_compat_after_hwframe+0x70/0x82 RIP: 0023:0xf7f21579 Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00 RSP: 002b:00000000f7f1c590 EFLAGS: 00000282 ORIG_RAX: 0000000000000004 RAX: ffffffffffffffda RBX: 00000000000000c8 RCX: 0000000020000040 RDX: 0000000000000083 RSI: 00000000f734e000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Modules linked in: Link: https://lore.kernel.org/netdev/CANaxB-yCk8hhP68L4Q2nFOJht8sqgXGGQO2AftpHs0u1xyGG5A@mail.gmail.com/ [1] Fixes: 850cbadd ("udp: use it's own memory accounting schema") Reported-by: syzbot+444ca0907e96f7c5e48b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=444ca0907e96f7c5e48bSigned-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230523163305.66466-1-kuniyu@amazon.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.284 commit 0a7c08aeca3e531772b83a224ec9b997c6fc4b2c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit dacab578 ] syzbot triggered the following splat [1], sending an empty message through pppoe_sendmsg(). When VLAN_FLAG_REORDER_HDR flag is set, vlan_dev_hard_header() does not push extra bytes for the VLAN header, because vlan is offloaded. Unfortunately vlan_dev_hard_start_xmit() first reads veth->h_vlan_proto before testing (vlan->flags & VLAN_FLAG_REORDER_HDR). We need to swap the two conditions. [1] BUG: KMSAN: uninit-value in vlan_dev_hard_start_xmit+0x171/0x7f0 net/8021q/vlan_dev.c:111 vlan_dev_hard_start_xmit+0x171/0x7f0 net/8021q/vlan_dev.c:111 __netdev_start_xmit include/linux/netdevice.h:4883 [inline] netdev_start_xmit include/linux/netdevice.h:4897 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x253/0xa20 net/core/dev.c:3596 __dev_queue_xmit+0x3c7f/0x5ac0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3053 [inline] pppoe_sendmsg+0xa93/0xb80 drivers/net/ppp/pppoe.c:900 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0xa24/0xe40 net/socket.c:2501 ___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2555 __sys_sendmmsg+0x411/0xa50 net/socket.c:2641 __do_sys_sendmmsg net/socket.c:2670 [inline] __se_sys_sendmmsg net/socket.c:2667 [inline] __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2667 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Uninit was created at: slab_post_alloc_hook+0x12d/0xb60 mm/slab.h:774 slab_alloc_node mm/slub.c:3452 [inline] kmem_cache_alloc_node+0x543/0xab0 mm/slub.c:3497 kmalloc_reserve+0x148/0x470 net/core/skbuff.c:520 __alloc_skb+0x3a7/0x850 net/core/skbuff.c:606 alloc_skb include/linux/skbuff.h:1277 [inline] sock_wmalloc+0xfe/0x1a0 net/core/sock.c:2583 pppoe_sendmsg+0x3af/0xb80 drivers/net/ppp/pppoe.c:867 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0xa24/0xe40 net/socket.c:2501 ___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2555 __sys_sendmmsg+0x411/0xa50 net/socket.c:2641 __do_sys_sendmmsg net/socket.c:2670 [inline] __se_sys_sendmmsg net/socket.c:2667 [inline] __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2667 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd CPU: 0 PID: 29770 Comm: syz-executor.0 Not tainted 6.3.0-rc6-syzkaller-gc478e5b17829 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/30/2023 Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Tobias Brunner 提交于
stable inclusion from stable-v4.19.284 commit d45a4270a5581ebceeec47e42d9def945dc51e90 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit cf3128a7 ] xfrm_state_find() uses `encap_family` of the current template with the passed local and remote addresses to find a matching state. If an optional tunnel or BEET mode template is skipped in a mixed-family scenario, there could be a mismatch causing an out-of-bounds read as the addresses were not replaced to match the family of the next template. While there are theoretical use cases for optional templates in outbound policies, the only practical one is to skip IPComp states in inbound policies if uncompressed packets are received that are handled by an implicitly created IPIP state instead. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: NTobias Brunner <tobias@strongswan.org> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Nick Child 提交于
stable inclusion from stable-v4.19.284 commit 3e750733375cb59cd049de7b38713122cec995c4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit 5dd0dfd5 ] When setting the XPS value of a TX queue, warn the user once if the index of the queue is greater than the number of allocated TX queues. Previously, this scenario went uncaught. In the best case, it resulted in unnecessary allocations. In the worst case, it resulted in out-of-bounds memory references through calls to `netdev_get_tx_queue( dev, index)`. Therefore, it is important to inform the user but not worth returning an error and risk downing the netdevice. Signed-off-by: NNick Child <nnac123@linux.ibm.com> Reviewed-by: NPiotr Raczynski <piotr.raczynski@intel.com> Link: https://lore.kernel.org/r/20230321150725.127229-1-nnac123@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Kuniyuki Iwashima 提交于
stable inclusion from stable-v4.19.284 commit 1c488f4e95b498c977fbeae784983eb4cf6085e8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit e1d09c2c ] KCSAN found a data race around sk->sk_shutdown where unix_release_sock() and unix_shutdown() update it under unix_state_lock(), OTOH unix_poll() and unix_dgram_poll() read it locklessly. We need to annotate the writes and reads with WRITE_ONCE() and READ_ONCE(). BUG: KCSAN: data-race in unix_poll / unix_release_sock write to 0xffff88800d0f8aec of 1 bytes by task 264 on cpu 0: unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631 unix_release+0x59/0x80 net/unix/af_unix.c:1042 __sock_release+0x7d/0x170 net/socket.c:653 sock_close+0x19/0x30 net/socket.c:1397 __fput+0x179/0x5e0 fs/file_table.c:321 ____fput+0x15/0x20 fs/file_table.c:349 task_work_run+0x116/0x1a0 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline] syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297 do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc read to 0xffff88800d0f8aec of 1 bytes by task 222 on cpu 1: unix_poll+0xa3/0x2a0 net/unix/af_unix.c:3170 sock_poll+0xcf/0x2b0 net/socket.c:1385 vfs_poll include/linux/poll.h:88 [inline] ep_item_poll.isra.0+0x78/0xc0 fs/eventpoll.c:855 ep_send_events fs/eventpoll.c:1694 [inline] ep_poll fs/eventpoll.c:1823 [inline] do_epoll_wait+0x6c4/0xea0 fs/eventpoll.c:2258 __do_sys_epoll_wait fs/eventpoll.c:2270 [inline] __se_sys_epoll_wait fs/eventpoll.c:2265 [inline] __x64_sys_epoll_wait+0xcc/0x190 fs/eventpoll.c:2265 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc value changed: 0x00 -> 0x03 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 222 Comm: dbus-broker Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: 3c73419c ("af_unix: fix 'poll for write'/ connected DGRAM sockets") Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NMichal Kubiak <michal.kubiak@intel.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Kuniyuki Iwashima 提交于
stable inclusion from stable-v4.19.284 commit 80508eceeb8073fe61dd80de1c07d28de9f4d57e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit 679ed006 ] KCSAN found a data race of sk->sk_receive_queue->qlen where recvmsg() updates qlen under the queue lock and sendmsg() checks qlen under unix_state_sock(), not the queue lock, so the reader side needs READ_ONCE(). BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_wait_for_peer write (marked) to 0xffff888019fe7c68 of 4 bytes by task 49792 on cpu 0: __skb_unlink include/linux/skbuff.h:2347 [inline] __skb_try_recv_from_queue+0x3de/0x470 net/core/datagram.c:197 __skb_try_recv_datagram+0xf7/0x390 net/core/datagram.c:263 __unix_dgram_recvmsg+0x109/0x8a0 net/unix/af_unix.c:2452 unix_dgram_recvmsg+0x94/0xa0 net/unix/af_unix.c:2549 sock_recvmsg_nosec net/socket.c:1019 [inline] ____sys_recvmsg+0x3a3/0x3b0 net/socket.c:2720 ___sys_recvmsg+0xc8/0x150 net/socket.c:2764 do_recvmmsg+0x182/0x560 net/socket.c:2858 __sys_recvmmsg net/socket.c:2937 [inline] __do_sys_recvmmsg net/socket.c:2960 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0x153/0x170 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc read to 0xffff888019fe7c68 of 4 bytes by task 49793 on cpu 1: skb_queue_len include/linux/skbuff.h:2127 [inline] unix_recvq_full net/unix/af_unix.c:229 [inline] unix_wait_for_peer+0x154/0x1a0 net/unix/af_unix.c:1445 unix_dgram_sendmsg+0x13bc/0x14b0 net/unix/af_unix.c:2048 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg+0x148/0x160 net/socket.c:747 ____sys_sendmsg+0x20e/0x620 net/socket.c:2503 ___sys_sendmsg+0xc6/0x140 net/socket.c:2557 __sys_sendmmsg+0x11d/0x370 net/socket.c:2643 __do_sys_sendmmsg net/socket.c:2672 [inline] __se_sys_sendmmsg net/socket.c:2669 [inline] __x64_sys_sendmmsg+0x58/0x70 net/socket.c:2669 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc value changed: 0x0000000b -> 0x00000001 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 49793 Comm: syz-executor.0 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NMichal Kubiak <michal.kubiak@intel.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.284 commit 32b304def00874955ce614261d15ea3836aa7947 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit 5bca1d08 ] datagram_poll() runs locklessly, we should add READ_ONCE() annotations while reading sk->sk_err, sk->sk_shutdown and sk->sk_state. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NKuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230509173131.3263780-1-edumazet@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Paolo Abeni 提交于
stable inclusion from stable-v4.19.284 commit e62806034c409b87e8fa00b829c059ba8face62c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit 77c3c956 ] unlocked version of protocol level close, will be used by MPTCP to allow decouple orphaning and subflow level close. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Stable-dep-of: e14cadfd ("tcp: add annotations around sk->sk_shutdown accesses") Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.284 commit 640bce625ccf667a1a80262175a81108889ac41d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit e05a5f51 ] do_recvmmsg() can write to sk->sk_err from multiple threads. As said before, many other points reading or writing sk_err need annotations. Fixes: 34b88a68 ("net: Fix use after free in the recvmmsg exit path") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Reviewed-by: NKuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.284 commit 840a647499b093621167de56ffa8756dfc69f242 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit a939d149 ] Both netlink_recvmsg() and netlink_native_seq_show() read nlk->cb_running locklessly. Use READ_ONCE() there. Add corresponding WRITE_ONCE() to netlink_dump() and __netlink_dump_start() syzbot reported: BUG: KCSAN: data-race in __netlink_dump_start / netlink_recvmsg write to 0xffff88813ea4db59 of 1 bytes by task 28219 on cpu 0: __netlink_dump_start+0x3af/0x4d0 net/netlink/af_netlink.c:2399 netlink_dump_start include/linux/netlink.h:308 [inline] rtnetlink_rcv_msg+0x70f/0x8c0 net/core/rtnetlink.c:6130 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2577 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6192 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] sock_write_iter+0x1aa/0x230 net/socket.c:1138 call_write_iter include/linux/fs.h:1851 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x463/0x760 fs/read_write.c:584 ksys_write+0xeb/0x1a0 fs/read_write.c:637 __do_sys_write fs/read_write.c:649 [inline] __se_sys_write fs/read_write.c:646 [inline] __x64_sys_write+0x42/0x50 fs/read_write.c:646 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffff88813ea4db59 of 1 bytes by task 28222 on cpu 1: netlink_recvmsg+0x3b4/0x730 net/netlink/af_netlink.c:2022 sock_recvmsg_nosec+0x4c/0x80 net/socket.c:1017 ____sys_recvmsg+0x2db/0x310 net/socket.c:2718 ___sys_recvmsg net/socket.c:2762 [inline] do_recvmmsg+0x2e5/0x710 net/socket.c:2856 __sys_recvmmsg net/socket.c:2935 [inline] __do_sys_recvmmsg net/socket.c:2958 [inline] __se_sys_recvmmsg net/socket.c:2951 [inline] __x64_sys_recvmmsg+0xe2/0x160 net/socket.c:2951 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x00 -> 0x01 Fixes: 16b304f3 ("netlink: Eliminate kmalloc in netlink dump operation.") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR CVE: NA Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- As Honza said, remove_inode_dquot_ref() currently does not release the last dquot reference but instead adds the dquot to tofree_head list. This is because dqput() can sleep while dropping of the last dquot reference (writing back the dquot and calling ->release_dquot()) and that must not happen under dq_list_lock. Now that dqput() queues the final dquot cleanup into a workqueue, remove_inode_dquot_ref() can call dqput() unconditionally and we can significantly simplify it. Here we open code the simplified code of remove_inode_dquot_ref() into remove_dquot_ref() and remove the function put_dquot_list() which is no longer used. Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR CVE: NA Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- The dquot_mark_dquot_dirty() using dquot references from the inode should be protected by dquot_srcu. quota_off code takes care to call synchronize_srcu(&dquot_srcu) to not drop dquot references while they are used by other users. But dquot_transfer() breaks this assumption. We call dquot_transfer() to drop the last reference of dquot and add it to free_dquots, but there may still be other users using the dquot at this time, as shown in the function graph below: cpu1 cpu2 _________________|_________________ wb_do_writeback CHOWN(1) ... ext4_da_update_reserve_space dquot_claim_block ... dquot_mark_dquot_dirty // try to dirty old quota test_bit(DQ_ACTIVE_B, &dquot->dq_flags) // still ACTIVE if (test_bit(DQ_MOD_B, &dquot->dq_flags)) // test no dirty, wait dq_list_lock ... dquot_transfer __dquot_transfer dqput_all(transfer_from) // rls old dquot dqput // last dqput dquot_release clear_bit(DQ_ACTIVE_B, &dquot->dq_flags) atomic_dec(&dquot->dq_count) put_dquot_last(dquot) list_add_tail(&dquot->dq_free, &free_dquots) // add the dquot to free_dquots if (!test_and_set_bit(DQ_MOD_B, &dquot->dq_flags)) add dqi_dirty_list // add released dquot to dirty_list This can cause various issues, such as dquot being destroyed by dqcache_shrink_scan() after being added to free_dquots, which can trigger a UAF in dquot_mark_dquot_dirty(); or after dquot is added to free_dquots and then to dirty_list, it is added to free_dquots again after dquot_writeback_dquots() is executed, which causes the free_dquots list to be corrupted and triggers a UAF when dqcache_shrink_scan() is called for freeing dquot twice. As Honza said, we need to fix dquot_transfer() to follow the guarantees dquot_srcu should provide. But calling synchronize_srcu() directly from dquot_transfer() is too expensive (and mostly unnecessary). So we add dquot whose last reference should be dropped to the new global dquot list releasing_dquots, and then queue work item which would call synchronize_srcu() and after that perform the final cleanup of all the dquots on releasing_dquots. Fixes: 4580b30e ("quota: Do not dirty bad dquots") Suggested-by: NJan Kara <jack@suse.cz> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR CVE: NA Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- Add new helper function dquot_active() to make the code more concise. Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR CVE: NA Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- Now we have a helper function dquot_dirty() to determine if dquot has DQ_MOD_B bit. dquot_active() can easily be misunderstood as a helper function to determine if dquot has DQ_ACTIVE_B bit. So we avoid this by renaming it to inode_quota_active() and later on we will add the helper function dquot_active() to determine if dquot has DQ_ACTIVE_B bit. Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR CVE: NA Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- Refactor out dquot_write_dquot() to reduce duplicate code. Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Chengguang Xu 提交于
mainline inclusion from mainline-v5.3-rc1 commit f44840ad category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f44840ad1f822d9ecee6a3f91f2d17825a361307 -------------------------------- Actually there are four lists for dquot management, so add the description of dqui_dirty_list to comment. Signed-off-by: NChengguang Xu <cgxu519@gmail.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Chengguang Xu 提交于
mainline inclusion from mainline-v5.5-rc1 commit 05848db2 category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=05848db2083d4f232e84e385845dcd98d5c511b2 -------------------------------- It is meaningless to increase DQST_LOOKUPS number while iterating over dirty/inuse list, so just avoid it. Link: https://lore.kernel.org/r/20190926083408.4269-1-cgxu519@zoho.com.cnSigned-off-by: NChengguang Xu <cgxu519@zoho.com.cn> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Nathan Chancellor 提交于
stable inclusion from stable-v4.19.285 commit 85d0254e7ec88b9cf6c853d9f6ea9e3af64d7f81 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- commit 63174f61 upstream. Clang warns: ../kernel/extable.c:37:52: warning: array comparison always evaluates to a constant [-Wtautological-compare] if (main_extable_sort_needed && __stop___ex_table > __start___ex_table) { ^ 1 warning generated. These are not true arrays, they are linker defined symbols, which are just addresses. Using the address of operator silences the warning and does not change the resulting assembly with either clang/ld.lld or gcc/ld (tested with diff + objdump -Dr). Suggested-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://github.com/ClangBuiltLinux/linux/issues/892 Link: http://lkml.kernel.org/r/20200219202036.45702-1-natechancellor@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Min-Hua Chen 提交于
stable inclusion from stable-v4.19.285 commit 90e687756316c6d9b97caf3dd29f04b66ce0aeab category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit d91d5808 ] This patch fixes several sparse warnings for fault.c: arch/arm64/mm/fault.c:493:24: sparse: warning: incorrect type in return expression (different base types) arch/arm64/mm/fault.c:493:24: sparse: expected restricted vm_fault_t arch/arm64/mm/fault.c:493:24: sparse: got int arch/arm64/mm/fault.c:501:32: sparse: warning: incorrect type in return expression (different base types) arch/arm64/mm/fault.c:501:32: sparse: expected restricted vm_fault_t arch/arm64/mm/fault.c:501:32: sparse: got int arch/arm64/mm/fault.c:503:32: sparse: warning: incorrect type in return expression (different base types) arch/arm64/mm/fault.c:503:32: sparse: expected restricted vm_fault_t arch/arm64/mm/fault.c:503:32: sparse: got int arch/arm64/mm/fault.c:511:24: sparse: warning: incorrect type in return expression (different base types) arch/arm64/mm/fault.c:511:24: sparse: expected restricted vm_fault_t arch/arm64/mm/fault.c:511:24: sparse: got int arch/arm64/mm/fault.c:670:13: sparse: warning: restricted vm_fault_t degrades to integer arch/arm64/mm/fault.c:670:13: sparse: warning: restricted vm_fault_t degrades to integer arch/arm64/mm/fault.c:713:39: sparse: warning: restricted vm_fault_t degrades to integer Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NMin-Hua Chen <minhuadotchen@gmail.com> Link: https://lore.kernel.org/r/20230502151909.128810-1-minhuadotchen@gmail.comSigned-off-by: NWill Deacon <will@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Dave Hansen 提交于
stable inclusion from stable-v4.19.284 commit 4d19f7698681c59b4c1f25dc343a025d91cc4827 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- commit ce0b15d1 upstream. The INVLPG instruction is used to invalidate TLB entries for a specified virtual address. When PCIDs are enabled, INVLPG is supposed to invalidate TLB entries for the specified address for both the current PCID *and* Global entries. (Note: Only kernel mappings set Global=1.) Unfortunately, some INVLPG implementations can leave Global translations unflushed when PCIDs are enabled. As a workaround, never enable PCIDs on affected processors. I expect there to eventually be microcode mitigations to replace this software workaround. However, the exact version numbers where that will happen are not known today. Once the version numbers are set in stone, the processor list can be tweaked to only disable PCIDs on affected processors with affected microcode. Note: if anyone wants a quick fix that doesn't require patching, just stick 'nopcid' on your kernel command-line. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: NDaniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Josh Poimboeuf 提交于
stable inclusion from stable-v4.19.284 commit 446e8d258ae5067786338723e73c601edfbd8a0e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit e0b081d1 ] With KCSAN enabled, end_of_stack() can get out-of-lined. Force it inline. Fixes the following warnings: vmlinux.o: warning: objtool: check_stackleak_irqoff+0x2b: call to end_of_stack() leaves .noinstr.text section Signed-off-by: NJosh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/cc1b4d73d3a428a00d206242a68fdf99a934ca7b.1681320026.git.jpoimboe@kernel.orgSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Tony Lindgren 提交于
stable inclusion from stable-v4.19.284 commit c9e080c3005fd183c56ff8f4d75edb5da0765d2c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit 04e82793 ] When we unbind a serial port hardware specific 8250 driver, the generic serial8250 driver takes over the port. After that we see an oops about 10 seconds later. This can produce the following at least on some TI SoCs: Unhandled fault: imprecise external abort (0x1406) Internal error: : 1406 [#1] SMP ARM Turns out that we may still have the serial port hardware specific driver port->pm in use, and serial8250_pm() tries to call it after the port specific driver is gone: serial8250_pm [8250_base] from uart_change_pm+0x54/0x8c [serial_base] uart_change_pm [serial_base] from uart_hangup+0x154/0x198 [serial_base] uart_hangup [serial_base] from __tty_hangup.part.0+0x328/0x37c __tty_hangup.part.0 from disassociate_ctty+0x154/0x20c disassociate_ctty from do_exit+0x744/0xaac do_exit from do_group_exit+0x40/0x8c do_group_exit from __wake_up_parent+0x0/0x1c Let's fix the issue by calling serial8250_set_defaults() in serial8250_unregister_port(). This will set the port back to using the serial8250 default functions, and sets the port->pm to point to serial8250_pm. Signed-off-by: NTony Lindgren <tony@atomide.com> Link: https://lore.kernel.org/r/20230418101407.12403-1-tony@atomide.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 void0red 提交于
stable inclusion from stable-v4.19.284 commit 35d67ffad6f5d78dbd800d354f5334c7b71a19e0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit ae5a0ecc ] ACPICA commit 0d5f467d6a0ba852ea3aad68663cbcbd43300fd4 ACPI_ALLOCATE_ZEROED may fails, object_info might be null and will cause null pointer dereference later. Link: https://github.com/acpica/acpica/commit/0d5f467dSigned-off-by: NBob Moore <robert.moore@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Armin Wolf 提交于
stable inclusion from stable-v4.19.284 commit 0d528a7c421b1f1772fc1d29370b3b5fc0f42b19 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit e5b492c6 ] When removing custom query handlers, the handler might still be used inside the EC query workqueue, causing a kernel oops if the module holding the callback function was already unloaded. Fix this by flushing the EC query workqueue when removing custom query handlers. Tested on a Acer Travelmate 4002WLMi Signed-off-by: NArmin Wolf <W_Armin@gmx.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Ben Hutchings 提交于
stable inclusion from stable-v4.19.286 commit 53f16fa73f71abfcac67e71a5513653b8db28b76 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=53f16fa73f71abfcac67e71a5513653b8db28b76 -------------------------------- [ Upstream commit 7c5d4801 ] irq_cpu_rmap_release() calls cpu_rmap_put(), which may free the rmap. So we need to clear the pointer to our glue structure in rmap before doing that, not after. Fixes: 4e0473f1 ("lib: cpu_rmap: Avoid use after free on rmap->obj array entries") Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Reviewed-by: NSimon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/ZHo0vwquhOy3FaXc@decadent.org.ukSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eli Cohen 提交于
stable inclusion from stable-v4.19.284 commit d1308bd0b24cb1d78fa2747d5fa3e055cc628a48 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d1308bd0b24cb1d78fa2747d5fa3e055cc628a48 -------------------------------- [ Upstream commit 4e0473f1 ] When calling irq_set_affinity_notifier() with NULL at the notify argument, it will cause freeing of the glue pointer in the corresponding array entry but will leave the pointer in the array. A subsequent call to free_irq_cpu_rmap() will try to free this entry again leading to possible use after free. Fix that by setting NULL to the array entry and checking that we have non-zero at the array entry when iterating over the array in free_irq_cpu_rmap(). The current code does not suffer from this since there are no cases where irq_set_affinity_notifier(irq, NULL) (note the NULL passed for the notify arg) is called, followed by a call to free_irq_cpu_rmap() so we don't hit and issue. Subsequent patches in this series excersize this flow, hence the required fix. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NEli Cohen <elic@nvidia.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Reviewed-by: NJacob Keller <jacob.e.keller@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Theodore Ts'o 提交于
stable inclusion from stable-v4.19.283 commit 37302d4c2724dc92be5f90a3718eafa29834d586 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA Reference:https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=37302d4c2724dc92be5f90a3718eafa29834d586 -------------------------------- commit 4c0b4818 upstream. If there are failures while changing the mount options in __ext4_remount(), we need to restore the old mount options. This commit fixes two problem. The first is there is a chance that we will free the old quota file names before a potential failure leading to a use-after-free. The second problem addressed in this commit is if there is a failed read/write to read-only transition, if the quota has already been suspended, we need to renable quota handling. Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230506142419.984260-2-tytso@mit.eduSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: fs/ext4/super.c Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Damien Le Moal 提交于
stable inclusion from stable-v4.19.282 commit 5cca80d4f3a842340fd0addf9ecaf9d83589bdec category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA -------------------------------- [ Upstream commit f0aa59a3 ] Some USB-SATA adapters have broken behavior when an unsupported VPD page is probed: Depending on the VPD page number, a 4-byte header with a valid VPD page number but with a 0 length is returned. Currently, scsi_vpd_inquiry() only checks that the page number is valid to determine if the page is valid, which results in receiving only the 4-byte header for the non-existent page. This error manifests itself very often with page 0xb9 for the Concurrent Positioning Ranges detection done by sd_read_cpr(), resulting in the following error message: sd 0:0:0:0: [sda] Invalid Concurrent Positioning Ranges VPD page Prevent such misleading error message by adding a check in scsi_vpd_inquiry() to verify that the page length is not 0. Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20230322022211.116327-1-damien.lemoal@opensource.wdc.comReviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Lukas Wunner 提交于
stable inclusion from stable-v4.19.283 commit 2a226e8ca95107cd434d967ecfd83409e26a730e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7J5UF CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2a226e8ca95107cd434d967ecfd83409e26a730e -------------------------------- commit f5eff559 upstream. In 2013, commits 2e35afae ("PCI: pciehp: Add reset_slot() method") 608c3881 ("PCI: Add slot reset option to pci_dev_reset()") amended PCIe hotplug to mask Presence Detect Changed events during a Secondary Bus Reset. The reset thus no longer causes gratuitous slot bringdown and bringup. However the commits neglected to serialize reset with code paths reading slot registers. For instance, a slot bringup due to an earlier hotplug event may see the Presence Detect State bit cleared during a concurrent Secondary Bus Reset. In 2018, commit 5b3f7b7d ("PCI: pciehp: Avoid slot access during reset") retrofitted the missing locking. It introduced a reset_lock which serializes a Secondary Bus Reset with other parts of pciehp. Unfortunately the locking turns out to be overzealous: reset_lock is held for the entire enumeration and de-enumeration of hotplugged devices, including driver binding and unbinding. Driver binding and unbinding acquires device_lock while the reset_lock of the ancestral hotplug port is held. A concurrent Secondary Bus Reset acquires the ancestral reset_lock while already holding the device_lock. The asymmetric locking order in the two code paths can lead to AB-BA deadlocks. Michael Haeuptle reports such deadlocks on simultaneous hot-removal and vfio release (the latter implies a Secondary Bus Reset): pciehp_ist() # down_read(reset_lock) pciehp_handle_presence_or_link_change() pciehp_disable_slot() __pciehp_disable_slot() remove_board() pciehp_unconfigure_device() pci_stop_and_remove_bus_device() pci_stop_bus_device() pci_stop_dev() device_release_driver() device_release_driver_internal() __device_driver_lock() # device_lock() SYS_munmap() vfio_device_fops_release() vfio_device_group_close() vfio_device_close() vfio_device_last_close() vfio_pci_core_close_device() vfio_pci_core_disable() # device_lock() __pci_reset_function_locked() pci_reset_bus_function() pci_dev_reset_slot_function() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Ian May reports the same deadlock on simultaneous hot-removal and an AER-induced Secondary Bus Reset: aer_recover_work_func() pcie_do_recovery() aer_root_reset() pci_bus_error_reset() pci_slot_reset() pci_slot_lock() # device_lock() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Fix by releasing the reset_lock during driver binding and unbinding, thereby splitting and shrinking the critical section. Driver binding and unbinding is protected by the device_lock() and thus serialized with a Secondary Bus Reset. There's no need to additionally protect it with the reset_lock. However, pciehp does not bind and unbind devices directly, but rather invokes PCI core functions which also perform certain enumeration and de-enumeration steps. The reset_lock's purpose is to protect slot registers, not enumeration and de-enumeration of hotplugged devices. That would arguably be the job of the PCI core, not the PCIe hotplug driver. After all, an AER-induced Secondary Bus Reset may as well happen during boot-time enumeration of the PCI hierarchy and there's no locking to prevent that either. Exempting *de-enumeration* from the reset_lock is relatively harmless: A concurrent Secondary Bus Reset may foil config space accesses such as PME interrupt disablement. But if the device is physically gone, those accesses are pointless anyway. If the device is physically present and only logically removed through an Attention Button press or the sysfs "power" attribute, PME interrupts as well as DMA cannot come through because pciehp_unconfigure_device() disables INTx and Bus Master bits. That's still protected by the reset_lock in the present commit. Exempting *enumeration* from the reset_lock also has limited impact: The exempted call to pci_bus_add_device() may perform device accesses through pcibios_bus_add_device() and pci_fixup_device() which are now no longer protected from a concurrent Secondary Bus Reset. Otherwise there should be no impact. In essence, the present commit seeks to fix the AB-BA deadlocks while still retaining a best-effort reset protection for enumeration and de-enumeration of hotplugged devices -- until a general solution is implemented in the PCI core. Link: https://lore.kernel.org/linux-pci/CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@CS1PR8401MB0728.NAMPRD84.PROD.OUTLOOK.COM Link: https://lore.kernel.org/linux-pci/20200615143250.438252-1-ian.may@canonical.com Link: https://lore.kernel.org/linux-pci/ce878dab-c0c4-5bd0-a725-9805a075682d@amd.com Link: https://lore.kernel.org/linux-pci/ed831249-384a-6d35-0831-70af191e9bce@huawei.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: 5b3f7b7d ("PCI: pciehp: Avoid slot access during reset") Link: https://lore.kernel.org/r/fef2b2e9edf245c049a8c5b94743c0f74ff5008a.1681191902.git.lukas@wunner.deReported-by: NMichael Haeuptle <michael.haeuptle@hpe.com> Reported-by: NIan May <ian.may@canonical.com> Reported-by: NAndrey Grodzovsky <andrey2805@gmail.com> Reported-by: NRahul Kumar <rahul.kumar1@amd.com> Reported-by: NJialin Zhang <zhangjialin11@huawei.com> Tested-by: NAnatoli Antonovitch <Anatoli.Antonovitch@amd.com> Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.19+ Cc: Dan Stein <dstein@hpe.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Alex Michon <amichon@kalrayinc.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: drivers/pci/hotplug/pciehp_pci.c Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Zhong Jinghua 提交于
mainline inclusion from mainline-v6.3-rc1 commit 9f6ad5d5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7JHOA CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9f6ad5d533d1c71e51bdd06a5712c4fbc8768dfa -------------------------------- In loop_set_status_from_info(), lo->lo_offset and lo->lo_sizelimit should be checked before reassignment, because if an overflow error occurs, the original correct value will be changed to the wrong value, and it will not be changed back. More, the original patch did not solve the problem, the value was set and ioctl returned an error, but the subsequent io used the value in the loop driver, which still caused an alarm: loop_handle_cmd do_req_filebacked loff_t pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset; lo_rw_aio cmd->iocb.ki_pos = pos Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20230221095027.3656193-1-zhongjinghua@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Siddh Raman Pant 提交于
mainline inclusion from mainline-v6.0-rc3 commit c490a0b5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7JHOA CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c490a0b5a4f36da3918181a8acdc6991d967c5f3 ---------------------------------------- The userspace can configure a loop using an ioctl call, wherein a configuration of type loop_config is passed (see lo_ioctl()'s case on line 1550 of drivers/block/loop.c). This proceeds to call loop_configure() which in turn calls loop_set_status_from_info() (see line 1050 of loop.c), passing &config->info which is of type loop_info64*. This function then sets the appropriate values, like the offset. loop_device has lo_offset of type loff_t (see line 52 of loop.c), which is typdef-chained to long long, whereas loop_info64 has lo_offset of type __u64 (see line 56 of include/uapi/linux/loop.h). The function directly copies offset from info to the device as follows (See line 980 of loop.c): lo->lo_offset = info->lo_offset; This results in an overflow, which triggers a warning in iomap_iter() due to a call to iomap_iter_done() which has: WARN_ON_ONCE(iter->iomap.offset > iter->pos); Thus, check for negative value during loop_set_status_from_info(). Bug report: https://syzkaller.appspot.com/bug?id=c620fe14aac810396d3c3edc9ad73848bf69a29e Reported-and-tested-by: syzbot+a8e049cd3abd342936b6@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NSiddh Raman Pant <code@siddh.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220823160810.181275-1-code@siddh.meSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Zhong Jinghua 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7JHOA CVE: NA ---------------------------------------- This reverts commit ba3f75d0. The location of the code merge is wrong, roll back and start again. Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yu Kuai 提交于
mainline inclusion from mainline-v6.3-rc6 commit 3723091e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7F3M1 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.4-rc7&id=3723091ea1884d599cc8b8bf719d6f42e8d4d8b1 -------------------------------- Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still set, and partition scan will be proceed again when blkdev_get_by_dev() is called. However, this will cause a problem that re-assemble partitioned raid device will creat partition for underlying disk. Test procedure: mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0 sgdisk -n 0:0:+100MiB /dev/md0 blockdev --rereadpt /dev/sda blockdev --rereadpt /dev/sdb mdadm -S /dev/md0 mdadm -A /dev/md0 /dev/sda /dev/sdb Test result: underlying disk partition and raid partition can be observed at the same time Note that this can still happen in come corner cases that GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid device. Fixes: e5cfefa9 ("block: fix scan partition for exclusively open device again") Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflicts: block/genhd.c Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Christoph Hellwig 提交于
mainline inclusion from mainline-v5.12~10 commit 68e6582e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7F3M1 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.4-rc7&id=68e6582e8f2dc32fd2458b9926564faa1fb4560e ---------------------------------------------- The switch to go through blkdev_get_by_dev means we now ignore the return value from bdev_disk_changed in __blkdev_get. Add a manual check to restore the old semantics. Fixes: 4601b4b1 ("block: reopen the device in blkdev_reread_part") Reported-by: NKarel Zak <kzak@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210421160502.447418-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflicts: block/ioctl.c Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yu Kuai 提交于
mainline inclusion from mainline-v6.2-rc1 commit a9a236d2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6Z1UG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3&id=a9a236d238a5e8ab2e74ca62c2c7ba5dd435af77 ---------------------------------------- Currently, if user disable wbt through sysfs, 'enable_state' will be 'WBT_STATE_ON_MANUAL', which will be confusing. Add a new state 'WBT_STATE_OFF_MANUAL' to cover that case. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221019121518.3865235-4-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-wbt.c Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Salman Qazi 提交于
mainline inclusion from mainline-v5.8-rc1 commit 28d65729 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6RQVT CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc4&id=28d65729b050977d8a9125e6726871e83bd22124 -------------------------------- Flushes bypass the I/O scheduler and get added to hctx->dispatch in blk_mq_sched_bypass_insert. This can happen while a kworker is running hctx->run_work work item and is past the point in blk_mq_sched_dispatch_requests where hctx->dispatch is checked. The blk_mq_do_dispatch_sched call is not guaranteed to end in bounded time, because the I/O scheduler can feed an arbitrary number of commands. Since we have only one hctx->run_work, the commands waiting in hctx->dispatch will wait an arbitrary length of time for run_work to be rerun. A similar phenomenon exists with dispatches from the software queue. The solution is to poll hctx->dispatch in blk_mq_do_dispatch_sched and blk_mq_do_dispatch_ctx and return from the run_work handler and let it rerun. Signed-off-by: NSalman Qazi <sqazi@google.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflicts: block/blk-mq-sched.c Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
mainline inclusion from mainline-v5.19-rc1 commit 91e8bcd7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7JJ9V CVE: NA -------------------------------- The access to cryptd_queue::cpu_queue is synchronized by disabling preemption in cryptd_enqueue_request() and disabling BH in cryptd_queue_worker(). This implies that access is allowed from BH. If cryptd_enqueue_request() is invoked from preemptible context _and_ soft interrupt then this can lead to list corruption since cryptd_enqueue_request() is not protected against access from soft interrupt. Replace get_cpu() in cryptd_enqueue_request() with local_bh_disable() to ensure BH is always disabled. Remove preempt_disable() from cryptd_queue_worker() since it is not needed because local_bh_disable() ensures synchronisation. Fixes: 254eff77 ("crypto: cryptd - Per-CPU thread implementation...") Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Conflicts: crypto/cryptd.c Signed-off-by: NGUO Zihua <guozihua@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Biggers 提交于
stable inclusion from stable-v4.19.226 commit a6f8ba674655f511bbf0e55d4f16c581f7a8a35a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7BZ5U CVE: NA ---------------------------------------- commit 5d73d1e3 upstream. extract_crng() and crng_backtrack_protect() load crng_node_pool with a plain load, which causes undefined behavior if do_numa_crng_init() modifies it concurrently. Fix this by using READ_ONCE(). Note: as per the previous discussion https://lore.kernel.org/lkml/20211219025139.31085-1-ebiggers@kernel.org/T/#u, READ_ONCE() is believed to be sufficient here, and it was requested that it be used here instead of smp_load_acquire(). Also change do_numa_crng_init() to set crng_node_pool using cmpxchg_release() instead of mb() + cmpxchg(), as the former is sufficient here but is more lightweight. Fixes: 1e7f583a ("random: make /dev/urandom scalable for silly userspace programs") Cc: stable@vger.kernel.org Signed-off-by: NEric Biggers <ebiggers@google.com> Acked-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NGONG, Ruiqi <gongruiqi@huaweicloud.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Li Huafei 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7EU4Q CVE: NA -------------------------------- We get the following crash caused by a null pointer access: <SNIP> BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 54 PID: 2469325 Comm: ftracetest Kdump: loaded Tainted: GFS OE 5.10.0+ #12 Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.35 10/20/2016 RIP: 0010:resume_execution+0x35/0x190 Code: 41 54 55 48 89 fd 53 48 89 f3 48 83 ec 08 4c 8b 6f 60 4c 8b b6 98 00 00 00 48 89 14 24 4c 8b 67 28 4d 89 ef eb 04 49 83 c5 01 <41> 0f b6 7d 00 e8 f1 12 57 00 83 e0 0f 8d 48 ff 83 f9 0a 76 e7 83 RSP: 0018:fffffe16118acec0 EFLAGS: 00010086 RAX: 0000000000000000 RBX: fffffe16118acf58 RCX: 00000000eefdca76 RDX: ffff8f8cffd1f400 RSI: fffffe16118acf58 RDI: ffff8f5500d3c000 RBP: ffff8f5500d3c000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff970e39c0 R13: 0000000000000000 R14: ffffb55ba0bd7c60 R15: 0000000000000000 FS: 00002aaaaad22740(0000) GS:ffff8f8cffd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000003ac202c003 CR4: 00000000003706e0 DR0: ffffffff98000160 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <#DB> kprobe_debug_handler+0x41/0xd0 exc_debug+0xe5/0x1b0 asm_exc_debug+0x19/0x30 RIP: 0010:copy_from_kernel_nofault.part.0+0x55/0xc0 Code: 85 c0 74 e1 65 48 8b 04 25 40 f0 01 00 83 a8 78 14 00 00 01 48 c7 c0 f2 ff ff ff c3 cc cc cc cc 48 83 fa 03 76 16 31 c0 8b 0e <89> 0f 85 c0 75 d4 48 83 c7 04 48 83 c6 04 48 83 ea 04 48 83 fa 01 RSP: 0018:ffffb55ba0bd7c60 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000076207325 RDX: 0000000000000004 RSI: ffffffff98000160 RDI: ffff8f5500c0902c RBP: ffff8f5500c0902c R08: ffffb55ba0bd7bdc R09: 0000000000000001 R10: ffff8f5548a0faf8 R11: 0000000000000000 R12: ffffffff98000160 R13: 0000000000000000 R14: ffff8f560da128c0 R15: 0000000000000000 </#DB> process_fetch_insn+0xfb/0x720 kprobe_trace_func+0x199/0x2c0 ? kernel_clone+0x5/0x2f0 kprobe_dispatcher+0x3d/0x60 aggr_pre_handler+0x40/0x80 ? kernel_clone+0x1/0x2f0 kprobe_ftrace_handler+0x82/0xf0 ? __se_sys_clone+0x65/0x90 ftrace_ops_assist_func+0x86/0x110 ? rcu_nocb_try_bypass+0x1f3/0x370 0xffffffffc07e60c8 ? kernel_clone+0x1/0x2f0 kernel_clone+0x5/0x2f0 <SNIP> The analysis reveals that kprobe and hardware breakpoints conflict in the use of debug exceptions. If we set a hardware breakpoint on a memory address, and at the same time there is a kprobe event that also goes to get the memory of that address, then when kprobe triggers, it will go to read the memory and trigger hardware breakpoint monitoring, at this time, because kprobe handles debug exceptions earlier than hardware breakpoints, it will cause kprobe to incorrectly consider this exception as a kprobe trigger. kprobe will change the status from KPROBE_HIT_ACTIVE to KPROBE_HIT_SS or KPROBE_HIT_SSDONE before single-step execution, so if the current status is KPROBE_HIT_ACTIVE, its not a debug exception triggered by kprobe. Signed-off-by: NLi Huafei <lihuafei1@huawei.com> Reviewed-by: NYe Weihua <yeweihua4@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-