- 08 5月, 2023 28 次提交
-
-
由 Ziyang Xuan 提交于
stable inclusion from stable-v4.19.281 commit f394f690a30a5ec0413c62777a058eaf3d6e10d5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- [ Upstream commit ea30388b ] Syzbot reported a bug as following: ===================================================== BUG: KMSAN: uninit-value in arch_atomic64_inc arch/x86/include/asm/atomic64_64.h:88 [inline] BUG: KMSAN: uninit-value in arch_atomic_long_inc include/linux/atomic/atomic-long.h:161 [inline] BUG: KMSAN: uninit-value in atomic_long_inc include/linux/atomic/atomic-instrumented.h:1429 [inline] BUG: KMSAN: uninit-value in __ip6_make_skb+0x2f37/0x30f0 net/ipv6/ip6_output.c:1956 arch_atomic64_inc arch/x86/include/asm/atomic64_64.h:88 [inline] arch_atomic_long_inc include/linux/atomic/atomic-long.h:161 [inline] atomic_long_inc include/linux/atomic/atomic-instrumented.h:1429 [inline] __ip6_make_skb+0x2f37/0x30f0 net/ipv6/ip6_output.c:1956 ip6_finish_skb include/net/ipv6.h:1122 [inline] ip6_push_pending_frames+0x10e/0x550 net/ipv6/ip6_output.c:1987 rawv6_push_pending_frames+0xb12/0xb90 net/ipv6/raw.c:579 rawv6_sendmsg+0x297e/0x2e60 net/ipv6/raw.c:922 inet_sendmsg+0x101/0x180 net/ipv4/af_inet.c:827 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0xa8e/0xe70 net/socket.c:2476 ___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2530 __sys_sendmsg net/socket.c:2559 [inline] __do_sys_sendmsg net/socket.c:2568 [inline] __se_sys_sendmsg net/socket.c:2566 [inline] __x64_sys_sendmsg+0x367/0x540 net/socket.c:2566 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Uninit was created at: slab_post_alloc_hook mm/slab.h:766 [inline] slab_alloc_node mm/slub.c:3452 [inline] __kmem_cache_alloc_node+0x71f/0xce0 mm/slub.c:3491 __do_kmalloc_node mm/slab_common.c:967 [inline] __kmalloc_node_track_caller+0x114/0x3b0 mm/slab_common.c:988 kmalloc_reserve net/core/skbuff.c:492 [inline] __alloc_skb+0x3af/0x8f0 net/core/skbuff.c:565 alloc_skb include/linux/skbuff.h:1270 [inline] __ip6_append_data+0x51c1/0x6bb0 net/ipv6/ip6_output.c:1684 ip6_append_data+0x411/0x580 net/ipv6/ip6_output.c:1854 rawv6_sendmsg+0x2882/0x2e60 net/ipv6/raw.c:915 inet_sendmsg+0x101/0x180 net/ipv4/af_inet.c:827 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0xa8e/0xe70 net/socket.c:2476 ___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2530 __sys_sendmsg net/socket.c:2559 [inline] __do_sys_sendmsg net/socket.c:2568 [inline] __se_sys_sendmsg net/socket.c:2566 [inline] __x64_sys_sendmsg+0x367/0x540 net/socket.c:2566 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd It is because icmp6hdr does not in skb linear region under the scenario of SOCK_RAW socket. Access icmp6_hdr(skb)->icmp6_type directly will trigger the uninit variable access bug. Use a local variable icmp6_type to carry the correct value in different scenarios. Fixes: 14878f75 ("[IPV6]: Add ICMPMsgStats MIB (RFC 4293) [rev 2]") Reported-by: syzbot+8257f4dcef79de670baf@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=3d605ec1d0a7f2a269a1a6936ac7f2b85975ee9cSigned-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Waiman Long 提交于
stable inclusion from stable-v4.19.281 commit 30e138e23f43f566b6df87bba51e4d97930671fd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- commit ba9182a8 upstream. After a successful cpuset_can_attach() call which increments the attach_in_progress flag, either cpuset_cancel_attach() or cpuset_attach() will be called later. In cpuset_attach(), tasks in cpuset_attach_wq, if present, will be woken up at the end. That is not the case in cpuset_cancel_attach(). So missed wakeup is possible if the attach operation is somehow cancelled. Fix that by doing the wakeup in cpuset_cancel_attach() as well. Fixes: e44193d3 ("cpuset: let hotplug propagation work wait for task attaching") Signed-off-by: NWaiman Long <longman@redhat.com> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Robbie Harwood 提交于
stable inclusion from stable-v4.19.281 commit 96acda7332d7f7f9b71e440d387ddebfe564e2f7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- [ Upstream commit 4fc5c74d ] The PE Format Specification (section "The Attribute Certificate Table (Image Only)") states that `dwLength` is to be rounded up to 8-byte alignment when used for traversal. Therefore, the field is not required to be an 8-byte multiple in the first place. Accordingly, pesign has not performed this alignment since version 0.110. This causes kexec failure on pesign'd binaries with "PEFILE: Signature wrapper len wrong". Update the comment and relax the check. Signed-off-by: NRobbie Harwood <rharwood@redhat.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> cc: Jarkko Sakkinen <jarkko@kernel.org> cc: Eric Biederman <ebiederm@xmission.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: keyrings@vger.kernel.org cc: linux-crypto@vger.kernel.org cc: kexec@lists.infradead.org Link: https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#the-attribute-certificate-table-image-only Link: https://github.com/rhboot/pesign Link: https://lore.kernel.org/r/20230220171254.592347-2-rharwood@redhat.com/ # v2 Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.281 commit 0638f2b9b220ac33e4657010dd0a130ed0cde7df category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- [ Upstream commit 1c5950fc ] lena wang reported an issue caused by udpv6_sendmsg() mangling msg->msg_name and msg->msg_namelen, which are later read from ____sys_sendmsg() : /* * If this is sendmmsg() and sending to current destination address was * successful, remember it. */ if (used_address && err >= 0) { used_address->name_len = msg_sys->msg_namelen; if (msg_sys->msg_name) memcpy(&used_address->name, msg_sys->msg_name, used_address->name_len); } udpv6_sendmsg() wants to pretend the remote address family is AF_INET in order to call udp_sendmsg(). A fix would be to modify the address in-place, instead of using a local variable, but this could have other side effects. Instead, restore initial values before we return from udpv6_sendmsg(). Fixes: c71d8ebe ("net: Fix security_socket_sendmsg() bypass problem.") Reported-by: Nlena wang <lena.wang@mediatek.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NMaciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20230412130308.1202254-1-edumazet@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Rongwei Wang 提交于
stable inclusion from stable-v4.19.281 commit a55f268abdb74ac5633b75a09fefb58458e9d2a2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- commit 6fe7d6b9 upstream. The si->lock must be held when deleting the si from the available list. Otherwise, another thread can re-add the si to the available list, which can lead to memory corruption. The only place we have found where this happens is in the swapoff path. This case can be described as below: core 0 core 1 swapoff del_from_avail_list(si) waiting try lock si->lock acquire swap_avail_lock and re-add si into swap_avail_head acquire si->lock but missing si already being added again, and continuing to clear SWP_WRITEOK, etc. It can be easily found that a massive warning messages can be triggered inside get_swap_pages() by some special cases, for example, we call madvise(MADV_PAGEOUT) on blocks of touched memory concurrently, meanwhile, run much swapon-swapoff operations (e.g. stress-ng-swap). However, in the worst case, panic can be caused by the above scene. In swapoff(), the memory used by si could be kept in swap_info[] after turning off a swap. This means memory corruption will not be caused immediately until allocated and reset for a new swap in the swapon path. A panic message caused: (with CONFIG_PLIST_DEBUG enabled) ------------[ cut here ]------------ top: 00000000e58a3003, n: 0000000013e75cda, p: 000000008cd4451a prev: 0000000035b1e58a, n: 000000008cd4451a, p: 000000002150ee8d next: 000000008cd4451a, n: 000000008cd4451a, p: 000000008cd4451a WARNING: CPU: 21 PID: 1843 at lib/plist.c:60 plist_check_prev_next_node+0x50/0x70 Modules linked in: rfkill(E) crct10dif_ce(E)... CPU: 21 PID: 1843 Comm: stress-ng Kdump: ... 5.10.134+ Hardware name: Alibaba Cloud ECS, BIOS 0.0.0 02/06/2015 pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) pc : plist_check_prev_next_node+0x50/0x70 lr : plist_check_prev_next_node+0x50/0x70 sp : ffff0018009d3c30 x29: ffff0018009d3c40 x28: ffff800011b32a98 x27: 0000000000000000 x26: ffff001803908000 x25: ffff8000128ea088 x24: ffff800011b32a48 x23: 0000000000000028 x22: ffff001800875c00 x21: ffff800010f9e520 x20: ffff001800875c00 x19: ffff001800fdc6e0 x18: 0000000000000030 x17: 0000000000000000 x16: 0000000000000000 x15: 0736076307640766 x14: 0730073007380731 x13: 0736076307640766 x12: 0730073007380731 x11: 000000000004058d x10: 0000000085a85b76 x9 : ffff8000101436e4 x8 : ffff800011c8ce08 x7 : 0000000000000000 x6 : 0000000000000001 x5 : ffff0017df9ed338 x4 : 0000000000000001 x3 : ffff8017ce62a000 x2 : ffff0017df9ed340 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: plist_check_prev_next_node+0x50/0x70 plist_check_head+0x80/0xf0 plist_add+0x28/0x140 add_to_avail_list+0x9c/0xf0 _enable_swap_info+0x78/0xb4 __do_sys_swapon+0x918/0xa10 __arm64_sys_swapon+0x20/0x30 el0_svc_common+0x8c/0x220 do_el0_svc+0x2c/0x90 el0_svc+0x1c/0x30 el0_sync_handler+0xa8/0xb0 el0_sync+0x148/0x180 irq event stamp: 2082270 Now, si->lock locked before calling 'del_from_avail_list()' to make sure other thread see the si had been deleted and SWP_WRITEOK cleared together, will not reinsert again. This problem exists in versions after stable 5.10.y. Link: https://lkml.kernel.org/r/20230404154716.23058-1-rongwei.wang@linux.alibaba.com Fixes: a2468cc9 ("swap: choose swap device according to numa node") Tested-by: NYongchen Yin <wb-yyc939293@alibaba-inc.com> Signed-off-by: NRongwei Wang <rongwei.wang@linux.alibaba.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Aaron Lu <aaron.lu@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 John Keeping 提交于
stable inclusion from stable-v4.19.281 commit 51ba3ee276a8f3e3b0b588526ebd8e9a9331aa2f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- commit ea65b418 upstream. If the compiler decides not to inline this function then preemption tracing will always show an IP inside the preemption disabling path and never the function actually calling preempt_{enable,disable}. Link: https://lore.kernel.org/linux-trace-kernel/20230327173647.1690849-1-john@metanate.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Fixes: f904f582 ("sched/debug: Fix preempt_disable_ip recording for preempt_disable()") Signed-off-by: NJohn Keeping <john@metanate.com> Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Kan Liang 提交于
stable inclusion from stable-v4.19.281 commit c1412fcad345d0c63874068a556cb1c03b87abb5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- [ Upstream commit 24d3ae2f ] The same task check in perf_event_set_output has some potential issues for some usages. For the current perf code, there is a problem if using of perf_event_open() to have multiple samples getting into the same mmap’d memory when they are both attached to the same process. https://lore.kernel.org/all/92645262-D319-4068-9C44-2409EF44888E@gmail.com/ Because the event->ctx is not ready when the perf_event_set_output() is invoked in the perf_event_open(). Besides the above issue, before the commit bd275681 ("perf: Rewrite core context handling"), perf record can errors out when sampling with a hardware event and a software event as below. $ perf record -e cycles,dummy --per-thread ls failed to mmap with 22 (Invalid argument) That's because that prior to the commit a hardware event and a software event are from different task context. The problem should be a long time issue since commit c3f00c70 ("perk: Separate find_get_context() from event initialization"). The task struct is stored in the event->hw.target for each per-thread event. It is a more reliable way to determine whether two events are attached to the same task. The event->hw.target was also introduced several years ago by the commit 50f16a8b ("perf: Remove type specific target pointers"). It can not only be used to fix the issue with the current code, but also back port to fix the issues with an older kernel. Note: The event->hw.target was introduced later than commit c3f00c70. The patch may cannot be applied between the commit c3f00c70 and commit 50f16a8b. Anybody that wants to back-port this at that period may have to find other solutions. Fixes: c3f00c70 ("perf: Separate find_get_context() from event initialization") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NZhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lkml.kernel.org/r/20230322202449.512091-1-kan.liang@linux.intel.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Jakub Kicinski 提交于
stable inclusion from stable-v4.19.281 commit be71c3c75a488ca1594a98df0754094179ec8146 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- [ Upstream commit 275b471e ] Commit 0db3dc73 ("[NETPOLL]: tx lock deadlock fix") narrowed down the region under netif_tx_trylock() inside netpoll_send_skb(). (At that point in time netif_tx_trylock() would lock all queues of the device.) Taking the tx lock was problematic because driver's cleanup method may take the same lock. So the change made us hold the xmit lock only around xmit, and expected the driver to take care of locking within ->ndo_poll_controller(). Unfortunately this only works if netpoll isn't itself called with the xmit lock already held. Netpoll code is careful and uses trylock(). The drivers, however, may be using plain lock(). Printing while holding the xmit lock is going to result in rare deadlocks. Luckily we record the xmit lock owners, so we can scan all the queues, the same way we scan NAPI owners. If any of the xmit locks is held by the local CPU we better not attempt any polling. It would be nice if we could narrow down the check to only the NAPIs and the queue we're trying to use. I don't see a way to do that now. Reported-by: NRoman Gushchin <roman.gushchin@linux.dev> Fixes: 0db3dc73 ("[NETPOLL]: tx lock deadlock fix") Signed-off-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.281 commit 824bc1fd2ec3d389c372bc885170f1943642d010 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- [ Upstream commit 7d63b671 ] syzbot was able to trigger a panic [1] in icmp_glue_bits(), or more exactly in skb_copy_and_csum_bits() There is no repro yet, but I think the issue is that syzbot manages to lower device mtu to a small value, fooling __icmp_send() __icmp_send() must make sure there is enough room for the packet to include at least the headers. We might in the future refactor skb_copy_and_csum_bits() and its callers to no longer crash when something bad happens. [1] kernel BUG at net/core/skbuff.c:3343 ! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 15766 Comm: syz-executor.0 Not tainted 6.3.0-rc4-syzkaller-00039-gffe78bbd #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 RIP: 0010:skb_copy_and_csum_bits+0x798/0x860 net/core/skbuff.c:3343 Code: f0 c1 c8 08 41 89 c6 e9 73 ff ff ff e8 61 48 d4 f9 e9 41 fd ff ff 48 8b 7c 24 48 e8 52 48 d4 f9 e9 c3 fc ff ff e8 c8 27 84 f9 <0f> 0b 48 89 44 24 28 e8 3c 48 d4 f9 48 8b 44 24 28 e9 9d fb ff ff RSP: 0018:ffffc90000007620 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000001e8 RCX: 0000000000000100 RDX: ffff8880276f6280 RSI: ffffffff87fdd138 RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 R10: 00000000000001e8 R11: 0000000000000001 R12: 000000000000003c R13: 0000000000000000 R14: ffff888028244868 R15: 0000000000000b0e FS: 00007fbc81f1c700(0000) GS:ffff88802ca00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2df43000 CR3: 00000000744db000 CR4: 0000000000150ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> icmp_glue_bits+0x7b/0x210 net/ipv4/icmp.c:353 __ip_append_data+0x1d1b/0x39f0 net/ipv4/ip_output.c:1161 ip_append_data net/ipv4/ip_output.c:1343 [inline] ip_append_data+0x115/0x1a0 net/ipv4/ip_output.c:1322 icmp_push_reply+0xa8/0x440 net/ipv4/icmp.c:370 __icmp_send+0xb80/0x1430 net/ipv4/icmp.c:765 ipv4_send_dest_unreach net/ipv4/route.c:1239 [inline] ipv4_link_failure+0x5a9/0x9e0 net/ipv4/route.c:1246 dst_link_failure include/net/dst.h:423 [inline] arp_error_report+0xcb/0x1c0 net/ipv4/arp.c:296 neigh_invalidate+0x20d/0x560 net/core/neighbour.c:1079 neigh_timer_handler+0xc77/0xff0 net/core/neighbour.c:1166 call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700 expire_timers+0x29b/0x4b0 kernel/time/timer.c:1751 __run_timers kernel/time/timer.c:2022 [inline] Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: syzbot+d373d60fddbdc915e666@syzkaller.appspotmail.com Signed-off-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230330174502.1915328-1-edumazet@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Linus Torvalds 提交于
stable inclusion from stable-v4.19.280 commit 178ff87d2a0c2d3d74081e1c2efbb33b3487267d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- [ Upstream commit 6015b1ac ] The getaffinity() system call uses 'cpumask_size()' to decide how big the CPU mask is - so far so good. It is indeed the allocation size of a cpumask. But the code also assumes that the whole allocation is initialized without actually doing so itself. That's wrong, because we might have fixed-size allocations (making copying and clearing more efficient), but not all of it is then necessarily used if 'nr_cpu_ids' is smaller. Having checked other users of 'cpumask_size()', they all seem to be ok, either using it purely for the allocation size, or explicitly zeroing the cpumask before using the size in bytes to copy it. See for example the ublk_ctrl_get_queue_affinity() function that uses the proper 'zalloc_cpumask_var()' to make sure that the whole mask is cleared, whether the storage is on the stack or if it was an external allocation. Fix this by just zeroing the allocation before using it. Do the same for the compat version of sched_getaffinity(), which had the same logic. Also, for consistency, make sched_getaffinity() use 'cpumask_bits()' to access the bits. For a cpumask_var_t, it ends up being a pointer to the same data either way, but it's just a good idea to treat it like you would a 'cpumask_t'. The compat case already did that. Reported-by: NRyan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/lkml/7d026744-6bd6-6827-0471-b5e8eae0be3f@arm.com/ Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Jiasheng Jiang 提交于
stable inclusion from stable-v4.19.280 commit 0d96bd507ed7e7d565b6d53ebd3874686f123b2e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- commit d3aa3e06 upstream. Check alloc_precpu()'s return value and return an error from dm_stats_init() if it fails. Update alloc_dev() to fail if dm_stats_init() does. Otherwise, a NULL pointer dereference will occur in dm_stats_cleanup() even if dm-stats isn't being actively used. Fixes: fd2ed4d2 ("dm: add statistics support") Cc: stable@vger.kernel.org Signed-off-by: NJiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: NMike Snitzer <snitzer@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Coly Li 提交于
stable inclusion from stable-v4.19.280 commit 84e13235e08941ce37aa9ee238b6dd007170f0fc category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- commit 9bbf5fee upstream. This is an already known issue that dm-thin volume cannot be used as swap, otherwise a deadlock may happen when dm-thin internal memory demand triggers swap I/O on the dm-thin volume itself. But thanks to commit a666e5c0 ("dm: fix deadlock when swapping to encrypted device"), the limit_swap_bios target flag can also be used for dm-thin to avoid the recursive I/O when it is used as swap. Fix is to simply set ti->limit_swap_bios to true in both pool_ctr() and thin_ctr(). In my test, I create a dm-thin volume /dev/vg/swap and use it as swap device. Then I run fio on another dm-thin volume /dev/vg/main and use large --blocksize to trigger swap I/O onto /dev/vg/swap. The following fio command line is used in my test, fio --name recursive-swap-io --lockmem 1 --iodepth 128 \ --ioengine libaio --filename /dev/vg/main --rw randrw \ --blocksize 1M --numjobs 32 --time_based --runtime=12h Without this fix, the whole system can be locked up within 15 seconds. With this fix, there is no any deadlock or hung task observed after 2 hours of running fio. Furthermore, if blocksize is changed from 1M to 128M, after around 30 seconds fio has no visible I/O, and the out-of-memory killer message shows up in kernel message. After around 20 minutes all fio processes are killed and the whole system is back to being alive. This is exactly what is expected when recursive I/O happens on dm-thin volume when it is used as swap. Depends-on: a666e5c0 ("dm: fix deadlock when swapping to encrypted device") Cc: stable@vger.kernel.org Signed-off-by: NColy Li <colyli@suse.de> Acked-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yipeng Zou 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6WPFT CVE: NA -------------------------------- Since we fix the issue that irq may lose in gic-v3. It also needs fixed in phytium platform. Signed-off-by: NYipeng Zou <zouyipeng@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Reviewed-by: NLiao Chang <liaochang1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yipeng Zou 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6WPFT CVE: NA -------------------------------- Recently, We have a LPI migration issue on the ARM SMP platform. For example, NIC device generates MSI and sends LPI to CPU0 via ITS, meanwhile irqbalance running on CPU1 set irq affinity of NIC to CPU1, the next interrupt will be sent to CPU2, due to the state of irq is still in progress, kernel does not end up performing irq handler on CPU2, which results in some userland service timeouts, the sequence of events is shown as follows: NIC CPU0 CPU1 Generate IRQ#1 READ_IAR Lock irq_desc Set IRQD_IN_PROGRESS Unlock irq_desc Lock irq_desc Change LPI Affinity Unlock irq_desc Call irq_handler Generate IRQ#2 READ_IAR Lock irq_desc Check IRQD_IN_PROGRESS Unlock irq_desc Return from interrupt#2 Lock irq_desc Clear IRQD_IN_PROGRESS Unlock irq_desc return from interrupt#1 For this scenario, The IRQ#2 will be lost. This does cause some exceptions. For further information, see [1]. This patch introduced a new flow handler which combines fasteoi and edge type as a workaround. An additional loop will be executed if the IRQS_PENDING has been setup. Fixes: cc2d3216 ("irqchip: GICv3: ITS command queue") Signed-off-by: NYipeng Zou <zouyipeng@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Reviewed-by: NLiao Chang <liaochang1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yipeng Zou 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6WPFT CVE: NA -------------------------------- This reverts commit 2e149805. Signed-off-by: NYipeng Zou <zouyipeng@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yipeng Zou 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6WPFT CVE: NA -------------------------------- This reverts commit eca55c5a. Signed-off-by: NYipeng Zou <zouyipeng@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yipeng Zou 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6WPFT CVE: NA -------------------------------- This reverts commit 16073a19. Signed-off-by: NYipeng Zou <zouyipeng@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yipeng Zou 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6WPFT CVE: NA -------------------------------- This reverts commit 6ea55196. Signed-off-by: NYipeng Zou <zouyipeng@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Arnd Bergmann 提交于
mainline inclusion from mainline-v6.0-rc1~14 commit b04e75a4 category: bugfix bugzilla: 188707, https://gitee.com/src-openeuler/kernel/issues/I6VK2F CVE: CVE-2023-2007 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b04e75a4a8a81887386a0d2dbf605a48e779d2a0 ---------------------------------------- The dpt_i2o driver was fixed to stop using virt_to_bus() in 2008, but it still has a stale reference in an error handling code path that could never work. I submitted a patch to fix this reference earlier, but Hannes Reinecke suggested that removing the driver may be just as good here. The i2o driver layer was removed in 2015 with commit 4a72a7af ("staging: remove i2o subsystem"), but the even older dpt_i2o scsi driver stayed around. The last non-cleanup patches I could find were from Miquel van Smoorenburg and Mark Salyzyn back in 2008, they might know if there is any chance of the hardware still being used anywhere. Link: https://lore.kernel.org/linux-scsi/CAK8P3a1XfwkTOV7qOs1fTxf4vthNBRXKNu8A5V7TWnHT081NGA@mail.gmail.com/T/ Link: https://lore.kernel.org/r/20220624155226.2889613-3-arnd@kernel.org Cc: Miquel van Smoorenburg <mikevs@xs4all.net> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Christoph Hellwig 提交于
mainline inclusion from mainline-v5.16-rc1 commit 94f3cd7d category: bugfix bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=94f3cd7d832c28681f1dea54b4dd8606e5e2bc75 -------------------------------- disks_mutex is intended to serialize md_alloc. Extended it to also cover the kobject_uevent call and getting the sysfs dirent to help reducing error handling complexity. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflict: drivers/md/md.c Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
mainline inclusion from mainline-v5.18-rc1 commit 7d959f6e category: bugfix bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=7d959f6e978cbbca90e26a192cc39480e977182f -------------------------------- Calling mdelay(1000) from process context, even while a reboot is in progress, does not make sense. Using msleep() allows other threads to make progress. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: linux-raid@vger.kernel.org Signed-off-by: NSong Liu <song@kernel.org> Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 zhangyue 提交于
mainline inclusion from mainline-v5.16-rc5 commit 07641b5f category: bugfix bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=07641b5f32f6991758b08da9b1f4173feeb64f2a -------------------------------- In driver/md/md.c, if the function autorun_array() is called, the problem of double free may occur. In function autorun_array(), when the function do_md_run() returns an error, the function do_md_stop() will be called. The function do_md_run() called function md_run(), but in function md_run(), the pointer mddev->private may be freed. The function do_md_stop() called the function __md_stop(), but in function __md_stop(), the pointer mddev->private also will be freed without judging null. At this time, the pointer mddev->private will be double free, so it needs to be judged null or not. Signed-off-by: Nzhangyue <zhangyue1@kylinos.cn> Signed-off-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6ZG5B CVE: NA -------------------------------- badblocks will loss if we set it as below: # echo 1 1 > bad_blocks # echo 3 1 > bad_blocks # echo 1 5 > bad_blocks # cat bad_blocks 1 3 we will combine badblocks if there is an intersection between p[lo] and p[hi] in badblocks_set(). The end of new badblocks is p[hi]'s end now. but p[lo] may cross p[hi] and new end should be the larger of p[lo] and p[hi]. lo: |------------------------| hi: |--------| Fixes: 9e0e252a ("badblocks: Add core badblock management code") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6ZG5B CVE: NA -------------------------------- Order of badblocks will be reversed if we set a large area at once. 'hi' remains unchanged while adding continuous badblocks is wrong, the next setting is greater than 'hi', it should be added to the next position. Let 'hi' +1 each cycle. # echo 0 2048 > bad_blocks # cat bad_blocks 1536 512 1024 512 512 512 0 512 Fixes: 9e0e252a ("badblocks: Add core badblock management code") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6ZG5B CVE: NA -------------------------------- bb->changed and unacked_exist is set and badblocks_update_acked() is involked even if no badblocks changes in badblocks_set(). Only update them when badblocks changes. Fixes: 9e0e252a ("badblocks: Add core badblock management code") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188553, https://gitee.com/openeuler/kernel/issues/I6TNFX CVE: NA -------------------------------- rdev->del_work has not been queued to md_rdev_misc_wq and flush_workqueue will not flush it if tow threads add and remove same device. sysfs might WARN duplicate filename as below. //T1 //T2 mdadm write super add success remove unbind_rdev_from_array md_ioctl flush_workqueue INIT_WORK queue_work md_add_new_disk duplicate filename dev-xxx Check if there is any kobj with the same name, and return busy if true. Fixes: 5792a285 ("md: avoid a deadlock when removing a device from an md array via sysfs") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188553, https://gitee.com/openeuler/kernel/issues/I6TNFX CVE: NA -------------------------------- If we want to remove a device, first we delete it from mddev->disks list, then init rdev->del_work to put it (see unbind_rdev_from_array()). flush_rdev_wq() traverses mddev->disks to check if there is any pending rdev->del_work, if so, flush it. Howerver, rdev will not be in the list of mddev->disks if rdev->del_work exists, and flush_workqueue() will never be executed. Replace it with flush_workqueue() to ensure del_work has been completed when adding devices. Fixes: cc1ffe61 ("md: add new workqueue for delete rdev") Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Ido Schimmel 提交于
mainline inclusion from mainline-v6.3-rc6 commit c484fcc0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I718K0 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c484fcc058bada604d7e4e5228d4affb646ddbc2 --------------------------- When a net device is put administratively up, its 'IFF_UP' flag is set (if not set already) and a 'NETDEV_UP' notification is emitted, which causes the 8021q driver to add VLAN ID 0 on the device. The reverse happens when a net device is put administratively down. When changing the type of a bond to Ethernet, its 'IFF_UP' flag is incorrectly cleared, resulting in the kernel skipping the above process and VLAN ID 0 being leaked [1]. Fix by restoring the flag when changing the type to Ethernet, in a similar fashion to the restoration of the 'IFF_SLAVE' flag. The issue can be reproduced using the script in [2], with example out before and after the fix in [3]. [1] unreferenced object 0xffff888103479900 (size 256): comm "ip", pid 329, jiffies 4294775225 (age 28.561s) hex dump (first 32 bytes): 00 a0 0c 15 81 88 ff ff 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0 [<ffffffff8406426c>] vlan_vid_add+0x30c/0x790 [<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0 [<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0 [<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150 [<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0 [<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180 [<ffffffff8379af36>] do_setlink+0xb96/0x4060 [<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0 [<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0 [<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00 [<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839a738f>] netlink_unicast+0x53f/0x810 [<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90 [<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70 [<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0 unreferenced object 0xffff88810f6a83e0 (size 32): comm "ip", pid 329, jiffies 4294775225 (age 28.561s) hex dump (first 32 bytes): a0 99 47 03 81 88 ff ff a0 99 47 03 81 88 ff ff ..G.......G..... 81 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc ................ backtrace: [<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0 [<ffffffff84064369>] vlan_vid_add+0x409/0x790 [<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0 [<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0 [<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150 [<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0 [<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180 [<ffffffff8379af36>] do_setlink+0xb96/0x4060 [<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0 [<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0 [<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00 [<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839a738f>] netlink_unicast+0x53f/0x810 [<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90 [<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70 [<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0 [2] ip link add name t-nlmon type nlmon ip link add name t-dummy type dummy ip link add name t-bond type bond mode active-backup ip link set dev t-bond up ip link set dev t-nlmon master t-bond ip link set dev t-nlmon nomaster ip link show dev t-bond ip link set dev t-dummy master t-bond ip link show dev t-bond ip link del dev t-bond ip link del dev t-dummy ip link del dev t-nlmon [3] Before: 12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/netlink 12: t-bond: <BROADCAST,MULTICAST,MASTER,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 46:57:39:a4:46:a2 brd ff:ff:ff:ff:ff:ff After: 12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/netlink 12: t-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 66:48:7b:74:b6:8a brd ff:ff:ff:ff:ff:ff Fixes: e36b9d16 ("bonding: clean muticast addresses when device changes type") Fixes: 75c78500 ("bonding: remap muticast addresses without using dev_close() and dev_open()") Fixes: 9ec7eb60 ("bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change") Reported-by: NMirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Link: https://lore.kernel.org/netdev/78a8a03b-6070-3e6b-5042-f848dab16fb8@alu.unizg.hr/Tested-by: NMirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NDong Chenchen <dongchenchen2@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 06 5月, 2023 6 次提交
-
-
由 Mike Snitzer 提交于
mainline inclusion from mainline-v6.4-rc1 commit 3d32aaa7 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6YQZS CVE: CVE-2023-2269 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=3d32aaa7e66d5c1479a3c31d6c2c5d45dd0d3b89 ---------------------------------------- syzkaller found the following problematic rwsem locking (with write lock already held): down_read+0x9d/0x450 kernel/locking/rwsem.c:1509 dm_get_inactive_table+0x2b/0xc0 drivers/md/dm-ioctl.c:773 __dev_status+0x4fd/0x7c0 drivers/md/dm-ioctl.c:844 table_clear+0x197/0x280 drivers/md/dm-ioctl.c:1537 In table_clear, it first acquires a write lock https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L1520 down_write(&_hash_lock); Then before the lock is released at L1539, there is a path shown above: table_clear -> __dev_status -> dm_get_inactive_table -> down_read https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L773 down_read(&_hash_lock); It tries to acquire the same read lock again, resulting in the deadlock problem. Fix this by moving table_clear()'s __dev_status() call to after its up_write(&_hash_lock); Cc: stable@vger.kernel.org Reported-by: NZheng Zhang <zheng.zhang@email.ucr.edu> Signed-off-by: NMike Snitzer <snitzer@kernel.org> Conflicts: drivers/md/dm-ioctl.c Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Frederic Weisbecker 提交于
mainline inclusion from mainline-v5.16-rc4 commit 53e87e3c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6WCC1 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=53e87e3cdc155f20c3417b689df8d2ac88d79576 -------------------------------- When at least one CPU runs in nohz_full mode, a dedicated timekeeper CPU is guaranteed to stay online and to never stop its tick. Meanwhile on some rare case, the dedicated timekeeper may be running with interrupts disabled for a while, such as in stop_machine. If jiffies stop being updated, a nohz_full CPU may end up endlessly programming the next tick in the past, taking the last jiffies update monotonic timestamp as a stale base, resulting in an tick storm. Here is a scenario where it matters: 0) CPU 0 is the timekeeper and CPU 1 a nohz_full CPU. 1) A stop machine callback is queued to execute somewhere. 2) CPU 0 reaches MULTI_STOP_DISABLE_IRQ while CPU 1 is still in MULTI_STOP_PREPARE. Hence CPU 0 can't do its timekeeping duty. CPU 1 can still take IRQs. 3) CPU 1 receives an IRQ which queues a timer callback one jiffy forward. 4) On IRQ exit, CPU 1 schedules the tick one jiffy forward, taking last_jiffies_update as a base. But last_jiffies_update hasn't been updated for 2 jiffies since the timekeeper has interrupts disabled. 5) clockevents_program_event(), which relies on ktime_get(), observes that the expiration is in the past and therefore programs the min delta event on the clock. 6) The tick fires immediately, goto 3) 7) Tick storm, the nohz_full CPU is drown and takes ages to reach MULTI_STOP_DISABLE_IRQ, which is the only way out of this situation. Solve this with unconditionally updating jiffies if the value is stale on nohz_full IRQ entry. IRQs and other disturbances are expected to be rare enough on nohz_full for the unconditional call to ktime_get() to actually matter. Reported-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NFrederic Weisbecker <frederic@kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NPaul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20211026141055.57358-2-frederic@kernel.org Conflicts: kernel/softirq.c Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Nikolay Aleksandrov 提交于
mainline inclusion from mainline-v6.3-rc3 commit e667d469 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6WNGK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e667d469098671261d558be0cd93dca4d285ce1e --------------------------- syzbot reported a warning[1] where the bond device itself is a slave and we try to enslave a non-ethernet device as the first slave which fails but then in the error path when ether_setup() restores the bond device it also clears all flags. In my previous fix[2] I restored the IFF_MASTER flag, but I didn't consider the case that the bond device itself might also be a slave with IFF_SLAVE set, so we need to restore that flag as well. Use the bond_ether_setup helper which does the right thing and restores the bond's flags properly. Steps to reproduce using a nlmon dev: $ ip l add nlmon0 type nlmon $ ip l add bond1 type bond $ ip l add bond2 type bond $ ip l set bond1 master bond2 $ ip l set dev nlmon0 master bond1 $ ip -d l sh dev bond1 22: bond1: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue master bond2 state DOWN mode DEFAULT group default qlen 1000 (now bond1's IFF_SLAVE flag is gone and we'll hit a warning[3] if we try to delete it) [1] https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3ef [2] commit 7d5cd2ce ("bonding: correctly handle bonding type change on enslave failure") [3] example warning: [ 27.008664] bond1: (slave nlmon0): The slave device specified does not support setting the MAC address [ 27.008692] bond1: (slave nlmon0): Error -95 calling set_mac_address [ 32.464639] bond1 (unregistering): Released all slaves [ 32.464685] ------------[ cut here ]------------ [ 32.464686] WARNING: CPU: 1 PID: 2004 at net/core/dev.c:10829 unregister_netdevice_many+0x72a/0x780 [ 32.464694] Modules linked in: br_netfilter bridge bonding virtio_net [ 32.464699] CPU: 1 PID: 2004 Comm: ip Kdump: loaded Not tainted 5.18.0-rc3+ #47 [ 32.464703] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014 [ 32.464704] RIP: 0010:unregister_netdevice_many+0x72a/0x780 [ 32.464707] Code: 99 fd ff ff ba 90 1a 00 00 48 c7 c6 f4 02 66 96 48 c7 c7 20 4d 35 96 c6 05 fa c7 2b 02 01 e8 be 6f 4a 00 0f 0b e9 73 fd ff ff <0f> 0b e9 5f fd ff ff 80 3d e3 c7 2b 02 00 0f 85 3b fd ff ff ba 59 [ 32.464710] RSP: 0018:ffffa006422d7820 EFLAGS: 00010206 [ 32.464712] RAX: ffff8f6e077140a0 RBX: ffffa006422d7888 RCX: 0000000000000000 [ 32.464714] RDX: ffff8f6e12edbe58 RSI: 0000000000000296 RDI: ffffffff96d4a520 [ 32.464716] RBP: ffff8f6e07714000 R08: ffffffff96d63600 R09: ffffa006422d7728 [ 32.464717] R10: 0000000000000ec0 R11: ffffffff9698c988 R12: ffff8f6e12edb140 [ 32.464719] R13: dead000000000122 R14: dead000000000100 R15: ffff8f6e12edb140 [ 32.464723] FS: 00007f297c2f1740(0000) GS:ffff8f6e5d900000(0000) knlGS:0000000000000000 [ 32.464725] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 32.464726] CR2: 00007f297bf1c800 CR3: 00000000115e8000 CR4: 0000000000350ee0 [ 32.464730] Call Trace: [ 32.464763] <TASK> [ 32.464767] rtnl_dellink+0x13e/0x380 [ 32.464776] ? cred_has_capability.isra.0+0x68/0x100 [ 32.464780] ? __rtnl_unlock+0x33/0x60 [ 32.464783] ? bpf_lsm_capset+0x10/0x10 [ 32.464786] ? security_capable+0x36/0x50 [ 32.464790] rtnetlink_rcv_msg+0x14e/0x3b0 [ 32.464792] ? _copy_to_iter+0xb1/0x790 [ 32.464796] ? post_alloc_hook+0xa0/0x160 [ 32.464799] ? rtnl_calcit.isra.0+0x110/0x110 [ 32.464802] netlink_rcv_skb+0x50/0xf0 [ 32.464806] netlink_unicast+0x216/0x340 [ 32.464809] netlink_sendmsg+0x23f/0x480 [ 32.464812] sock_sendmsg+0x5e/0x60 [ 32.464815] ____sys_sendmsg+0x22c/0x270 [ 32.464818] ? import_iovec+0x17/0x20 [ 32.464821] ? sendmsg_copy_msghdr+0x59/0x90 [ 32.464823] ? do_set_pte+0xa0/0xe0 [ 32.464828] ___sys_sendmsg+0x81/0xc0 [ 32.464832] ? mod_objcg_state+0xc6/0x300 [ 32.464835] ? refill_obj_stock+0xa9/0x160 [ 32.464838] ? memcg_slab_free_hook+0x1a5/0x1f0 [ 32.464842] __sys_sendmsg+0x49/0x80 [ 32.464847] do_syscall_64+0x3b/0x90 [ 32.464851] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 32.464865] RIP: 0033:0x7f297bf2e5e7 [ 32.464868] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 32.464869] RSP: 002b:00007ffd96c824c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 32.464872] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f297bf2e5e7 [ 32.464874] RDX: 0000000000000000 RSI: 00007ffd96c82540 RDI: 0000000000000003 [ 32.464875] RBP: 00000000640f19de R08: 0000000000000001 R09: 000000000000007c [ 32.464876] R10: 00007f297bffabe0 R11: 0000000000000246 R12: 0000000000000001 [ 32.464877] R13: 00007ffd96c82d20 R14: 00007ffd96c82610 R15: 000055bfe38a7020 [ 32.464881] </TASK> [ 32.464882] ---[ end trace 0000000000000000 ]--- Fixes: 7d5cd2ce ("bonding: correctly handle bonding type change on enslave failure") Reported-by: syzbot+9dfc3f3348729cc82277@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3efSigned-off-by: NNikolay Aleksandrov <razor@blackwall.org> Reviewed-by: NMichal Kubiak <michal.kubiak@intel.com> Acked-by: NJonathan Toppins <jtoppins@redhat.com> Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Nikolay Aleksandrov 提交于
mainline inclusion from mainline-v6.3-rc3 commit 9ec7eb60 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6WNGK CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9ec7eb60dcbcb6c41076defbc5df7bbd95ceaba5 --------------------------- Add bond_ether_setup helper which is used to fix ether_setup() calls in the bonding driver. It takes care of both IFF_MASTER and IFF_SLAVE flags, the former is always restored and the latter only if it was set. If the bond enslaves non-ARPHRD_ETHER device (changes its type), then releases it and enslaves ARPHRD_ETHER device (changes back) then we use ether_setup() to restore the bond device type but it also resets its flags and removes IFF_MASTER and IFF_SLAVE[1]. Use the bond_ether_setup helper to restore both after such transition. [1] reproduce (nlmon is non-ARPHRD_ETHER): $ ip l add nlmon0 type nlmon $ ip l add bond2 type bond mode active-backup $ ip l set nlmon0 master bond2 $ ip l set nlmon0 nomaster $ ip l add bond1 type bond (we use bond1 as ARPHRD_ETHER device to restore bond2's mode) $ ip l set bond1 master bond2 $ ip l sh dev bond2 37: bond2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether be:d7:c5:40:5b:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500 (notice bond2's IFF_MASTER is missing) Fixes: e36b9d16 ("bonding: clean muticast addresses when device changes type") Signed-off-by: NNikolay Aleksandrov <razor@blackwall.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Conflicts: drivers/net/bonding/bond_main.c Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Zheng Wang 提交于
mainline inclusion from mainline-v6.3-rc4 commit 6b6bc5b8 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6ZWOL CVE: CVE-2023-2483 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6b6bc5b8bd2d4ca9e1efa9ae0f98a0b0687ace75 --------------------------- In emac_probe, &adpt->work_thread is bound with emac_work_thread. Then it will be started by timeout handler emac_tx_timeout or a IRQ handler emac_isr. If we remove the driver which will call emac_remove to make cleanup, there may be a unfinished work. The possible sequence is as follows: Fix it by finishing the work before cleanup in the emac_remove and disable timeout response. CPU0 CPU1 |emac_work_thread emac_remove | free_netdev | kfree(netdev); | |emac_reinit_locked |emac_mac_down |//use netdev Fixes: b9b17deb ("net: emac: emac gigabit ethernet controller driver") Signed-off-by: NZheng Wang <zyytlz.wz@163.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> (cherry picked from commit 6b6bc5b8) Signed-off-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I70MZX CVE: NA -------------------------------- Following process: P1 P2 path_openat link_path_walk may_lookup inode_permission(rcu) ovl_permission acl_permission_check check_acl get_cached_acl_rcu ovl_get_inode_acl realinode = ovl_inode_real(ovl_inode) drop_cache __dentry_kill(ovl_dentry) iput(ovl_inode) ovl_destroy_inode(ovl_inode) dput(oi->__upperdentry) dentry_kill(upperdentry) dentry_unlink_inode upperdentry->d_inode = NULL ovl_inode_upper upperdentry = ovl_i_dentry_upper(ovl_inode) d_inode(upperdentry) // returns NULL IS_POSIXACL(realinode) // NULL pointer dereference , will trigger an null pointer dereference at realinode: [ 205.472797] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 205.476701] CPU: 2 PID: 2713 Comm: ls Not tainted 6.3.0-12064-g2edfa098e750-dirty #1216 [ 205.478754] RIP: 0010:do_ovl_get_acl+0x5d/0x300 [ 205.489584] Call Trace: [ 205.489812] <TASK> [ 205.490014] ovl_get_inode_acl+0x26/0x30 [ 205.490466] get_cached_acl_rcu+0x61/0xa0 [ 205.490908] generic_permission+0x1bf/0x4e0 [ 205.491447] ovl_permission+0x79/0x1b0 [ 205.491917] inode_permission+0x15e/0x2c0 [ 205.492425] link_path_walk+0x115/0x550 [ 205.493311] path_lookupat.isra.0+0xb2/0x200 [ 205.493803] filename_lookup+0xda/0x240 [ 205.495747] vfs_fstatat+0x7b/0xb0 Fetch a reproducer in [Link]. Fix it by checking realinode whether to be NULL before accessing it. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217404 Fixes: 332f606b ("ovl: enable RCU'd ->get_acl()") Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 05 5月, 2023 2 次提交
-
-
由 Gwangun Jung 提交于
stable inclusion from stable-v4.19.282 commit 6ef8120262dfa63d9ec517d724e6f15591473a78 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6ZISA CVE: CVE-2023-31436 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6ef8120262dfa63d9ec517d724e6f15591473a78 -------------------------------- [ Upstream commit 30379334 ] If the TCA_QFQ_LMAX value is not offered through nlattr, lmax is determined by the MTU value of the network device. The MTU of the loopback device can be set up to 2^31-1. As a result, it is possible to have an lmax value that exceeds QFQ_MIN_LMAX. Due to the invalid lmax value, an index is generated that exceeds the QFQ_MAX_INDEX(=24) value, causing out-of-bounds read/write errors. The following reports a oob access: [ 84.582666] BUG: KASAN: slab-out-of-bounds in qfq_activate_agg.constprop.0 (net/sched/sch_qfq.c:1027 net/sched/sch_qfq.c:1060 net/sched/sch_qfq.c:1313) [ 84.583267] Read of size 4 at addr ffff88810f676948 by task ping/301 [ 84.583686] [ 84.583797] CPU: 3 PID: 301 Comm: ping Not tainted 6.3.0-rc5 #1 [ 84.584164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 84.584644] Call Trace: [ 84.584787] <TASK> [ 84.584906] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) [ 84.585108] print_report (mm/kasan/report.c:320 mm/kasan/report.c:430) [ 84.585570] kasan_report (mm/kasan/report.c:538) [ 84.585988] qfq_activate_agg.constprop.0 (net/sched/sch_qfq.c:1027 net/sched/sch_qfq.c:1060 net/sched/sch_qfq.c:1313) [ 84.586599] qfq_enqueue (net/sched/sch_qfq.c:1255) [ 84.587607] dev_qdisc_enqueue (net/core/dev.c:3776) [ 84.587749] __dev_queue_xmit (./include/net/sch_generic.h:186 net/core/dev.c:3865 net/core/dev.c:4212) [ 84.588763] ip_finish_output2 (./include/net/neighbour.h:546 net/ipv4/ip_output.c:228) [ 84.589460] ip_output (net/ipv4/ip_output.c:430) [ 84.590132] ip_push_pending_frames (./include/net/dst.h:444 net/ipv4/ip_output.c:126 net/ipv4/ip_output.c:1586 net/ipv4/ip_output.c:1606) [ 84.590285] raw_sendmsg (net/ipv4/raw.c:649) [ 84.591960] sock_sendmsg (net/socket.c:724 net/socket.c:747) [ 84.592084] __sys_sendto (net/socket.c:2142) [ 84.593306] __x64_sys_sendto (net/socket.c:2150) [ 84.593779] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 84.593902] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 84.594070] RIP: 0033:0x7fe568032066 [ 84.594192] Code: 0e 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c09[ 84.594796] RSP: 002b:00007ffce388b4e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c Code starting with the faulting instruction =========================================== [ 84.595047] RAX: ffffffffffffffda RBX: 00007ffce388cc70 RCX: 00007fe568032066 [ 84.595281] RDX: 0000000000000040 RSI: 00005605fdad6d10 RDI: 0000000000000003 [ 84.595515] RBP: 00005605fdad6d10 R08: 00007ffce388eeec R09: 0000000000000010 [ 84.595749] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 [ 84.595984] R13: 00007ffce388cc30 R14: 00007ffce388b4f0 R15: 0000001d00000001 [ 84.596218] </TASK> [ 84.596295] [ 84.596351] Allocated by task 291: [ 84.596467] kasan_save_stack (mm/kasan/common.c:46) [ 84.596597] kasan_set_track (mm/kasan/common.c:52) [ 84.596725] __kasan_kmalloc (mm/kasan/common.c:384) [ 84.596852] __kmalloc_node (./include/linux/kasan.h:196 mm/slab_common.c:967 mm/slab_common.c:974) [ 84.596979] qdisc_alloc (./include/linux/slab.h:610 ./include/linux/slab.h:731 net/sched/sch_generic.c:938) [ 84.597100] qdisc_create (net/sched/sch_api.c:1244) [ 84.597222] tc_modify_qdisc (net/sched/sch_api.c:1680) [ 84.597357] rtnetlink_rcv_msg (net/core/rtnetlink.c:6174) [ 84.597495] netlink_rcv_skb (net/netlink/af_netlink.c:2574) [ 84.597627] netlink_unicast (net/netlink/af_netlink.c:1340 net/netlink/af_netlink.c:1365) [ 84.597759] netlink_sendmsg (net/netlink/af_netlink.c:1942) [ 84.597891] sock_sendmsg (net/socket.c:724 net/socket.c:747) [ 84.598016] ____sys_sendmsg (net/socket.c:2501) [ 84.598147] ___sys_sendmsg (net/socket.c:2557) [ 84.598275] __sys_sendmsg (./include/linux/file.h:31 net/socket.c:2586) [ 84.598399] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 84.598520] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 84.598688] [ 84.598744] The buggy address belongs to the object at ffff88810f674000 [ 84.598744] which belongs to the cache kmalloc-8k of size 8192 [ 84.599135] The buggy address is located 2664 bytes to the right of [ 84.599135] allocated 7904-byte region [ffff88810f674000, ffff88810f675ee0) [ 84.599544] [ 84.599598] The buggy address belongs to the physical page: [ 84.599777] page:00000000e638567f refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10f670 [ 84.600074] head:00000000e638567f order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 84.600330] flags: 0x200000000010200(slab|head|node=0|zone=2) [ 84.600517] raw: 0200000000010200 ffff888100043180 dead000000000122 0000000000000000 [ 84.600764] raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000 [ 84.601009] page dumped because: kasan: bad access detected [ 84.601187] [ 84.601241] Memory state around the buggy address: [ 84.601396] ffff88810f676800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.601620] ffff88810f676880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.601845] >ffff88810f676900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.602069] ^ [ 84.602243] ffff88810f676980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.602468] ffff88810f676a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.602693] ================================================================== [ 84.602924] Disabling lock debugging due to kernel taint Fixes: 3015f3d2 ("pkt_sched: enable QFQ to support TSO/GSO") Reported-by: NGwangun Jung <exsociety@gmail.com> Signed-off-by: NGwangun Jung <exsociety@gmail.com> Acked-by: Jamal Hadi Salim<jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188499, https://gitee.com/openeuler/kernel/issues/I6TNVT CVE: NA Reference: https://patchwork.ozlabs.org/project/linux-ext4/patch/20230412124126.2286716-2-libaokun1@huawei.com/ ---------------------------------------- In our fault injection test, we create an ext4 file, migrate it to non-extent based file, then punch a hole and finally trigger a WARN_ON in the ext4_da_update_reserve_space(): EXT4-fs warning (device sda): ext4_da_update_reserve_space:369: ino 14, used 11 with only 10 reserved data blocks When writing back a non-extent based file, if we enable delalloc, the number of reserved blocks will be subtracted from the number of blocks mapped by ext4_ind_map_blocks(), and the extent status tree will be updated. We update the extent status tree by first removing the old extent_status and then inserting the new extent_status. If the block range we remove happens to be in an extent, then we need to allocate another extent_status with ext4_es_alloc_extent(). use old to remove to add new |----------|------------|------------| old extent_status The problem is that the allocation of a new extent_status failed due to a fault injection, and __es_shrink() did not get free memory, resulting in a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM, we map to the same extent again, and the number of reserved blocks is again subtracted from the number of blocks in that extent. Since the blocks in the same extent are subtracted twice, we end up triggering WARN_ON at ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks. For non-extent based file, we update the number of reserved blocks after ext4_ind_map_blocks() is executed, which causes a problem that when we call ext4_ind_map_blocks() to create a block, it doesn't always create a block, but we always reduce the number of reserved blocks. So we move the logic for updating reserved blocks to ext4_ind_map_blocks() to ensure that the number of reserved blocks is updated only after we do succeed in allocating some new blocks. Fixes: 5f634d06 ("ext4: Fix quota accounting error with fallocate") Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 28 4月, 2023 4 次提交
-
-
由 Ma Wupeng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6RKHX CVE: NA -------------------------------- commit ee97b610 ("mm: Count mirrored pages in buddy system") proposes to use vmstat events (PGALLOC/PGFREE) to count free reliable pages. However, this can be achieved by counting NR_FREE_PAGES for all non-movable zones, which will have better compatibility. Therefore, replace it. Signed-off-by: NMa Wupeng <mawupeng1@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Baokun Li 提交于
mainline inclusion from mainline-v6.3-rc8 commit 1ba1199e category: bugfix bugzilla: 188601, https://gitee.com/openeuler/kernel/issues/I6TNTC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1ba1199ec5747f475538c0d25a32804e5ba1dfde -------------------------------- KASAN report null-ptr-deref: ================================================================== BUG: KASAN: null-ptr-deref in bdi_split_work_to_wbs+0x5c5/0x7b0 Write of size 8 at addr 0000000000000000 by task sync/943 CPU: 5 PID: 943 Comm: sync Tainted: 6.3.0-rc5-next-20230406-dirty #461 Call Trace: <TASK> dump_stack_lvl+0x7f/0xc0 print_report+0x2ba/0x340 kasan_report+0xc4/0x120 kasan_check_range+0x1b7/0x2e0 __kasan_check_write+0x24/0x40 bdi_split_work_to_wbs+0x5c5/0x7b0 sync_inodes_sb+0x195/0x630 sync_inodes_one_sb+0x3a/0x50 iterate_supers+0x106/0x1b0 ksys_sync+0x98/0x160 [...] ================================================================== The race that causes the above issue is as follows: cpu1 cpu2 -------------------------|------------------------- inode_switch_wbs INIT_WORK(&isw->work, inode_switch_wbs_work_fn) queue_rcu_work(isw_wq, &isw->work) // queue_work async inode_switch_wbs_work_fn wb_put_many(old_wb, nr_switched) percpu_ref_put_many ref->data->release(ref) cgwb_release queue_work(cgwb_release_wq, &wb->release_work) // queue_work async &wb->release_work cgwb_release_workfn ksys_sync iterate_supers sync_inodes_one_sb sync_inodes_sb bdi_split_work_to_wbs kmalloc(sizeof(*work), GFP_ATOMIC) // alloc memory failed percpu_ref_exit ref->data = NULL kfree(data) wb_get(wb) percpu_ref_get(&wb->refcnt) percpu_ref_get_many(ref, 1) atomic_long_add(nr, &ref->data->count) atomic64_add(i, v) // trigger null-ptr-deref bdi_split_work_to_wbs() traverses &bdi->wb_list to split work into all wbs. If the allocation of new work fails, the on-stack fallback will be used and the reference count of the current wb is increased afterwards. If cgroup writeback membership switches occur before getting the reference count and the current wb is released as old_wd, then calling wb_get() or wb_put() will trigger the null pointer dereference above. This issue was introduced in v4.3-rc7 (see fix tag1). Both sync_inodes_sb() and __writeback_inodes_sb_nr() calls to bdi_split_work_to_wbs() can trigger this issue. For scenarios called via sync_inodes_sb(), originally commit 7fc5854f ("writeback: synchronize sync(2) against cgroup writeback membership switches") reduced the possibility of the issue by adding wb_switch_rwsem, but in v5.14-rc1 (see fix tag2) removed the "inode_io_list_del_locked(inode, old_wb)" from inode_switch_wbs_work_fn() so that wb->state contains WB_has_dirty_io, thus old_wb is not skipped when traversing wbs in bdi_split_work_to_wbs(), and the issue becomes easily reproducible again. To solve this problem, percpu_ref_exit() is called under RCU protection to avoid race between cgwb_release_workfn() and bdi_split_work_to_wbs(). Moreover, replace wb_get() with wb_tryget() in bdi_split_work_to_wbs(), and skip the current wb if wb_tryget() fails because the wb has already been shutdown. Link: https://lkml.kernel.org/r/20230410130826.1492525-1-libaokun1@huawei.com Fixes: b817525a ("writeback: bdi_writeback iteration must not skip dying ones") Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Acked-by: NTejun Heo <tj@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Christian Brauner <brauner@kernel.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Hou Tao <houtao1@huawei.com> Cc: yangerkun <yangerkun@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: mm/backing-dev.c Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Xin Long 提交于
mainline inclusion from mainline-v5.19-rc7 commit 181d8d20 category: bugfix bugzilla: 188712,https://gitee.com/src-openeuler/kernel/issues/I6X47E CVE: CVE-2023-2177 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=181d8d2066c000ba0a0e6940a7ad80f1a0e68e9d --------------------------- A NULL pointer dereference was reported by Wei Chen: BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:__list_del_entry_valid+0x26/0x80 Call Trace: <TASK> sctp_sched_dequeue_common+0x1c/0x90 sctp_sched_prio_dequeue+0x67/0x80 __sctp_outq_teardown+0x299/0x380 sctp_outq_free+0x15/0x20 sctp_association_free+0xc3/0x440 sctp_do_sm+0x1ca7/0x2210 sctp_assoc_bh_rcv+0x1f6/0x340 This happens when calling sctp_sendmsg without connecting to server first. In this case, a data chunk already queues up in send queue of client side when processing the INIT_ACK from server in sctp_process_init() where it calls sctp_stream_init() to alloc stream_in. If it fails to alloc stream_in all stream_out will be freed in sctp_stream_init's err path. Then in the asoc freeing it will crash when dequeuing this data chunk as stream_out is missing. As we can't free stream out before dequeuing all data from send queue, and this patch is to fix it by moving the err path stream_out/in freeing in sctp_stream_init() to sctp_stream_free() which is eventually called when freeing the asoc in sctp_association_free(). This fix also makes the code in sctp_process_init() more clear. Note that in sctp_association_init() when it fails in sctp_stream_init(), sctp_association_free() will not be called, and in that case it should go to 'stream_free' err path to free stream instead of 'fail_init'. Fixes: 5bbbbe32 ("sctp: introduce stream scheduler foundations") Reported-by: NWei Chen <harperchen1110@gmail.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/831a3dc100c4908ff76e5bcc363be97f2778bc0b.1658787066.git.lucien.xin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Conflicts: net/sctp/stream.c Signed-off-by: NDong Chenchen <dongchenchen2@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Patrisious Haddad 提交于
mainline inclusion from mainline-v6.3-rc1 commit 8d037973 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6X49E CVE: CVE-2023-2176 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d037973d48c026224ab285e6a06985ccac6f7bf --------------------------- Refactor rdma_bind_addr function so that it doesn't require that the cma destination address be changed before calling it. So now it will update the destination address internally only when it is really needed and after passing all the required checks. Which in turn results in a cleaner and more sensible call and error handling flows for the functions that call it directly or indirectly. Signed-off-by: NPatrisious Haddad <phaddad@nvidia.com> Reported-by: NWei Chen <harperchen1110@gmail.com> Reviewed-by: NMark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/3d0e9a2fd62bc10ba02fed1c7c48a48638952320.1672819273.git.leonro@nvidia.comSigned-off-by: NLeon Romanovsky <leon@kernel.org> (cherry picked from commit 8d037973) Signed-off-by: NLiu Jian <liujian56@huawei.com> Conflicts: drivers/infiniband/core/cma.c Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-