- 29 3月, 2023 6 次提交
-
-
由 Zhong Jinghua 提交于
hulk inclusion category: bugfix bugzilla: 187268, https://gitee.com/openeuler/kernel/issues/I5N162 ---------------------------------------- This reverts commit 51e35e67. There is a new fix for this problem in the mainline patch, so the patch should return to the mainline solution. mainline patch: d36a9ea5 ("block: fix use-after-free of q->q_usage_counter") Fixes: 51e35e67("block: fix null-deref in percpu_ref_put") Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Dave Chinner 提交于
mainline inclusion from mainline-v5.17-rc3 commit ebb7fb15 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ebb7fb1557b1d03b906b668aa2164b51e6b7d19a -------------------------------- Trond Myklebust reported soft lockups in XFS IO completion such as this: watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [kworker/12:1:3106] CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-305.10.2.el8_4.x86_64 #1 Workqueue: xfs-conv/md127 xfs_end_io [xfs] RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20 Call Trace: wake_up_page_bit+0x8a/0x110 iomap_finish_ioend+0xd7/0x1c0 iomap_finish_ioends+0x7f/0xb0 xfs_end_ioend+0x6b/0x100 [xfs] xfs_end_io+0xb9/0xe0 [xfs] process_one_work+0x1a7/0x360 worker_thread+0x1fa/0x390 kthread+0x116/0x130 ret_from_fork+0x35/0x40 Ioends are processed as an atomic completion unit when all the chained bios in the ioend have completed their IO. Logically contiguous ioends can also be merged and completed as a single, larger unit. Both of these things can be problematic as both the bio chains per ioend and the size of the merged ioends processed as a single completion are both unbound. If we have a large sequential dirty region in the page cache, write_cache_pages() will keep feeding us sequential pages and we will keep mapping them into ioends and bios until we get a dirty page at a non-sequential file offset. These large sequential runs can will result in bio and ioend chaining to optimise the io patterns. The pages iunder writeback are pinned within these chains until the submission chaining is broken, allowing the entire chain to be completed. This can result in huge chains being processed in IO completion context. We get deep bio chaining if we have large contiguous physical extents. We will keep adding pages to the current bio until it is full, then we'll chain a new bio to keep adding pages for writeback. Hence we can build bio chains that map millions of pages and tens of gigabytes of RAM if the page cache contains big enough contiguous dirty file regions. This long bio chain pins those pages until the final bio in the chain completes and the ioend can iterate all the chained bios and complete them. OTOH, if we have a physically fragmented file, we end up submitting one ioend per physical fragment that each have a small bio or bio chain attached to them. We do not chain these at IO submission time, but instead we chain them at completion time based on file offset via iomap_ioend_try_merge(). Hence we can end up with unbound ioend chains being built via completion merging. XFS can then do COW remapping or unwritten extent conversion on that merged chain, which involves walking an extent fragment at a time and running a transaction to modify the physical extent information. IOWs, we merge all the discontiguous ioends together into a contiguous file range, only to then process them individually as discontiguous extents. This extent manipulation is computationally expensive and can run in a tight loop, so merging logically contiguous but physically discontigous ioends gains us nothing except for hiding the fact the fact we broke the ioends up into individual physical extents at submission and then need to loop over those individual physical extents at completion. Hence we need to have mechanisms to limit ioend sizes and to break up completion processing of large merged ioend chains: 1. bio chains per ioend need to be bound in length. Pure overwrites go straight to iomap_finish_ioend() in softirq context with the exact bio chain attached to the ioend by submission. Hence the only way to prevent long holdoffs here is to bound ioend submission sizes because we can't reschedule in softirq context. 2. iomap_finish_ioends() has to handle unbound merged ioend chains correctly. This relies on any one call to iomap_finish_ioend() being bound in runtime so that cond_resched() can be issued regularly as the long ioend chain is processed. i.e. this relies on mechanism #1 to limit individual ioend sizes to work correctly. 3. filesystems have to loop over the merged ioends to process physical extent manipulations. This means they can loop internally, and so we break merging at physical extent boundaries so the filesystem can easily insert reschedule points between individual extent manipulations. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reported-and-tested-by: NTrond Myklebust <trondmy@hammerspace.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Conflicts: include/linux/iomap.h fs/iomap/buffered-io.c fs/xfs/xfs_aops.c [ 6e552494 ("iomap: remove unused private field from ioend") is not applied. 95c4cd05 ("iomap: Convert to_iomap_page to take a folio") is not applied. 8ffd74e9 ("iomap: Convert bio completions to use folios") is not applied. 044c6449 ("xfs: drop unused ioend private merge and setfilesize code") is not applied. ] Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Li Huafei 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6KT9C CVE: CVE-2023-1249 -------------------------------- The coredump_params structure is only used as parameters for function pointer members of some structures, such as linux_binfmt, spufs_calls, etc., and the parameters are of pointer type, so adding members of coredump_params will not affect the memory layout. Also coredump_params is used to hold coredump parameters to be passed to coredump functions of different types of binfmt, the driver will not use the structure. Signed-off-by: NLi Huafei <lihuafei1@huawei.com> Reviewed-by: NXu Kuohai <xukuohai@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Eric W. Biederman 提交于
stable inclusion from stable-v5.10.110 commit 558564db44755dfb3e48b0d64de327d20981e950 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6KT9C CVE: CVE-2023-1249 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=558564db44755dfb3e48b0d64de327d20981e950 -------------------------------- commit 390031c9 upstream. Matthew Wilcox reported that there is a missing mmap_lock in file_files_note that could possibly lead to a user after free. Solve this by using the existing vma snapshot for consistency and to avoid the need to take the mmap_lock anywhere in the coredump code except for dump_vma_snapshot. Update the dump_vma_snapshot to capture vm_pgoff and vm_file that are neeeded by fill_files_note. Add free_vma_snapshot to free the captured values of vm_file. Reported-by: NMatthew Wilcox <willy@infradead.org> Link: https://lkml.kernel.org/r/20220131153740.2396974-1-willy@infradead.org Cc: stable@vger.kernel.org Fixes: a07279c9 ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Fixes: 2aa362c4 ("coredump: extend core dump note section to contain file names of mapped files") Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Huafei <lihuafei1@huawei.com> Reviewed-by: NXu Kuohai <xukuohai@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Eric W. Biederman 提交于
stable inclusion from stable-v5.10.110 commit 936c8be4d1447f36ac4d2a464bd03a5cd659c42f category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6KT9C CVE: CVE-2023-1249 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=936c8be4d1447f36ac4d2a464bd03a5cd659c42f -------------------------------- commit 95c5436a upstream. Move the call of dump_vma_snapshot and kvfree(vma_meta) out of the individual coredump routines into do_coredump itself. This makes the code less error prone and easier to maintain. Make the vma snapshot available to the coredump routines in struct coredump_params. This makes it easier to change and update what is captures in the vma snapshot and will be needed for fixing fill_file_notes. Reviewed-by: NJann Horn <jannh@google.com> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Huafei <lihuafei1@huawei.com> Reviewed-by: NXu Kuohai <xukuohai@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Eric Dumazet 提交于
mainline inclusion from mainline-v5.16-rc1 commit f941eadd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6O293 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc3&id=f941eadd8d6d4ee2f8c9aeab8e1da5e647533a7d --------------------------- __bpf_prog_run() can run from non IRQ contexts, meaning it could be re entered if interrupted. This calls for the irq safe variant of u64_stats_update_{begin|end}, or risk a deadlock. This patch is a nop on 64bit arches, fortunately. syzbot report: WARNING: inconsistent lock state 5.12.0-rc3-syzkaller #0 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. udevd/4013 [HC0[0]:SC0[0]:HE1:SE1] takes: ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: sk_filter include/linux/filter.h:867 [inline] ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: do_one_broadcast net/netlink/af_netlink.c:1468 [inline] ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: netlink_broadcast_filtered+0x27c/0x4fc net/netlink/af_netlink.c:1520 {IN-SOFTIRQ-W} state was registered at: lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510 lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483 do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline] do_write_seqcount_begin include/linux/seqlock.h:545 [inline] u64_stats_update_begin include/linux/u64_stats_sync.h:129 [inline] bpf_prog_run_pin_on_cpu include/linux/filter.h:624 [inline] bpf_prog_run_clear_cb+0x1bc/0x270 include/linux/filter.h:755 run_filter+0xa0/0x17c net/packet/af_packet.c:2031 packet_rcv+0xc0/0x3e0 net/packet/af_packet.c:2104 dev_queue_xmit_nit+0x2bc/0x39c net/core/dev.c:2387 xmit_one net/core/dev.c:3588 [inline] dev_hard_start_xmit+0x94/0x518 net/core/dev.c:3609 sch_direct_xmit+0x11c/0x1f0 net/sched/sch_generic.c:313 qdisc_restart net/sched/sch_generic.c:376 [inline] __qdisc_run+0x194/0x7f8 net/sched/sch_generic.c:384 qdisc_run include/net/pkt_sched.h:136 [inline] qdisc_run include/net/pkt_sched.h:128 [inline] __dev_xmit_skb net/core/dev.c:3795 [inline] __dev_queue_xmit+0x65c/0xf84 net/core/dev.c:4150 dev_queue_xmit+0x14/0x18 net/core/dev.c:4215 neigh_resolve_output net/core/neighbour.c:1491 [inline] neigh_resolve_output+0x170/0x228 net/core/neighbour.c:1471 neigh_output include/net/neighbour.h:510 [inline] ip6_finish_output2+0x2e4/0x9fc net/ipv6/ip6_output.c:117 __ip6_finish_output net/ipv6/ip6_output.c:182 [inline] __ip6_finish_output+0x164/0x3f8 net/ipv6/ip6_output.c:161 ip6_finish_output+0x2c/0xb0 net/ipv6/ip6_output.c:192 NF_HOOK_COND include/linux/netfilter.h:290 [inline] ip6_output+0x74/0x294 net/ipv6/ip6_output.c:215 dst_output include/net/dst.h:448 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] NF_HOOK include/linux/netfilter.h:295 [inline] mld_sendpack+0x2a8/0x7e4 net/ipv6/mcast.c:1679 mld_send_cr net/ipv6/mcast.c:1975 [inline] mld_ifc_timer_expire+0x1e8/0x494 net/ipv6/mcast.c:2474 call_timer_fn+0xd0/0x570 kernel/time/timer.c:1431 expire_timers kernel/time/timer.c:1476 [inline] __run_timers kernel/time/timer.c:1745 [inline] run_timer_softirq+0x2e4/0x384 kernel/time/timer.c:1758 __do_softirq+0x204/0x7ac kernel/softirq.c:345 do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] invoke_softirq kernel/softirq.c:228 [inline] __irq_exit_rcu+0x1d8/0x200 kernel/softirq.c:422 irq_exit+0x10/0x3c kernel/softirq.c:446 __handle_domain_irq+0xb4/0x120 kernel/irq/irqdesc.c:692 handle_domain_irq include/linux/irqdesc.h:176 [inline] gic_handle_irq+0x84/0xac drivers/irqchip/irq-gic.c:370 __irq_svc+0x5c/0x94 arch/arm/kernel/entry-armv.S:205 debug_smp_processor_id+0x0/0x24 lib/smp_processor_id.c:53 rcu_read_lock_held_common kernel/rcu/update.c:108 [inline] rcu_read_lock_sched_held+0x24/0x7c kernel/rcu/update.c:123 trace_lock_acquire+0x24c/0x278 include/trace/events/lock.h:13 lock_acquire+0x3c/0x74 kernel/locking/lockdep.c:5481 rcu_lock_acquire include/linux/rcupdate.h:267 [inline] rcu_read_lock include/linux/rcupdate.h:656 [inline] avc_has_perm_noaudit+0x6c/0x260 security/selinux/avc.c:1150 selinux_inode_permission+0x140/0x220 security/selinux/hooks.c:3141 security_inode_permission+0x44/0x60 security/security.c:1268 inode_permission.part.0+0x5c/0x13c fs/namei.c:521 inode_permission fs/namei.c:494 [inline] may_lookup fs/namei.c:1652 [inline] link_path_walk.part.0+0xd4/0x38c fs/namei.c:2208 link_path_walk fs/namei.c:2189 [inline] path_lookupat+0x3c/0x1b8 fs/namei.c:2419 filename_lookup+0xa8/0x1a4 fs/namei.c:2453 user_path_at_empty+0x74/0x90 fs/namei.c:2733 do_readlinkat+0x5c/0x12c fs/stat.c:417 __do_sys_readlink fs/stat.c:450 [inline] sys_readlink+0x24/0x28 fs/stat.c:447 ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64 0x7eaa4974 irq event stamp: 298277 hardirqs last enabled at (298277): [<802000d0>] no_work_pending+0x4/0x34 hardirqs last disabled at (298276): [<8020c9b8>] do_work_pending+0x9c/0x648 arch/arm/kernel/signal.c:676 softirqs last enabled at (298216): [<8020167c>] __do_softirq+0x584/0x7ac kernel/softirq.c:372 softirqs last disabled at (298201): [<8024dff4>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] softirqs last disabled at (298201): [<8024dff4>] invoke_softirq kernel/softirq.c:228 [inline] softirqs last disabled at (298201): [<8024dff4>] __irq_exit_rcu+0x1d8/0x200 kernel/softirq.c:422 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&pstats->syncp)->seq); <Interrupt> lock(&(&pstats->syncp)->seq); *** DEADLOCK *** 1 lock held by udevd/4013: #0: 82b09c5c (rcu_read_lock){....}-{1:2}, at: sk_filter_trim_cap+0x54/0x434 net/core/filter.c:139 stack backtrace: CPU: 1 PID: 4013 Comm: udevd Not tainted 5.12.0-rc3-syzkaller #0 Hardware name: ARM-Versatile Express Backtrace: [<81802550>] (dump_backtrace) from [<818027c4>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) r7:00000080 r6:600d0093 r5:00000000 r4:82b58344 [<818027ac>] (show_stack) from [<81809e98>] (__dump_stack lib/dump_stack.c:79 [inline]) [<818027ac>] (show_stack) from [<81809e98>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120) [<81809de0>] (dump_stack) from [<81804a00>] (print_usage_bug.part.0+0x228/0x230 kernel/locking/lockdep.c:3806) r7:86bcb768 r6:81a0326c r5:830f96a8 r4:86bcb0c0 [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (print_usage_bug kernel/locking/lockdep.c:3776 [inline]) [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (valid_state kernel/locking/lockdep.c:3818 [inline]) [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (mark_lock_irq kernel/locking/lockdep.c:4021 [inline]) [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (mark_lock.part.0+0xc34/0x136c kernel/locking/lockdep.c:4478) r10:83278fe8 r9:82c6d748 r8:00000000 r7:82c6d2d4 r6:00000004 r5:86bcb768 r4:00000006 [<802ba584>] (mark_lock.part.0) from [<802bc644>] (mark_lock kernel/locking/lockdep.c:4442 [inline]) [<802ba584>] (mark_lock.part.0) from [<802bc644>] (mark_usage kernel/locking/lockdep.c:4391 [inline]) [<802ba584>] (mark_lock.part.0) from [<802bc644>] (__lock_acquire+0x9bc/0x3318 kernel/locking/lockdep.c:4854) r10:86bcb768 r9:86bcb0c0 r8:00000001 r7:00040000 r6:0000075a r5:830f96a8 r4:00000000 [<802bbc88>] (__lock_acquire) from [<802bfb90>] (lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510) r10:00000000 r9:600d0013 r8:00000000 r7:00000000 r6:828a2680 r5:828a2680 r4:861e5bc8 [<802bfaa0>] (lock_acquire.part.0) from [<802bff28>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483) r10:8146137c r9:00000000 r8:00000001 r7:00000000 r6:00000000 r5:00000000 r4:ff7c9dec [<802bfebc>] (lock_acquire) from [<81381eb4>] (do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (do_write_seqcount_begin include/linux/seqlock.h:545 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (u64_stats_update_begin include/linux/u64_stats_sync.h:129 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (__bpf_prog_run_save_cb include/linux/filter.h:727 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (bpf_prog_run_save_cb include/linux/filter.h:741 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (sk_filter_trim_cap+0x26c/0x434 net/core/filter.c:149) r10:a4095dd0 r9:ff7c9dd0 r8:e44be000 r7:8146137c r6:00000001 r5:8611ba80 r4:00000000 [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (sk_filter include/linux/filter.h:867 [inline]) [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (do_one_broadcast net/netlink/af_netlink.c:1468 [inline]) [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (netlink_broadcast_filtered+0x27c/0x4fc net/netlink/af_netlink.c:1520) r10:00000001 r9:833d6b1c r8:00000000 r7:8572f864 r6:8611ba80 r5:8698d800 r4:8572f800 [<81461100>] (netlink_broadcast_filtered) from [<81463e60>] (netlink_broadcast net/netlink/af_netlink.c:1544 [inline]) [<81461100>] (netlink_broadcast_filtered) from [<81463e60>] (netlink_sendmsg+0x3d0/0x478 net/netlink/af_netlink.c:1925) r10:00000000 r9:00000002 r8:8698d800 r7:000000b7 r6:8611b900 r5:861e5f50 r4:86aa3000 [<81463a90>] (netlink_sendmsg) from [<81321f54>] (sock_sendmsg_nosec net/socket.c:654 [inline]) [<81463a90>] (netlink_sendmsg) from [<81321f54>] (sock_sendmsg+0x3c/0x4c net/socket.c:674) r10:00000000 r9:861e5dd4 r8:00000000 r7:86570000 r6:00000000 r5:86570000 r4:861e5f50 [<81321f18>] (sock_sendmsg) from [<813234d0>] (____sys_sendmsg+0x230/0x29c net/socket.c:2350) r5:00000040 r4:861e5f50 [<813232a0>] (____sys_sendmsg) from [<8132549c>] (___sys_sendmsg+0xac/0xe4 net/socket.c:2404) r10:00000128 r9:861e4000 r8:00000000 r7:00000000 r6:86570000 r5:861e5f50 r4:00000000 [<813253f0>] (___sys_sendmsg) from [<81325684>] (__sys_sendmsg net/socket.c:2433 [inline]) [<813253f0>] (___sys_sendmsg) from [<81325684>] (__do_sys_sendmsg net/socket.c:2442 [inline]) [<813253f0>] (___sys_sendmsg) from [<81325684>] (sys_sendmsg+0x58/0xa0 net/socket.c:2440) r8:80200224 r7:00000128 r6:00000000 r5:7eaa541c r4:86570000 [<8132562c>] (sys_sendmsg) from [<80200060>] (ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64) Exception stack(0x861e5fa8 to 0x861e5ff0) 5fa0: 00000000 00000000 0000000c 7eaa541c 00000000 00000000 5fc0: 00000000 00000000 76fbf840 00000128 00000000 0000008f 7eaa541c 000563f8 5fe0: 00056110 7eaa53e0 00036cec 76c9bf44 r6:76fbf840 r5:00000000 r4:00000000 Fixes: 492ecee8 ("bpf: enable program stats") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211026214133.3114279-2-eric.dumazet@gmail.com Conflicts: include/linux/filter.h Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NXu Kuohai <xukuohai@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 15 3月, 2023 2 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6L586 CVE: NA -------------------------------- Currently, for bio-based device, 'ios' and 'sectors' is counted while io is started, while 'nsecs' is counted while io is done. This behaviour is obviously wrong, however we can't fix exist kapis because this will require new parameter, which will cause kapi broken. Hence this patch add some new apis, which will make sure io accounting for bio-based device is precise. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Mike Christie 提交于
mainline inclusion from mainline-v6.2-rc6 commit 6f1d64b1 category: bugfix bugzilla: 188443, https://gitee.com/openeuler/kernel/issues/I6I8YD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6f1d64b13097e85abda0f91b5638000afc5f9a06 ---------------------------------------- Bug report and analysis from Ding Hui. During iSCSI session logout, if another task accesses the shost ipaddress attr, we can get a KASAN UAF report like this: [ 276.942144] BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x78/0xe0 [ 276.942535] Write of size 4 at addr ffff8881053b45b8 by task cat/4088 [ 276.943511] CPU: 2 PID: 4088 Comm: cat Tainted: G E 6.1.0-rc8+ #3 [ 276.943997] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [ 276.944470] Call Trace: [ 276.944943] <TASK> [ 276.945397] dump_stack_lvl+0x34/0x48 [ 276.945887] print_address_description.constprop.0+0x86/0x1e7 [ 276.946421] print_report+0x36/0x4f [ 276.947358] kasan_report+0xad/0x130 [ 276.948234] kasan_check_range+0x35/0x1c0 [ 276.948674] _raw_spin_lock_bh+0x78/0xe0 [ 276.949989] iscsi_sw_tcp_host_get_param+0xad/0x2e0 [iscsi_tcp] [ 276.951765] show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0xe9/0x130 [scsi_transport_iscsi] [ 276.952185] dev_attr_show+0x3f/0x80 [ 276.953005] sysfs_kf_seq_show+0x1fb/0x3e0 [ 276.953401] seq_read_iter+0x402/0x1020 [ 276.954260] vfs_read+0x532/0x7b0 [ 276.955113] ksys_read+0xed/0x1c0 [ 276.955952] do_syscall_64+0x38/0x90 [ 276.956347] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 276.956769] RIP: 0033:0x7f5d3a679222 [ 276.957161] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 32 c0 0b 00 e8 a5 fe 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 276.958009] RSP: 002b:00007ffc864d16a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 276.958431] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5d3a679222 [ 276.958857] RDX: 0000000000020000 RSI: 00007f5d3a4fe000 RDI: 0000000000000003 [ 276.959281] RBP: 00007f5d3a4fe000 R08: 00000000ffffffff R09: 0000000000000000 [ 276.959682] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020000 [ 276.960126] R13: 0000000000000003 R14: 0000000000000000 R15: 0000557a26dada58 [ 276.960536] </TASK> [ 276.961357] Allocated by task 2209: [ 276.961756] kasan_save_stack+0x1e/0x40 [ 276.962170] kasan_set_track+0x21/0x30 [ 276.962557] __kasan_kmalloc+0x7e/0x90 [ 276.962923] __kmalloc+0x5b/0x140 [ 276.963308] iscsi_alloc_session+0x28/0x840 [scsi_transport_iscsi] [ 276.963712] iscsi_session_setup+0xda/0xba0 [libiscsi] [ 276.964078] iscsi_sw_tcp_session_create+0x1fd/0x330 [iscsi_tcp] [ 276.964431] iscsi_if_create_session.isra.0+0x50/0x260 [scsi_transport_iscsi] [ 276.964793] iscsi_if_recv_msg+0xc5a/0x2660 [scsi_transport_iscsi] [ 276.965153] iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi] [ 276.965546] netlink_unicast+0x4d5/0x7b0 [ 276.965905] netlink_sendmsg+0x78d/0xc30 [ 276.966236] sock_sendmsg+0xe5/0x120 [ 276.966576] ____sys_sendmsg+0x5fe/0x860 [ 276.966923] ___sys_sendmsg+0xe0/0x170 [ 276.967300] __sys_sendmsg+0xc8/0x170 [ 276.967666] do_syscall_64+0x38/0x90 [ 276.968028] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 276.968773] Freed by task 2209: [ 276.969111] kasan_save_stack+0x1e/0x40 [ 276.969449] kasan_set_track+0x21/0x30 [ 276.969789] kasan_save_free_info+0x2a/0x50 [ 276.970146] __kasan_slab_free+0x106/0x190 [ 276.970470] __kmem_cache_free+0x133/0x270 [ 276.970816] device_release+0x98/0x210 [ 276.971145] kobject_cleanup+0x101/0x360 [ 276.971462] iscsi_session_teardown+0x3fb/0x530 [libiscsi] [ 276.971775] iscsi_sw_tcp_session_destroy+0xd8/0x130 [iscsi_tcp] [ 276.972143] iscsi_if_recv_msg+0x1bf1/0x2660 [scsi_transport_iscsi] [ 276.972485] iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi] [ 276.972808] netlink_unicast+0x4d5/0x7b0 [ 276.973201] netlink_sendmsg+0x78d/0xc30 [ 276.973544] sock_sendmsg+0xe5/0x120 [ 276.973864] ____sys_sendmsg+0x5fe/0x860 [ 276.974248] ___sys_sendmsg+0xe0/0x170 [ 276.974583] __sys_sendmsg+0xc8/0x170 [ 276.974891] do_syscall_64+0x38/0x90 [ 276.975216] entry_SYSCALL_64_after_hwframe+0x63/0xcd We can easily reproduce by two tasks: 1. while :; do iscsiadm -m node --login; iscsiadm -m node --logout; done 2. while :; do cat \ /sys/devices/platform/host*/iscsi_host/host*/ipaddress; done iscsid | cat --------------------------------+--------------------------------------- |- iscsi_sw_tcp_session_destroy | |- iscsi_session_teardown | |- device_release | |- iscsi_session_release ||- dev_attr_show |- kfree | |- show_host_param_ | ISCSI_HOST_PARAM_IPADDRESS | |- iscsi_sw_tcp_host_get_param | |- r/w tcp_sw_host->session (UAF) |- iscsi_host_remove | |- iscsi_host_free | Fix the above bug by splitting the session removal into 2 parts: 1. removal from iSCSI class which includes sysfs and removal from host tracking. 2. freeing of session. During iscsi_tcp host and session removal we can remove the session from sysfs then remove the host from sysfs. At this point we know userspace is not accessing the kernel via sysfs so we can free the session and host. Link: https://lore.kernel.org/r/20230117193937.21244-2-michael.christie@oracle.comSigned-off-by: NMike Christie <michael.christie@oracle.com> Reviewed-by: NLee Duncan <lduncan@suse.com> Acked-by: NDing Hui <dinghui@sangfor.com.cn> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> conflicts: drivers/scsi/iscsi_tcp.c Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 13 3月, 2023 1 次提交
-
-
由 Bibo Mao 提交于
LoongArch inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6BWFP -------------------------------- KVM adapts to 5.10 kernel based on 4.19 kernel KVM code. Signed-off-by: NXiangLai Li <lixianglai@loongson.cn> Signed-off-by: NBibo Mao <maobibo@loongson.cn> Change-Id: Iea4333d8e0905ab5c04c725defd0e4c421bfe916
-
- 09 3月, 2023 1 次提交
-
-
由 zhangtianyang 提交于
LoongArch inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6BWFP -------------------------------- Signed-off-by: Nzhangtianyang <zhangtianyang@loongson.cn> Change-Id: Ie79e0c0a3ebcc1c9016b2be1e434ad3d57bf334f (cherry picked from commit d1c564ec)
-
- 08 3月, 2023 2 次提交
-
-
由 Pietro Borrello 提交于
mainline inclusion from mainline-v6.2-rc7 commit ffe2a225 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6I7U2 CVE: CVE-2023-1075 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ffe2a22562444720b05bdfeb999c03e810d84cbb -------------------------------- tls_is_tx_ready() checks that list_first_entry() does not return NULL. This condition can never happen. For empty lists, list_first_entry() returns the list_entry() of the head, which is a type confusion. Use list_first_entry_or_null() which returns NULL in case of empty lists. Fixes: a42055e8 ("net/tls: Add support for async encryption of records for performance") Signed-off-by: NPietro Borrello <borrello@diag.uniroma1.it> Link: https://lore.kernel.org/r/20230128-list-entry-null-check-tls-v1-1-525bbfe6f0d0@diag.uniroma1.itSigned-off-by: NJakub Kicinski <kuba@kernel.org> Conflicts: net/tls/tls_sw.c Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Pietro Borrello 提交于
mainline inclusion from mainline-v6.3-rc1 commit 584f3742 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6I7UC CVE: CVE-2023-1076 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=584f3742890e966d2f0a1f3c418c9ead70b2d99e -------------------------------- Add sock_init_data_uid() to explicitly initialize the socket uid. To initialise the socket uid, sock_init_data() assumes a the struct socket* sock is always embedded in a struct socket_alloc, used to access the corresponding inode uid. This may not be true. Examples are sockets created in tun_chr_open() and tap_open(). Fixes: 86741ec2 ("net: core: Add a UID field to struct sock.") Signed-off-by: NPietro Borrello <borrello@diag.uniroma1.it> Reviewed-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NBaisong Zhong <zhongbaisong@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 02 3月, 2023 1 次提交
-
-
由 DuanqiangWen 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I66W4Y CVE: NA workaround if multiple functions on a PCl device belong to the same IOMMU group, they can be directly assigned to only one VM as well, letting multiple functions belong to different IOMMU group. Signed-off-by: NDuanqiangWen <duanqiangwen@net-swift.com>
-
- 28 2月, 2023 14 次提交
-
-
由 Eric Dumazet 提交于
stable inclusion from stable-v5.10.166 commit c60fe70078d6e515f424cb868d07e00411b27fbc category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6H5DD CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c60fe70078d6e515f424cb868d07e00411b27fbc -------------------------------- [ Upstream commit 3a415d59 ] syzbot reported a nasty crash [1] in net_tx_action() which made little sense until we got a repro. This repro installs a taprio qdisc, but providing an invalid TCA_RATE attribute. qdisc_create() has to destroy the just initialized taprio qdisc, and taprio_destroy() is called. However, the hrtimer used by taprio had already fired, therefore advance_sched() called __netif_schedule(). Then net_tx_action was trying to use a destroyed qdisc. We can not undo the __netif_schedule(), so we must wait until one cpu serviced the qdisc before we can proceed. Many thanks to Alexander Potapenko for his help. [1] BUG: KMSAN: uninit-value in queued_spin_trylock include/asm-generic/qspinlock.h:94 [inline] BUG: KMSAN: uninit-value in do_raw_spin_trylock include/linux/spinlock.h:191 [inline] BUG: KMSAN: uninit-value in __raw_spin_trylock include/linux/spinlock_api_smp.h:89 [inline] BUG: KMSAN: uninit-value in _raw_spin_trylock+0x92/0xa0 kernel/locking/spinlock.c:138 queued_spin_trylock include/asm-generic/qspinlock.h:94 [inline] do_raw_spin_trylock include/linux/spinlock.h:191 [inline] __raw_spin_trylock include/linux/spinlock_api_smp.h:89 [inline] _raw_spin_trylock+0x92/0xa0 kernel/locking/spinlock.c:138 spin_trylock include/linux/spinlock.h:359 [inline] qdisc_run_begin include/net/sch_generic.h:187 [inline] qdisc_run+0xee/0x540 include/net/pkt_sched.h:125 net_tx_action+0x77c/0x9a0 net/core/dev.c:5086 __do_softirq+0x1cc/0x7fb kernel/softirq.c:571 run_ksoftirqd+0x2c/0x50 kernel/softirq.c:934 smpboot_thread_fn+0x554/0x9f0 kernel/smpboot.c:164 kthread+0x31b/0x430 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 Uninit was created at: slab_post_alloc_hook mm/slab.h:732 [inline] slab_alloc_node mm/slub.c:3258 [inline] __kmalloc_node_track_caller+0x814/0x1250 mm/slub.c:4970 kmalloc_reserve net/core/skbuff.c:358 [inline] __alloc_skb+0x346/0xcf0 net/core/skbuff.c:430 alloc_skb include/linux/skbuff.h:1257 [inline] nlmsg_new include/net/netlink.h:953 [inline] netlink_ack+0x5f3/0x12b0 net/netlink/af_netlink.c:2436 netlink_rcv_skb+0x55d/0x6c0 net/netlink/af_netlink.c:2507 rtnetlink_rcv+0x30/0x40 net/core/rtnetlink.c:6108 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0xf3b/0x1270 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x1288/0x1440 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0xabc/0xe90 net/socket.c:2482 ___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2536 __sys_sendmsg net/socket.c:2565 [inline] __do_sys_sendmsg net/socket.c:2574 [inline] __se_sys_sendmsg net/socket.c:2572 [inline] __x64_sys_sendmsg+0x367/0x540 net/socket.c:2572 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd CPU: 0 PID: 13 Comm: ksoftirqd/0 Not tainted 6.0.0-rc2-syzkaller-47461-gac3859c02d7f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022 Fixes: 5a781ccb ("tc: Add support for configuring the taprio scheduler") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Li Lingfeng 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC ------------------- Commit 5125c6de8709("[Backport] io_uring: import 5.15-stable io_uring") changes some structs, so we need to fix kabi broken problem. Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit 788d0824269bef539fe31a785b1517882eafed93 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC CVE: CVE-2023-0240 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=788d0824269bef539fe31a785b1517882eafed93 -------------------------------- No upstream commit exists. This imports the io_uring codebase from 5.15.85, wholesale. Changes from that code base: - Drop IOCB_ALLOC_CACHE, we don't have that in 5.10. - Drop MKDIRAT/SYMLINKAT/LINKAT. Would require further VFS backports, and we don't support these in 5.10 to begin with. - sock_from_file() old style calling convention. - Use compat_get_bitmap() only for CONFIG_COMPAT=y Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit ed3005032993da7a3fe2e6095436e0bc2e83d011 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=ed3005032993da7a3fe2e6095436e0bc2e83d011 -------------------------------- [ Upstream commit c7aab1a7 ] The only exported helper we have right now is task_work_cancel(), which cancels any task_work from a given task where func matches the queued work item. This is a bit too coarse for some use cases. Add a task_work_cancel_match() that allows to more specifically target individual work items outside of purely the callback function used. task_work_cancel() can be trivially implemented on top of that, hence do so. Reviewed-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Acked-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit 1500fed00878fd59b2d6a1832b8d3f7c261a5671 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=1500fed00878fd59b2d6a1832b8d3f7c261a5671 -------------------------------- [ Upstream commit cc440e87 ] Provide a generic helper for setting up an io_uring worker. Returns a task_struct so that the caller can do whatever setup is needed, then call wake_up_new_task() to kick it into gear. Add a kernel_clone_args member, io_thread, which tells copy_process() to mark the task with PF_IO_WORKER. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Acked-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit 90a2c3821bbfe8435bde901953871576a1bf8c6d category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=90a2c3821bbfe8435bde901953871576a1bf8c6d -------------------------------- [ Upstream commit e296dc49 ] It's available everywhere now, no need to check or add dummy defines. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Acked-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit 3c295bd2ddaecf3509458c86bf7ba610042f3609 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=3c295bd2ddaecf3509458c86bf7ba610042f3609 -------------------------------- [ Upstream commit 12db8b69 ] Add TIF_NOTIFY_SIGNAL handling in the generic entry code, which if set, will return true if signal_pending() is used in a wait loop. That causes an exit of the loop so that notify_signal tracehooks can be run. If the wait loop is currently inside a system call, the system call is restarted once task_work has been processed. In preparation for only having arch_do_signal() handle syscall restarts if _TIF_SIGPENDING isn't set, rename it to arch_do_signal_or_restart(). Pass in a boolean that tells the architecture specific signal handler if it should attempt to get a signal, or just process a potential syscall restart. For !CONFIG_GENERIC_ENTRY archs, add the TIF_NOTIFY_SIGNAL handling to get_signal(). This is done to minimize the needed architecture changes to support this feature. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20201026203230.386348-3-axboe@kernel.dkSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Conflict: include/linux/tracehook.h Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Acked-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit 52cfde6bbf64f7f058efa805c6dbb6332d4de6fa category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=52cfde6bbf64f7f058efa805c6dbb6332d4de6fa -------------------------------- [ Upstream commit 5c251e9d ] This is in preparation for maintaining signal_pending() as the decider of whether or not a schedule() loop should be broken, or continue sleeping. This is different than the core signal use cases, which really need to know whether an actual signal is pending or not. task_sigpending() returns non-zero if TIF_SIGPENDING is set. Only core kernel use cases should care about the distinction between the two, make sure those use the task_sigpending() helper. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20201026203230.386348-2-axboe@kernel.dkSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Acked-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit e86db87191d8f91dfe787758c6dd646c2bf0c335 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=e86db87191d8f91dfe787758c6dd646c2bf0c335 -------------------------------- [ Upstream commit 8fb0f47a ] In an ideal world, when someone is passed an iov_iter and returns X bytes, then X bytes would have been consumed/advanced from the iov_iter. But we have use cases that always consume the entire iterator, a few examples of that are iomap and bdev O_DIRECT. This means we cannot rely on the state of the iov_iter once we've called ->read_iter() or ->write_iter(). This would be easier if we didn't always have to deal with truncate of the iov_iter, as rewinding would be trivial without that. We recently added a commit to track the truncate state, but that grew the iov_iter by 8 bytes and wasn't the best solution. Implement a helper to save enough of the iov_iter state to sanely restore it after we've called the read/write iterator helpers. This currently only works for IOVEC/BVEC/KVEC as that's all we need, support for other iterator types are left as an exercise for the reader. Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wiacKV4Gh-MYjteU0LwNBSGpWrK-Ov25HdqB1ewinrFPg@mail.gmail.com/Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Eric W. Biederman 提交于
stable inclusion from stable-v5.10.162 commit 57b20530363d127ab6a82e336275769258eb5f37 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=57b20530363d127ab6a82e336275769258eb5f37 -------------------------------- [ Upstream commit 9fe83c43 ] The function close_fd_get_file is explicitly a variant of __close_fd[1]. Now that __close_fd has been renamed close_fd, rename close_fd_get_file to be consistent with close_fd. When __alloc_fd, __close_fd and __fd_install were introduced the double underscore indicated that the function took a struct files_struct parameter. The function __close_fd_get_file never has so the naming has always been inconsistent. This just cleans things up so there are not any lingering mentions or references __close_fd left in the code. [1] 80cd7956 ("binder: fix use-after-free due to ksys_close() during fdget()") Link: https://lkml.kernel.org/r/20201120231441.29911-23-ebiederm@xmission.comSigned-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Pavel Begunkov 提交于
stable inclusion from stable-v5.10.162 commit ad0b0137953a2c973958dadf6d222e120e278856 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=ad0b0137953a2c973958dadf6d222e120e278856 -------------------------------- [ Upstream commit d32f89da ] Introduce and reuse a helper that acts similarly to __sys_accept4_file() but returns struct file instead of installing file descriptor. Will be used by io_uring. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Acked-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Acked-by: NDavid S. Miller <davem@davemloft.net> Link: https://lore.kernel.org/r/c57b9e8e818d93683a3d24f8ca50ca038d1da8c4.1629888991.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit 069ac28d92432dd7cdac0a2c141a1b3b8d4330d5 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=069ac28d92432dd7cdac0a2c141a1b3b8d4330d5 -------------------------------- [ Upstream commit b713c195 ] No functional changes in this patch, needed to provide io_uring support for shutdown(2). Cc: netdev@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Acked-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit 5683caa7350f389d099b72bfdb289d2073286e32 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=5683caa7350f389d099b72bfdb289d2073286e32 -------------------------------- [ Upstream commit 99668f61 ] Now that we support non-blocking path resolution internally, expose it via openat2() in the struct open_how ->resolve flags. This allows applications using openat2() to limit path resolution to the extent that it is already cached. If the lookup cannot be satisfied in a non-blocking manner, openat2(2) will return -1/-EAGAIN. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.162 commit c1fe7bd3e1aa85865396b464b31f28b094a4353c category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=c1fe7bd3e1aa85865396b464b31f28b094a4353c -------------------------------- [ Upstream commit 6c6ec2b0 ] io_uring always punts opens to async context, since there's no control over whether the lookup blocks or not. Add LOOKUP_CACHED to support just doing the fast RCU based lookups, which we know will not block. If we can do a cached path resolution of the filename, then we don't have to always punt lookups for a worker. During path resolution, we always do LOOKUP_RCU first. If that fails and we terminate LOOKUP_RCU, then fail a LOOKUP_CACHED attempt as well. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Conflict: fs/namei.c Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 22 2月, 2023 1 次提交
-
-
由 Kefeng Wang 提交于
mainline inclusion from mainline-v6.2-rc7 commit ac86f547 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6BYND Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ac86f547ca1002aec2ef66b9e64d03f45bbbfbb9 -------------------------------- As commit 18365225 ("hwpoison, memcg: forcibly uncharge LRU pages"), hwpoison will forcibly uncharg a LRU hwpoisoned page, the folio_memcg could be NULl, then, mem_cgroup_track_foreign_dirty_slowpath() could occurs a NULL pointer dereference, let's do not record the foreign writebacks for folio memcg is null in mem_cgroup_track_foreign_dirty() to fix it. Link: https://lkml.kernel.org/r/20230129040945.180629-1-wangkefeng.wang@huawei.com Fixes: 97b27821 ("writeback, memcg: Implement foreign dirty flushing") Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reported-by: NMa Wupeng <mawupeng1@huawei.com> Tested-by: NMiko Larsson <mikoxyzzz@gmail.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Ma Wupeng <mawupeng1@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: include/linux/memcontrol.h Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Reviewed-by: Nguozihua 00570089 <guozihua@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 08 2月, 2023 2 次提交
-
-
由 Marco Elver 提交于
stable inclusion from stable-v5.10.163 commit 67349025f00d0749e36386cfcfc32c2887f47fdb category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6D0MR CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=67349025f00d0749e36386cfcfc32c2887f47fdb -------------------------------- [ Upstream commit fa69ee5a ] It turns out that usage of skb extensions can cause memory leaks. Ido Schimmel reported: "[...] there are instances that blindly overwrite 'skb->extensions' by invoking skb_copy_header() after __alloc_skb()." Therefore, give up on using skb extensions for KCOV handle, and instead directly store kcov_handle in sk_buff. Fixes: 6370cc3b ("net: add kcov handle to skb extensions") Fixes: 85ce50d3 ("net: kcov: don't select SKB_EXTENSIONS when there is no NET") Fixes: 97f53a08 ("net: linux/skbuff.h: combine SKB_EXTENSIONS + KCOV handling") Link: https://lore.kernel.org/linux-wireless/20201121160941.GA485907@shredder.lan/Reported-by: NIdo Schimmel <idosch@idosch.org> Signed-off-by: NMarco Elver <elver@google.com> Link: https://lore.kernel.org/r/20201125224840.2014773-1-elver@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Stable-dep-of: db0b124f ("igc: Enhance Qbv scheduling by using first flag bit") Signed-off-by: NSasha Levin <sashal@kernel.org> Conflicts: include/linux/skbuff.h Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6ADCF CVE: NA -------------------------------- syzbot is reporting GFP_KERNEL allocation with oom_lock held when reporting memcg OOM [1]. If this allocation triggers the global OOM situation then the system can livelock because the GFP_KERNEL allocation with oom_lock held cannot trigger the global OOM killer because __alloc_pages_may_oom() fails to hold oom_lock. The problem mentioned above has been fixed by patch[2]. The is the same problem in memcg_memfs_info feature too. Refer to the patch[2], fix it by removing the allocation from mem_cgroup_print_memfs_info() completely, and pass static buffer when calling from memcg OOM path. Link: https://syzkaller.appspot.com/bug?extid=2d2aeadc6ce1e1f11d45 [1] Link: https://lkml.kernel.org/r/86afb39f-8c65-bec2-6cfc-c5e3cd600c0b@I-love.SAKURA.ne.jp [2] Fixes: 6b1d4d3a ("mm/memcg_memfs_info: show files that having pages charged in mem_cgroup") Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 31 1月, 2023 1 次提交
-
-
由 Yuanzheng Song 提交于
mainline inclusion from mainline-v5.16-rc1 commit 7e6ec49c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6AW65 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e6ec49c18988f1b8dab0677271dafde5f8d9a43 -------------------------------- When reading memcg->socket_pressure in mem_cgroup_under_socket_pressure() and writing memcg->socket_pressure in vmpressure() at the same time, the following data-race occurs: BUG: KCSAN: data-race in __sk_mem_reduce_allocated / vmpressure write to 0xffff8881286f4938 of 8 bytes by task 24550 on cpu 3: vmpressure+0x218/0x230 mm/vmpressure.c:307 shrink_node_memcgs+0x2b9/0x410 mm/vmscan.c:2658 shrink_node+0x9d2/0x11d0 mm/vmscan.c:2769 shrink_zones+0x29f/0x470 mm/vmscan.c:2972 do_try_to_free_pages+0x193/0x6e0 mm/vmscan.c:3027 try_to_free_mem_cgroup_pages+0x1c0/0x3f0 mm/vmscan.c:3345 reclaim_high mm/memcontrol.c:2440 [inline] mem_cgroup_handle_over_high+0x18b/0x4d0 mm/memcontrol.c:2624 tracehook_notify_resume include/linux/tracehook.h:197 [inline] exit_to_user_mode_loop kernel/entry/common.c:164 [inline] exit_to_user_mode_prepare+0x110/0x170 kernel/entry/common.c:191 syscall_exit_to_user_mode+0x16/0x30 kernel/entry/common.c:266 ret_from_fork+0x15/0x30 arch/x86/entry/entry_64.S:289 read to 0xffff8881286f4938 of 8 bytes by interrupt on cpu 1: mem_cgroup_under_socket_pressure include/linux/memcontrol.h:1483 [inline] sk_under_memory_pressure include/net/sock.h:1314 [inline] __sk_mem_reduce_allocated+0x1d2/0x270 net/core/sock.c:2696 __sk_mem_reclaim+0x44/0x50 net/core/sock.c:2711 sk_mem_reclaim include/net/sock.h:1490 [inline] ...... net_rx_action+0x17a/0x480 net/core/dev.c:6864 __do_softirq+0x12c/0x2af kernel/softirq.c:298 run_ksoftirqd+0x13/0x20 kernel/softirq.c:653 smpboot_thread_fn+0x33f/0x510 kernel/smpboot.c:165 kthread+0x1fc/0x220 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Fix it by using READ_ONCE() and WRITE_ONCE() to read and write memcg->socket_pressure. Link: https://lkml.kernel.org/r/20211025082843.671690-1-songyuanzheng@huawei.comSigned-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Alex Shi <alexs@kernel.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NCai Xinchen <caixinchen1@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 18 1月, 2023 6 次提交
-
-
由 Qi Zheng 提交于
mainline inclusion from mainline-v6.1-rc7 commit ea4452de category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I69VVC CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ea4452de2ae987342fadbdd2c044034e6480daad -------------------------------- When we specify __GFP_NOWARN, we only expect that no warnings will be issued for current caller. But in the __should_failslab() and __should_fail_alloc_page(), the local GFP flags alter the global {failslab|fail_page_alloc}.attr, which is persistent and shared by all tasks. This is not what we expected, let's fix it. [akpm@linux-foundation.org: unexport should_fail_ex()] Link: https://lkml.kernel.org/r/20221118100011.2634-1-zhengqi.arch@bytedance.com Fixes: 3f913fc5 ("mm: fix missing handler for __GFP_NOWARN") Signed-off-by: NQi Zheng <zhengqi.arch@bytedance.com> Reported-by: NDmitry Vyukov <dvyukov@google.com> Reviewed-by: NAkinobu Mita <akinobu.mita@gmail.com> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NYe Weihua <yeweihua4@huawei.com> Reviewed-by: Ntong tiangen <tongtiangen@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Li Lingfeng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I685FC CVE: NA -------------------------------- Commit 0845c5803f3f("[Backport] io_uring: disable polling pollfree files") adds a new member in file_operations, so we need to fix kabi broken problem. Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Pavel Begunkov 提交于
stable inclusion from stable-v5.10.141 commit 28d8d2737e82fc29ff9e788597661abecc7f7994 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I685FC CEV: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.162&id=28d8d2737e82fc29ff9e788597661abecc7f7994 -------------------------------- Older kernels lack io_uring POLLFREE handling. As only affected files are signalfd and android binder the safest option would be to disable polling those files via io_uring and hope there are no users. Fixes: 221c5eb2 ("io_uring: add support for IORING_OP_POLL") Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> conflicts: include/linux/fs.h Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Ma Wupeng 提交于
mm: oom_kill: fix KABI broken by "oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup" hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I61FDP CVE: NA ------------------------------- Move oom_reaper_timer from task_struct to task_struct_resvd to fix KABI broken. Signed-off-by: NMa Wupeng <mawupeng1@huawei.com> Reviewed-by: NNanyong Sun <sunnanyong@huawei.com> Reviewed-by: Nchenhui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Nico Pache 提交于
mainline inclusion from mainline-v5.18-rc4 commit e4a38402 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61FDP CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e4a38402c36e42df28eb1a5394be87e6571fb48a -------------------------------- The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which can be targeted by the oom reaper. This mapping is used to store the futex robust list head; the kernel does not keep a copy of the robust list and instead references a userspace address to maintain the robustness during a process death. A race can occur between exit_mm and the oom reaper that allows the oom reaper to free the memory of the futex robust list before the exit path has handled the futex death: CPU1 CPU2 -------------------------------------------------------------------- page_fault do_exit "signal" wake_oom_reaper oom_reaper oom_reap_task_mm (invalidates mm) exit_mm exit_mm_release futex_exit_release futex_cleanup exit_robust_list get_user (EFAULT- can't access memory) If the get_user EFAULT's, the kernel will be unable to recover the waiters on the robust_list, leaving userspace mutexes hung indefinitely. Delay the OOM reaper, allowing more time for the exit path to perform the futex cleanup. Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer Based on a patch by Michal Hocko. Link: https://elixir.bootlin.com/glibc/glibc-2.35/source/nptl/allocatestack.c#L370 [1] Link: https://lkml.kernel.org/r/20220414144042.677008-1-npache@redhat.com Fixes: 21292580 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: NJoel Savitz <jsavitz@redhat.com> Signed-off-by: NNico Pache <npache@redhat.com> Co-developed-by: NJoel Savitz <jsavitz@redhat.com> Suggested-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Waiman Long <longman@redhat.com> Cc: Herton R. Krzesinski <herton@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joel Savitz <jsavitz@redhat.com> Cc: Darren Hart <dvhart@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NMa Wupeng <mawupeng1@huawei.com> Reviewed-by: NNanyong Sun <sunnanyong@huawei.com> Reviewed-by: Nchenhui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Zheng Zucheng 提交于
hulk inclusion category: feature bugzilla: 187196, https://gitee.com/openeuler/kernel/issues/I612CS CVE: NA ------------------------------- Allocate a new task_struct_resvd object for the recently cloned task Signed-off-by: NZheng Zucheng <zhengzucheng@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NNanyong Sun <sunnanyong@huawei.com> Reviewed-by: Nchenhui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 11 1月, 2023 1 次提交
-
-
由 Alan Stern 提交于
mainline inclusion from mainline-v6.0-rc4 commit 9c6d7788 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I675RE CVE: CVE-2022-4662 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9c6d778800b921bde3bff3cff5003d1650f942d1 -------------------------------- Automatic kernel fuzzing revealed a recursive locking violation in usb-storage: ============================================ WARNING: possible recursive locking detected 5.18.0 #3 Not tainted -------------------------------------------- kworker/1:3/1205 is trying to acquire lock: ffff888018638db8 (&us_interface_key[i]){+.+.}-{3:3}, at: usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230 but task is already holding lock: ffff888018638db8 (&us_interface_key[i]){+.+.}-{3:3}, at: usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230 ... stack backtrace: CPU: 1 PID: 1205 Comm: kworker/1:3 Not tainted 5.18.0 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Workqueue: usb_hub_wq hub_event Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_deadlock_bug kernel/locking/lockdep.c:2988 [inline] check_deadlock kernel/locking/lockdep.c:3031 [inline] validate_chain kernel/locking/lockdep.c:3816 [inline] __lock_acquire.cold+0x152/0x3ca kernel/locking/lockdep.c:5053 lock_acquire kernel/locking/lockdep.c:5665 [inline] lock_acquire+0x1ab/0x520 kernel/locking/lockdep.c:5630 __mutex_lock_common kernel/locking/mutex.c:603 [inline] __mutex_lock+0x14f/0x1610 kernel/locking/mutex.c:747 usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230 usb_reset_device+0x37d/0x9a0 drivers/usb/core/hub.c:6109 r871xu_dev_remove+0x21a/0x270 drivers/staging/rtl8712/usb_intf.c:622 usb_unbind_interface+0x1bd/0x890 drivers/usb/core/driver.c:458 device_remove drivers/base/dd.c:545 [inline] device_remove+0x11f/0x170 drivers/base/dd.c:537 __device_release_driver drivers/base/dd.c:1222 [inline] device_release_driver_internal+0x1a7/0x2f0 drivers/base/dd.c:1248 usb_driver_release_interface+0x102/0x180 drivers/usb/core/driver.c:627 usb_forced_unbind_intf+0x4d/0xa0 drivers/usb/core/driver.c:1118 usb_reset_device+0x39b/0x9a0 drivers/usb/core/hub.c:6114 This turned out not to be an error in usb-storage but rather a nested device reset attempt. That is, as the rtl8712 driver was being unbound from a composite device in preparation for an unrelated USB reset (that driver does not have pre_reset or post_reset callbacks), its ->remove routine called usb_reset_device() -- thus nesting one reset call within another. Performing a reset as part of disconnect processing is a questionable practice at best. However, the bug report points out that the USB core does not have any protection against nested resets. Adding a reset_in_progress flag and testing it will prevent such errors in the future. Link: https://lore.kernel.org/all/CAB7eexKUpvX-JNiLzhXBDWgfg2T9e9_0Tw4HQ6keN==voRbP0g@mail.gmail.com/ Cc: stable@vger.kernel.org Reported-and-tested-by: NRondreis <linhaoguo86@gmail.com> Signed-off-by: NAlan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/YwkflDxvg0KWqyZK@rowland.harvard.eduSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuyao Lin <linyuyao1@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 04 1月, 2023 2 次提交
-
-
由 Zhang Yi 提交于
mainline inclusion from mainline-v6.1-rc1 commit fdee117e category: bugfix bugzilla: 187878,https://gitee.com/openeuler/kernel/issues/I5QJH9 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.1-rc2&id=fdee117ee86479fd2644bcd9ac2b2469e55722d1 -------------------------------- Current ll_rw_block() helper is fragile because it assumes that locked buffer means it's under IO which is submitted by some other who holds the lock, it skip buffer if it failed to get the lock, so it's only safe on the readahead path. Unfortunately, now that most filesystems still use this helper mistakenly on the sync metadata read path. There is no guarantee that the one who holds the buffer lock always submit IO (e.g. buffer_migrate_folio_norefs() after commit 88dbcbb3 ("blkdev: avoid migration stalls for blkdev pages"), it could lead to false positive -EIO when submitting reading IO. This patch add some friendly buffer read helpers to prepare replacing ll_rw_block() and similar calls. We can only call bh_readahead_[] helpers for the readahead paths. Link: https://lkml.kernel.org/r/20220901133505.2510834-3-yi.zhang@huawei.comSigned-off-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflict: fs/buffer.c include/linux/buffer_head.h Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 188176, https://gitee.com/openeuler/kernel/issues/I67294 CVE: NA ------------------------------- Commit 891e2639 ("scsi: iscsi: Stop queueing during ep_disconnect") introduces .unbind_conn to fix the race between __iscsi_conn_send_pdu() and .ep_disconnect, however it also introduces the KABI problem. Considering the issue is only related with offload iscsi driver but not iscsi_tcp, so tried to revert it, however the above commit is just one patch in a patchset, the following patches depends on it and these patches fix problem related with iscsi_tcp. So just reverting it manually by removing .unbind_conn from iscsi_transport. Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-