1. 15 5月, 2023 6 次提交
  2. 25 4月, 2023 1 次提交
    • X
      kabi: Fix kabi breakage without build warning. · 600130a3
      Xie Haocheng 提交于
      amd inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I6XNL2
      CVE: NA
      
      -------------------------------------------------
      Error report detail:
      *** ERROR - ABI BREAKAGE WAS DETECTED ***
      
      The following symbols have been changed (this will cause an ABI breakage):
      new kabi:
      0x65d25289	__SCK__tp_func_xdp_exception	vmlinux	EXPORT_SYMBOL_GPL
      0x5e9265ee	__tracepoint_xdp_exception	vmlinux	EXPORT_SYMBOL_GPL
      old kabi:
      0x5e0fbbff	__SCK__tp_func_xdp_exception	vmlinux	EXPORT_SYMBOL_GPL
      0x017cc464	__tracepoint_xdp_exception	vmlinux	EXPORT_SYMBOL_GPL
      Signed-off-by: NXie Haocheng <haocheng.xie@amd.com>
      600130a3
  3. 24 4月, 2023 2 次提交
  4. 19 4月, 2023 2 次提交
  5. 17 4月, 2023 1 次提交
  6. 13 4月, 2023 1 次提交
  7. 12 4月, 2023 2 次提交
  8. 08 4月, 2023 3 次提交
  9. 07 4月, 2023 1 次提交
    • Z
      sched: Introduce priority load balance for CFS · 7d655d41
      zhangsong 提交于
      euleros inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5HF3M
      CVE: NA
      
      Reference: NA
      
      --------------------------------
      
      Add new sysctl interface:
      `/proc/sys/kernel/sched_prio_load_balance_enabled`
      
       0: default behavior
       1: enable priority load balance for qos scheduler
      
      For tasks co-location with qos scheduler, when CFS do load balance,
      it is reasonable to prefer migrating online(Latency Sensitive) tasks.
      So the CFS load balance can be changed to below:
      
      1) `cfs_tasks` list is owned by online tasks.
      2) Add new `cfs_offline_tasks` list which is owned by offline tasks.
      3) Prefer to migrate the online tasks of `cfs_tasks` list to dst rq.
      Signed-off-by: Nzhangsong <zhangsong34@huawei.com>
      Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
      --------------------------------
      V2->V3:
      - remove skip_migrate_task for load balance
      V1->V2:
      - remove setting cpu shares for offline cgroup
      
       Conflicts:
      	kernel/sched/sched.h
      7d655d41
  10. 04 4月, 2023 2 次提交
  11. 29 3月, 2023 6 次提交
    • Z
      Revert "block: fix null-deref in percpu_ref_put" · 4134b635
      Zhong Jinghua 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 187268, https://gitee.com/openeuler/kernel/issues/I5N162
      
      ----------------------------------------
      
      This reverts commit 51e35e67.
      
      There is a new fix for this problem in the mainline patch, so the patch
      should return to the mainline solution.
      
      mainline patch:
      d36a9ea5 ("block: fix use-after-free of q->q_usage_counter")
      
      Fixes: 51e35e67("block: fix null-deref in percpu_ref_put")
      Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      4134b635
    • D
      xfs, iomap: limit individual ioend chain lengths in writeback · c61374c2
      Dave Chinner 提交于
      mainline inclusion
      from mainline-v5.17-rc3
      commit ebb7fb15
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ebb7fb1557b1d03b906b668aa2164b51e6b7d19a
      
      --------------------------------
      
      Trond Myklebust reported soft lockups in XFS IO completion such as
      this:
      
       watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [kworker/12:1:3106]
       CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-305.10.2.el8_4.x86_64 #1
       Workqueue: xfs-conv/md127 xfs_end_io [xfs]
       RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20
       Call Trace:
        wake_up_page_bit+0x8a/0x110
        iomap_finish_ioend+0xd7/0x1c0
        iomap_finish_ioends+0x7f/0xb0
        xfs_end_ioend+0x6b/0x100 [xfs]
        xfs_end_io+0xb9/0xe0 [xfs]
        process_one_work+0x1a7/0x360
        worker_thread+0x1fa/0x390
        kthread+0x116/0x130
        ret_from_fork+0x35/0x40
      
      Ioends are processed as an atomic completion unit when all the
      chained bios in the ioend have completed their IO. Logically
      contiguous ioends can also be merged and completed as a single,
      larger unit.  Both of these things can be problematic as both the
      bio chains per ioend and the size of the merged ioends processed as
      a single completion are both unbound.
      
      If we have a large sequential dirty region in the page cache,
      write_cache_pages() will keep feeding us sequential pages and we
      will keep mapping them into ioends and bios until we get a dirty
      page at a non-sequential file offset. These large sequential runs
      can will result in bio and ioend chaining to optimise the io
      patterns. The pages iunder writeback are pinned within these chains
      until the submission chaining is broken, allowing the entire chain
      to be completed. This can result in huge chains being processed
      in IO completion context.
      
      We get deep bio chaining if we have large contiguous physical
      extents. We will keep adding pages to the current bio until it is
      full, then we'll chain a new bio to keep adding pages for writeback.
      Hence we can build bio chains that map millions of pages and tens of
      gigabytes of RAM if the page cache contains big enough contiguous
      dirty file regions. This long bio chain pins those pages until the
      final bio in the chain completes and the ioend can iterate all the
      chained bios and complete them.
      
      OTOH, if we have a physically fragmented file, we end up submitting
      one ioend per physical fragment that each have a small bio or bio
      chain attached to them. We do not chain these at IO submission time,
      but instead we chain them at completion time based on file
      offset via iomap_ioend_try_merge(). Hence we can end up with unbound
      ioend chains being built via completion merging.
      
      XFS can then do COW remapping or unwritten extent conversion on that
      merged chain, which involves walking an extent fragment at a time
      and running a transaction to modify the physical extent information.
      IOWs, we merge all the discontiguous ioends together into a
      contiguous file range, only to then process them individually as
      discontiguous extents.
      
      This extent manipulation is computationally expensive and can run in
      a tight loop, so merging logically contiguous but physically
      discontigous ioends gains us nothing except for hiding the fact the
      fact we broke the ioends up into individual physical extents at
      submission and then need to loop over those individual physical
      extents at completion.
      
      Hence we need to have mechanisms to limit ioend sizes and
      to break up completion processing of large merged ioend chains:
      
      1. bio chains per ioend need to be bound in length. Pure overwrites
      go straight to iomap_finish_ioend() in softirq context with the
      exact bio chain attached to the ioend by submission. Hence the only
      way to prevent long holdoffs here is to bound ioend submission
      sizes because we can't reschedule in softirq context.
      
      2. iomap_finish_ioends() has to handle unbound merged ioend chains
      correctly. This relies on any one call to iomap_finish_ioend() being
      bound in runtime so that cond_resched() can be issued regularly as
      the long ioend chain is processed. i.e. this relies on mechanism #1
      to limit individual ioend sizes to work correctly.
      
      3. filesystems have to loop over the merged ioends to process
      physical extent manipulations. This means they can loop internally,
      and so we break merging at physical extent boundaries so the
      filesystem can easily insert reschedule points between individual
      extent manipulations.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reported-and-tested-by: NTrond Myklebust <trondmy@hammerspace.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Conflicts:
      	include/linux/iomap.h
      	fs/iomap/buffered-io.c
      	fs/xfs/xfs_aops.c
      
      	[ 6e552494 ("iomap: remove unused private field from ioend")
      	  is not applied.
      	  95c4cd05 ("iomap: Convert to_iomap_page to take a folio") is
      	  not applied.
      	  8ffd74e9 ("iomap: Convert bio completions to use folios") is
      	  not applied.
      	  044c6449 ("xfs: drop unused ioend private merge and
      	  setfilesize code") is not applied. ]
      Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      c61374c2
    • L
      coredump: fix kabi broken in struct coredump_params · a4edd5e2
      Li Huafei 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6KT9C
      CVE: CVE-2023-1249
      
      --------------------------------
      
      The coredump_params structure is only used as parameters for function
      pointer members of some structures, such as linux_binfmt, spufs_calls,
      etc., and the parameters are of pointer type, so adding members of
      coredump_params will not affect the memory layout.
      
      Also coredump_params is used to hold coredump parameters to be passed to
      coredump functions of different types of binfmt, the driver will not use
      the structure.
      Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
      Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
      Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      a4edd5e2
    • E
      coredump: Use the vma snapshot in fill_files_note · d47f6989
      Eric W. Biederman 提交于
      stable inclusion
      from stable-v5.10.110
      commit 558564db44755dfb3e48b0d64de327d20981e950
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6KT9C
      CVE: CVE-2023-1249
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=558564db44755dfb3e48b0d64de327d20981e950
      
      --------------------------------
      
      commit 390031c9 upstream.
      
      Matthew Wilcox reported that there is a missing mmap_lock in
      file_files_note that could possibly lead to a user after free.
      
      Solve this by using the existing vma snapshot for consistency
      and to avoid the need to take the mmap_lock anywhere in the
      coredump code except for dump_vma_snapshot.
      
      Update the dump_vma_snapshot to capture vm_pgoff and vm_file
      that are neeeded by fill_files_note.
      
      Add free_vma_snapshot to free the captured values of vm_file.
      Reported-by: NMatthew Wilcox <willy@infradead.org>
      Link: https://lkml.kernel.org/r/20220131153740.2396974-1-willy@infradead.org
      Cc: stable@vger.kernel.org
      Fixes: a07279c9 ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot")
      Fixes: 2aa362c4 ("coredump: extend core dump note section to contain file names of mapped files")
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
      Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
      Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      d47f6989
    • E
      coredump: Snapshot the vmas in do_coredump · e156bdef
      Eric W. Biederman 提交于
      stable inclusion
      from stable-v5.10.110
      commit 936c8be4d1447f36ac4d2a464bd03a5cd659c42f
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6KT9C
      CVE: CVE-2023-1249
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=936c8be4d1447f36ac4d2a464bd03a5cd659c42f
      
      --------------------------------
      
      commit 95c5436a upstream.
      
      Move the call of dump_vma_snapshot and kvfree(vma_meta) out of the
      individual coredump routines into do_coredump itself.  This makes
      the code less error prone and easier to maintain.
      
      Make the vma snapshot available to the coredump routines
      in struct coredump_params.  This makes it easier to
      change and update what is captures in the vma snapshot
      and will be needed for fixing fill_file_notes.
      Reviewed-by: NJann Horn <jannh@google.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
      Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
      Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      e156bdef
    • E
      bpf: Avoid races in __bpf_prog_run() for 32bit arches · a9380593
      Eric Dumazet 提交于
      mainline inclusion
      from mainline-v5.16-rc1
      commit f941eadd
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I6O293
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc3&id=f941eadd8d6d4ee2f8c9aeab8e1da5e647533a7d
      
      ---------------------------
      
      __bpf_prog_run() can run from non IRQ contexts, meaning
      it could be re entered if interrupted.
      
      This calls for the irq safe variant of u64_stats_update_{begin|end},
      or risk a deadlock.
      
      This patch is a nop on 64bit arches, fortunately.
      
      syzbot report:
      
      WARNING: inconsistent lock state
      5.12.0-rc3-syzkaller #0 Not tainted
      --------------------------------
      inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
      udevd/4013 [HC0[0]:SC0[0]:HE1:SE1] takes:
      ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: sk_filter include/linux/filter.h:867 [inline]
      ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: do_one_broadcast net/netlink/af_netlink.c:1468 [inline]
      ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: netlink_broadcast_filtered+0x27c/0x4fc net/netlink/af_netlink.c:1520
      {IN-SOFTIRQ-W} state was registered at:
        lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510
        lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483
        do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline]
        do_write_seqcount_begin include/linux/seqlock.h:545 [inline]
        u64_stats_update_begin include/linux/u64_stats_sync.h:129 [inline]
        bpf_prog_run_pin_on_cpu include/linux/filter.h:624 [inline]
        bpf_prog_run_clear_cb+0x1bc/0x270 include/linux/filter.h:755
        run_filter+0xa0/0x17c net/packet/af_packet.c:2031
        packet_rcv+0xc0/0x3e0 net/packet/af_packet.c:2104
        dev_queue_xmit_nit+0x2bc/0x39c net/core/dev.c:2387
        xmit_one net/core/dev.c:3588 [inline]
        dev_hard_start_xmit+0x94/0x518 net/core/dev.c:3609
        sch_direct_xmit+0x11c/0x1f0 net/sched/sch_generic.c:313
        qdisc_restart net/sched/sch_generic.c:376 [inline]
        __qdisc_run+0x194/0x7f8 net/sched/sch_generic.c:384
        qdisc_run include/net/pkt_sched.h:136 [inline]
        qdisc_run include/net/pkt_sched.h:128 [inline]
        __dev_xmit_skb net/core/dev.c:3795 [inline]
        __dev_queue_xmit+0x65c/0xf84 net/core/dev.c:4150
        dev_queue_xmit+0x14/0x18 net/core/dev.c:4215
        neigh_resolve_output net/core/neighbour.c:1491 [inline]
        neigh_resolve_output+0x170/0x228 net/core/neighbour.c:1471
        neigh_output include/net/neighbour.h:510 [inline]
        ip6_finish_output2+0x2e4/0x9fc net/ipv6/ip6_output.c:117
        __ip6_finish_output net/ipv6/ip6_output.c:182 [inline]
        __ip6_finish_output+0x164/0x3f8 net/ipv6/ip6_output.c:161
        ip6_finish_output+0x2c/0xb0 net/ipv6/ip6_output.c:192
        NF_HOOK_COND include/linux/netfilter.h:290 [inline]
        ip6_output+0x74/0x294 net/ipv6/ip6_output.c:215
        dst_output include/net/dst.h:448 [inline]
        NF_HOOK include/linux/netfilter.h:301 [inline]
        NF_HOOK include/linux/netfilter.h:295 [inline]
        mld_sendpack+0x2a8/0x7e4 net/ipv6/mcast.c:1679
        mld_send_cr net/ipv6/mcast.c:1975 [inline]
        mld_ifc_timer_expire+0x1e8/0x494 net/ipv6/mcast.c:2474
        call_timer_fn+0xd0/0x570 kernel/time/timer.c:1431
        expire_timers kernel/time/timer.c:1476 [inline]
        __run_timers kernel/time/timer.c:1745 [inline]
        run_timer_softirq+0x2e4/0x384 kernel/time/timer.c:1758
        __do_softirq+0x204/0x7ac kernel/softirq.c:345
        do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
        invoke_softirq kernel/softirq.c:228 [inline]
        __irq_exit_rcu+0x1d8/0x200 kernel/softirq.c:422
        irq_exit+0x10/0x3c kernel/softirq.c:446
        __handle_domain_irq+0xb4/0x120 kernel/irq/irqdesc.c:692
        handle_domain_irq include/linux/irqdesc.h:176 [inline]
        gic_handle_irq+0x84/0xac drivers/irqchip/irq-gic.c:370
        __irq_svc+0x5c/0x94 arch/arm/kernel/entry-armv.S:205
        debug_smp_processor_id+0x0/0x24 lib/smp_processor_id.c:53
        rcu_read_lock_held_common kernel/rcu/update.c:108 [inline]
        rcu_read_lock_sched_held+0x24/0x7c kernel/rcu/update.c:123
        trace_lock_acquire+0x24c/0x278 include/trace/events/lock.h:13
        lock_acquire+0x3c/0x74 kernel/locking/lockdep.c:5481
        rcu_lock_acquire include/linux/rcupdate.h:267 [inline]
        rcu_read_lock include/linux/rcupdate.h:656 [inline]
        avc_has_perm_noaudit+0x6c/0x260 security/selinux/avc.c:1150
        selinux_inode_permission+0x140/0x220 security/selinux/hooks.c:3141
        security_inode_permission+0x44/0x60 security/security.c:1268
        inode_permission.part.0+0x5c/0x13c fs/namei.c:521
        inode_permission fs/namei.c:494 [inline]
        may_lookup fs/namei.c:1652 [inline]
        link_path_walk.part.0+0xd4/0x38c fs/namei.c:2208
        link_path_walk fs/namei.c:2189 [inline]
        path_lookupat+0x3c/0x1b8 fs/namei.c:2419
        filename_lookup+0xa8/0x1a4 fs/namei.c:2453
        user_path_at_empty+0x74/0x90 fs/namei.c:2733
        do_readlinkat+0x5c/0x12c fs/stat.c:417
        __do_sys_readlink fs/stat.c:450 [inline]
        sys_readlink+0x24/0x28 fs/stat.c:447
        ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64
        0x7eaa4974
      irq event stamp: 298277
      hardirqs last  enabled at (298277): [<802000d0>] no_work_pending+0x4/0x34
      hardirqs last disabled at (298276): [<8020c9b8>] do_work_pending+0x9c/0x648 arch/arm/kernel/signal.c:676
      softirqs last  enabled at (298216): [<8020167c>] __do_softirq+0x584/0x7ac kernel/softirq.c:372
      softirqs last disabled at (298201): [<8024dff4>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
      softirqs last disabled at (298201): [<8024dff4>] invoke_softirq kernel/softirq.c:228 [inline]
      softirqs last disabled at (298201): [<8024dff4>] __irq_exit_rcu+0x1d8/0x200 kernel/softirq.c:422
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&pstats->syncp)->seq);
        <Interrupt>
          lock(&(&pstats->syncp)->seq);
      
       *** DEADLOCK ***
      
      1 lock held by udevd/4013:
       #0: 82b09c5c (rcu_read_lock){....}-{1:2}, at: sk_filter_trim_cap+0x54/0x434 net/core/filter.c:139
      
      stack backtrace:
      CPU: 1 PID: 4013 Comm: udevd Not tainted 5.12.0-rc3-syzkaller #0
      Hardware name: ARM-Versatile Express
      Backtrace:
      [<81802550>] (dump_backtrace) from [<818027c4>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252)
       r7:00000080 r6:600d0093 r5:00000000 r4:82b58344
      [<818027ac>] (show_stack) from [<81809e98>] (__dump_stack lib/dump_stack.c:79 [inline])
      [<818027ac>] (show_stack) from [<81809e98>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120)
      [<81809de0>] (dump_stack) from [<81804a00>] (print_usage_bug.part.0+0x228/0x230 kernel/locking/lockdep.c:3806)
       r7:86bcb768 r6:81a0326c r5:830f96a8 r4:86bcb0c0
      [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (print_usage_bug kernel/locking/lockdep.c:3776 [inline])
      [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (valid_state kernel/locking/lockdep.c:3818 [inline])
      [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (mark_lock_irq kernel/locking/lockdep.c:4021 [inline])
      [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (mark_lock.part.0+0xc34/0x136c kernel/locking/lockdep.c:4478)
       r10:83278fe8 r9:82c6d748 r8:00000000 r7:82c6d2d4 r6:00000004 r5:86bcb768
       r4:00000006
      [<802ba584>] (mark_lock.part.0) from [<802bc644>] (mark_lock kernel/locking/lockdep.c:4442 [inline])
      [<802ba584>] (mark_lock.part.0) from [<802bc644>] (mark_usage kernel/locking/lockdep.c:4391 [inline])
      [<802ba584>] (mark_lock.part.0) from [<802bc644>] (__lock_acquire+0x9bc/0x3318 kernel/locking/lockdep.c:4854)
       r10:86bcb768 r9:86bcb0c0 r8:00000001 r7:00040000 r6:0000075a r5:830f96a8
       r4:00000000
      [<802bbc88>] (__lock_acquire) from [<802bfb90>] (lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510)
       r10:00000000 r9:600d0013 r8:00000000 r7:00000000 r6:828a2680 r5:828a2680
       r4:861e5bc8
      [<802bfaa0>] (lock_acquire.part.0) from [<802bff28>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483)
       r10:8146137c r9:00000000 r8:00000001 r7:00000000 r6:00000000 r5:00000000
       r4:ff7c9dec
      [<802bfebc>] (lock_acquire) from [<81381eb4>] (do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline])
      [<802bfebc>] (lock_acquire) from [<81381eb4>] (do_write_seqcount_begin include/linux/seqlock.h:545 [inline])
      [<802bfebc>] (lock_acquire) from [<81381eb4>] (u64_stats_update_begin include/linux/u64_stats_sync.h:129 [inline])
      [<802bfebc>] (lock_acquire) from [<81381eb4>] (__bpf_prog_run_save_cb include/linux/filter.h:727 [inline])
      [<802bfebc>] (lock_acquire) from [<81381eb4>] (bpf_prog_run_save_cb include/linux/filter.h:741 [inline])
      [<802bfebc>] (lock_acquire) from [<81381eb4>] (sk_filter_trim_cap+0x26c/0x434 net/core/filter.c:149)
       r10:a4095dd0 r9:ff7c9dd0 r8:e44be000 r7:8146137c r6:00000001 r5:8611ba80
       r4:00000000
      [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (sk_filter include/linux/filter.h:867 [inline])
      [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (do_one_broadcast net/netlink/af_netlink.c:1468 [inline])
      [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (netlink_broadcast_filtered+0x27c/0x4fc net/netlink/af_netlink.c:1520)
       r10:00000001 r9:833d6b1c r8:00000000 r7:8572f864 r6:8611ba80 r5:8698d800
       r4:8572f800
      [<81461100>] (netlink_broadcast_filtered) from [<81463e60>] (netlink_broadcast net/netlink/af_netlink.c:1544 [inline])
      [<81461100>] (netlink_broadcast_filtered) from [<81463e60>] (netlink_sendmsg+0x3d0/0x478 net/netlink/af_netlink.c:1925)
       r10:00000000 r9:00000002 r8:8698d800 r7:000000b7 r6:8611b900 r5:861e5f50
       r4:86aa3000
      [<81463a90>] (netlink_sendmsg) from [<81321f54>] (sock_sendmsg_nosec net/socket.c:654 [inline])
      [<81463a90>] (netlink_sendmsg) from [<81321f54>] (sock_sendmsg+0x3c/0x4c net/socket.c:674)
       r10:00000000 r9:861e5dd4 r8:00000000 r7:86570000 r6:00000000 r5:86570000
       r4:861e5f50
      [<81321f18>] (sock_sendmsg) from [<813234d0>] (____sys_sendmsg+0x230/0x29c net/socket.c:2350)
       r5:00000040 r4:861e5f50
      [<813232a0>] (____sys_sendmsg) from [<8132549c>] (___sys_sendmsg+0xac/0xe4 net/socket.c:2404)
       r10:00000128 r9:861e4000 r8:00000000 r7:00000000 r6:86570000 r5:861e5f50
       r4:00000000
      [<813253f0>] (___sys_sendmsg) from [<81325684>] (__sys_sendmsg net/socket.c:2433 [inline])
      [<813253f0>] (___sys_sendmsg) from [<81325684>] (__do_sys_sendmsg net/socket.c:2442 [inline])
      [<813253f0>] (___sys_sendmsg) from [<81325684>] (sys_sendmsg+0x58/0xa0 net/socket.c:2440)
       r8:80200224 r7:00000128 r6:00000000 r5:7eaa541c r4:86570000
      [<8132562c>] (sys_sendmsg) from [<80200060>] (ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64)
      Exception stack(0x861e5fa8 to 0x861e5ff0)
      5fa0:                   00000000 00000000 0000000c 7eaa541c 00000000 00000000
      5fc0: 00000000 00000000 76fbf840 00000128 00000000 0000008f 7eaa541c 000563f8
      5fe0: 00056110 7eaa53e0 00036cec 76c9bf44
       r6:76fbf840 r5:00000000 r4:00000000
      
      Fixes: 492ecee8 ("bpf: enable program stats")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211026214133.3114279-2-eric.dumazet@gmail.com
      Conflicts:
      	include/linux/filter.h
      Signed-off-by: NPu Lehui <pulehui@huawei.com>
      Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      a9380593
  12. 17 3月, 2023 2 次提交
  13. 15 3月, 2023 2 次提交
    • Y
      block: add precise io accouting apis · 80fbdf77
      Yu Kuai 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I6L586
      CVE: NA
      
      --------------------------------
      
      Currently, for bio-based device, 'ios' and 'sectors' is counted while
      io is started, while 'nsecs' is counted while io is done.
      
      This behaviour is obviously wrong, however we can't fix exist kapis
      because this will require new parameter, which will cause kapi broken.
      Hence this patch add some new apis, which will make sure io accounting
      for bio-based device is precise.
      Signed-off-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      80fbdf77
    • M
      scsi: iscsi_tcp: Fix UAF during logout when accessing the shost ipaddress · 499bf50a
      Mike Christie 提交于
      mainline inclusion
      from mainline-v6.2-rc6
      commit 6f1d64b1
      category: bugfix
      bugzilla: 188443, https://gitee.com/openeuler/kernel/issues/I6I8YD
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6f1d64b13097e85abda0f91b5638000afc5f9a06
      
      ----------------------------------------
      
      Bug report and analysis from Ding Hui.
      
      During iSCSI session logout, if another task accesses the shost ipaddress
      attr, we can get a KASAN UAF report like this:
      
      [  276.942144] BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x78/0xe0
      [  276.942535] Write of size 4 at addr ffff8881053b45b8 by task cat/4088
      [  276.943511] CPU: 2 PID: 4088 Comm: cat Tainted: G            E      6.1.0-rc8+ #3
      [  276.943997] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
      [  276.944470] Call Trace:
      [  276.944943]  <TASK>
      [  276.945397]  dump_stack_lvl+0x34/0x48
      [  276.945887]  print_address_description.constprop.0+0x86/0x1e7
      [  276.946421]  print_report+0x36/0x4f
      [  276.947358]  kasan_report+0xad/0x130
      [  276.948234]  kasan_check_range+0x35/0x1c0
      [  276.948674]  _raw_spin_lock_bh+0x78/0xe0
      [  276.949989]  iscsi_sw_tcp_host_get_param+0xad/0x2e0 [iscsi_tcp]
      [  276.951765]  show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0xe9/0x130 [scsi_transport_iscsi]
      [  276.952185]  dev_attr_show+0x3f/0x80
      [  276.953005]  sysfs_kf_seq_show+0x1fb/0x3e0
      [  276.953401]  seq_read_iter+0x402/0x1020
      [  276.954260]  vfs_read+0x532/0x7b0
      [  276.955113]  ksys_read+0xed/0x1c0
      [  276.955952]  do_syscall_64+0x38/0x90
      [  276.956347]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [  276.956769] RIP: 0033:0x7f5d3a679222
      [  276.957161] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 32 c0 0b 00 e8 a5 fe 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
      [  276.958009] RSP: 002b:00007ffc864d16a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
      [  276.958431] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5d3a679222
      [  276.958857] RDX: 0000000000020000 RSI: 00007f5d3a4fe000 RDI: 0000000000000003
      [  276.959281] RBP: 00007f5d3a4fe000 R08: 00000000ffffffff R09: 0000000000000000
      [  276.959682] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020000
      [  276.960126] R13: 0000000000000003 R14: 0000000000000000 R15: 0000557a26dada58
      [  276.960536]  </TASK>
      [  276.961357] Allocated by task 2209:
      [  276.961756]  kasan_save_stack+0x1e/0x40
      [  276.962170]  kasan_set_track+0x21/0x30
      [  276.962557]  __kasan_kmalloc+0x7e/0x90
      [  276.962923]  __kmalloc+0x5b/0x140
      [  276.963308]  iscsi_alloc_session+0x28/0x840 [scsi_transport_iscsi]
      [  276.963712]  iscsi_session_setup+0xda/0xba0 [libiscsi]
      [  276.964078]  iscsi_sw_tcp_session_create+0x1fd/0x330 [iscsi_tcp]
      [  276.964431]  iscsi_if_create_session.isra.0+0x50/0x260 [scsi_transport_iscsi]
      [  276.964793]  iscsi_if_recv_msg+0xc5a/0x2660 [scsi_transport_iscsi]
      [  276.965153]  iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi]
      [  276.965546]  netlink_unicast+0x4d5/0x7b0
      [  276.965905]  netlink_sendmsg+0x78d/0xc30
      [  276.966236]  sock_sendmsg+0xe5/0x120
      [  276.966576]  ____sys_sendmsg+0x5fe/0x860
      [  276.966923]  ___sys_sendmsg+0xe0/0x170
      [  276.967300]  __sys_sendmsg+0xc8/0x170
      [  276.967666]  do_syscall_64+0x38/0x90
      [  276.968028]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [  276.968773] Freed by task 2209:
      [  276.969111]  kasan_save_stack+0x1e/0x40
      [  276.969449]  kasan_set_track+0x21/0x30
      [  276.969789]  kasan_save_free_info+0x2a/0x50
      [  276.970146]  __kasan_slab_free+0x106/0x190
      [  276.970470]  __kmem_cache_free+0x133/0x270
      [  276.970816]  device_release+0x98/0x210
      [  276.971145]  kobject_cleanup+0x101/0x360
      [  276.971462]  iscsi_session_teardown+0x3fb/0x530 [libiscsi]
      [  276.971775]  iscsi_sw_tcp_session_destroy+0xd8/0x130 [iscsi_tcp]
      [  276.972143]  iscsi_if_recv_msg+0x1bf1/0x2660 [scsi_transport_iscsi]
      [  276.972485]  iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi]
      [  276.972808]  netlink_unicast+0x4d5/0x7b0
      [  276.973201]  netlink_sendmsg+0x78d/0xc30
      [  276.973544]  sock_sendmsg+0xe5/0x120
      [  276.973864]  ____sys_sendmsg+0x5fe/0x860
      [  276.974248]  ___sys_sendmsg+0xe0/0x170
      [  276.974583]  __sys_sendmsg+0xc8/0x170
      [  276.974891]  do_syscall_64+0x38/0x90
      [  276.975216]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      We can easily reproduce by two tasks:
      1. while :; do iscsiadm -m node --login; iscsiadm -m node --logout; done
      2. while :; do cat \
      /sys/devices/platform/host*/iscsi_host/host*/ipaddress; done
      
                  iscsid              |        cat
      --------------------------------+---------------------------------------
      |- iscsi_sw_tcp_session_destroy |
        |- iscsi_session_teardown     |
          |- device_release           |
            |- iscsi_session_release  ||- dev_attr_show
              |- kfree                |  |- show_host_param_
                                      |             ISCSI_HOST_PARAM_IPADDRESS
                                      |    |- iscsi_sw_tcp_host_get_param
                                      |      |- r/w tcp_sw_host->session (UAF)
        |- iscsi_host_remove          |
        |- iscsi_host_free            |
      
      Fix the above bug by splitting the session removal into 2 parts:
      
       1. removal from iSCSI class which includes sysfs and removal from host
          tracking.
      
       2. freeing of session.
      
      During iscsi_tcp host and session removal we can remove the session from
      sysfs then remove the host from sysfs. At this point we know userspace is
      not accessing the kernel via sysfs so we can free the session and host.
      
      Link: https://lore.kernel.org/r/20230117193937.21244-2-michael.christie@oracle.comSigned-off-by: NMike Christie <michael.christie@oracle.com>
      Reviewed-by: NLee Duncan <lduncan@suse.com>
      Acked-by: NDing Hui <dinghui@sangfor.com.cn>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com>
      conflicts:
      	drivers/scsi/iscsi_tcp.c
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      499bf50a
  14. 13 3月, 2023 1 次提交
  15. 08 3月, 2023 8 次提交