提交 · 67cd4976cb764912e1d1e25e14b9b0c2675c522d · openeuler / Kernel

16 5月, 2023 3 次提交

vfs: add rcu argument to ->get_acl() callback · 67cd4976

由 Miklos Szeredi 提交于 5月 16, 2023

mainline inclusion
from mainline-v5.15-rc1
commit 0cad6246
category: perf
bugzilla: https://gitee.com/openeuler/kernel/issues/I6ZCW0
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0cad6246621b5887d5b33fea84219d2a71f2f99a

--------------------------------

Add a rcu argument to the ->get_acl() callback to allow
get_cached_acl_rcu() to call the ->get_acl() method in the next patch.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
[chengzhihao: rename get_acl to get_acl2 to prevent KABI changes, and
 only backport(realize) overlayfs]
Conflicts:
	fs/overlayfs/dir.c
	fs/overlayfs/inode.c
	fs/overlayfs/overlayfs.h
	fs/posix_acl.c
	include/linux/fs.h
Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

67cd4976

bpf: add missing header file include · b169caaa

由 Linus Torvalds 提交于 5月 16, 2023

stable inclusion
from stable-v5.10.170
commit 12e3119a87627741bd3871c895ce198f21529eb3
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I71N8L

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=12e3119a87627741bd3871c895ce198f21529eb3

--------------------------------

commit f3dd0c53 upstream.

Commit 74e19ef0 ("uaccess: Add speculation barrier to
copy_from_user()") built fine on x86-64 and arm64, and that's the extent
of my local build testing.

It turns out those got the <linux/nospec.h> include incidentally through
other header files (<linux/kvm_host.h> in particular), but that was not
true of other architectures, resulting in build errors

  kernel/bpf/core.c: In function ‘___bpf_prog_run’:
  kernel/bpf/core.c:1913:3: error: implicit declaration of function ‘barrier_nospec’

so just make sure to explicitly include the proper <linux/nospec.h>
header file to make everybody see it.

Fixes: 74e19ef0 ("uaccess: Add speculation barrier to copy_from_user()")
Reported-by: Nkernel test robot <lkp@intel.com>
Reported-by: NViresh Kumar <viresh.kumar@linaro.org>
Reported-by: NHuacai Chen <chenhuacai@loongson.cn>
Tested-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Tested-by: NDave Hansen <dave.hansen@linux.intel.com>
Acked-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflicts:
	kernel/bpf/core.c
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: NNanyong Sun <sunnanyong@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

b169caaa

uaccess: Add speculation barrier to copy_from_user() · 4e0f3c2f

由 Dave Hansen 提交于 5月 16, 2023

stable inclusion
from stable-v5.10.170
commit 3b6ce54cfa2c04f0636fd0c985913af8703b408d
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I71N8L
CVE: CVE-2023-0459

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3b6ce54cfa2c04f0636fd0c985913af8703b408d

--------------------------------

commit 74e19ef0 upstream.

The results of "access_ok()" can be mis-speculated.  The result is that
you can end speculatively:

	if (access_ok(from, size))
		// Right here

even for bad from/size combinations.  On first glance, it would be ideal
to just add a speculation barrier to "access_ok()" so that its results
can never be mis-speculated.

But there are lots of system calls just doing access_ok() via
"copy_to_user()" and friends (example: fstat() and friends).  Those are
generally not problematic because they do not _consume_ data from
userspace other than the pointer.  They are also very quick and common
system calls that should not be needlessly slowed down.

"copy_from_user()" on the other hand uses a user-controller pointer and
is frequently followed up with code that might affect caches.  Take
something like this:

	if (!copy_from_user(&kernelvar, uptr, size))
		do_something_with(kernelvar);

If userspace passes in an evil 'uptr' that *actually* points to a kernel
addresses, and then do_something_with() has cache (or other)
side-effects, it could allow userspace to infer kernel data values.

Add a barrier to the common copy_from_user() code to prevent
mis-speculated values which happen after the copy.

Also add a stub for architectures that do not define barrier_nospec().
This makes the macro usable in generic code.

Since the barrier is now usable in generic code, the x86 #ifdef in the
BPF code can also go away.
Reported-by: NJordy Zomer <jordyzomer@google.com>
Suggested-by: NLinus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>   # BPF bits
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflicts:
	lib/usercopy.c
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: NNanyong Sun <sunnanyong@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

4e0f3c2f

11 5月, 2023 1 次提交

!685 [sync] PR-675: driver/hifc&hinic: Fix compile error in allyesconfigs · 6a895a1b

由 openeuler-ci-bot 提交于 5月 11, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/675 
 
When using allyesconfig to configure the kernel, errors may occur during the linking process when making. 
 
Link:https://gitee.com/openeuler/kernel/pulls/685 

Reviewed-by: Chiqijun <chiqijun@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

6a895a1b

10 5月, 2023 21 次提交

!721 Backport CVEs and bugfixes · cbe5d727

由 openeuler-ci-bot 提交于 5月 10, 2023

Merge Pull Request from: @zhangjialin11 
 
Pull new CVEs:
CVE-2022-4382
CVE-2023-0458
CVE-2023-2269
CVE-2023-2483
CVE-2023-31436
CVE-2023-2194
CVE-2023-2166
CVE-2023-2176
CVE-2023-2007

fs bugfixes from Baokun Li
bpf bugfixes from Liu Jian 
 
Link:https://gitee.com/openeuler/kernel/pulls/721 

Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>

cbe5d727

!694 [sync] PR-625: AMD: Fix build warning. · 692b9167

由 openeuler-ci-bot 提交于 5月 10, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/625 
 
This PR fixes below build warnings:
WARNING: modpost: EXPORT symbol "__SCT__perf_lopwr_cb" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: modpost: EXPORT symbol "__SCT__perf_lopwr_cb" [vmlinux] version generation failed, symbol will not be versioned. 
 
Link:https://gitee.com/openeuler/kernel/pulls/694 

Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>

692b9167

USB: gadgetfs: Fix race between mounting and unmounting · 48b73f7f

由 Alan Stern 提交于 5月 10, 2023

mainline inclusion
from mainline-v6.2-rc5
commit d18dcfe9
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I66IZK
CVE: CVE-2022-4382

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d18dcfe9860e842f394e37ba01ca9440ab2178f4

----------------------------------------------------------------------

The syzbot fuzzer and Gerald Lee have identified a use-after-free bug
in the gadgetfs driver, involving processes concurrently mounting and
unmounting the gadgetfs filesystem.  In particular, gadgetfs_fill_super()
can race with gadgetfs_kill_sb(), causing the latter to deallocate
the_device while the former is using it.  The output from KASAN says,
in part:

BUG: KASAN: use-after-free in instrument_atomic_read_write include/linux/instrumented.h:102 [inline]
BUG: KASAN: use-after-free in atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline]
BUG: KASAN: use-after-free in __refcount_sub_and_test include/linux/refcount.h:272 [inline]
BUG: KASAN: use-after-free in __refcount_dec_and_test include/linux/refcount.h:315 [inline]
BUG: KASAN: use-after-free in refcount_dec_and_test include/linux/refcount.h:333 [inline]
BUG: KASAN: use-after-free in put_dev drivers/usb/gadget/legacy/inode.c:159 [inline]
BUG: KASAN: use-after-free in gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086
Write of size 4 at addr ffff8880276d7840 by task syz-executor126/18689

CPU: 0 PID: 18689 Comm: syz-executor126 Not tainted 6.1.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
 <TASK>
...
 atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline]
 __refcount_sub_and_test include/linux/refcount.h:272 [inline]
 __refcount_dec_and_test include/linux/refcount.h:315 [inline]
 refcount_dec_and_test include/linux/refcount.h:333 [inline]
 put_dev drivers/usb/gadget/legacy/inode.c:159 [inline]
 gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086
 deactivate_locked_super+0xa7/0xf0 fs/super.c:332
 vfs_get_super fs/super.c:1190 [inline]
 get_tree_single+0xd0/0x160 fs/super.c:1207
 vfs_get_tree+0x88/0x270 fs/super.c:1531
 vfs_fsconfig_locked fs/fsopen.c:232 [inline]

The simplest solution is to ensure that gadgetfs_fill_super() and
gadgetfs_kill_sb() are serialized by making them both acquire a new
mutex.
Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+33d7ad66d65044b93f16@syzkaller.appspotmail.com
Reported-and-tested-by: NGerald Lee <sundaywind2004@gmail.com>
Link: https://lore.kernel.org/linux-usb/CAO3qeMVzXDP-JU6v1u5Ags6Q-bb35kg3=C6d04DjzA9ffa5x1g@mail.gmail.com/
Fixes: e5d82a73 ("vfs: Convert gadgetfs to use the new mount API")
CC: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/Y6XCPXBpn3tmjdCC@rowland.harvard.eduSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZhangPeng <zhangpeng362@huawei.com>
Reviewed-by: Ntong tiangen <tongtiangen@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

48b73f7f

prlimit: do_prlimit needs to have a speculation check · 39fc4e6e

由 Xia Fukun 提交于 5月 09, 2023

stable inclusion
from stable-v5.10.165
commit 9f8e45720e0e7edb661d0082422f662ed243d8d8
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6Z6SU?from=project-issue
CVE: CVE-2023-0458

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=9f8e45720e0e7edb661d0082422f662ed243d8d8

--------------------------------

[ Upstream commit 73979060 ]

do_prlimit() adds the user-controlled resource value to a pointer that
will subsequently be dereferenced.  In order to help prevent this
codepath from being used as a spectre "gadget" a barrier needs to be
added after checking the range.
Reported-by: NJordy Zomer <jordyzomer@google.com>
Tested-by: NJordy Zomer <jordyzomer@google.com>
Suggested-by: NLinus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NXia Fukun <xiafukun@huawei.com>
Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

39fc4e6e

ext4: check iomap type only if ext4_iomap_begin() does not fail · a090a59f

由 Baokun Li 提交于 5月 09, 2023

maillist inclusion
category: bugfix
bugzilla: 188724, https://gitee.com/openeuler/kernel/issues/I70Q22

Reference: https://www.spinics.net/lists/kernel/msg4779681.html

----------------------------------------

When ext4_iomap_overwrite_begin() calls ext4_iomap_begin() map blocks may
fail for some reason (e.g. memory allocation failure, bare disk write), and
later because "iomap->type ! = IOMAP_MAPPED" triggers WARN_ON(). When ext4
iomap_begin() returns an error, it is normal that the type of iomap->type
may not match the expectation. Therefore, we only determine if iomap->type
is as expected when ext4_iomap_begin() is executed successfully.

Reported-by: syzbot+08106c4b7d60702dbc14@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/00000000000015760b05f9b4eee9@google.comReviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

a090a59f

dm ioctl: fix nested locking in table_clear() to remove deadlock concern · 7d512e8b

由 Mike Snitzer 提交于 5月 09, 2023

mainline inclusion
from mainline-v6.4-rc1
commit 3d32aaa7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6YQZS
CVE: CVE-2023-2269

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=3d32aaa7e66d5c1479a3c31d6c2c5d45dd0d3b89

----------------------------------------

syzkaller found the following problematic rwsem locking (with write
lock already held):

 down_read+0x9d/0x450 kernel/locking/rwsem.c:1509
 dm_get_inactive_table+0x2b/0xc0 drivers/md/dm-ioctl.c:773
 __dev_status+0x4fd/0x7c0 drivers/md/dm-ioctl.c:844
 table_clear+0x197/0x280 drivers/md/dm-ioctl.c:1537

In table_clear, it first acquires a write lock
https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L1520
down_write(&_hash_lock);

Then before the lock is released at L1539, there is a path shown above:
table_clear -> __dev_status -> dm_get_inactive_table ->  down_read
https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L773
down_read(&_hash_lock);

It tries to acquire the same read lock again, resulting in the deadlock
problem.

Fix this by moving table_clear()'s __dev_status() call to after its
up_write(&_hash_lock);

Cc: stable@vger.kernel.org
Reported-by: NZheng Zhang <zheng.zhang@email.ucr.edu>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

Conflicts:
  drivers/md/dm-ioctl.c
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

7d512e8b

net: qcom/emac: Fix use after free bug in emac_remove due to race condition · 17d38bb9

由 Zheng Wang 提交于 5月 09, 2023

mainline inclusion
from mainline-v6.3-rc4
commit 6b6bc5b8
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6ZWOL
CVE: CVE-2023-2483

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6b6bc5b8bd2d4ca9e1efa9ae0f98a0b0687ace75

---------------------------

In emac_probe, &adpt->work_thread is bound with
emac_work_thread. Then it will be started by timeout
handler emac_tx_timeout or a IRQ handler emac_isr.

If we remove the driver which will call emac_remove
  to make cleanup, there may be a unfinished work.

The possible sequence is as follows:

Fix it by finishing the work before cleanup in the emac_remove
and disable timeout response.

CPU0                  CPU1

                    |emac_work_thread
emac_remove         |
free_netdev         |
kfree(netdev);      |
                    |emac_reinit_locked
                    |emac_mac_down
                    |//use netdev
Fixes: b9b17deb ("net: emac: emac gigabit ethernet controller driver")
Signed-off-by: NZheng Wang <zyytlz.wz@163.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
(cherry picked from commit 6b6bc5b8)
Signed-off-by: NLiu Jian <liujian56@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

17d38bb9

net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg · 86d304ad

由 Gwangun Jung 提交于 5月 09, 2023

stable inclusion
from stable-v5.10.179
commit ddcf35deb8f2a1d9addc74b586cf4c5a1f5d6020
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6ZISA
CVE: CVE-2023-31436

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ddcf35deb8f2a1d9addc74b586cf4c5a1f5d6020

--------------------------------

[ Upstream commit 30379334 ]

If the TCA_QFQ_LMAX value is not offered through nlattr, lmax is determined by the MTU value of the network device.
The MTU of the loopback device can be set up to 2^31-1.
As a result, it is possible to have an lmax value that exceeds QFQ_MIN_LMAX.

Due to the invalid lmax value, an index is generated that exceeds the QFQ_MAX_INDEX(=24) value, causing out-of-bounds read/write errors.

The following reports a oob access:

[   84.582666] BUG: KASAN: slab-out-of-bounds in qfq_activate_agg.constprop.0 (net/sched/sch_qfq.c:1027 net/sched/sch_qfq.c:1060 net/sched/sch_qfq.c:1313)
[   84.583267] Read of size 4 at addr ffff88810f676948 by task ping/301
[   84.583686]
[   84.583797] CPU: 3 PID: 301 Comm: ping Not tainted 6.3.0-rc5 #1
[   84.584164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[   84.584644] Call Trace:
[   84.584787]  <TASK>
[   84.584906] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[   84.585108] print_report (mm/kasan/report.c:320 mm/kasan/report.c:430)
[   84.585570] kasan_report (mm/kasan/report.c:538)
[   84.585988] qfq_activate_agg.constprop.0 (net/sched/sch_qfq.c:1027 net/sched/sch_qfq.c:1060 net/sched/sch_qfq.c:1313)
[   84.586599] qfq_enqueue (net/sched/sch_qfq.c:1255)
[   84.587607] dev_qdisc_enqueue (net/core/dev.c:3776)
[   84.587749] __dev_queue_xmit (./include/net/sch_generic.h:186 net/core/dev.c:3865 net/core/dev.c:4212)
[   84.588763] ip_finish_output2 (./include/net/neighbour.h:546 net/ipv4/ip_output.c:228)
[   84.589460] ip_output (net/ipv4/ip_output.c:430)
[   84.590132] ip_push_pending_frames (./include/net/dst.h:444 net/ipv4/ip_output.c:126 net/ipv4/ip_output.c:1586 net/ipv4/ip_output.c:1606)
[   84.590285] raw_sendmsg (net/ipv4/raw.c:649)
[   84.591960] sock_sendmsg (net/socket.c:724 net/socket.c:747)
[   84.592084] __sys_sendto (net/socket.c:2142)
[   84.593306] __x64_sys_sendto (net/socket.c:2150)
[   84.593779] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[   84.593902] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[   84.594070] RIP: 0033:0x7fe568032066
[   84.594192] Code: 0e 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c09[ 84.594796] RSP: 002b:00007ffce388b4e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c

Code starting with the faulting instruction
===========================================
[   84.595047] RAX: ffffffffffffffda RBX: 00007ffce388cc70 RCX: 00007fe568032066
[   84.595281] RDX: 0000000000000040 RSI: 00005605fdad6d10 RDI: 0000000000000003
[   84.595515] RBP: 00005605fdad6d10 R08: 00007ffce388eeec R09: 0000000000000010
[   84.595749] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
[   84.595984] R13: 00007ffce388cc30 R14: 00007ffce388b4f0 R15: 0000001d00000001
[   84.596218]  </TASK>
[   84.596295]
[   84.596351] Allocated by task 291:
[   84.596467] kasan_save_stack (mm/kasan/common.c:46)
[   84.596597] kasan_set_track (mm/kasan/common.c:52)
[   84.596725] __kasan_kmalloc (mm/kasan/common.c:384)
[   84.596852] __kmalloc_node (./include/linux/kasan.h:196 mm/slab_common.c:967 mm/slab_common.c:974)
[   84.596979] qdisc_alloc (./include/linux/slab.h:610 ./include/linux/slab.h:731 net/sched/sch_generic.c:938)
[   84.597100] qdisc_create (net/sched/sch_api.c:1244)
[   84.597222] tc_modify_qdisc (net/sched/sch_api.c:1680)
[   84.597357] rtnetlink_rcv_msg (net/core/rtnetlink.c:6174)
[   84.597495] netlink_rcv_skb (net/netlink/af_netlink.c:2574)
[   84.597627] netlink_unicast (net/netlink/af_netlink.c:1340 net/netlink/af_netlink.c:1365)
[   84.597759] netlink_sendmsg (net/netlink/af_netlink.c:1942)
[   84.597891] sock_sendmsg (net/socket.c:724 net/socket.c:747)
[   84.598016] ____sys_sendmsg (net/socket.c:2501)
[   84.598147] ___sys_sendmsg (net/socket.c:2557)
[   84.598275] __sys_sendmsg (./include/linux/file.h:31 net/socket.c:2586)
[   84.598399] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[   84.598520] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[   84.598688]
[   84.598744] The buggy address belongs to the object at ffff88810f674000
[   84.598744]  which belongs to the cache kmalloc-8k of size 8192
[   84.599135] The buggy address is located 2664 bytes to the right of
[   84.599135]  allocated 7904-byte region [ffff88810f674000, ffff88810f675ee0)
[   84.599544]
[   84.599598] The buggy address belongs to the physical page:
[   84.599777] page:00000000e638567f refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10f670
[   84.600074] head:00000000e638567f order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[   84.600330] flags: 0x200000000010200(slab|head|node=0|zone=2)
[   84.600517] raw: 0200000000010200 ffff888100043180 dead000000000122 0000000000000000
[   84.600764] raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000
[   84.601009] page dumped because: kasan: bad access detected
[   84.601187]
[   84.601241] Memory state around the buggy address:
[   84.601396]  ffff88810f676800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   84.601620]  ffff88810f676880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   84.601845] >ffff88810f676900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   84.602069]                                               ^
[   84.602243]  ffff88810f676980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   84.602468]  ffff88810f676a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   84.602693] ==================================================================
[   84.602924] Disabling lock debugging due to kernel taint

Fixes: 3015f3d2 ("pkt_sched: enable QFQ to support TSO/GSO")
Reported-by: NGwangun Jung <exsociety@gmail.com>
Signed-off-by: NGwangun Jung <exsociety@gmail.com>
Acked-by: Jamal Hadi Salim<jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

86d304ad

i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer() · 86923497

由 Wei Chen 提交于 5月 09, 2023

mainline inclusion
from mainline-v6.3-rc4
commit 92fbb6d1
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6XHPL
CVE: CVE-2023-2194

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=92fbb6d1296f81f41f65effd7f5f8c0f74943d15

--------------------------------

The data->block[0] variable comes from user and is a number between
0-255. Without proper check, the variable may be very large to cause
an out-of-bounds when performing memcpy in slimpro_i2c_blkwr.

Fix this bug by checking the value of writelen.

Fixes: f6505fba ("i2c: add SLIMpro I2C device driver on APM X-Gene platform")
Signed-off-by: NWei Chen <harperchen1110@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: NAndi Shyti <andi.shyti@kernel.org>
Signed-off-by: NWolfram Sang <wsa@kernel.org>
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NZheng Yejian <zhengyejian1@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

86923497

ext4: only update i_reserved_data_blocks on successful block allocation · f0af88ce

由 Baokun Li 提交于 5月 09, 2023

maillist inclusion
category: bugfix
bugzilla: 188499, https://gitee.com/openeuler/kernel/issues/I6TNVT
CVE: NA

Reference: https://patchwork.ozlabs.org/project/linux-ext4/patch/20230412124126.2286716-2-libaokun1@huawei.com/

----------------------------------------

In our fault injection test, we create an ext4 file, migrate it to
non-extent based file, then punch a hole and finally trigger a WARN_ON
in the ext4_da_update_reserve_space():

EXT4-fs warning (device sda): ext4_da_update_reserve_space:369:
ino 14, used 11 with only 10 reserved data blocks

When writing back a non-extent based file, if we enable delalloc, the
number of reserved blocks will be subtracted from the number of blocks
mapped by ext4_ind_map_blocks(), and the extent status tree will be
updated. We update the extent status tree by first removing the old
extent_status and then inserting the new extent_status. If the block range
we remove happens to be in an extent, then we need to allocate another
extent_status with ext4_es_alloc_extent().

       use old    to remove   to add new
    |----------|------------|------------|
              old extent_status

The problem is that the allocation of a new extent_status failed due to a
fault injection, and __es_shrink() did not get free memory, resulting in
a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM,
we map to the same extent again, and the number of reserved blocks is again
subtracted from the number of blocks in that extent. Since the blocks in
the same extent are subtracted twice, we end up triggering WARN_ON at
ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks.

For non-extent based file, we update the number of reserved blocks after
ext4_ind_map_blocks() is executed, which causes a problem that when we call
ext4_ind_map_blocks() to create a block, it doesn't always create a block,
but we always reduce the number of reserved blocks. So we move the logic
for updating reserved blocks to ext4_ind_map_blocks() to ensure that the
number of reserved blocks is updated only after we do succeed in allocating
some new blocks.

Fixes: 5f634d06 ("ext4: Fix quota accounting error with fallocate")
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

f0af88ce

can: af_can: fix NULL pointer dereference in can_rcv_filter · a6b58c2d

由 Oliver Hartkopp 提交于 5月 09, 2023

stable inclusion
from stable-v5.10.159
commit c42221efb1159d6a3c89e96685ee38acdce86b6f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6WUDS
CVE: CVE-2023-2166

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c42221efb1159d6a3c89e96685ee38acdce86b6f

--------------------------------

commit 0acc4423 upstream.

Analogue to commit 8aa59e35 ("can: af_can: fix NULL pointer
dereference in can_rx_register()") we need to check for a missing
initialization of ml_priv in the receive path of CAN frames.

Since commit 4e096a18 ("net: introduce CAN specific pointer in the
struct net_device") the check for dev->type to be ARPHRD_CAN is not
sufficient anymore since bonding or tun netdevices claim to be CAN
devices but do not initialize ml_priv accordingly.

Fixes: 4e096a18 ("net: introduce CAN specific pointer in the struct net_device")
Reported-by: syzbot+2d7f58292cb5b29eb5ad@syzkaller.appspotmail.com
Reported-by: NWei Chen <harperchen1110@gmail.com>
Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/all/20221206201259.3028-1-socketcan@hartkopp.net
Cc: stable@vger.kernel.org
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

a6b58c2d

RDMA/core: Refactor rdma_bind_addr · e51e93cb

由 Patrisious Haddad 提交于 5月 09, 2023

mainline inclusion
from mainline-v6.3-rc1
commit 8d037973
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6X49E
CVE: CVE-2023-2176

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d037973d48c026224ab285e6a06985ccac6f7bf

---------------------------

Refactor rdma_bind_addr function so that it doesn't require that the
cma destination address be changed before calling it.

So now it will update the destination address internally only when it is
really needed and after passing all the required checks.

Which in turn results in a cleaner and more sensible call and error
handling flows for the functions that call it directly or indirectly.
Signed-off-by: NPatrisious Haddad <phaddad@nvidia.com>
Reported-by: NWei Chen <harperchen1110@gmail.com>
Reviewed-by: NMark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/3d0e9a2fd62bc10ba02fed1c7c48a48638952320.1672819273.git.leonro@nvidia.comSigned-off-by: NLeon Romanovsky <leon@kernel.org>
(cherry picked from commit 8d037973)
Signed-off-by: NLiu Jian <liujian56@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

e51e93cb

RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests · 7767949c

由 Jason Gunthorpe 提交于 5月 09, 2023

mainline inclusion
from mainline-v5.15-rc4
commit 305d568b
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6X49E
CVE: CVE-2023-2176

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=305d568b72f17f674155a2a8275f865f207b3808

---------------------------

The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
on the same id_priv. While this cannot happen without going through the
work, it violates the invariant that the same address resolution
background request cannot be active twice.

       CPU 1                                  CPU 2

rdma_resolve_addr():
  RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
  rdma_resolve_ip(addr_handler)  #1

			 process_one_req(): for #1
                          addr_handler():
                            RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
                            mutex_unlock(&id_priv->handler_mutex);
                            [.. handler still running ..]

rdma_resolve_addr():
  RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
  rdma_resolve_ip(addr_handler)
    !! two requests are now on the req_list

rdma_destroy_id():
 destroy_id_handler_unlock():
  _destroy_id():
   cma_cancel_operation():
    rdma_addr_cancel()

                          // process_one_req() self removes it
		          spin_lock_bh(&lock);
                           cancel_delayed_work(&req->work);
	                   if (!list_empty(&req->list)) == true

      ! rdma_addr_cancel() returns after process_on_req #1 is done

   kfree(id_priv)

			 process_one_req(): for #2
                          addr_handler():
	                    mutex_lock(&id_priv->handler_mutex);
                            !! Use after free on id_priv

rdma_addr_cancel() expects there to be one req on the list and only
cancels the first one. The self-removal behavior of the work only happens
after the handler has returned. This yields a situations where the
req_list can have two reqs for the same "handle" but rdma_addr_cancel()
only cancels the first one.

The second req remains active beyond rdma_destroy_id() and will
use-after-free id_priv once it inevitably triggers.

Fix this by remembering if the id_priv has called rdma_resolve_ip() and
always cancel before calling it again. This ensures the req_list never
gets more than one item in it and doesn't cost anything in the normal flow
that never uses this strange error path.

Link: https://lore.kernel.org/r/0-v1-3bc675b8006d+22-syz_cancel_uaf_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: e51060f0 ("IB: IP address based RDMA connection manager")
Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
(cherry picked from commit 305d568b)
Signed-off-by: NLiu Jian <liujian56@huawei.com>

Conflicts:
	drivers/infiniband/core/cma_priv.h
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

7767949c

scsi: dpt_i2o: Remove obsolete driver · 49d0287b

由 Arnd Bergmann 提交于 5月 09, 2023

mainline inclusion
from mainline-v6.0-rc1~14
commit b04e75a4
category: bugfix
bugzilla: 188707, https://gitee.com/src-openeuler/kernel/issues/I6VK2F
CVE: CVE-2023-2007

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b04e75a4a8a81887386a0d2dbf605a48e779d2a0

----------------------------------------

The dpt_i2o driver was fixed to stop using virt_to_bus() in 2008, but it
still has a stale reference in an error handling code path that could never
work. I submitted a patch to fix this reference earlier, but Hannes
Reinecke suggested that removing the driver may be just as good here.

The i2o driver layer was removed in 2015 with commit 4a72a7af
("staging: remove i2o subsystem"), but the even older dpt_i2o scsi driver
stayed around.

The last non-cleanup patches I could find were from Miquel van Smoorenburg
and Mark Salyzyn back in 2008, they might know if there is any chance of
the hardware still being used anywhere.

Link: https://lore.kernel.org/linux-scsi/CAK8P3a1XfwkTOV7qOs1fTxf4vthNBRXKNu8A5V7TWnHT081NGA@mail.gmail.com/T/
Link: https://lore.kernel.org/r/20220624155226.2889613-3-arnd@kernel.org
Cc: Miquel van Smoorenburg <mikevs@xs4all.net>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

49d0287b

writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs · 79c2ed51

由 Baokun Li 提交于 5月 09, 2023

mainline inclusion
from mainline-v6.3-rc8
commit 1ba1199e
category: bugfix
bugzilla: 188601, https://gitee.com/openeuler/kernel/issues/I6TNTC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1ba1199ec5747f475538c0d25a32804e5ba1dfde

--------------------------------

KASAN report null-ptr-deref:
==================================================================
BUG: KASAN: null-ptr-deref in bdi_split_work_to_wbs+0x5c5/0x7b0
Write of size 8 at addr 0000000000000000 by task sync/943
CPU: 5 PID: 943 Comm: sync Tainted: 6.3.0-rc5-next-20230406-dirty #461
Call Trace:
 <TASK>
 dump_stack_lvl+0x7f/0xc0
 print_report+0x2ba/0x340
 kasan_report+0xc4/0x120
 kasan_check_range+0x1b7/0x2e0
 __kasan_check_write+0x24/0x40
 bdi_split_work_to_wbs+0x5c5/0x7b0
 sync_inodes_sb+0x195/0x630
 sync_inodes_one_sb+0x3a/0x50
 iterate_supers+0x106/0x1b0
 ksys_sync+0x98/0x160
[...]
==================================================================

The race that causes the above issue is as follows:

           cpu1                     cpu2
-------------------------|-------------------------
inode_switch_wbs
 INIT_WORK(&isw->work, inode_switch_wbs_work_fn)
 queue_rcu_work(isw_wq, &isw->work)
 // queue_work async
  inode_switch_wbs_work_fn
   wb_put_many(old_wb, nr_switched)
    percpu_ref_put_many
     ref->data->release(ref)
     cgwb_release
      queue_work(cgwb_release_wq, &wb->release_work)
      // queue_work async
       &wb->release_work
       cgwb_release_workfn
                            ksys_sync
                             iterate_supers
                              sync_inodes_one_sb
                               sync_inodes_sb
                                bdi_split_work_to_wbs
                                 kmalloc(sizeof(*work), GFP_ATOMIC)
                                 // alloc memory failed
        percpu_ref_exit
         ref->data = NULL
         kfree(data)
                                 wb_get(wb)
                                  percpu_ref_get(&wb->refcnt)
                                   percpu_ref_get_many(ref, 1)
                                    atomic_long_add(nr, &ref->data->count)
                                     atomic64_add(i, v)
                                     // trigger null-ptr-deref

bdi_split_work_to_wbs() traverses &bdi->wb_list to split work into all
wbs.  If the allocation of new work fails, the on-stack fallback will be
used and the reference count of the current wb is increased afterwards.
If cgroup writeback membership switches occur before getting the reference
count and the current wb is released as old_wd, then calling wb_get() or
wb_put() will trigger the null pointer dereference above.

This issue was introduced in v4.3-rc7 (see fix tag1).  Both
sync_inodes_sb() and __writeback_inodes_sb_nr() calls to
bdi_split_work_to_wbs() can trigger this issue.  For scenarios called via
sync_inodes_sb(), originally commit 7fc5854f ("writeback: synchronize
sync(2) against cgroup writeback membership switches") reduced the
possibility of the issue by adding wb_switch_rwsem, but in v5.14-rc1 (see
fix tag2) removed the "inode_io_list_del_locked(inode, old_wb)" from
inode_switch_wbs_work_fn() so that wb->state contains WB_has_dirty_io,
thus old_wb is not skipped when traversing wbs in bdi_split_work_to_wbs(),
and the issue becomes easily reproducible again.

To solve this problem, percpu_ref_exit() is called under RCU protection to
avoid race between cgwb_release_workfn() and bdi_split_work_to_wbs().
Moreover, replace wb_get() with wb_tryget() in bdi_split_work_to_wbs(),
and skip the current wb if wb_tryget() fails because the wb has already
been shutdown.

Link: https://lkml.kernel.org/r/20230410130826.1492525-1-libaokun1@huawei.com
Fixes: b817525a ("writeback: bdi_writeback iteration must not skip dying ones")
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Hou Tao <houtao1@huawei.com>
Cc: yangerkun <yangerkun@huawei.com>
Cc: Zhang Yi <yi.zhang@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

Conflicts:
	mm/backing-dev.c
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

79c2ed51

bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser() · 4be06acd

由 Liu Jian 提交于 5月 09, 2023

mainline inclusion
from mainline-v6.3-rc2
commit d900f3d2
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I65HYE

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d900f3d20cc3169ce42ec72acc850e662a4d4db2

---------------------------

When the buffer length of the recvmsg system call is 0, we got the
flollowing soft lockup problem:

watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149]
CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:remove_wait_queue+0xb/0xc0
Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20
RSP: 0018:ffff88811b5978b8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff88811a7d3780 RCX: ffffffffb7a4d768
RDX: dffffc0000000000 RSI: ffff88811b597908 RDI: ffff888115408040
RBP: 1ffff110236b2f1b R08: 0000000000000000 R09: ffff88811a7d37e7
R10: ffffed10234fa6fc R11: 0000000000000001 R12: ffff88811179b800
R13: 0000000000000001 R14: ffff88811a7d38a8 R15: ffff88811a7d37e0
FS:  00007f6fb5398740(0000) GS:ffff888237180000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 000000010b6ba002 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 tcp_msg_wait_data+0x279/0x2f0
 tcp_bpf_recvmsg_parser+0x3c6/0x490
 inet_recvmsg+0x280/0x290
 sock_recvmsg+0xfc/0x120
 ____sys_recvmsg+0x160/0x3d0
 ___sys_recvmsg+0xf0/0x180
 __sys_recvmsg+0xea/0x1a0
 do_syscall_64+0x3f/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

The logic in tcp_bpf_recvmsg_parser is as follows:

msg_bytes_ready:
	copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
	if (!copied) {
		wait data;
		goto msg_bytes_ready;
	}

In this case, "copied" always is 0, the infinite loop occurs.

According to the Linux system call man page, 0 should be returned in this
case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly
return. Also modify several other functions with the same problem.

Fixes: 1f5be6b3 ("udp: Implement udp_bpf_recvmsg() for sockmap")
Fixes: 9825d866 ("af_unix: Implement unix_dgram_bpf_recvmsg()")
Fixes: c5d2177a ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: NLiu Jian <liujian56@huawei.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.comSigned-off-by: NLiu Jian <liujian56@huawei.com>

Conflicts:
	net/ipv4/udp_bpf.c
	net/unix/unix_bpf.c
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

4be06acd

bpf, sockmap: Fix double bpf_prog_put on error case in map_link · 220b8487

由 John Fastabend 提交于 5月 09, 2023

mainline inclusion
from mainline-v5.17-rc1
commit 218d747a
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I65HYE

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=218d747a4142f281a256687bb513a135c905867b

---------------------------

sock_map_link() is called to update a sockmap entry with a sk. But, if the
sock_map_init_proto() call fails then we return an error to the map_update
op against the sockmap. In the error path though we need to cleanup psock
and dec the refcnt on any programs associated with the map, because we
refcnt them early in the update process to ensure they are pinned for the
psock. (This avoids a race where user deletes programs while also updating
the map with new socks.)

In current code we do the prog refcnt dec explicitely by calling
bpf_prog_put() when the program was found in the map. But, after commit
'38207a5e' in this error path we've already done the prog to psock
assignment so the programs have a reference from the psock as well. This
then causes the psock tear down logic, invoked by sk_psock_put() in the
error path, to similarly call bpf_prog_put on the programs there.

To be explicit this logic does the prog->psock assignment:

  if (msg_*)
    psock_set_prog(...)

Then the error path under the out_progs label does a similar check and
dec with:

  if (msg_*)
     bpf_prog_put(...)

And the teardown logic sk_psock_put() does ...

  psock_set_prog(msg_*, NULL)

... triggering another bpf_prog_put(...). Then KASAN gives us this splat,
found by syzbot because we've created an inbalance between bpf_prog_inc and
bpf_prog_put calling put twice on the program.

  BUG: KASAN: vmalloc-out-of-bounds in __bpf_prog_put kernel/bpf/syscall.c:1812 [inline]
  BUG: KASAN: vmalloc-out-of-bounds in __bpf_prog_put kernel/bpf/syscall.c:1812 [inline] kernel/bpf/syscall.c:1829
  BUG: KASAN: vmalloc-out-of-bounds in bpf_prog_put+0x8c/0x4f0 kernel/bpf/syscall.c:1829 kernel/bpf/syscall.c:1829
  Read of size 8 at addr ffffc90000e76038 by task syz-executor020/3641

To fix clean up error path so it doesn't try to do the bpf_prog_put in the
error path once progs are assigned then it relies on the normal psock
tear down logic to do complete cleanup.

For completness we also cover the case whereh sk_psock_init_strp() fails,
but this is not expected because it indicates an incorrect socket type
and should be caught earlier.

Fixes: 38207a5e ("bpf, sockmap: Attach map progs to psock early for feature probes")
Reported-by: syzbot+bb73e71cf4b8fd376a4f@syzkaller.appspotmail.com
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220104214645.290900-1-john.fastabend@gmail.com
(cherry picked from commit 218d747a)
Signed-off-by: NLiu Jian <liujian56@huawei.com>

Conflicts:
	net/core/sock_map.c
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

220b8487

bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap · cbc0f0ca

由 John Fastabend 提交于 5月 09, 2023

mainline inclusion
from mainline-v5.16-rc5
commit c0d95d33
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I65HYE

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c0d95d3380ee099d735e08618c0d599e72f6c8b0

---------------------------

When a sock is added to a sock map we evaluate what proto op hooks need to
be used. However, when the program is removed from the sock map we have not
been evaluating if that changes the required program layout.

Before the patch listed in the 'fixes' tag this was not causing failures
because the base program set handles all cases. Specifically, the case with
a stream parser and the case with out a stream parser are both handled. With
the fix below we identified a race when running with a proto op that attempts
to read skbs off both the stream parser and the skb->receive_queue. Namely,
that a race existed where when the stream parser is empty checking the
skb->receive_queue from recvmsg at the precies moment when the parser is
paused and the receive_queue is not empty could result in skipping the stream
parser. This may break a RX policy depending on the parser to run.

The fix tag then loads a specific proto ops that resolved this race. But, we
missed removing that proto ops recv hook when the sock is removed from the
sockmap. The result is the stream parser is stopped so no more skbs will be
aggregated there, but the hook and BPF program continues to be attached on
the psock. User space will then get an EBUSY when trying to read the socket
because the recvmsg() handler is now waiting on a stopped stream parser.

To fix we rerun the proto ops init() function which will look at the new set
of progs attached to the psock and rest the proto ops hook to the correct
handlers. And in the above case where we remove the sock from the sock map
the RX prog will no longer be listed so the proto ops is removed.

Fixes: c5d2177a ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211119181418.353932-3-john.fastabend@gmail.com
(cherry picked from commit c0d95d33)
Signed-off-by: NLiu Jian <liujian56@huawei.com>

Conflicts:
net/core/skmsg.c
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

cbc0f0ca

bpf, sockmap: Attach map progs to psock early for feature probes · 75922849

由 John Fastabend 提交于 5月 09, 2023

mainline inclusion
from mainline-v5.16-rc5
commit 38207a5e
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I65HYE

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=38207a5e81230d6ffbdd51e5fa5681be5116dcae

---------------------------

When a TCP socket is added to a sock map we look at the programs attached
to the map to determine what proto op hooks need to be changed. Before
the patch in the 'fixes' tag there were only two categories -- the empty
set of programs or a TX policy. In any case the base set handled the
receive case.

After the fix we have an optimized program for receive that closes a small,
but possible, race on receive. This program is loaded only when the map the
psock is being added to includes a RX policy. Otherwise, the race is not
possible so we don't need to handle the race condition.

In order for the call to sk_psock_init() to correctly evaluate the above
conditions all progs need to be set in the psock before the call. However,
in the current code this is not the case. We end up evaluating the
requirements on the old prog state. If your psock is attached to multiple
maps -- for example a tx map and rx map -- then the second update would pull
in the correct maps. But, the other pattern with a single rx enabled map
the correct receive hooks are not used. The result is the race fixed by the
patch in the fixes tag below may still be seen in this case.

To fix we simply set all psock->progs before doing the call into
sock_map_init(). With this the init() call gets the full list of programs
and chooses the correct proto ops on the first iteration instead of
requiring the second update to pull them in. This fixes the race case when
only a single map is used.

Fixes: c5d2177a ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211119181418.353932-2-john.fastabend@gmail.com
(cherry picked from commit 38207a5e)
Signed-off-by: NLiu Jian <liujian56@huawei.com>

Conflicts:
net/core/sock_map.c
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

75922849

bpf, sockmap: Fix return codes from tcp_bpf_recvmsg_parser() · d3bb682c

由 John Fastabend 提交于 5月 09, 2023

mainline inclusion
from mainline-v5.17-rc1
commit 5b2c5540
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I65HYE

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b2c5540b8110eea0d67a78fb0ddb9654c58daeb

---------------------------

Applications can be confused slightly because we do not always return the
same error code as expected, e.g. what the TCP stack normally returns. For
example on a sock err sk->sk_err instead of returning the sock_error we
return EAGAIN. This usually means the application will 'try again'
instead of aborting immediately. Another example, when a shutdown event
is received we should immediately abort instead of waiting for data when
the user provides a timeout.

These tend to not be fatal, applications usually recover, but introduces
bogus errors to the user or introduces unexpected latency. Before
'c5d2177a' we fell back to the TCP stack when no data was available
so we managed to catch many of the cases here, although with the extra
latency cost of calling tcp_msg_wait_data() first.

To fix lets duplicate the error handling in TCP stack into tcp_bpf so
that we get the same error codes.

These were found in our CI tests that run applications against sockmap
and do longer lived testing, at least compared to test_sockmap that
does short-lived ping/pong tests, and in some of our test clusters
we deploy.

Its non-trivial to do these in a shorter form CI tests that would be
appropriate for BPF selftests, but we are looking into it so we can
ensure this keeps working going forward. As a preview one idea is to
pull in the packetdrill testing which catches some of this.

Fixes: c5d2177a ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220104205918.286416-1-john.fastabend@gmail.com
(cherry picked from commit 5b2c5540)
Signed-off-by: NLiu Jian <liujian56@huawei.com>

Conflicts:
net/ipv4/tcp_bpf.c
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

d3bb682c

bpf, sockmap: Fix race in ingress receive verdict with redirect to self · 9a9749fb

由 John Fastabend 提交于 5月 09, 2023

mainline inclusion
from mainline-v5.16-rc1
commit c5d2177a
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I65HYE

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c5d2177a72a1659554922728fc407f59950aa929

---------------------------

A socket in a sockmap may have different combinations of programs attached
depending on configuration. There can be no programs in which case the socket
acts as a sink only. There can be a TX program in this case a BPF program is
attached to sending side, but no RX program is attached. There can be an RX
program only where sends have no BPF program attached, but receives are hooked
with BPF. And finally, both TX and RX programs may be attached. Giving us the
permutations:

 None, Tx, Rx, and TxRx

To date most of our use cases have been TX case being used as a fast datapath
to directly copy between local application and a userspace proxy. Or Rx cases
and TxRX applications that are operating an in kernel based proxy. The traffic
in the first case where we hook applications into a userspace application looks
like this:

  AppA  redirect   AppB
   Tx <-----------> Rx
   |                |
   +                +
   TCP <--> lo <--> TCP

In this case all traffic from AppA (after 3whs) is copied into the AppB
ingress queue and no traffic is ever on the TCP recieive_queue.

In the second case the application never receives, except in some rare error
cases, traffic on the actual user space socket. Instead the send happens in
the kernel.

           AppProxy       socket pool
       sk0 ------------->{sk1,sk2, skn}
        ^                      |
        |                      |
        |                      v
       ingress              lb egress
       TCP                  TCP

Here because traffic is never read off the socket with userspace recv() APIs
there is only ever one reader on the sk receive_queue. Namely the BPF programs.

However, we've started to introduce a third configuration where the BPF program
on receive should process the data, but then the normal case is to push the
data into the receive queue of AppB.

       AppB
       recv()                (userspace)
     -----------------------
       tcp_bpf_recvmsg()     (kernel)
         |             |
         |             |
         |             |
       ingress_msgQ    |
         |             |
       RX_BPF          |
         |             |
         v             v
       sk->receive_queue

This is different from the App{A,B} redirect because traffic is first received
on the sk->receive_queue.

Now for the issue. The tcp_bpf_recvmsg() handler first checks the ingress_msg
queue for any data handled by the BPF rx program and returned with PASS code
so that it was enqueued on the ingress msg queue. Then if no data exists on
that queue it checks the socket receive queue. Unfortunately, this is the same
receive_queue the BPF program is reading data off of. So we get a race. Its
possible for the recvmsg() hook to pull data off the receive_queue before the
BPF hook has a chance to read it. It typically happens when an application is
banging on recv() and getting EAGAINs. Until they manage to race with the RX
BPF program.

To fix this we note that before this patch at attach time when the socket is
loaded into the map we check if it needs a TX program or just the base set of
proto bpf hooks. Then it uses the above general RX hook regardless of if we
have a BPF program attached at rx or not. This patch now extends this check to
handle all cases enumerated above, TX, RX, TXRX, and none. And to fix above
race when an RX program is attached we use a new hook that is nearly identical
to the old one except now we do not let the recv() call skip the RX BPF program.
Now only the BPF program pulls data from sk->receive_queue and recv() only
pulls data from the ingress msgQ post BPF program handling.

With this resolved our AppB from above has been up and running for many hours
without detecting any errors. We do this by correlating counters in RX BPF
events and the AppB to ensure data is never skipping the BPF program. Selftests,
was not able to detect this because we only run them for a short period of time
on well ordered send/recvs so we don't get any of the noise we see in real
application environments.

Fixes: 51199405 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Tested-by: NJussi Maki <joamaki@gmail.com>
Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-4-john.fastabend@gmail.com
(cherry picked from commit c5d2177a)
Signed-off-by: NLiu Jian <liujian56@huawei.com>

 Conflicts:
	net/ipv4/tcp_bpf.c
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

9a9749fb

09 5月, 2023 4 次提交

kabi: Fix kabi breakage without build warning. · f2968b34

由 Xie Haocheng 提交于 4月 25, 2023

amd inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6XNL2
CVE: NA

-------------------------------------------------
Error report detail:
*** ERROR - ABI BREAKAGE WAS DETECTED ***

The following symbols have been changed (this will cause an ABI breakage):
new kabi:
0x65d25289	__SCK__tp_func_xdp_exception	vmlinux	EXPORT_SYMBOL_GPL
0x5e9265ee	__tracepoint_xdp_exception	vmlinux	EXPORT_SYMBOL_GPL
old kabi:
0x5e0fbbff	__SCK__tp_func_xdp_exception	vmlinux	EXPORT_SYMBOL_GPL
0x017cc464	__tracepoint_xdp_exception	vmlinux	EXPORT_SYMBOL_GPL
Signed-off-by: NXie Haocheng <haocheng.xie@amd.com>
(cherry picked from commit 600130a3)

f2968b34

Revert "kabi: Fix kabi breakage caused by commit ." · 5ce0e68d

由 Xie Haocheng 提交于 4月 25, 2023

amd inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6XNL2
CVE: NA

-------------------------------------------------

This reverts commit a9cbff64.

This patch could introduce build warnings, should be reverted.
The build warning messages:
WARNING: modpost: EXPORT symbol "__SCT__perf_lopwr_cb" [vmlinux] version generation failed, symbol will not be versioned.
    WARNING: modpost: EXPORT symbol "__SCT__perf_lopwr_cb" [vmlinux] version generation failed, symbol will not be versioned.
Signed-off-by: NXie Haocheng <haocheng.xie@amd.com>
(cherry picked from commit fb465f68)

5ce0e68d

scsi/hifc: Fix compile error in allyesconfigs · 3d7e932d

由 zhoujiadong 提交于 5月 09, 2023

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6X8PA
CVE: NA

Reference: NA

---------------------------------

When using allyesconfig to configure the kernel,
errors may occur during the linking process when making.
Signed-off-by: Nzhoujiadong <zhoujiadong5@huawei.com>
Reviewed-by: NWulike (Collin) <wulike1@huawei.com>
(cherry picked from commit 3d776c92)

3d7e932d

net/hinic: Fix compile error in allyesconfigs · d33708c3

由 zhoujiadong 提交于 5月 09, 2023

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6X8PA
CVE: NA

Reference: NA

---------------------------------

When using allyesconfig to configure the kernel,
errors may occur during the linking process when making.
Signed-off-by: Nzhoujiadong <zhoujiadong5@huawei.com>
Reviewed-by: NWulike (Collin) <wulike1@huawei.com>
(cherry picked from commit e1b48bc2)

d33708c3

26 4月, 2023 11 次提交

!631 Backport CVEs and bugfixes · 1a82ccca

由 openeuler-ci-bot 提交于 4月 26, 2023

Merge Pull Request from: @zhangjialin11 
 
Pull new CVEs:
CVE-2023-1855
CVE-2023-2006
CVE-2023-30772
CVE-2023-1872

net bugfixes from Ziyang Xuan
mm cleanup from Ma Wupeng
timer bugfix from Yu Liao
xfs bugfixes from Guo Xuenan 
 
Link:https://gitee.com/openeuler/kernel/pulls/631 

Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com> 
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>

1a82ccca

bonding: Fix memory leak when changing bond type to Ethernet · 74aee3ec

由 Ido Schimmel 提交于 4月 26, 2023

mainline inclusion
from mainline-v6.3
commit c484fcc0
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WNGK
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c484fcc058bada604d7e4e5228d4affb646ddbc2

---------------------------

When a net device is put administratively up, its 'IFF_UP' flag is set
(if not set already) and a 'NETDEV_UP' notification is emitted, which
causes the 8021q driver to add VLAN ID 0 on the device. The reverse
happens when a net device is put administratively down.

When changing the type of a bond to Ethernet, its 'IFF_UP' flag is
incorrectly cleared, resulting in the kernel skipping the above process
and VLAN ID 0 being leaked [1].

Fix by restoring the flag when changing the type to Ethernet, in a
similar fashion to the restoration of the 'IFF_SLAVE' flag.

The issue can be reproduced using the script in [2], with example out
before and after the fix in [3].

[1]
unreferenced object 0xffff888103479900 (size 256):
  comm "ip", pid 329, jiffies 4294775225 (age 28.561s)
  hex dump (first 32 bytes):
    00 a0 0c 15 81 88 ff ff 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0
    [<ffffffff8406426c>] vlan_vid_add+0x30c/0x790
    [<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0
    [<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0
    [<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150
    [<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0
    [<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180
    [<ffffffff8379af36>] do_setlink+0xb96/0x4060
    [<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0
    [<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0
    [<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00
    [<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440
    [<ffffffff839a738f>] netlink_unicast+0x53f/0x810
    [<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90
    [<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70
    [<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0
unreferenced object 0xffff88810f6a83e0 (size 32):
  comm "ip", pid 329, jiffies 4294775225 (age 28.561s)
  hex dump (first 32 bytes):
    a0 99 47 03 81 88 ff ff a0 99 47 03 81 88 ff ff  ..G.......G.....
    81 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc  ................
  backtrace:
    [<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0
    [<ffffffff84064369>] vlan_vid_add+0x409/0x790
    [<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0
    [<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0
    [<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150
    [<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0
    [<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180
    [<ffffffff8379af36>] do_setlink+0xb96/0x4060
    [<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0
    [<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0
    [<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00
    [<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440
    [<ffffffff839a738f>] netlink_unicast+0x53f/0x810
    [<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90
    [<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70
    [<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0

[2]
ip link add name t-nlmon type nlmon
ip link add name t-dummy type dummy
ip link add name t-bond type bond mode active-backup

ip link set dev t-bond up
ip link set dev t-nlmon master t-bond
ip link set dev t-nlmon nomaster
ip link show dev t-bond
ip link set dev t-dummy master t-bond
ip link show dev t-bond

ip link del dev t-bond
ip link del dev t-dummy
ip link del dev t-nlmon

[3]
Before:

12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/netlink
12: t-bond: <BROADCAST,MULTICAST,MASTER,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 46:57:39:a4:46:a2 brd ff:ff:ff:ff:ff:ff

After:

12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/netlink
12: t-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 66:48:7b:74:b6:8a brd ff:ff:ff:ff:ff:ff

Fixes: e36b9d16 ("bonding: clean muticast addresses when device changes type")
Fixes: 75c78500 ("bonding: remap muticast addresses without using dev_close() and dev_open()")
Fixes: 9ec7eb60 ("bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change")
Reported-by: NMirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Link: https://lore.kernel.org/netdev/78a8a03b-6070-3e6b-5042-f848dab16fb8@alu.unizg.hr/Tested-by: NMirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

74aee3ec

bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails · f878ae86

由 Nikolay Aleksandrov 提交于 4月 26, 2023

mainline inclusion
from mainline-v6.3-rc3
commit e667d469
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WNGK
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e667d469098671261d558be0cd93dca4d285ce1e

---------------------------

syzbot reported a warning[1] where the bond device itself is a slave and
we try to enslave a non-ethernet device as the first slave which fails
but then in the error path when ether_setup() restores the bond device
it also clears all flags. In my previous fix[2] I restored the
IFF_MASTER flag, but I didn't consider the case that the bond device
itself might also be a slave with IFF_SLAVE set, so we need to restore
that flag as well. Use the bond_ether_setup helper which does the right
thing and restores the bond's flags properly.

Steps to reproduce using a nlmon dev:
 $ ip l add nlmon0 type nlmon
 $ ip l add bond1 type bond
 $ ip l add bond2 type bond
 $ ip l set bond1 master bond2
 $ ip l set dev nlmon0 master bond1
 $ ip -d l sh dev bond1
 22: bond1: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue master bond2 state DOWN mode DEFAULT group default qlen 1000
 (now bond1's IFF_SLAVE flag is gone and we'll hit a warning[3] if we
  try to delete it)

[1] https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3ef
[2] commit 7d5cd2ce ("bonding: correctly handle bonding type change on enslave failure")
[3] example warning:
 [   27.008664] bond1: (slave nlmon0): The slave device specified does not support setting the MAC address
 [   27.008692] bond1: (slave nlmon0): Error -95 calling set_mac_address
 [   32.464639] bond1 (unregistering): Released all slaves
 [   32.464685] ------------[ cut here ]------------
 [   32.464686] WARNING: CPU: 1 PID: 2004 at net/core/dev.c:10829 unregister_netdevice_many+0x72a/0x780
 [   32.464694] Modules linked in: br_netfilter bridge bonding virtio_net
 [   32.464699] CPU: 1 PID: 2004 Comm: ip Kdump: loaded Not tainted 5.18.0-rc3+ #47
 [   32.464703] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014
 [   32.464704] RIP: 0010:unregister_netdevice_many+0x72a/0x780
 [   32.464707] Code: 99 fd ff ff ba 90 1a 00 00 48 c7 c6 f4 02 66 96 48 c7 c7 20 4d 35 96 c6 05 fa c7 2b 02 01 e8 be 6f 4a 00 0f 0b e9 73 fd ff ff <0f> 0b e9 5f fd ff ff 80 3d e3 c7 2b 02 00 0f 85 3b fd ff ff ba 59
 [   32.464710] RSP: 0018:ffffa006422d7820 EFLAGS: 00010206
 [   32.464712] RAX: ffff8f6e077140a0 RBX: ffffa006422d7888 RCX: 0000000000000000
 [   32.464714] RDX: ffff8f6e12edbe58 RSI: 0000000000000296 RDI: ffffffff96d4a520
 [   32.464716] RBP: ffff8f6e07714000 R08: ffffffff96d63600 R09: ffffa006422d7728
 [   32.464717] R10: 0000000000000ec0 R11: ffffffff9698c988 R12: ffff8f6e12edb140
 [   32.464719] R13: dead000000000122 R14: dead000000000100 R15: ffff8f6e12edb140
 [   32.464723] FS:  00007f297c2f1740(0000) GS:ffff8f6e5d900000(0000) knlGS:0000000000000000
 [   32.464725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [   32.464726] CR2: 00007f297bf1c800 CR3: 00000000115e8000 CR4: 0000000000350ee0
 [   32.464730] Call Trace:
 [   32.464763]  <TASK>
 [   32.464767]  rtnl_dellink+0x13e/0x380
 [   32.464776]  ? cred_has_capability.isra.0+0x68/0x100
 [   32.464780]  ? __rtnl_unlock+0x33/0x60
 [   32.464783]  ? bpf_lsm_capset+0x10/0x10
 [   32.464786]  ? security_capable+0x36/0x50
 [   32.464790]  rtnetlink_rcv_msg+0x14e/0x3b0
 [   32.464792]  ? _copy_to_iter+0xb1/0x790
 [   32.464796]  ? post_alloc_hook+0xa0/0x160
 [   32.464799]  ? rtnl_calcit.isra.0+0x110/0x110
 [   32.464802]  netlink_rcv_skb+0x50/0xf0
 [   32.464806]  netlink_unicast+0x216/0x340
 [   32.464809]  netlink_sendmsg+0x23f/0x480
 [   32.464812]  sock_sendmsg+0x5e/0x60
 [   32.464815]  ____sys_sendmsg+0x22c/0x270
 [   32.464818]  ? import_iovec+0x17/0x20
 [   32.464821]  ? sendmsg_copy_msghdr+0x59/0x90
 [   32.464823]  ? do_set_pte+0xa0/0xe0
 [   32.464828]  ___sys_sendmsg+0x81/0xc0
 [   32.464832]  ? mod_objcg_state+0xc6/0x300
 [   32.464835]  ? refill_obj_stock+0xa9/0x160
 [   32.464838]  ? memcg_slab_free_hook+0x1a5/0x1f0
 [   32.464842]  __sys_sendmsg+0x49/0x80
 [   32.464847]  do_syscall_64+0x3b/0x90
 [   32.464851]  entry_SYSCALL_64_after_hwframe+0x44/0xae
 [   32.464865] RIP: 0033:0x7f297bf2e5e7
 [   32.464868] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
 [   32.464869] RSP: 002b:00007ffd96c824c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 [   32.464872] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f297bf2e5e7
 [   32.464874] RDX: 0000000000000000 RSI: 00007ffd96c82540 RDI: 0000000000000003
 [   32.464875] RBP: 00000000640f19de R08: 0000000000000001 R09: 000000000000007c
 [   32.464876] R10: 00007f297bffabe0 R11: 0000000000000246 R12: 0000000000000001
 [   32.464877] R13: 00007ffd96c82d20 R14: 00007ffd96c82610 R15: 000055bfe38a7020
 [   32.464881]  </TASK>
 [   32.464882] ---[ end trace 0000000000000000 ]---

Fixes: 7d5cd2ce ("bonding: correctly handle bonding type change on enslave failure")
Reported-by: syzbot+9dfc3f3348729cc82277@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3efSigned-off-by: NNikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: NMichal Kubiak <michal.kubiak@intel.com>
Acked-by: NJonathan Toppins <jtoppins@redhat.com>
Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

f878ae86

bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change · a30dde44

由 Nikolay Aleksandrov 提交于 4月 26, 2023

mainline inclusion
from mainline-v6.3-rc3
commit 9ec7eb60
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WNGK
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9ec7eb60dcbcb6c41076defbc5df7bbd95ceaba5

---------------------------

Add bond_ether_setup helper which is used to fix ether_setup() calls in the
bonding driver. It takes care of both IFF_MASTER and IFF_SLAVE flags, the
former is always restored and the latter only if it was set.
If the bond enslaves non-ARPHRD_ETHER device (changes its type), then
releases it and enslaves ARPHRD_ETHER device (changes back) then we
use ether_setup() to restore the bond device type but it also resets its
flags and removes IFF_MASTER and IFF_SLAVE[1]. Use the bond_ether_setup
helper to restore both after such transition.

[1] reproduce (nlmon is non-ARPHRD_ETHER):
 $ ip l add nlmon0 type nlmon
 $ ip l add bond2 type bond mode active-backup
 $ ip l set nlmon0 master bond2
 $ ip l set nlmon0 nomaster
 $ ip l add bond1 type bond
 (we use bond1 as ARPHRD_ETHER device to restore bond2's mode)
 $ ip l set bond1 master bond2
 $ ip l sh dev bond2
 37: bond2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether be:d7:c5:40:5b:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500
 (notice bond2's IFF_MASTER is missing)

Fixes: e36b9d16 ("bonding: clean muticast addresses when device changes type")
Signed-off-by: NNikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Conflicts:
	drivers/net/bonding/bond_main.c
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

a30dde44

hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition · 490a5a46

由 Zheng Wang 提交于 4月 26, 2023

mainline inclusion
from mainline-v6.3-rc3
commit cb090e64
category: bugfix
bugzilla: 188657, https://gitee.com/src-openeuler/kernel/issues/I6T36A
CVE: CVE-2023-1855

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cb090e64cf25602b9adaf32d5dfc9c8bec493cd1

--------------------------------

In xgene_hwmon_probe, &ctx->workq is bound with xgene_hwmon_evt_work.
Then it will be started.

If we remove the driver which will call xgene_hwmon_remove to clean up,
there may be unfinished work.

The possible sequence is as follows:

Fix it by finishing the work before cleanup in xgene_hwmon_remove.

CPU0                  CPU1

                    |xgene_hwmon_evt_work
xgene_hwmon_remove   |
kfifo_free(&ctx->async_msg_fifo);|
                    |
                    |kfifo_out_spinlocked
                    |//use &ctx->async_msg_fifo
Fixes: 2ca492e2 ("hwmon: (xgene) Fix crash when alarm occurs before driver probe")
Signed-off-by: NZheng Wang <zyytlz.wz@163.com>
Link: https://lore.kernel.org/r/20230310084007.1403388-1-zyytlz.wz@163.comSigned-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NZhao Wenhui <zhaowenhui8@huawei.com>
Reviewed-by: Nsongping yu <yusongping@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
Reviewed-by: NChen Hui <judy.chenhui@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

490a5a46

rxrpc: Fix race between conn bundle lookup and bundle removal [ZDI-CAN-15975] · f539bfdf

由 David Howells 提交于 4月 26, 2023

stable inclusion
from stable-v5.10.157
commit 3535c632e6d16c98f76e615da8dc0cb2750c66cc
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6VK2H
CVE: CVE-2023-2006

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3535c632e6d16c98f76e615da8dc0cb2750c66cc

--------------------------------

[ Upstream commit 3bcd6c7e ]

After rxrpc_unbundle_conn() has removed a connection from a bundle, it
checks to see if there are any conns with available channels and, if not,
removes and attempts to destroy the bundle.

Whilst it does check after grabbing client_bundles_lock that there are no
connections attached, this races with rxrpc_look_up_bundle() retrieving the
bundle, but not attaching a connection for the connection to be attached
later.

There is therefore a window in which the bundle can get destroyed before we
manage to attach a new connection to it.

Fix this by adding an "active" counter to struct rxrpc_bundle:

 (1) rxrpc_connect_call() obtains an active count by prepping/looking up a
     bundle and ditches it before returning.

 (2) If, during rxrpc_connect_call(), a connection is added to the bundle,
     this obtains an active count, which is held until the connection is
     discarded.

 (3) rxrpc_deactivate_bundle() is created to drop an active count on a
     bundle and destroy it when the active count reaches 0.  The active
     count is checked inside client_bundles_lock() to prevent a race with
     rxrpc_look_up_bundle().

 (4) rxrpc_unbundle_conn() then calls rxrpc_deactivate_bundle().

Fixes: 245500d8 ("rxrpc: Rewrite the client connection manager")
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-15975
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-by: zdi-disclosures@trendmicro.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

Conflicts:
	net/rxrpc/ar-internal.h
	net/rxrpc/conn_client.c
Signed-off-by: NWang Yufen <wangyufen@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

f539bfdf

power: supply: da9150: Fix use after free bug in da9150_charger_remove due to race condition · 771a10a7

由 Zheng Wang 提交于 4月 26, 2023

stable inclusion
from stable-v5.10.177
commit 75e2144291e847009fbc0350e10ec588ff96e05a
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6W80A
CVE: CVE-2023-30772

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=75e2144291e847009fbc0350e10ec588ff96e05a

--------------------------------

[ Upstream commit 06615d11 ]

In da9150_charger_probe, &charger->otg_work is bound with
da9150_charger_otg_work. da9150_charger_otg_ncb may be
called to start the work.

If we remove the module which will call da9150_charger_remove
to make cleanup, there may be a unfinished work. The possible
sequence is as follows:

Fix it by canceling the work before cleanup in the da9150_charger_remove

CPU0                  CPUc1

                    |da9150_charger_otg_work
da9150_charger_remove      |
power_supply_unregister  |
device_unregister   |
power_supply_dev_release|
kfree(psy)          |
                    |
                    | 	power_supply_changed(charger->usb);
                    |   //use

Fixes: c1a281e3 ("power: Add support for DA9150 Charger")
Signed-off-by: NZheng Wang <zyytlz.wz@163.com>
Signed-off-by: NSebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

771a10a7

mm: mem_reliable: Fix blank space issue in reliable_report_usage() · ea9084b8

由 Ma Wupeng 提交于 4月 26, 2023

hulk inclusion
category: cleanup
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WKXZ
CVE: NA

--------------------------------

The blank space before kB is needed to align the previous memory report
style.
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

ea9084b8

timers/nohz: Last resort update jiffies on nohz_full IRQ entry · 28e38fcc

由 Frederic Weisbecker 提交于 4月 26, 2023

mainline inclusion
from mainline-v5.16-rc4
commit 53e87e3c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WCC1
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=53e87e3cdc155f20c3417b689df8d2ac88d79576

--------------------------------

When at least one CPU runs in nohz_full mode, a dedicated timekeeper CPU
is guaranteed to stay online and to never stop its tick.

Meanwhile on some rare case, the dedicated timekeeper may be running
with interrupts disabled for a while, such as in stop_machine.

If jiffies stop being updated, a nohz_full CPU may end up endlessly
programming the next tick in the past, taking the last jiffies update
monotonic timestamp as a stale base, resulting in an tick storm.

Here is a scenario where it matters:

0) CPU 0 is the timekeeper and CPU 1 a nohz_full CPU.

1) A stop machine callback is queued to execute somewhere.

2) CPU 0 reaches MULTI_STOP_DISABLE_IRQ while CPU 1 is still in
MULTI_STOP_PREPARE. Hence CPU 0 can't do its timekeeping duty. CPU 1
can still take IRQs.

3) CPU 1 receives an IRQ which queues a timer callback one jiffy forward.

4) On IRQ exit, CPU 1 schedules the tick one jiffy forward, taking
last_jiffies_update as a base. But last_jiffies_update hasn't been
updated for 2 jiffies since the timekeeper has interrupts disabled.

5) clockevents_program_event(), which relies on ktime_get(), observes
that the expiration is in the past and therefore programs the min
delta event on the clock.

6) The tick fires immediately, goto 3)

7) Tick storm, the nohz_full CPU is drown and takes ages to reach
MULTI_STOP_DISABLE_IRQ, which is the only way out of this situation.

Solve this with unconditionally updating jiffies if the value is stale
on nohz_full IRQ entry. IRQs and other disturbances are expected to be
rare enough on nohz_full for the unconditional call to ktime_get() to
actually matter.
Reported-by: NPaul E. McKenney <paulmck@kernel.org>
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NPaul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20211026141055.57358-2-frederic@kernel.org

Conflicts:
kernel/softirq.c
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

28e38fcc

xfs: don't leak btree cursor when insrec fails after a split · a156b593

由 Darrick J. Wong 提交于 4月 26, 2023

mainline inclusion
from mainline-v5.18-rc2
commit a54f78de
category: bugfix
bugzilla: 187526,https://gitee.com/openeuler/kernel/issues/I6WKVJ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a54f78def73d847cb060b18c4e4a3d1d26c9ca6d

--------------------------------

The recent patch to improve btree cycle checking caused a regression
when I rebased the in-memory btree branch atop the 5.19 for-next branch,
because in-memory short-pointer btrees do not have AG numbers.  This
produced the following complaint from kmemleak:

unreferenced object 0xffff88803d47dde8 (size 264):
  comm "xfs_io", pid 4889, jiffies 4294906764 (age 24.072s)
  hex dump (first 32 bytes):
    90 4d 0b 0f 80 88 ff ff 00 a0 bd 05 80 88 ff ff  .M..............
    e0 44 3a a0 ff ff ff ff 00 df 08 06 80 88 ff ff  .D:.............
  backtrace:
    [<ffffffffa0388059>] xfbtree_dup_cursor+0x49/0xc0 [xfs]
    [<ffffffffa029887b>] xfs_btree_dup_cursor+0x3b/0x200 [xfs]
    [<ffffffffa029af5d>] __xfs_btree_split+0x6ad/0x820 [xfs]
    [<ffffffffa029b130>] xfs_btree_split+0x60/0x110 [xfs]
    [<ffffffffa029f6da>] xfs_btree_make_block_unfull+0x19a/0x1f0 [xfs]
    [<ffffffffa029fada>] xfs_btree_insrec+0x3aa/0x810 [xfs]
    [<ffffffffa029fff3>] xfs_btree_insert+0xb3/0x240 [xfs]
    [<ffffffffa02cb729>] xfs_rmap_insert+0x99/0x200 [xfs]
    [<ffffffffa02cf142>] xfs_rmap_map_shared+0x192/0x5f0 [xfs]
    [<ffffffffa02cf60b>] xfs_rmap_map_raw+0x6b/0x90 [xfs]
    [<ffffffffa0384a85>] xrep_rmap_stash+0xd5/0x1d0 [xfs]
    [<ffffffffa0384dc0>] xrep_rmap_visit_bmbt+0xa0/0xf0 [xfs]
    [<ffffffffa0384fb6>] xrep_rmap_scan_iext+0x56/0xa0 [xfs]
    [<ffffffffa03850d8>] xrep_rmap_scan_ifork+0xd8/0x160 [xfs]
    [<ffffffffa0385195>] xrep_rmap_scan_inode+0x35/0x80 [xfs]
    [<ffffffffa03852ee>] xrep_rmap_find_rmaps+0x10e/0x270 [xfs]

I noticed that xfs_btree_insrec has a bunch of debug code that return
out of the function immediately, without freeing the "new" btree cursor
that can be returned when _make_block_unfull calls xfs_btree_split.  Fix
the error return in this function to free the btree cursor.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

a156b593

xfs: avoid unnecessary runtime sibling pointer endian conversions · bd1b1e70

由 Dave Chinner 提交于 4月 26, 2023

mainline inclusion
from mainline-v5.18-rc2
commit 5672225e
category: bugfix
bugzilla: 187526,https://gitee.com/openeuler/kernel/issues/I6WKVJ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5672225e8f2a872a22b0cecedba7a6644af1fb84

--------------------------------

Commit dc04db2a has caused a small aim7 regression, showing a
small increase in CPU usage in __xfs_btree_check_sblock() as a
result of the extra checking.

This is likely due to the endian conversion of the sibling poitners
being unconditional instead of relying on the compiler to endian
convert the NULL pointer at compile time and avoiding the runtime
conversion for this common case.

Rework the checks so that endian conversion of the sibling pointers
is only done if they are not null as the original code did.

.... and these need to be "inline" because the compiler completely
fails to inline them automatically like it should be doing.

$ size fs/xfs/libxfs/xfs_btree.o*
   text	   data	    bss	    dec	    hex	filename
  51874	    240	      0	  52114	   cb92 fs/xfs/libxfs/xfs_btree.o.orig
  51562	    240	      0	  51802	   ca5a fs/xfs/libxfs/xfs_btree.o.inline

Just when you think the tools have advanced sufficiently we don't
have to care about stuff like this anymore, along comes a reminder
that *our tools still suck*.

Fixes: dc04db2a ("xfs: detect self referencing btree sibling pointers")
Reported-by: Nkernel test robot <oliver.sang@intel.com>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

bd1b1e70

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功