- 05 12月, 2022 2 次提交
-
-
由 Jakub Sitnicki 提交于
mainline inclusion from mainline-v6.1-rc7 commit af295e85 category: bugfix bugzilla: 188056, https://gitee.com/src-openeuler/kernel/issues/I62RNU CVE: CVE-2022-4129 -------------------------------- When holding a reader-writer spin lock we cannot sleep. Calling setup_udp_tunnel_sock() with write lock held violates this rule, because we end up calling percpu_down_read(), which might sleep, as syzbot reports [1]: __might_resched.cold+0x222/0x26b kernel/sched/core.c:9890 percpu_down_read include/linux/percpu-rwsem.h:49 [inline] cpus_read_lock+0x1b/0x140 kernel/cpu.c:310 static_key_slow_inc+0x12/0x20 kernel/jump_label.c:158 udp_tunnel_encap_enable include/net/udp_tunnel.h:187 [inline] setup_udp_tunnel_sock+0x43d/0x550 net/ipv4/udp_tunnel_core.c:81 l2tp_tunnel_register+0xc51/0x1210 net/l2tp/l2tp_core.c:1509 pppol2tp_connect+0xcdc/0x1a10 net/l2tp/l2tp_ppp.c:723 Trim the writer-side critical section for sk_callback_lock down to the minimum, so that it covers only operations on sk_user_data. Also, when grabbing the sk_callback_lock, we always need to disable BH, as Eric points out. Failing to do so leads to deadlocks because we acquire sk_callback_lock in softirq context, which can get stuck waiting on us if: 1) it runs on the same CPU, or CPU0 ---- lock(clock-AF_INET6); <Interrupt> lock(clock-AF_INET6); 2) lock ordering leads to priority inversion CPU0 CPU1 ---- ---- lock(clock-AF_INET6); local_irq_disable(); lock(&tcp_hashinfo.bhash[i].lock); lock(clock-AF_INET6); <Interrupt> lock(&tcp_hashinfo.bhash[i].lock); ... as syzbot reports [2,3]. Use the _bh variants for write_(un)lock. [1] https://lore.kernel.org/netdev/0000000000004e78ec05eda79749@google.com/ [2] https://lore.kernel.org/netdev/000000000000e38b6605eda76f98@google.com/ [3] https://lore.kernel.org/netdev/000000000000dfa31e05eda76f75@google.com/ v2: - Check and set sk_user_data while holding sk_callback_lock for both L2TP encapsulation types (IP and UDP) (Tetsuo) Cc: Tom Parkin <tparkin@katalix.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Fixes: b68777d5 ("l2tp: Serialize access to sk_user_data with sk_callback_lock") Reported-by: NEric Dumazet <edumazet@google.com> Reported-by: syzbot+703d9e154b3b58277261@syzkaller.appspotmail.com Reported-by: syzbot+50680ced9e98a61f7698@syzkaller.appspotmail.com Reported-by: syzbot+de987172bb74a381879b@syzkaller.appspotmail.com Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NLu Wei <luwei32@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Jakub Sitnicki 提交于
mainline inclusion from mainline-v6.1-rc6 commit b68777d5 category: bugfix bugzilla: 188056, https://gitee.com/src-openeuler/kernel/issues/I62RNU CVE: CVE-2022-4129 -------------------------------- sk->sk_user_data has multiple users, which are not compatible with each other. Writers must synchronize by grabbing the sk->sk_callback_lock. l2tp currently fails to grab the lock when modifying the underlying tunnel socket fields. Fix it by adding appropriate locking. We err on the side of safety and grab the sk_callback_lock also inside the sk_destruct callback overridden by l2tp, even though there should be no refs allowing access to the sock at the time when sk_destruct gets called. v4: - serialize write to sk_user_data in l2tp sk_destruct v3: - switch from sock lock to sk_callback_lock - document write-protection for sk_user_data v2: - update Fixes to point to origin of the bug - use real names in Reported/Tested-by tags Cc: Tom Parkin <tparkin@katalix.com> Fixes: 3557baab ("[L2TP]: PPP over L2TP driver core") Reported-by: NHaowei Yan <g1042620637@gmail.com> Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> conflict: net/l2tp/l2tp_core.c Signed-off-by: NLu Wei <luwei32@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 29 11月, 2022 5 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @meitingli Add the upcoming ConnectX-6 Dx, ConnectX-6 LX device ID. In addition, add "ConnectX Family mlx5Gen Virtual Function" device ID. Every new HCA VF will be identified with this device ID. Different VF models will be distinguished by their revision id. Link:https://gitee.com/openeuler/kernel/pulls/288 Reviewed-by: Laibin Qiu <qiulaibin@huawei.com> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
-
由 Shani Shapp 提交于
mainline inclusion from mainline-v5.4 commit b7eca940 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63MQP CVE: NA ---------------------------------------------------------------------- Add the upcoming ConnectX-6 LX device ID. Fixes: 85327a9c ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: NShani Shapp <shanish@mellanox.com> Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NMeiting Li <limeiting1@huawei.com>
-
由 Eran Ben Elisha 提交于
mainline inclusion from mainline-v5.1-rc1 commit 85327a9c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63MQP CVE: NA ---------------------------------------------------------------------- Add the upcoming ConnectX-6 Dx. In addition, add "ConnectX Family mlx5Gen Virtual Function" device ID. Every new HCA VF will be identified with this device ID. Different VF models will be distinguished by their revision id. Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com> Reviewed-by: NAya Levin <ayal@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NMeiting Li <limeiting1@huawei.com>
-
由 Duoming Zhou 提交于
stable inclusion from stable-4.19.239 commit 753b9d220a7d36dac70e7c6d05492d10d6f9dd36 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I62KQZ CVE: CVE-2022-41858 -------------------------------- [ Upstream commit ec4eb8a8 ] When a slip driver is detaching, the slip_close() will act to cleanup necessary resources and sl->tty is set to NULL in slip_close(). Meanwhile, the packet we transmit is blocked, sl_tx_timeout() will be called. Although slip_close() and sl_tx_timeout() use sl->lock to synchronize, we don`t judge whether sl->tty equals to NULL in sl_tx_timeout() and the null pointer dereference bug will happen. (Thread 1) | (Thread 2) | slip_close() | spin_lock_bh(&sl->lock) | ... ... | sl->tty = NULL //(1) sl_tx_timeout() | spin_unlock_bh(&sl->lock) spin_lock(&sl->lock); | ... | ... tty_chars_in_buffer(sl->tty)| if (tty->ops->..) //(2) | ... | synchronize_rcu() We set NULL to sl->tty in position (1) and dereference sl->tty in position (2). This patch adds check in sl_tx_timeout(). If sl->tty equals to NULL, sl_tx_timeout() will goto out. Signed-off-by: NDuoming Zhou <duoming@zju.edu.cn> Reviewed-by: NJiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20220405132206.55291-1-duoming@zju.edu.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Dan Carpenter 提交于
stable inclusion from stable-v5.10.142 commit 19e3f69d19801940abc2ac37c169882769ed9770 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I62AX2 CVE: CVE-2022-4095 -------------------------------- _Read/Write_MACREG callbacks are NULL so the read/write_macreg_hdl() functions don't do anything except free the "pcmd" pointer. It results in a use after free. Delete them. Fixes: 2865d42c ("staging: r8712u: Add the new driver to the mainline kernel") Cc: stable <stable@kernel.org> Reported-by: NZheng Wang <hackerzheng666@gmail.com> Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Yw4ASqkYcUhUfoY2@kiliSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NGuan Jing <guanjing6@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 27 11月, 2022 1 次提交
-
-
由 Feng Tang 提交于
hulk inclusion category: bugfix bugzilla: 187942, https://gitee.com/openeuler/kernel/issues/I5U037 CVE: NA ------------------------------- Commit b50db709 ("x86/tsc: Disable clocksource watchdog for TSC on qualified platorms") was introduced to solve problem that sometimes TSC clocksource is wrongly judged as unstable by watchdog like 'jiffies', HPET, etc. In it, the hardware socket number is a key factor for judging whether to disable the watchdog for TSC, and 'nr_online_nodes' was chosen as an estimation due to it is needed in early boot phase before registering 'tsc-early' clocksource, where all none-boot CPUs are not brought up yet. In recent patch review, Dave Hansen pointed out there are many cases that 'nr_online_nodes' could have issue, like: * numa emulation (numa=fake=4 etc.) * numa=off * platforms with CPU+DRAM nodes, CPU-less HBM nodes, CPU-less persistent memory nodes. Peter Zijlstra suggested to use logical package ids, but it is only usable after smp_init() and all CPUs are initialized. One solution is to skip the watchdog for 'tsc-early' clocksource, and move the check after smp_init(), while before 'tsc' clocksoure is registered, where topology_max_packages() could be used as a much more accurate socket number. Signed-off-by: NFeng Tang <feng.tang@intel.com> Conflict: arch/x86/kernel/tsc.c Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 26 11月, 2022 2 次提交
-
-
由 Xingui Yang 提交于
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I62ZXO CVE: NA ------------------------------------------------ During the write I/O, when the SAS PHY switch is tested, the hardware may reports two CQs for one IO. the first cq indicates invalid port when DPH scheduling, the second cq indicates that response frame has been written to the memory but the I/O is ended abnormally due to I/O data underload. So set iptt aborted flag when receiving an abnormal CQ, then the host will discards the IPTT frame received from the SAS hard disk. Signed-off-by: NXingui Yang <yangxingui@huawei.com> Reviewed-by: Nkang fenglong <kangfenglong@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Luís Henriques 提交于
mainline inclusion from mainline-v6.0-rc7 commit 29a5b8a1 category: bugfix bugzilla: 187444, https://gitee.com/openeuler/kernel/issues/I6261Z CVE: NA -------------------------------- When walking through an inode extents, the ext4_ext_binsearch_idx() function assumes that the extent header has been previously validated. However, there are no checks that verify that the number of entries (eh->eh_entries) is non-zero when depth is > 0. And this will lead to problems because the EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this: [ 135.245946] ------------[ cut here ]------------ [ 135.247579] kernel BUG at fs/ext4/extents.c:2258! [ 135.249045] invalid opcode: 0000 [#1] PREEMPT SMP [ 135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4 [ 135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014 [ 135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0 [ 135.256475] Code: [ 135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246 [ 135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023 [ 135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c [ 135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c [ 135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024 [ 135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000 [ 135.272394] FS: 00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 135.274510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0 [ 135.277952] Call Trace: [ 135.278635] <TASK> [ 135.279247] ? preempt_count_add+0x6d/0xa0 [ 135.280358] ? percpu_counter_add_batch+0x55/0xb0 [ 135.281612] ? _raw_read_unlock+0x18/0x30 [ 135.282704] ext4_map_blocks+0x294/0x5a0 [ 135.283745] ? xa_load+0x6f/0xa0 [ 135.284562] ext4_mpage_readpages+0x3d6/0x770 [ 135.285646] read_pages+0x67/0x1d0 [ 135.286492] ? folio_add_lru+0x51/0x80 [ 135.287441] page_cache_ra_unbounded+0x124/0x170 [ 135.288510] filemap_get_pages+0x23d/0x5a0 [ 135.289457] ? path_openat+0xa72/0xdd0 [ 135.290332] filemap_read+0xbf/0x300 [ 135.291158] ? _raw_spin_lock_irqsave+0x17/0x40 [ 135.292192] new_sync_read+0x103/0x170 [ 135.293014] vfs_read+0x15d/0x180 [ 135.293745] ksys_read+0xa1/0xe0 [ 135.294461] do_syscall_64+0x3c/0x80 [ 135.295284] entry_SYSCALL_64_after_hwframe+0x46/0xb0 This patch simply adds an extra check in __ext4_ext_check(), verifying that eh_entries is not 0 when eh_depth is > 0. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283 Cc: Baokun Li <libaokun1@huawei.com> Cc: stable@kernel.org Signed-off-by: NLuís Henriques <lhenriques@suse.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NBaokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 21 11月, 2022 1 次提交
-
-
由 Wang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61RA3 CVE: NA ------------------------------- The following two ioctl commands are not in used at all. Delete those implementation. SVM_IOCTL_SET_RC SVM_IOCTL_REMAP_PROC Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Nchenweilong <chenweilong@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 19 11月, 2022 4 次提交
-
-
由 Yuyao Lin 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: 186339 -------------------------------- This reverts commit 098b0e01. Function timespec64_to_ns() Add the upper and lower limits check in commit c0b52741 ("time: Prevent undefined behaviour in timespec64_to_ns()"), timespec64_to_ktime() only check the upper limits, so revert this patch can fix overflow. Signed-off-by: NYuyao Lin <linyuyao1@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
-
由 Shin'ichiro Kawasaki 提交于
mainline inclusion from mainline-v5.18-rc1 commit 572299f0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61Z7G CVE: NA -------------------------------- When IO requests are made continuously and the target block device handles requests faster than request arrival, the request dispatch loop keeps on repeating to dispatch the arriving requests very long time, more than a minute. Since the loop runs as a workqueue worker task, the very long loop duration triggers workqueue watchdog timeout and BUG [1]. To avoid the very long loop duration, break the loop periodically. When opportunity to dispatch requests still exists, check need_resched(). If need_resched() returns true, the dispatch loop already consumed its time slice, then reschedule the dispatch work and break the loop. With heavy IO load, need_resched() does not return true for 20~30 seconds. To cover such case, check time spent in the dispatch loop with jiffies. If more than 1 second is spent, reschedule the dispatch work and break the loop. [1] [ 609.691437] BUG: workqueue lockup - pool cpus=10 node=1 flags=0x0 nice=-20 stuck for 35s! [ 609.701820] Showing busy workqueues and worker pools: [ 609.707915] workqueue events: flags=0x0 [ 609.712615] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 609.712626] pending: drm_fb_helper_damage_work [drm_kms_helper] [ 609.712687] workqueue events_freezable: flags=0x4 [ 609.732943] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 609.732952] pending: pci_pme_list_scan [ 609.732968] workqueue events_power_efficient: flags=0x80 [ 609.751947] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 609.751955] pending: neigh_managed_work [ 609.752018] workqueue kblockd: flags=0x18 [ 609.769480] pwq 21: cpus=10 node=1 flags=0x0 nice=-20 active=3/256 refcnt=4 [ 609.769488] in-flight: 1020:blk_mq_run_work_fn [ 609.769498] pending: blk_mq_timeout_work, blk_mq_run_work_fn [ 609.769744] pool 21: cpus=10 node=1 flags=0x0 nice=-20 hung=35s workers=2 idle: 67 [ 639.899730] BUG: workqueue lockup - pool cpus=10 node=1 flags=0x0 nice=-20 stuck for 66s! [ 639.909513] Showing busy workqueues and worker pools: [ 639.915404] workqueue events: flags=0x0 [ 639.920197] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 639.920215] pending: drm_fb_helper_damage_work [drm_kms_helper] [ 639.920365] workqueue kblockd: flags=0x18 [ 639.939932] pwq 21: cpus=10 node=1 flags=0x0 nice=-20 active=3/256 refcnt=4 [ 639.939942] in-flight: 1020:blk_mq_run_work_fn [ 639.939955] pending: blk_mq_timeout_work, blk_mq_run_work_fn [ 639.940212] pool 21: cpus=10 node=1 flags=0x0 nice=-20 hung=66s workers=2 idle: 67 Fixes: 6e6fcbc2 ("blk-mq: support batching dispatch in case of io") Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.10+ Link: https://lore.kernel.org/linux-block/20220310091649.zypaem5lkyfadymg@shindev/ Link: https://lore.kernel.org/r/20220318022641.133484-1-shinichiro.kawasaki@wdc.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Luiz Augusto von Dentz 提交于
mainline inclusion from mainline-v6.1-rc4 commit 711f8c3f category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5ZNPH CVE: CVE-2022-42896 -------------------------------- The Bluetooth spec states that the valid range for SPSM is from 0x0001-0x00ff so it is invalid to accept values outside of this range: BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 3, Part A page 1059: Table 4.15: L2CAP_LE_CREDIT_BASED_CONNECTION_REQ SPSM ranges CVE: CVE-2022-42896 CC: stable@vger.kernel.org Reported-by: NTamás Koczka <poprdi@google.com> Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: NTedd Ho-Jeong An <tedd.an@intel.com> Conflicts: net/bluetooth/l2cap_core.c Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Luiz Augusto von Dentz 提交于
stable inclusion from stable-v4.19.265 commit 36919a82f335784d86b4def308739559bb47943d category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5ZNRS CVE: CVE-2022-42895 -------------------------------- commit b1a2cd50 upstream. On l2cap_parse_conf_req the variable efs is only initialized if remote_efs has been set. CVE: CVE-2022-42895 CC: stable@vger.kernel.org Reported-by: NTamás Koczka <poprdi@google.com> Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: NTedd Ho-Jeong An <tedd.an@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBaisong Zhong <zhongbaisong@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 15 11月, 2022 1 次提交
-
-
由 Li Lingfeng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60QE9 CVE: NA -------------------------------- As explained in 0eb44012 ("block: fix use after free for bd_holder_dir"), we should make sure the "disk" is still live and then grab a reference to 'bd_holder_dir'. However, the "disk" should be "the claimed slave bdev" rather than "the holding disk". Fixes: 0eb44012 ("block: fix use after free for bd_holder_dir") Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 14 11月, 2022 2 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60QE9 CVE: NA -------------------------------- Currently, the caller of bd_link_disk_holer() get 'bdev' by blkdev_get_by_dev(), which will look up 'bdev' by inode number 'dev'. Howerver, it's possible that del_gendisk() can be called currently, and 'bd_holder_dir' can be freed before bd_link_disk_holer() access it, thus use after free is triggered. t1: t2: bdev = blkdev_get_by_dev del_gendisk kobject_put(bd_holder_dir) kobject_free() bd_link_disk_holder Fix the problem by checking disk is still live and grabbing a reference to 'bd_holder_dir' first in bd_link_disk_holder(). Link: https://lore.kernel.org/all/20221103025541.1875809-3-yukuai1@huaweicloud.com/Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60QE9 CVE: NA -------------------------------- This reverts commit 01b1ec1d. Official solution will be applied to mainline, and this solution can't fix uaf for 'bd_holder_dir' thoroughly. hence revert this temporary solution. Officail solution will be backported in the next patch. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 08 11月, 2022 22 次提交
-
-
由 Randy Dunlap 提交于
stable inclusion from stable-4.19.238 commit c7daf1b4ad809692d5c26f33c02ed8a031066548 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X41F CVE: NA -------------------------------- [ Upstream commit f9a40b08 ] initcall_blacklist() should return 1 to indicate that it handled its cmdline arguments. set_debug_rodata() should return 1 to indicate that it handled its cmdline arguments. Print a warning if the option string is invalid. This prevents these strings from being added to the 'init' program's environment as they are not init arguments/parameters. Link: https://lkml.kernel.org/r/20220221050901.23985-1-rdunlap@infradead.orgSigned-off-by: NRandy Dunlap <rdunlap@infradead.org> Reported-by: NIgor Zhbanov <i.zhbanov@omprussia.ru> Cc: Ingo Molnar <mingo@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Pawan Gupta 提交于
stable inclusion from stable-4.19.238 commit c7daf1b4ad809692d5c26f33c02ed8a031066548 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X41F CVE: NA -------------------------------- commit 73924ec4 upstream. The mechanism to save/restore MSRs during S3 suspend/resume checks for the MSR validity during suspend, and only restores the MSR if its a valid MSR. This is not optimal, as an invalid MSR will unnecessarily throw an exception for every suspend cycle. The more invalid MSRs, higher the impact will be. Check and save the MSR validity at setup. This ensures that only valid MSRs that are guaranteed to not throw an exception will be attempted during suspend. Fixes: 7a9c2dd0 ("x86/pm: Introduce quirk framework to save/restore extra MSR registers around suspend/resume") Suggested-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com> Acked-by: NBorislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Pawan Gupta 提交于
stable inclusion from stable-4.19.238 commit c7daf1b4ad809692d5c26f33c02ed8a031066548 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X41F CVE: NA -------------------------------- commit e2a1256b upstream. After resuming from suspend-to-RAM, the MSRs that control CPU's speculative execution behavior are not being restored on the boot CPU. These MSRs are used to mitigate speculative execution vulnerabilities. Not restoring them correctly may leave the CPU vulnerable. Secondary CPU's MSRs are correctly being restored at S3 resume by identify_secondary_cpu(). During S3 resume, restore these MSRs for boot CPU when restoring its processor state. Fixes: 77243971 ("x86/bugs/intel: Set proper CPU features and setup RDS") Reported-by: NNeelima Krishnan <neelima.krishnan@intel.com> Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com> Tested-by: NNeelima Krishnan <neelima.krishnan@intel.com> Acked-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Borislav Petkov 提交于
stable inclusion from stable-4.19.238 commit c7daf1b4ad809692d5c26f33c02ed8a031066548 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X41F CVE: NA -------------------------------- commit f9e14dbb upstream. When resuming from system sleep state, restore_processor_state() restores the boot CPU MSRs. These MSRs could be emulated by microcode. If microcode is not loaded yet, writing to emulated MSRs leads to unchecked MSR access error: ... PM: Calling lapic_suspend+0x0/0x210 unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0...0) at rIP: ... (native_write_msr) Call Trace: <TASK> ? restore_processor_state x86_acpi_suspend_lowlevel acpi_suspend_enter suspend_devices_and_enter pm_suspend.cold state_store kobj_attr_store sysfs_kf_write kernfs_fop_write_iter new_sync_write vfs_write ksys_write __x64_sys_write do_syscall_64 entry_SYSCALL_64_after_hwframe RIP: 0033:0x7fda13c260a7 To ensure microcode emulated MSRs are available for restoration, load the microcode on the boot CPU before restoring these MSRs. [ Pawan: write commit message and productize it. ] Fixes: e2a1256b ("x86/speculation: Restore speculation related MSRs during S3 resume") Reported-by: NKyle D. Pelton <kyle.d.pelton@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com> Tested-by: NKyle D. Pelton <kyle.d.pelton@intel.com> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=215841 Link: https://lore.kernel.org/r/4350dfbf785cd482d3fafa72b2b49c83102df3ce.1650386317.git.pawan.kumar.gupta@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Thomas Pfaff 提交于
stable inclusion from stable-4.19.238 commit c7daf1b4ad809692d5c26f33c02ed8a031066548 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X41F CVE: NA -------------------------------- commit 8707898e upstream. A kernel hang can be observed when running setserial in a loop on a kernel with force threaded interrupts. The sequence of events is: setserial open("/dev/ttyXXX") request_irq() do_stuff() -> serial interrupt -> wake(irq_thread) desc->threads_active++; close() free_irq() kthread_stop(irq_thread) synchronize_irq() <- hangs because desc->threads_active != 0 The thread is created in request_irq() and woken up, but does not get on a CPU to reach the actual thread function, which would handle the pending wake-up. kthread_stop() sets the should stop condition which makes the thread immediately exit, which in turn leaves the stale threads_active count around. This problem was introduced with commit 519cc865, which addressed a interrupt sharing issue in the PCIe code. Before that commit free_irq() invoked synchronize_irq(), which waits for the hard interrupt handler and also for associated threads to complete. To address the PCIe issue synchronize_irq() was replaced with __synchronize_hardirq(), which only waits for the hard interrupt handler to complete, but not for threaded handlers. This was done under the assumption, that the interrupt thread already reached the thread function and waits for a wake-up, which is guaranteed to be handled before acting on the stop condition. The problematic case, that the thread would not reach the thread function, was obviously overlooked. Make sure that the interrupt thread is really started and reaches thread_fn() before returning from __setup_irq(). This utilizes the existing wait queue in the interrupt descriptor. The wait queue is unused for non-shared interrupts. For shared interrupts the usage might cause a spurious wake-up of a waiter in synchronize_irq() or the completion of a threaded handler might cause a spurious wake-up of the waiter for the ready flag. Both are harmless and have no functional impact. [ tglx: Amended changelog ] Fixes: 519cc865 ("genirq: Synchronize only with single thread on free_irq()") Signed-off-by: NThomas Pfaff <tpfaff@pcs.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NMarc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/552fe7b4-9224-b183-bb87-a8f36d335690@pcs.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Michael Kelley 提交于
stable inclusion from stable-v4.19.261 commit 5f7fd71e5bebf337769f20dd125822ce63266e4d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- [ Upstream commit c292a337 ] The IOC_PR_CLEAR and IOC_PR_RELEASE ioctls are non-functional on NVMe devices because the nvme_pr_clear() and nvme_pr_release() functions set the IEKEY field incorrectly. The IEKEY field should be set only when the key is zero (i.e, not specified). The current code does it backwards. Furthermore, the NVMe spec describes the persistent reservation "clear" function as an option on the reservation release command. The current implementation of nvme_pr_clear() erroneously uses the reservation register command. Fix these errors. Note that NVMe version 1.3 and later specify that setting the IEKEY field will return an error of Invalid Field in Command. The fix will set IEKEY when the key is zero, which is appropriate as these ioctls consider a zero key to be "unspecified", and the intention of the spec change is to require a valid key. Tested on a version 1.4 PCI NVMe device in an Azure VM. Fixes: 1673f1f0 ("nvme: move block_device_operations and ns/ctrl freeing to common code") Fixes: 1d277a63 ("NVMe: Add persistent reservation ops") Signed-off-by: NMichael Kelley <mikelley@microsoft.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org> Conflicts: drivers/nvme/host/core.c Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.262 commit f5686a03b138f6330eeda082ee4f96c8109f56f3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- [ Upstream commit 62c07983 ] Christophe Leroy reported a ~80ms latency spike happening at first TCP connect() time. This is because __inet_hash_connect() uses get_random_once() to populate a perturbation table which became quite big after commit 4c2c8f03 ("tcp: increase source port perturb table to 2^16") get_random_once() uses DO_ONCE(), which block hard irqs for the duration of the operation. This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock for operations where we prefer to stay in process context. Then __inet_hash_connect() can use get_random_slow_once() to populate its perturbation table. Fixes: 4c2c8f03 ("tcp: increase source port perturb table to 2^16") Fixes: 190cc824 ("tcp: change source port randomizarion at connect() time") Reported-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/netdev/CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com/T/#tSigned-off-by: NEric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Tested-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> One conflict occurs because the commit 4c2c8f03 ("tcp: increase source port perturb table to 2^16") is integrated but the commit e9261476 ("tcp: dynamically allocate the perturb table used by source port") is not integrated. One conflict occurs because the commit 1027b96e ("once: Fix panic when module unload") is not integrated. Conflicts: net/ipv4/inet_hashtables.c lib/once.c Signed-off-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.262 commit 75a578000ae5e511e5d0e8433c94a14d9c99c412 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- commit 8f905c0e upstream. syzbot reported various issues around early demux, one being included in this changelog [1] sk->sk_rx_dst is using RCU protection without clearly documenting it. And following sequences in tcp_v4_do_rcv()/tcp_v6_do_rcv() are not following standard RCU rules. [a] dst_release(dst); [b] sk->sk_rx_dst = NULL; They look wrong because a delete operation of RCU protected pointer is supposed to clear the pointer before the call_rcu()/synchronize_rcu() guarding actual memory freeing. In some cases indeed, dst could be freed before [b] is done. We could cheat by clearing sk_rx_dst before calling dst_release(), but this seems the right time to stick to standard RCU annotations and debugging facilities. [1] BUG: KASAN: use-after-free in dst_check include/net/dst.h:470 [inline] BUG: KASAN: use-after-free in tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792 Read of size 2 at addr ffff88807f1cb73a by task syz-executor.5/9204 CPU: 0 PID: 9204 Comm: syz-executor.5 Not tainted 5.16.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0x8d/0x320 mm/kasan/report.c:247 __kasan_report mm/kasan/report.c:433 [inline] kasan_report.cold+0x83/0xdf mm/kasan/report.c:450 dst_check include/net/dst.h:470 [inline] tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792 ip_rcv_finish_core.constprop.0+0x15de/0x1e80 net/ipv4/ip_input.c:340 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline] ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644 __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline] __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556 __netif_receive_skb_list net/core/dev.c:5608 [inline] netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699 gro_normal_list net/core/dev.c:5853 [inline] gro_normal_list net/core/dev.c:5849 [inline] napi_complete_done+0x1f1/0x880 net/core/dev.c:6590 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline] virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557 __napi_poll+0xaf/0x440 net/core/dev.c:7023 napi_poll net/core/dev.c:7090 [inline] net_rx_action+0x801/0xb40 net/core/dev.c:7177 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649 common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240 asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:629 RIP: 0033:0x7f5e972bfd57 Code: 39 d1 73 14 0f 1f 80 00 00 00 00 48 8b 50 f8 48 83 e8 08 48 39 ca 77 f3 48 39 c3 73 3e 48 89 13 48 8b 50 f8 48 89 38 49 8b 0e <48> 8b 3e 48 83 c3 08 48 83 c6 08 eb bc 48 39 d1 72 9e 48 39 d0 73 RSP: 002b:00007fff8a413210 EFLAGS: 00000283 RAX: 00007f5e97108990 RBX: 00007f5e97108338 RCX: ffffffff81d3aa45 RDX: ffffffff81d3aa45 RSI: 00007f5e97108340 RDI: ffffffff81d3aa45 RBP: 00007f5e97107eb8 R08: 00007f5e97108d88 R09: 0000000093c2e8d9 R10: 0000000000000000 R11: 0000000000000000 R12: 00007f5e97107eb0 R13: 00007f5e97108338 R14: 00007f5e97107ea8 R15: 0000000000000019 </TASK> Allocated by task 13: kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38 kasan_set_track mm/kasan/common.c:46 [inline] set_alloc_info mm/kasan/common.c:434 [inline] __kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:467 kasan_slab_alloc include/linux/kasan.h:259 [inline] slab_post_alloc_hook mm/slab.h:519 [inline] slab_alloc_node mm/slub.c:3234 [inline] slab_alloc mm/slub.c:3242 [inline] kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3247 dst_alloc+0x146/0x1f0 net/core/dst.c:92 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613 ip_route_input_slow+0x1817/0x3a20 net/ipv4/route.c:2340 ip_route_input_rcu net/ipv4/route.c:2470 [inline] ip_route_input_noref+0x116/0x2a0 net/ipv4/route.c:2415 ip_rcv_finish_core.constprop.0+0x288/0x1e80 net/ipv4/ip_input.c:354 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline] ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644 __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline] __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556 __netif_receive_skb_list net/core/dev.c:5608 [inline] netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699 gro_normal_list net/core/dev.c:5853 [inline] gro_normal_list net/core/dev.c:5849 [inline] napi_complete_done+0x1f1/0x880 net/core/dev.c:6590 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline] virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557 __napi_poll+0xaf/0x440 net/core/dev.c:7023 napi_poll net/core/dev.c:7090 [inline] net_rx_action+0x801/0xb40 net/core/dev.c:7177 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 Freed by task 13: kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38 kasan_set_track+0x21/0x30 mm/kasan/common.c:46 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370 ____kasan_slab_free mm/kasan/common.c:366 [inline] ____kasan_slab_free mm/kasan/common.c:328 [inline] __kasan_slab_free+0xff/0x130 mm/kasan/common.c:374 kasan_slab_free include/linux/kasan.h:235 [inline] slab_free_hook mm/slub.c:1723 [inline] slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1749 slab_free mm/slub.c:3513 [inline] kmem_cache_free+0xbd/0x5d0 mm/slub.c:3530 dst_destroy+0x2d6/0x3f0 net/core/dst.c:127 rcu_do_batch kernel/rcu/tree.c:2506 [inline] rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2741 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 Last potentially related work creation: kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38 __kasan_record_aux_stack+0xf5/0x120 mm/kasan/generic.c:348 __call_rcu kernel/rcu/tree.c:2985 [inline] call_rcu+0xb1/0x740 kernel/rcu/tree.c:3065 dst_release net/core/dst.c:177 [inline] dst_release+0x79/0xe0 net/core/dst.c:167 tcp_v4_do_rcv+0x612/0x8d0 net/ipv4/tcp_ipv4.c:1712 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0x134/0x3b0 net/core/sock.c:2768 release_sock+0x54/0x1b0 net/core/sock.c:3300 tcp_sendmsg+0x36/0x40 net/ipv4/tcp.c:1441 inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:724 sock_write_iter+0x289/0x3c0 net/socket.c:1057 call_write_iter include/linux/fs.h:2162 [inline] new_sync_write+0x429/0x660 fs/read_write.c:503 vfs_write+0x7cd/0xae0 fs/read_write.c:590 ksys_write+0x1ee/0x250 fs/read_write.c:643 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff88807f1cb700 which belongs to the cache ip_dst_cache of size 176 The buggy address is located 58 bytes inside of 176-byte region [ffff88807f1cb700, ffff88807f1cb7b0) The buggy address belongs to the page: page:ffffea0001fc72c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7f1cb flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff) raw: 00fff00000000200 dead000000000100 dead000000000122 ffff8881413bb780 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 5, ts 108466983062, free_ts 108048976062 prep_new_page mm/page_alloc.c:2418 [inline] get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4149 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5369 alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191 alloc_slab_page mm/slub.c:1793 [inline] allocate_slab mm/slub.c:1930 [inline] new_slab+0x32d/0x4a0 mm/slub.c:1993 ___slab_alloc+0x918/0xfe0 mm/slub.c:3022 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109 slab_alloc_node mm/slub.c:3200 [inline] slab_alloc mm/slub.c:3242 [inline] kmem_cache_alloc+0x35c/0x3a0 mm/slub.c:3247 dst_alloc+0x146/0x1f0 net/core/dst.c:92 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613 __mkroute_output net/ipv4/route.c:2564 [inline] ip_route_output_key_hash_rcu+0x921/0x2d00 net/ipv4/route.c:2791 ip_route_output_key_hash+0x18b/0x300 net/ipv4/route.c:2619 __ip_route_output_key include/net/route.h:126 [inline] ip_route_output_flow+0x23/0x150 net/ipv4/route.c:2850 ip_route_output_key include/net/route.h:142 [inline] geneve_get_v4_rt+0x3a6/0x830 drivers/net/geneve.c:809 geneve_xmit_skb drivers/net/geneve.c:899 [inline] geneve_xmit+0xc4a/0x3540 drivers/net/geneve.c:1082 __netdev_start_xmit include/linux/netdevice.h:4994 [inline] netdev_start_xmit include/linux/netdevice.h:5008 [inline] xmit_one net/core/dev.c:3590 [inline] dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3606 __dev_queue_xmit+0x299a/0x3650 net/core/dev.c:4229 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1338 [inline] free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1389 free_unref_page_prepare mm/page_alloc.c:3309 [inline] free_unref_page+0x19/0x690 mm/page_alloc.c:3388 qlink_free mm/kasan/quarantine.c:146 [inline] qlist_free_all+0x5a/0xc0 mm/kasan/quarantine.c:165 kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:272 __kasan_slab_alloc+0xa2/0xc0 mm/kasan/common.c:444 kasan_slab_alloc include/linux/kasan.h:259 [inline] slab_post_alloc_hook mm/slab.h:519 [inline] slab_alloc_node mm/slub.c:3234 [inline] kmem_cache_alloc_node+0x255/0x3f0 mm/slub.c:3270 __alloc_skb+0x215/0x340 net/core/skbuff.c:414 alloc_skb include/linux/skbuff.h:1126 [inline] alloc_skb_with_frags+0x93/0x620 net/core/skbuff.c:6078 sock_alloc_send_pskb+0x783/0x910 net/core/sock.c:2575 mld_newpack+0x1df/0x770 net/ipv6/mcast.c:1754 add_grhead+0x265/0x330 net/ipv6/mcast.c:1857 add_grec+0x1053/0x14e0 net/ipv6/mcast.c:1995 mld_send_initial_cr.part.0+0xf6/0x230 net/ipv6/mcast.c:2242 mld_send_initial_cr net/ipv6/mcast.c:1232 [inline] mld_dad_work+0x1d3/0x690 net/ipv6/mcast.c:2268 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445 Memory state around the buggy address: ffff88807f1cb600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88807f1cb680: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc >ffff88807f1cb700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88807f1cb780: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc ffff88807f1cb800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: 41063e9d ("ipv4: Early TCP socket demux.") Signed-off-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211220143330.680945-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> [cmllamas: fixed trivial merge conflict] Signed-off-by: NCarlos Llamas <cmllamas@google.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: net/ipv4/af_inet.c Signed-off-by: NDong Chenchen <dongchenchen2@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Jerry Lee 李修賢 提交于
stable inclusion from stable-v4.19.262 commit f2180ad6a43501597d20eacad0c6f146c51d4bbd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- commit df3cb754 upstream. When expanding a file system from (16TiB-2MiB) to 18TiB, the operation exits early which leads to result inconsistency between resize2fs and Ext4 kernel driver. === before === ○ → resize2fs /dev/mapper/thin resize2fs 1.45.5 (07-Jan-2020) Filesystem at /dev/mapper/thin is mounted on /mnt/test; on-line resizing required old_desc_blocks = 2048, new_desc_blocks = 2304 The filesystem on /dev/mapper/thin is now 4831837696 (4k) blocks long. [ 865.186308] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [ 912.091502] dm-4: detected capacity change from 34359738368 to 38654705664 [ 970.030550] dm-5: detected capacity change from 34359734272 to 38654701568 [ 1000.012751] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks [ 1000.012878] EXT4-fs (dm-5): resized filesystem to 4294967296 === after === [ 129.104898] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [ 143.773630] dm-4: detected capacity change from 34359738368 to 38654705664 [ 198.203246] dm-5: detected capacity change from 34359734272 to 38654701568 [ 207.918603] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks [ 207.918754] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks [ 207.918758] EXT4-fs (dm-5): Converting file system to meta_bg [ 207.918790] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks [ 221.454050] EXT4-fs (dm-5): resized to 4658298880 blocks [ 227.634613] EXT4-fs (dm-5): resized filesystem to 4831837696 Signed-off-by: NJerry Lee <jerrylee@qnap.com> Link: https://lore.kernel.org/r/PU1PR04MB22635E739BD21150DC182AC6A18C9@PU1PR04MB2263.apcprd04.prod.outlook.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Keith Busch 提交于
stable inclusion from stable-v4.19.262 commit 366a2b3110c69f919fb3277acc1a0bb8cd8a8dbd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- [ Upstream commit a8eb6c1b ] The firmware revision can change on after a reset so copy the most recent info each time instead of just the first time, otherwise the sysfs firmware_rev entry may contain stale data. Reported-by: NJeff Lien <jeff.lien@wdc.com> Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NChao Leng <lengchao@huawei.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Liu Jian 提交于
stable inclusion from stable-v4.19.262 commit 5fe03917bb017d9af68a95f989f1c122eebc69a6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- [ Upstream commit 3f8ef65a ] Fixes the below NULL pointer dereference: [...] [ 14.471200] Call Trace: [ 14.471562] <TASK> [ 14.471882] lock_acquire+0x245/0x2e0 [ 14.472416] ? remove_wait_queue+0x12/0x50 [ 14.473014] ? _raw_spin_lock_irqsave+0x17/0x50 [ 14.473681] _raw_spin_lock_irqsave+0x3d/0x50 [ 14.474318] ? remove_wait_queue+0x12/0x50 [ 14.474907] remove_wait_queue+0x12/0x50 [ 14.475480] sk_stream_wait_memory+0x20d/0x340 [ 14.476127] ? do_wait_intr_irq+0x80/0x80 [ 14.476704] do_tcp_sendpages+0x287/0x600 [ 14.477283] tcp_bpf_push+0xab/0x260 [ 14.477817] tcp_bpf_sendmsg_redir+0x297/0x500 [ 14.478461] ? __local_bh_enable_ip+0x77/0xe0 [ 14.479096] tcp_bpf_send_verdict+0x105/0x470 [ 14.479729] tcp_bpf_sendmsg+0x318/0x4f0 [ 14.480311] sock_sendmsg+0x2d/0x40 [ 14.480822] ____sys_sendmsg+0x1b4/0x1c0 [ 14.481390] ? copy_msghdr_from_user+0x62/0x80 [ 14.482048] ___sys_sendmsg+0x78/0xb0 [ 14.482580] ? vmf_insert_pfn_prot+0x91/0x150 [ 14.483215] ? __do_fault+0x2a/0x1a0 [ 14.483738] ? do_fault+0x15e/0x5d0 [ 14.484246] ? __handle_mm_fault+0x56b/0x1040 [ 14.484874] ? lock_is_held_type+0xdf/0x130 [ 14.485474] ? find_held_lock+0x2d/0x90 [ 14.486046] ? __sys_sendmsg+0x41/0x70 [ 14.486587] __sys_sendmsg+0x41/0x70 [ 14.487105] ? intel_pmu_drain_pebs_core+0x350/0x350 [ 14.487822] do_syscall_64+0x34/0x80 [ 14.488345] entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] The test scenario has the following flow: thread1 thread2 ----------- --------------- tcp_bpf_sendmsg tcp_bpf_send_verdict tcp_bpf_sendmsg_redir sock_close tcp_bpf_push_locked __sock_release tcp_bpf_push //inet_release do_tcp_sendpages sock->ops->release sk_stream_wait_memory // tcp_close sk_wait_event sk->sk_prot->close release_sock(__sk); *** lock_sock(sk); __tcp_close sock_orphan(sk) sk->sk_wq = NULL release_sock **** lock_sock(__sk); remove_wait_queue(sk_sleep(sk), &wait); sk_sleep(sk) //NULL pointer dereference &rcu_dereference_raw(sk->sk_wq)->wait While waiting for memory in thread1, the socket is released with its wait queue because thread2 has closed it. This caused by tcp_bpf_send_verdict didn't increase the f_count of psock->sk_redir->sk_socket->file in thread1. We should check if SOCK_DEAD flag is set on wakeup in sk_stream_wait_memory before accessing the wait queue. Suggested-by: NJakub Sitnicki <jakub@cloudflare.com> Signed-off-by: NLiu Jian <liujian56@huawei.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20220823133755.314697-2-liujian56@huawei.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Ziyang Xuan 提交于
stable inclusion from stable-v4.19.262 commit dae06957f856eb699f2a504a46891718c9b1e0d3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- [ Upstream commit 3fd7bfd2 ] If can_send() fail, it should not update frames_abs counter in bcm_can_tx(). Add the result check for can_send() in bcm_can_tx(). Suggested-by: NMarc Kleine-Budde <mkl@pengutronix.de> Suggested-by: NOliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Link: https://lore.kernel.org/all/9851878e74d6d37aee2f1ee76d68361a46f89458.1663206163.git.william.xuanziyang@huawei.comAcked-by: NOliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Khalid Masum 提交于
stable inclusion from stable-v4.19.262 commit 1e8abde895b3ac6a368cbdb372e8800c49e73a28 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- [ Upstream commit 8a04d2fc ] Currently if ipcomp_alloc_scratches() fails to allocate memory ipcomp_scratches holds obsolete address. So when we try to free the percpu scratches using ipcomp_free_scratches() it tries to vfree non existent vm area. Described below: static void * __percpu *ipcomp_alloc_scratches(void) { ... scratches = alloc_percpu(void *); if (!scratches) return NULL; ipcomp_scratches does not know about this allocation failure. Therefore holding the old obsolete address. ... } So when we free, static void ipcomp_free_scratches(void) { ... scratches = ipcomp_scratches; Assigning obsolete address from ipcomp_scratches if (!scratches) return; for_each_possible_cpu(i) vfree(*per_cpu_ptr(scratches, i)); Trying to free non existent page, causing warning: trying to vfree existent vm area. ... } Fix this breakage by updating ipcomp_scrtches with NULL when scratches is freed Suggested-by: NHerbert Xu <herbert@gondor.apana.org.au> Reported-by: syzbot+5ec9bb042ddfe9644773@syzkaller.appspotmail.com Tested-by: syzbot+5ec9bb042ddfe9644773@syzkaller.appspotmail.com Signed-off-by: NKhalid Masum <khalid.masum.92@gmail.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from stable-v4.19.262 commit 5c4e1b8939195fe27b05d791577f92445b139a3e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- [ Upstream commit aacd467c ] tcp_md5sig_pool_populated can be read while another thread changes its value. The race has no consequence because allocations are protected with tcp_md5sig_mutex. This patch adds READ_ONCE() and WRITE_ONCE() to document the race and silence KCSAN. Reported-by: NAbhishek Shah <abhishek.shah@columbia.edu> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Neal Cardwell 提交于
stable inclusion from stable-v4.19.262 commit a434d10e7a90e301ea4a1826ee758b53c79d7de8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- [ Upstream commit f4ce91ce ] This commit fixes a bug in the tracking of max_packets_out and is_cwnd_limited. This bug can cause the connection to fail to remember that is_cwnd_limited is true, causing the connection to fail to grow cwnd when it should, causing throughput to be lower than it should be. The following event sequence is an example that triggers the bug: (a) The connection is cwnd_limited, but packets_out is not at its peak due to TSO deferral deciding not to send another skb yet. In such cases the connection can advance max_packets_seq and set tp->is_cwnd_limited to true and max_packets_out to a small number. (b) Then later in the round trip the connection is pacing-limited (not cwnd-limited), and packets_out is larger. In such cases the connection would raise max_packets_out to a bigger number but (unexpectedly) flip tp->is_cwnd_limited from true to false. This commit fixes that bug. One straightforward fix would be to separately track (a) the next window after max_packets_out reaches a maximum, and (b) the next window after tp->is_cwnd_limited is set to true. But this would require consuming an extra u32 sequence number. Instead, to save space we track only the most important information. Specifically, we track the strongest available signal of the degree to which the cwnd is fully utilized: (1) If the connection is cwnd-limited then we remember that fact for the current window. (2) If the connection not cwnd-limited then we track the maximum number of outstanding packets in the current window. In particular, note that the new logic cannot trigger the buggy (a)/(b) sequence above because with the new logic a condition where tp->packets_out > tp->max_packets_out can only trigger an update of tp->is_cwnd_limited if tp->is_cwnd_limited is false. This first showed up in a testing of a BBRv2 dev branch, but this buggy behavior highlighted a general issue with the tcp_cwnd_validate() logic that can cause cwnd to fail to increase at the proper rate for any TCP congestion control, including Reno or CUBIC. Fixes: ca8a2263 ("tcp: make cwnd-limited checks measurement-based, and gentler") Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NKevin(Yudong) Yang <yyd@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Baokun Li 提交于
stable inclusion from stable-v4.19.262 commit 947264e00c46de19a016fd81218118c708fed2f3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- commit f9c1f248 upstream. I caught a null-ptr-deref bug as follows: ================================================================== KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f] CPU: 1 PID: 1589 Comm: umount Not tainted 5.10.0-02219-dirty #339 RIP: 0010:ext4_write_info+0x53/0x1b0 [...] Call Trace: dquot_writeback_dquots+0x341/0x9a0 ext4_sync_fs+0x19e/0x800 __sync_filesystem+0x83/0x100 sync_filesystem+0x89/0xf0 generic_shutdown_super+0x79/0x3e0 kill_block_super+0xa1/0x110 deactivate_locked_super+0xac/0x130 deactivate_super+0xb6/0xd0 cleanup_mnt+0x289/0x400 __cleanup_mnt+0x16/0x20 task_work_run+0x11c/0x1c0 exit_to_user_mode_prepare+0x203/0x210 syscall_exit_to_user_mode+0x5b/0x3a0 do_syscall_64+0x59/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ================================================================== Above issue may happen as follows: ------------------------------------- exit_to_user_mode_prepare task_work_run __cleanup_mnt cleanup_mnt deactivate_super deactivate_locked_super kill_block_super generic_shutdown_super shrink_dcache_for_umount dentry = sb->s_root sb->s_root = NULL <--- Here set NULL sync_filesystem __sync_filesystem sb->s_op->sync_fs > ext4_sync_fs dquot_writeback_dquots sb->dq_op->write_info > ext4_write_info ext4_journal_start(d_inode(sb->s_root), EXT4_HT_QUOTA, 2) d_inode(sb->s_root) s_root->d_inode <--- Null pointer dereference To solve this problem, we use ext4_journal_start_sb directly to avoid s_root being used. Cc: stable@kernel.org Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220805123947.565152-1-libaokun1@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Sasha Levin 提交于
stable inclusion from stable-v4.19.262 commit 6d43e94b8daf009694d709c5919d67067936577a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- This reverts commit fd0a6e99b61e6c08fa5cf585d54fd956f70c73a6. Which was upstream commit 97ef77c5. The commit is missing dependencies and breaks NFS tests, remove it for now. Reported-by: NSaeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Tyler Hicks 提交于
stable inclusion from stable-v4.19.261 commit f039564c36ee609bc5327a52895dfee05c8f0f7c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- commit 2bdd737c upstream. Use ima_free_rule() to fix memory leaks of allocated ima_rule_entry members, such as .fsname and .keyrings, when an error is encountered during rule parsing. Set the args_p pointer to NULL after freeing it in the error path of ima_lsm_rule_init() so that it isn't freed twice. This fixes a memory leak seen when loading an rule that contains an additional piece of allocated memory, such as an fsname, followed by an invalid conditional: # echo "measure fsname=tmpfs bad=cond" > /sys/kernel/security/ima/policy -bash: echo: write error: Invalid argument # echo scan > /sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak unreferenced object 0xffff98e7e4ece6c0 (size 8): comm "bash", pid 672, jiffies 4294791843 (age 21.855s) hex dump (first 8 bytes): 74 6d 70 66 73 00 6b a5 tmpfs.k. backtrace: [<00000000abab7413>] kstrdup+0x2e/0x60 [<00000000f11ede32>] ima_parse_add_rule+0x7d4/0x1020 [<00000000f883dd7a>] ima_write_policy+0xab/0x1d0 [<00000000b17cf753>] vfs_write+0xde/0x1d0 [<00000000b8ddfdea>] ksys_write+0x68/0xe0 [<00000000b8e21e87>] do_syscall_64+0x56/0xa0 [<0000000089ea7b98>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: f1b08bbc ("ima: define a new policy condition based on the filesystem name") Fixes: 2b60c0ec ("IMA: Read keyrings= option from the IMA policy") Signed-off-by: NTyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: NMimi Zohar <zohar@linux.ibm.com> Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: NGou Hao <gouhao@uniontech.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Tyler Hicks 提交于
stable inclusion from stable-v4.19.261 commit 3d55a948aaec1ffd4ea329bc6e1a7ecd4f10e64f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- commit 465aee77 upstream. Create a function, ima_free_rule(), to free all memory associated with an ima_rule_entry. Use the new function to fix memory leaks of allocated ima_rule_entry members, such as .fsname and .keyrings, when deleting a list of rules. Make the existing ima_lsm_free_rule() function specific to the LSM audit rule array of an ima_rule_entry and require that callers make an additional call to kfree to free the ima_rule_entry itself. This fixes a memory leak seen when loading by a valid rule that contains an additional piece of allocated memory, such as an fsname, followed by an invalid rule that triggers a policy load failure: # echo -e "dont_measure fsname=securityfs\nbad syntax" > \ /sys/kernel/security/ima/policy -bash: echo: write error: Invalid argument # echo scan > /sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak unreferenced object 0xffff9bab67ca12c0 (size 16): comm "bash", pid 684, jiffies 4295212803 (age 252.344s) hex dump (first 16 bytes): 73 65 63 75 72 69 74 79 66 73 00 6b 6b 6b 6b a5 securityfs.kkkk. backtrace: [<00000000adc80b1b>] kstrdup+0x2e/0x60 [<00000000d504cb0d>] ima_parse_add_rule+0x7d4/0x1020 [<00000000444825ac>] ima_write_policy+0xab/0x1d0 [<000000002b7f0d6c>] vfs_write+0xde/0x1d0 [<0000000096feedcf>] ksys_write+0x68/0xe0 [<0000000052b544a2>] do_syscall_64+0x56/0xa0 [<000000007ead1ba7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: f1b08bbc ("ima: define a new policy condition based on the filesystem name") Fixes: 2b60c0ec ("IMA: Read keyrings= option from the IMA policy") Signed-off-by: NTyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: NMimi Zohar <zohar@linux.ibm.com> Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: NGou Hao <gouhao@uniontech.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Tyler Hicks 提交于
stable inclusion from stable-v4.19.261 commit 7e290764624acfc807a9dae958b3e4ecc550b50c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- commit 9ff8a616 upstream. Ask the LSM to free its audit rule rather than directly calling kfree(). Both AppArmor and SELinux do additional work in their audit_rule_free() hooks. Fix memory leaks by allowing the LSMs to perform necessary work. Fixes: b1694245 ("ima: use the lsm policy update notifier") Signed-off-by: NTyler Hicks <tyhicks@linux.microsoft.com> Cc: Janne Karhunen <janne.karhunen@gmail.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: NMimi Zohar <zohar@linux.ibm.com> Signed-off-by: NMimi Zohar <zohar@linux.ibm.com> Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: NGou Hao <gouhao@uniontech.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Alistair Popple 提交于
stable inclusion from stable-v4.19.261 commit acf4387e553ede843bdacf3616e4b750517b58db category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- commit 60bae737 upstream. When clearing a PTE the TLB should be flushed whilst still holding the PTL to avoid a potential race with madvise/munmap/etc. For example consider the following sequence: CPU0 CPU1 ---- ---- migrate_vma_collect_pmd() pte_unmap_unlock() madvise(MADV_DONTNEED) -> zap_pte_range() pte_offset_map_lock() [ PTE not present, TLB not flushed ] pte_unmap_unlock() [ page is still accessible via stale TLB ] flush_tlb_range() In this case the page may still be accessed via the stale TLB entry after madvise returns. Fix this by flushing the TLB while holding the PTL. Fixes: 8c3328f1 ("mm/migrate: migrate_vma() unmap page from vma while collecting pages") Link: https://lkml.kernel.org/r/9f801e9d8d830408f2ca27821f606e09aa856899.1662078528.git-series.apopple@nvidia.comSigned-off-by: NAlistair Popple <apopple@nvidia.com> Reported-by: NNadav Amit <nadav.amit@gmail.com> Reviewed-by: N"Huang, Ying" <ying.huang@intel.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NPeter Xu <peterx@redhat.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: huang ying <huang.ying.caritas@gmail.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Maurizio Lombardi 提交于
stable inclusion from stable-v4.19.261 commit 39a22a4ccd3a6c073ba2257629e752afa4e7ad08 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZXGL CVE: NA -------------------------------- commit dac22531 upstream. A number of drivers call page_frag_alloc() with a fragment's size > PAGE_SIZE. In low memory conditions, __page_frag_cache_refill() may fail the order 3 cache allocation and fall back to order 0; In this case, the cache will be smaller than the fragment, causing memory corruptions. Prevent this from happening by checking if the newly allocated cache is large enough for the fragment; if not, the allocation will fail and page_frag_alloc() will return NULL. Link: https://lkml.kernel.org/r/20220715125013.247085-1-mlombard@redhat.com Fixes: b63ae8ca ("mm/net: Rename and move page fragment handling from net/ to mm/") Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com> Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com> Cc: Chen Lin <chen45464546@163.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-