- 12 4月, 2023 6 次提交
-
-
由 Darrick J. Wong 提交于
mainline inclusion from mainline-v5.19-rc5 commit 732436ef category: bugfix bugzilla: 187164, https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=732436ef916b4f338d672ea56accfdb11e8d0732 -------------------------------- We're about to make this logic do a bit more, so convert the macro to a static inline function for better typechecking and fewer shouty macros. No functional changes here. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com> conflicts: fs/xfs/libxfs/xfs_bmap.c fs/xfs/libxfs/xfs_bmap_btree.c fs/xfs/libxfs/xfs_inode_fork.c fs/xfs/libxfs/xfs_inode_fork.h fs/xfs/scrub/bmap.c fs/xfs/scrub/symlink.c fs/xfs/xfs_inode.c fs/xfs/xfs_ioctl.c fs/xfs/xfs_qm.c fs/xfs/xfs_reflink.c Signed-off-by: NLong Li <leo.lilong@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Brian Foster 提交于
mainline inclusion from mainline-v5.11-rc4 commit 06058bc4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=06058bc40534530e617e5623775c53bb24f032cb -------------------------------- Freed extents are marked busy from the point the freeing transaction commits until the associated CIL context is checkpointed to the log. This prevents reuse and overwrite of recently freed blocks before the changes are committed to disk, which can lead to corruption after a crash. The exception to this rule is that metadata allocation is allowed to reuse busy extents because metadata changes are also logged. As of commit 97d3ac75 ("xfs: exact busy extent tracking"), XFS has allowed modification or complete invalidation of outstanding busy extents for metadata allocations. This implementation assumes that use of the associated extent is imminent, which is not always the case. For example, the trimmed extent might not satisfy the minimum length of the allocation request, or the allocation algorithm might be involved in a search for the optimal result based on locality. generic/019 reproduces a corruption caused by this scenario. First, a metadata block (usually a bmbt or symlink block) is freed from an inode. A subsequent bmbt split on an unrelated inode attempts a near mode allocation request that invalidates the busy block during the search, but does not ultimately allocate it. Due to the busy state invalidation, the block is no longer considered busy to subsequent allocation. A direct I/O write request immediately allocates the block and writes to it. Finally, the filesystem crashes while in a state where the initial metadata block free had not committed to the on-disk log. After recovery, the original metadata block is in its original location as expected, but has been corrupted by the aforementioned dio. This demonstrates that it is fundamentally unsafe to modify busy extent state for extents that are not guaranteed to be allocated. This applies to pretty much all of the code paths that currently trim busy extents for one reason or another. Therefore to address this problem, drop the reuse mechanism from the busy extent trim path. This code already knows how to return partial non-busy ranges of the targeted free extent and higher level code tracks the busy state of the allocation attempt. If a block allocation fails where one or more candidate extents is busy, we force the log and retry the allocation. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLong Li <leo.lilong@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Zheng Yongjun 提交于
mainline inclusion from mainline-v5.10-rc5 commit 1189686e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1189686e5440041057f8cc21a7c1d13bb6642cb9 -------------------------------- Replace a comma between expression statements by a semicolon. Signed-off-by: NZheng Yongjun <zhengyongjun3@huawei.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NLong Li <leo.lilong@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Dave Chinner 提交于
mainline inclusion from mainline-v5.17-rc6 commit 941fbdfd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=941fbdfd6dd0f1d7961c28123b5460912f678cb5 -------------------------------- xfs_ail_push_all_sync() has a loop like this: while max_ail_lsn { prepare_to_wait(ail_empty) target = max_ail_lsn wake_up(ail_task); schedule() } Which is designed to sleep until the AIL is emptied. When xfs_ail_update_finish() moves the tail of the log, it does: if (list_empty(&ailp->ail_head)) wake_up_all(&ailp->ail_empty); So it will only wake up the sync push waiter when the AIL goes empty. If, by the time the push waiter has woken, the AIL has more in it, it will reset the target, wake the push task and go back to sleep. The problem here is that if the AIL is having items added to it when xfs_ail_push_all_sync() is called, then they may get inserted into the AIL at a LSN higher than the target LSN. At this point, xfsaild_push() will see that the target is X, the item LSNs are (X+N) and skip over them, hence never pushing the out. The result of this the AIL will not get emptied by the AIL push thread, hence xfs_ail_finish_update() will never see the AIL being empty even if it moves the tail. Hence xfs_ail_push_all_sync() never gets woken and hence cannot update the push target to capture the items beyond the current target on the LSN. This is a TOCTOU type of issue so the way to avoid it is to not use the push target at all for sync pushes. We know that a sync push is being requested by the fact the ail_empty wait queue is active, hence the xfsaild can just set the target to max_ail_lsn on every push that we see the wait queue active. Hence we no longer will leave items on the AIL that are beyond the LSN sampled at the start of a sync push. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChandan Babu R <chandan.babu@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NLong Li <leo.lilong@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Dave Chinner 提交于
mainline inclusion from mainline-v5.17-rc6 commit dbd0f529 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dbd0f5299302f8506637592e2373891a748c6990 -------------------------------- AIL flushing can get stuck here: [316649.005769] INFO: task xfsaild/pmem1:324525 blocked for more than 123 seconds. [316649.007807] Not tainted 5.17.0-rc6-dgc+ #975 [316649.009186] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [316649.011720] task:xfsaild/pmem1 state:D stack:14544 pid:324525 ppid: 2 flags:0x00004000 [316649.014112] Call Trace: [316649.014841] <TASK> [316649.015492] __schedule+0x30d/0x9e0 [316649.017745] schedule+0x55/0xd0 [316649.018681] io_schedule+0x4b/0x80 [316649.019683] xfs_buf_wait_unpin+0x9e/0xf0 [316649.021850] __xfs_buf_submit+0x14a/0x230 [316649.023033] xfs_buf_delwri_submit_buffers+0x107/0x280 [316649.024511] xfs_buf_delwri_submit_nowait+0x10/0x20 [316649.025931] xfsaild+0x27e/0x9d0 [316649.028283] kthread+0xf6/0x120 [316649.030602] ret_from_fork+0x1f/0x30 in the situation where flushing gets preempted between the unpin check and the buffer trylock under nowait conditions: blk_start_plug(&plug); list_for_each_entry_safe(bp, n, buffer_list, b_list) { if (!wait_list) { if (xfs_buf_ispinned(bp)) { pinned++; continue; } Here >>>>>> if (!xfs_buf_trylock(bp)) continue; This means submission is stuck until something else triggers a log force to unpin the buffer. To get onto the delwri list to begin with, the buffer pin state has already been checked, and hence it's relatively rare we get a race between flushing and encountering a pinned buffer in delwri submission to begin with. Further, to increase the pin count the buffer has to be locked, so the only way we can hit this race without failing the trylock is to be preempted between the pincount check seeing zero and the trylock being run. Hence to avoid this problem, just invert the order of trylock vs pin check. We shouldn't hit that many pinned buffers here, so optimising away the trylock for pinned buffers should not matter for performance at all. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChandan Babu R <chandan.babu@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NLong Li <leo.lilong@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Dave Chinner 提交于
mainline inclusion from mainline-v5.17-rc6 commit a9a4bc8c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a9a4bc8c76d747aa40b30e2dfc176c781f353a08 -------------------------------- After 963 iterations of generic/530, it deadlocked during recovery on a pinned inode cluster buffer like so: XFS (pmem1): Starting recovery (logdev: internal) INFO: task kworker/8:0:306037 blocked for more than 122 seconds. Not tainted 5.17.0-rc6-dgc+ #975 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/8:0 state:D stack:13024 pid:306037 ppid: 2 flags:0x00004000 Workqueue: xfs-inodegc/pmem1 xfs_inodegc_worker Call Trace: <TASK> __schedule+0x30d/0x9e0 schedule+0x55/0xd0 schedule_timeout+0x114/0x160 __down+0x99/0xf0 down+0x5e/0x70 xfs_buf_lock+0x36/0xf0 xfs_buf_find+0x418/0x850 xfs_buf_get_map+0x47/0x380 xfs_buf_read_map+0x54/0x240 xfs_trans_read_buf_map+0x1bd/0x490 xfs_imap_to_bp+0x4f/0x70 xfs_iunlink_map_ino+0x66/0xd0 xfs_iunlink_map_prev.constprop.0+0x148/0x2f0 xfs_iunlink_remove_inode+0xf2/0x1d0 xfs_inactive_ifree+0x1a3/0x900 xfs_inode_unlink+0xcc/0x210 xfs_inodegc_worker+0x1ac/0x2f0 process_one_work+0x1ac/0x390 worker_thread+0x56/0x3c0 kthread+0xf6/0x120 ret_from_fork+0x1f/0x30 </TASK> task:mount state:D stack:13248 pid:324509 ppid:324233 flags:0x00004000 Call Trace: <TASK> __schedule+0x30d/0x9e0 schedule+0x55/0xd0 schedule_timeout+0x114/0x160 __down+0x99/0xf0 down+0x5e/0x70 xfs_buf_lock+0x36/0xf0 xfs_buf_find+0x418/0x850 xfs_buf_get_map+0x47/0x380 xfs_buf_read_map+0x54/0x240 xfs_trans_read_buf_map+0x1bd/0x490 xfs_imap_to_bp+0x4f/0x70 xfs_iget+0x300/0xb40 xlog_recover_process_one_iunlink+0x4c/0x170 xlog_recover_process_iunlinks.isra.0+0xee/0x130 xlog_recover_finish+0x57/0x110 xfs_log_mount_finish+0xfc/0x1e0 xfs_mountfs+0x540/0x910 xfs_fs_fill_super+0x495/0x850 get_tree_bdev+0x171/0x270 xfs_fs_get_tree+0x15/0x20 vfs_get_tree+0x24/0xc0 path_mount+0x304/0xba0 __x64_sys_mount+0x108/0x140 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> task:xfsaild/pmem1 state:D stack:14544 pid:324525 ppid: 2 flags:0x00004000 Call Trace: <TASK> __schedule+0x30d/0x9e0 schedule+0x55/0xd0 io_schedule+0x4b/0x80 xfs_buf_wait_unpin+0x9e/0xf0 __xfs_buf_submit+0x14a/0x230 xfs_buf_delwri_submit_buffers+0x107/0x280 xfs_buf_delwri_submit_nowait+0x10/0x20 xfsaild+0x27e/0x9d0 kthread+0xf6/0x120 ret_from_fork+0x1f/0x30 We have the mount process waiting on an inode cluster buffer read, inodegc doing unlink waiting on the same inode cluster buffer, and the AIL push thread blocked in writeback waiting for the inode cluster buffer to become unpinned. What has happened here is that the AIL push thread has raced with the inodegc process modifying, committing and pinning the inode cluster buffer here in xfs_buf_delwri_submit_buffers() here: blk_start_plug(&plug); list_for_each_entry_safe(bp, n, buffer_list, b_list) { if (!wait_list) { if (xfs_buf_ispinned(bp)) { pinned++; continue; } Here >>>>>> if (!xfs_buf_trylock(bp)) continue; Basically, the AIL has found the buffer wasn't pinned and got the lock without blocking, but then the buffer was pinned. This implies the processing here was pre-empted between the pin check and the lock, because the pin count can only be increased while holding the buffer locked. Hence when it has gone to submit the IO, it has blocked waiting for the buffer to be unpinned. With all executing threads now waiting on the buffer to be unpinned, we normally get out of situations like this via the background log worker issuing a log force which will unpinned stuck buffers like this. But at this point in recovery, we haven't started the log worker. In fact, the first thing we do after processing intents and unlinked inodes is *start the log worker*. IOWs, we start it too late to have it break deadlocks like this. Avoid this and any other similar deadlock vectors in intent and unlinked inode recovery by starting the log worker before we recover intents and unlinked inodes. This part of recovery runs as though the filesystem is fully active, so we really should have the same infrastructure running as we normally do at runtime. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChandan Babu R <chandan.babu@oracle.com> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NLong Li <leo.lilong@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 04 4月, 2023 19 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @wang-yufen316 This is achieved by broadcasting ARP or ND packets to all of its slave devices on transmit side. The switch will take further actions based on proper configuration. A new sysctl knob "net.bonding.broadcast_arp_or_nd" is introduced which controls the behaviour of broadcasting. Link:https://gitee.com/openeuler/kernel/pulls/550 Reviewed-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @zhangjialin11 Pull new CVEs: CVE-2023-1513 CVE-2022-4269 net bugfixes from Zhengchao Shao ext4 bugfixes from Zhihao Cheng and Baokun Li timer bugfix from Yu Liao nvme bugfix from Li Lingfeng xfs bugfixes from Zhihao Cheng scsi bugfix from Yu Kuai Link:https://gitee.com/openeuler/kernel/pulls/561 Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/539 fix compile warnning by remove unused function and variable. Link:https://gitee.com/openeuler/kernel/pulls/560 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 Gustavo A. R. Silva 提交于
mainline inclusion from mainline-v5.16-rc1 commit 12929198 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6SFHJ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=129291980f4901ae68ee3ba4344bdc38cf5f800d -------------------------------- Make use of the struct_size() helper instead of an open-coded version, in order to avoid any potential type mistakes or integer overflows that, in the worst scenario, could lead to heap overflows. Link: https://github.com/KSPP/linux/issues/160Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20210929201718.GA342296@embeddedorSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Gustavo A. R. Silva 提交于
mainline inclusion from mainline-v5.16-rc1 commit 69508d43 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6SFHJ CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=69508d43334e3b09c344f662272bcf24a5b508ed -------------------------------- Make use of the struct_size() and flex_array_size() helpers instead of an open-coded version, in order to avoid any potential type mistakes or integer overflows that, in the worse scenario, could lead to heap overflows. Link: https://github.com/KSPP/linux/issues/160Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20210928193107.GA262595@embeddedorSigned-off-by: NJakub Kicinski <kuba@kernel.org> Conflicts: net/sched/sch_api.c Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Zhang Yi 提交于
mainline inclusion from mainline-v6.3-rc1 commit 240930fb category: perf bugzilla: https://gitee.com/openeuler/kernel/issues/I6S63P CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=240930fb7e6b52229bdee5b1423bfeab0002fed2 -------------------------------- In the dio write path, we only take shared inode lock for the case of aligned overwriting initialized blocks inside EOF. But for overwriting preallocated blocks, it may only need to split unwritten extents, this procedure has been protected under i_data_sem lock, it's safe to release the exclusive inode lock and take shared inode lock. This could give a significant speed up for multi-threaded writes. Test on Intel Xeon Gold 6140 and nvme SSD with below fio parameters. direct=1 ioengine=libaio iodepth=10 numjobs=10 runtime=60 rw=randwrite size=100G And the test result are: Before: bs=4k IOPS=11.1k, BW=43.2MiB/s bs=16k IOPS=11.1k, BW=173MiB/s bs=64k IOPS=11.2k, BW=697MiB/s After: bs=4k IOPS=41.4k, BW=162MiB/s bs=16k IOPS=41.3k, BW=646MiB/s bs=64k IOPS=13.5k, BW=843MiB/s Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221226062015.3479416-1-yi.zhang@huaweicloud.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Conflicts: fs/ext4/file.c [ 2f632965("iomap: pass a flags argument to iomap_dio_rw") is not applied. ] Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Baokun Li 提交于
hulk inclusion category: bugfix bugzilla: 188500, https://gitee.com/openeuler/kernel/issues/I6RJ0V CVE: NA -------------------------------- We got a WARNING in ext4_add_complete_io: ================================================================== WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250 CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85 RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4] [...] Call Trace: <TASK> ext4_end_bio+0xa8/0x240 [ext4] bio_endio+0x195/0x310 blk_update_request+0x184/0x770 scsi_end_request+0x2f/0x240 scsi_io_completion+0x75/0x450 scsi_finish_command+0xef/0x160 scsi_complete+0xa3/0x180 blk_complete_reqs+0x60/0x80 blk_done_softirq+0x25/0x40 __do_softirq+0x119/0x4c8 run_ksoftirqd+0x42/0x70 smpboot_thread_fn+0x136/0x3c0 kthread+0x140/0x1a0 ret_from_fork+0x2c/0x50 ================================================================== Above issue may happen as follows: cpu1 cpu2 ----------------------------|---------------------------- mount -o dioread_lock ext4_writepages ext4_do_writepages *if (ext4_should_dioread_nolock(inode))* // rsv_blocks is not assigned here mount -o remount,dioread_nolock ext4_journal_start_with_reserve __ext4_journal_start __ext4_journal_start_sb jbd2__journal_start *if (rsv_blocks)* // h_rsv_handle is not initialized here mpage_map_and_submit_extent mpage_map_one_extent dioread_nolock = ext4_should_dioread_nolock(inode) if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN)) mpd->io_submit.io_end->handle = handle->h_rsv_handle ext4_set_io_unwritten_flag io_end->flag |= EXT4_IO_END_UNWRITTEN // now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag scsi_finish_command scsi_io_completion scsi_io_completion_action scsi_end_request blk_update_request req_bio_endio bio_endio bio->bi_end_io > ext4_end_bio ext4_put_io_end_defer ext4_add_complete_io // trigger WARN_ON(!io_end->handle && sbi->s_journal); The immediate cause of this problem is that ext4_should_dioread_nolock() function returns inconsistent values in the ext4_do_writepages() and mpage_map_one_extent(). There are four conditions in this function that can be changed at mount time to cause this problem. These four conditions can be divided into two categories: (1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl (2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount The two in the first category have been fixed by commit c8585c6f ("ext4: fix races between changing inode journal mode and ext4_writepages") and commit cb85f4d2 ("ext4: fix race between writepages and enabling EXT4_EXTENTS_FL") respectively. Two cases in the other category have not yet been fixed, and the above issue is caused by this situation. We refer to the fix for the first category, when applying options during remount, we grab s_writepages_rwsem to avoid racing with writepages ops to trigger this problem. Fixes: 6b523df4 ("ext4: use transaction reservation for extent conversion in ext4_end_io") Cc: stable@vger.kernel.org Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yang Guo 提交于
mainline inclusion from mainline-v6.1-rc1 commit af246cc6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6RQVI Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=af246cc6d0ed11318223606128bb0b09866c4c08 -------------------------------- CNTPCT_LO and CNTVCT_LO are defined by mistake in commit '8b82c4f8', so fix them according to the Arm ARM DDI 0487I.a, Table I2-4 "CNTBaseN memory map" as follows: Offset Register Type Description 0x000 CNTPCT[31:0] RO Physical Count register. 0x004 CNTPCT[63:32] RO 0x008 CNTVCT[31:0] RO Virtual Count register. 0x00C CNTVCT[63:32] RO Fixes: 8b82c4f8 ("clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL") Cc: stable@vger.kernel.org Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NYang Guo <guoyang2@huawei.com> Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20220927033221.49589-1-zhangshaokun@hisilicon.comSigned-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Greg Kroah-Hartman 提交于
stable inclusion from stable-v5.10.169 commit 6416c2108ba54d569e4c98d3b62ac78cb12e7107 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6OOP3 CVE: CVE-2023-1513 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6416c2108ba54d569e4c98d3b62ac78cb12e7107 -------------------------------- commit 2c10b614 upstream. When calling the KVM_GET_DEBUGREGS ioctl, on some configurations, there might be some unitialized portions of the kvm_debugregs structure that could be copied to userspace. Prevent this as is done in the other kvm ioctls, by setting the whole structure to 0 before copying anything into it. Bonus is that this reduces the lines of code as the explicit flag setting and reserved space zeroing out can be removed. Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <x86@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: stable <stable@kernel.org> Reported-by: NXingyuan Mo <hdthky0@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20230214103304.3689213-1-gregkh@linuxfoundation.org> Tested-by: NXingyuan Mo <hdthky0@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Li Lingfeng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6QMTU -------------------------------- Recently, a null-ptr-deref problem occur when submitting IO to nvme disk: [34432.226539] ========================================================== [34432.226579] BUG: KASAN: null-ptr-deref in trace_event_raw_event_nvme_complete_rq+0x13c/0x270 [nvme_core] [34432.226584] Read of size 2 at addr 0000000000000002 by task loop0/32 [34432.226586] [34432.226594] CPU: 0 PID: 3242729 Comm: loop0 Kdump: loaded Tainted: [34432.226598] Hardware name: Huawei TaiShan 2280 V2/BC82AMDD [34432.226602] Call trace: [34432.226610] dump_backtrace+0x0/0x2fc [34432.226615] show_stack+0x20/0x30 [34432.226623] dump_stack+0x104/0x17c [34432.226630] __kasan_report+0x138/0x140 [34432.226634] kasan_report+0x44/0xdc [34432.226639] __asan_load2+0x90/0xd0 [34432.226662] trace_event_raw_event_nvme_complete_rq+0x13c/0x270 [34432.226684] nvme_complete_rq+0x228/0x480 [nvme_core] [34432.226698] nvme_pci_complete_rq+0x184/0x1b4 [nvme] [34432.226706] nvme_irq+0x270/0x500 [nvme] [34432.226714] __handle_irq_event_percpu+0x8c/0x324 [34432.226719] handle_irq_event_percpu+0x88/0x11c [34432.226724] handle_irq_event+0x110/0x2b0 [34432.226729] handle_fasteoi_irq+0x1e4/0x3f4 [34432.226734] __handle_domain_irq+0xbc/0x130 [34432.226739] gic_handle_irq+0x78/0x460 [34432.226743] el1_irq+0xb8/0x140 [34432.226750] __slab_alloc+0x38/0x70 [34432.226756] kmem_cache_alloc+0x6b8/0x904 [34432.226762] mempool_alloc_slab+0x3c/0x60 [34432.226766] mempool_alloc+0xf0/0x440 [34432.226772] bio_alloc_bioset+0x208/0x2f0 [34432.226899] io_submit_init_bio+0x3c/0x190 [ext4] [34432.226991] ext4_bio_write_page+0x540/0xbd0 [ext4] [34432.227082] mpage_submit_page+0xb0/0x120 [ext4] [34432.227173] mpage_process_page_bufs+0x25c/0x2b4 [ext4] [34432.227265] mpage_prepare_extent_to_map+0x3b8/0x75c [ext4] [34432.227356] ext4_writepages+0x454/0xcb4 [ext4] [34432.227361] do_writepages+0xc4/0x1c0 ... This can be reproduced by following steps: 1) modprobe nvme 2) echo nvme:* > /sys/kernel/debug/tracing/set_event 3) dd if=/dev/random of=/dev/nvmexxx bs=1M count=1024 Generating command_id by nvme_cid() in trace event instead of nvme_req(req)->cmd->common.command_id can fix it since nvme_req(req)->cmd can be NULL in sometimes. Fixes: eae0bc99 ("nvme: use command_id instead of req->tag in trace_nvme_complete_rq()") Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Darrick J. Wong 提交于
mainline inclusion from mainline-v5.18-rc1 commit 85bcfa26 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=85bcfa26f9a3782be37d4feafd49668b98b8bdbe -------------------------------- On a modern filesystem, we don't allow userspace to allocate blocks for data storage from the per-AG space reservations, the user-controlled reservation pool that prevents ENOSPC in the middle of internal operations, or the internal per-AG set-aside that prevents unwanted filesystem shutdowns due to ENOSPC during a bmap btree split. Since we now consider freespace btree blocks as unavailable for allocation for data storage, we shouldn't report those blocks via statfs either. This makes the numbers that we return via the statfs f_bavail and f_bfree fields a more conservative estimate of actual free space. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Darrick J. Wong 提交于
mainline inclusion from mainline-v5.18-rc1 commit c8c56825 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c8c568259772751a14e969b7230990508de73d9d -------------------------------- xfs_reserve_blocks controls the size of the user-visible free space reserve pool. Given the difference between the current and requested pool sizes, it will try to reserve free space from fdblocks. However, the amount requested from fdblocks is also constrained by the amount of space that we think xfs_mod_fdblocks will give us. If we forget to subtract m_allocbt_blks before calling xfs_mod_fdblocks, it will will return ENOSPC and we'll hang the kernel at mount due to the infinite loop. In commit fd43cf60, we decided that xfs_mod_fdblocks should not hand out the "free space" used by the free space btrees, because some portion of the free space btrees hold in reserve space for future btree expansion. Unfortunately, xfs_reserve_blocks' estimation of the number of blocks that it could request from xfs_mod_fdblocks was not updated to include m_allocbt_blks, so if space is extremely low, the caller hangs. Fix this by creating a function to estimate the number of blocks that can be reserved from fdblocks, which needs to exclude the set-aside and m_allocbt_blks. Found by running xfs/306 (which formats a single-AG 20MB filesystem) with an fstests configuration that specifies a 1k blocksize and a specially crafted log size that will consume 7/8 of the space (17920 blocks, specifically) in that AG. Cc: Brian Foster <bfoster@redhat.com> Fixes: fd43cf60 ("xfs: set aside allocation btree blocks from block reservation") Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Conflicts: fs/xfs/xfs_mount.h [ 15f04fdc("xfs: remove infinite loop when reserving free block pool") applied earlier. ] Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Brian Foster 提交于
mainline inclusion from mainline-v5.13-rc1 commit fd43cf60 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fd43cf600cf61c66ae0a1021aca2f636115c7fcb -------------------------------- The blocks used for allocation btrees (bnobt and countbt) are technically considered free space. This is because as free space is used, allocbt blocks are removed and naturally become available for traditional allocation. However, this means that a significant portion of free space may consist of in-use btree blocks if free space is severely fragmented. On large filesystems with large perag reservations, this can lead to a rare but nasty condition where a significant amount of physical free space is available, but the majority of actual usable blocks consist of in-use allocbt blocks. We have a record of a (~12TB, 32 AG) filesystem with multiple AGs in a state with ~2.5GB or so free blocks tracked across ~300 total allocbt blocks, but effectively at 100% full because the the free space is entirely consumed by refcountbt perag reservation. Such a large perag reservation is by design on large filesystems. The problem is that because the free space is so fragmented, this AG contributes the 300 or so allocbt blocks to the global counters as free space. If this pattern repeats across enough AGs, the filesystem lands in a state where global block reservation can outrun physical block availability. For example, a streaming buffered write on the affected filesystem continues to allow delayed allocation beyond the point where writeback starts to fail due to physical block allocation failures. The expected behavior is for the delalloc block reservation to fail gracefully with -ENOSPC before physical block allocation failure is a possibility. To address this problem, set aside in-use allocbt blocks at reservation time and thus ensure they cannot be reserved until truly available for physical allocation. This allows alloc btree metadata to continue to reside in free space, but dynamically adjusts reservation availability based on internal state. Note that the logic requires that the allocbt counter is fully populated at reservation time before it is fully effective. We currently rely on the mount time AGF scan in the perag reservation initialization code for this dependency on filesystems where it's most important (i.e. with active perag reservations). Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Brian Foster 提交于
mainline inclusion from mainline-v5.13-rc1 commit 16eaab83 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=16eaab839a9273ed156ebfccbd40c15d1e72f3d8 -------------------------------- Introduce an in-core counter to track the sum of all allocbt blocks used by the filesystem. This value is currently tracked per-ag via the ->agf_btreeblks field in the AGF, which also happens to include rmapbt blocks. A global, in-core count of allocbt blocks is required to identify the subset of global ->m_fdblocks that consists of unavailable blocks currently used for allocation btrees. To support this calculation at block reservation time, construct a similar global counter for allocbt blocks, populate it on first read of each AGF and update it as allocbt blocks are used and released. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Davide Caratti 提交于
mainline inclusion from mainline-v6.3-rc1 commit ca22da2f category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I64END CVE: CVE-2022-4269 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2-rc7&id=ca22da2fbd693b54dc8e3b7b54ccc9f7e9ba3640 -------------------------------- William reports kernel soft-lockups on some OVS topologies when TC mirred egress->ingress action is hit by local TCP traffic [1]. The same can also be reproduced with SCTP (thanks Xin for verifying), when client and server reach themselves through mirred egress to ingress, and one of the two peers sends a "heartbeat" packet (from within a timer). Enqueueing to backlog proved to fix this soft lockup; however, as Cong noticed [2], we should preserve - when possible - the current mirred behavior that counts as "overlimits" any eventual packet drop subsequent to the mirred forwarding action [3]. A compromise solution might use the backlog only when tcf_mirred_act() has a nest level greater than one: change tcf_mirred_forward() accordingly. Also, add a kselftest that can reproduce the lockup and verifies TC mirred ability to account for further packet drops after TC mirred egress->ingress (when the nest level is 1). [1] https://lore.kernel.org/netdev/33dc43f587ec1388ba456b4915c75f02a8aae226.1663945716.git.dcaratti@redhat.com/ [2] https://lore.kernel.org/netdev/Y0w%2FWWY60gqrtGLp@pop-os.localdomain/ [3] such behavior is not guaranteed: for example, if RPS or skb RX timestamping is enabled on the mirred target device, the kernel can defer receiving the skb and return NET_RX_SUCCESS inside tcf_mirred_forward(). Reported-by: NWilliam Zhao <wizhao@redhat.com> CC: Xin Long <lucien.xin@gmail.com> Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Conflicts: net/sched/act_mirred.c tools/testing/selftests/net/forwarding/tc_actions.sh Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Davide Caratti 提交于
mainline inclusion from mainline-v6.3-rc1 commit 78dcdffe category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I64END CVE: CVE-2022-4269 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2-rc7&id=78dcdffe0418ac8f3f057f26fe71ccf4d8ed851f -------------------------------- with commit e2ca070f ("net: sched: protect against stack overflow in TC act_mirred"), act_mirred protected itself against excessive stack growth using per_cpu counter of nested calls to tcf_mirred_act(), and capping it to MIRRED_RECURSION_LIMIT. However, such protection does not detect recursion/loops in case the packet is enqueued to the backlog (for example, when the mirred target device has RPS or skb timestamping enabled). Change the wording from "recursion" to "nesting" to make it more clear to readers. CC: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 wenxu 提交于
mainline inclusion from mainline-v5.11-rc1 commit fa6d6399 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I64END CVE: CVE-2022-4269 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2-rc7&id=fa6d639930ee5cd3f932cc314f3407f07a06582d -------------------------------- This one is prepare for the next patch. Signed-off-by: Nwenxu <wenxu@ucloud.cn> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Conflicts: include/net/sch_generic.h Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6NBAZ CVE: NA -------------------------------- If alua_rtpg_queue() failed from alua_activate(), then 'qdata' is not freed, which will cause following memleak: unreferenced object 0xffff88810b2c6980 (size 32): comm "kworker/u16:2", pid 635322, jiffies 4355801099 (age 1216426.076s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 40 39 24 c1 ff ff ff ff 00 f8 ea 0a 81 88 ff ff @9$............. backtrace: [<0000000098f3a26d>] alua_activate+0xb0/0x320 [<000000003b529641>] scsi_dh_activate+0xb2/0x140 [<000000007b296db3>] activate_path_work+0xc6/0xe0 [dm_multipath] [<000000007adc9ace>] process_one_work+0x3c5/0x730 [<00000000c457a985>] worker_thread+0x93/0x650 [<00000000cb80e628>] kthread+0x1ba/0x210 [<00000000a1e61077>] ret_from_fork+0x22/0x30 Fix the problem by freeing 'qdata' in error path. Fixes: 625fe857 ("scsi: scsi_dh_alua: Check scsi_device_get() return value") Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Tony Lu 提交于
anolis inclusion from devel-5.10-v5.10.134-12 commit b90e28f7170e1ae40c572f9f80a50bbdc8f8b99f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I697AN CVE: NA Reference: https://gitee.com/anolis/cloud-kernel/commit/b90e28f7170e1ae40c572f9f80a50bbdc8f8b99f --------------------------- OpenAnolis Bug Tracker:0000282 This is achieved by broadcasting ARP or ND packets to all of its slave devices on transmit side. The switch will take further actions based on proper configuration. A new sysctl knob "net.bonding.broadcast_arp_or_nd" is introduced which controls the behaviour of broadcasting. Signed-off-by: NTony Lu <tonylu@linux.alibaba.com> Acked-by: NDust Li <dust.li@linux.alibaba.com> Signed-off-by: NQiao Ma <mqaio@linux.alibaba.com> Reviewed-by: NShile Zhang <shile.zhang@linux.alibaba.com> Acked-by: NDust Li <dust.li@linux.alibaba.com> Signed-off-by: NWang Yufen <wangyufen@huawei.com>
-
- 03 4月, 2023 3 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @svishen Currently the vf function reset needs to delay 5000ms for stack recovery. This is too long for product configurations and cause configuration failures. According to the tests, 500ms delay is enough for reset process except PF FLR. So this patch adapts this delay in these scenarios. issue: https://gitee.com/openeuler/kernel/issues/I6SLBO Link:https://gitee.com/openeuler/kernel/pulls/558 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 Sui Jingfeng 提交于
LoongArch inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6RRGJ -------------------------------- fix compile warnning by remove unused function and variable. Signed-off-by: NSui Jingfeng <15330273260@189.cn> Change-Id: I45a9adabcadd4f02e69fab44207aef4883e3fdb9 (cherry picked from commit c24ddfb3)
-
由 Jie Wang 提交于
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6SLBO CVE: NA ---------------------------------------------------------------------- Currently the vf function reset needs to delay 5000ms for stack recovery. This is too long for product configurations and cause configuration failures. According to the tests, 500ms delay is enough for reset process except PF FLR. So this patch adapts this delay in these scenarios. Signed-off-by: NJie Wang <wangjie125@huawei.com>
-
- 01 4月, 2023 1 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @barry19901226 fix CVE-2023-0266 Link:https://gitee.com/openeuler/kernel/pulls/541 Reviewed-by: Zucheng Zheng <zhengzucheng@huawei.com> Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
- 31 3月, 2023 1 次提交
-
-
由 Clement Lecigne 提交于
stable inclusion from stable-v5.10.162 commit df02234e6b87d2a9a82acd3198e44bdeff8488c6 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6AOWP CVE: CVE-2023-0266 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=df02234e6b87d2a9a82acd3198e44bdeff8488c6 -------------------------------- [ Note: this is a fix that works around the bug equivalently as the two upstream commits: 1fa4445f ("ALSA: control - introduce snd_ctl_notify_one() helper") 56b88b50 ("ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAF") but in a simpler way to fit with older stable trees -- tiwai ] Add missing locking in ctl_elem_read_user/ctl_elem_write_user which can be easily triggered and turned into an use-after-free. Example code paths with SNDRV_CTL_IOCTL_ELEM_READ: 64-bits: snd_ctl_ioctl snd_ctl_elem_read_user [takes controls_rwsem] snd_ctl_elem_read [lock properly held, all good] [drops controls_rwsem] 32-bits (compat): snd_ctl_ioctl_compat snd_ctl_elem_write_read_compat ctl_elem_write_read snd_ctl_elem_read [missing lock, not good] CVE-2023-0266 was assigned for this issue. Signed-off-by: NClement Lecigne <clecigne@google.com> Cc: stable@kernel.org # 5.12 and older Signed-off-by: NTakashi Iwai <tiwai@suse.de> Reviewed-by: NJaroslav Kysela <perex@perex.cz> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NHui Tang <tanghui20@huawei.com>
-
- 29 3月, 2023 10 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @zhangjialin11 Pull new CVEs: CVE-2023-1281 CVE-2022-48423 CVE-2023-1249 CVE-2022-48425 CVE-2022-48424 CVE-2023-28327 CVE-2023-28466 CVE-2023-1380 block and md/raid6 bugfixes from Zhong Jinghua fs bugfixes from Zhihao Cheng and Baokun Li tty bugfix from Yi Yang mm bugfixes from ZhangPeng and Ze Zuo bpf bugfixes from Pu Lehui and Liu Jian ima bugfix from GUO Zihua softirq and arch bugfixes from Lin Yujun Link:https://gitee.com/openeuler/kernel/pulls/529 Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
-
由 Ming Lei 提交于
mainline inclusion from mainline-v6.2-rc1 commit d36a9ea5 category: bugfix bugzilla: 187268, https://gitee.com/openeuler/kernel/issues/I5N162 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d36a9ea5e7766961e753ee38d4c331bbe6ef659b ---------------------------------------- For blk-mq, queue release handler is usually called after blk_mq_freeze_queue_wait() returns. However, the q_usage_counter->release() handler may not be run yet at that time, so this can cause a use-after-free. Fix the issue by moving percpu_ref_exit() into blk_free_queue_rcu(). Since ->release() is called with rcu read lock held, it is agreed that the race should be covered in caller per discussion from the two links. Reported-by: NZhang Wensheng <zhangwensheng@huaweicloud.com> Reported-by: NZhong Jinghua <zhongjinghua@huawei.com> Link: https://lore.kernel.org/linux-block/Y5prfOjyyjQKUrtH@T590/T/#u Link: https://lore.kernel.org/lkml/Y4%2FmzMd4evRg9yDi@fedora/ Cc: Hillf Danton <hdanton@sina.com> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Dennis Zhou <dennis@kernel.org> Fixes: 2b0d3d3e ("percpu_ref: reduce memory footprint of percpu_ref in fast path") Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20221215021629.74870-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk> conflicts: block/blk-sysfs.c block/blk-core.c Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Ming Lei 提交于
mainline inclusion from mainline-v5.18-rc1 commit ba3e8456 category: bugfix bugzilla: 187268, https://gitee.com/openeuler/kernel/issues/I5N162 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ba3e845665fbbb0252336f27200cd5cf288a3573 ---------------------------------------- After blk_cleanup_queue() returns, disk may not be released yet, so probably bio may still be submitted and ->q_usage_counter may be touched, so far this way seems safe, but not good from API's viewpoint. Move the release q_usage_counter into blk_queue_release(). Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220308055200.735835-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk> conflicts: block/blk-core.c block/blk-sysfs.c Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Zhong Jinghua 提交于
hulk inclusion category: bugfix bugzilla: 187268, https://gitee.com/openeuler/kernel/issues/I5N162 ---------------------------------------- This reverts commit 51e35e67. There is a new fix for this problem in the mainline patch, so the patch should return to the mainline solution. mainline patch: d36a9ea5 ("block: fix use-after-free of q->q_usage_counter") Fixes: 51e35e67("block: fix null-deref in percpu_ref_put") Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Miaohe Lin 提交于
mainline inclusion from mainline-v6.3-rc1 commit 3109de30 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6POXN CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3109de308987ceae413ee015038d51e2a86c7806 -------------------------------- It's possible that kcompactd_run could fail to run kcompactd for a hot added node and leave pgdat->kcompactd as NULL. So pgdat->kcompactd should be checked here to avoid possible NULL pointer dereference. Link: https://lkml.kernel.org/r/20220418141253.24298-10-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Charan Teja Kalla <charante@codeaurora.org> Cc: David Hildenbrand <david@redhat.com> Cc: Pintu Kumar <pintu@codeaurora.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NZe Zuo <zuoze1@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Zhong Jinghua 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6OSXU ---------------------------------------- commit "md/raid6: refactor raid5_read_one_chunk" incorrectly merged the code. Repeatedly applying for memory leads to memory leaks. Fix it by removing redundant allocating memory code. Fixes: c13c2cd2 ("md/raid6: refactor raid5_read_one_chunk") Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Dave Chinner 提交于
mainline inclusion from mainline-v5.17-rc3 commit ebb7fb15 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ebb7fb1557b1d03b906b668aa2164b51e6b7d19a -------------------------------- Trond Myklebust reported soft lockups in XFS IO completion such as this: watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [kworker/12:1:3106] CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-305.10.2.el8_4.x86_64 #1 Workqueue: xfs-conv/md127 xfs_end_io [xfs] RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20 Call Trace: wake_up_page_bit+0x8a/0x110 iomap_finish_ioend+0xd7/0x1c0 iomap_finish_ioends+0x7f/0xb0 xfs_end_ioend+0x6b/0x100 [xfs] xfs_end_io+0xb9/0xe0 [xfs] process_one_work+0x1a7/0x360 worker_thread+0x1fa/0x390 kthread+0x116/0x130 ret_from_fork+0x35/0x40 Ioends are processed as an atomic completion unit when all the chained bios in the ioend have completed their IO. Logically contiguous ioends can also be merged and completed as a single, larger unit. Both of these things can be problematic as both the bio chains per ioend and the size of the merged ioends processed as a single completion are both unbound. If we have a large sequential dirty region in the page cache, write_cache_pages() will keep feeding us sequential pages and we will keep mapping them into ioends and bios until we get a dirty page at a non-sequential file offset. These large sequential runs can will result in bio and ioend chaining to optimise the io patterns. The pages iunder writeback are pinned within these chains until the submission chaining is broken, allowing the entire chain to be completed. This can result in huge chains being processed in IO completion context. We get deep bio chaining if we have large contiguous physical extents. We will keep adding pages to the current bio until it is full, then we'll chain a new bio to keep adding pages for writeback. Hence we can build bio chains that map millions of pages and tens of gigabytes of RAM if the page cache contains big enough contiguous dirty file regions. This long bio chain pins those pages until the final bio in the chain completes and the ioend can iterate all the chained bios and complete them. OTOH, if we have a physically fragmented file, we end up submitting one ioend per physical fragment that each have a small bio or bio chain attached to them. We do not chain these at IO submission time, but instead we chain them at completion time based on file offset via iomap_ioend_try_merge(). Hence we can end up with unbound ioend chains being built via completion merging. XFS can then do COW remapping or unwritten extent conversion on that merged chain, which involves walking an extent fragment at a time and running a transaction to modify the physical extent information. IOWs, we merge all the discontiguous ioends together into a contiguous file range, only to then process them individually as discontiguous extents. This extent manipulation is computationally expensive and can run in a tight loop, so merging logically contiguous but physically discontigous ioends gains us nothing except for hiding the fact the fact we broke the ioends up into individual physical extents at submission and then need to loop over those individual physical extents at completion. Hence we need to have mechanisms to limit ioend sizes and to break up completion processing of large merged ioend chains: 1. bio chains per ioend need to be bound in length. Pure overwrites go straight to iomap_finish_ioend() in softirq context with the exact bio chain attached to the ioend by submission. Hence the only way to prevent long holdoffs here is to bound ioend submission sizes because we can't reschedule in softirq context. 2. iomap_finish_ioends() has to handle unbound merged ioend chains correctly. This relies on any one call to iomap_finish_ioend() being bound in runtime so that cond_resched() can be issued regularly as the long ioend chain is processed. i.e. this relies on mechanism #1 to limit individual ioend sizes to work correctly. 3. filesystems have to loop over the merged ioends to process physical extent manipulations. This means they can loop internally, and so we break merging at physical extent boundaries so the filesystem can easily insert reschedule points between individual extent manipulations. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reported-and-tested-by: NTrond Myklebust <trondmy@hammerspace.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Conflicts: include/linux/iomap.h fs/iomap/buffered-io.c fs/xfs/xfs_aops.c [ 6e552494 ("iomap: remove unused private field from ioend") is not applied. 95c4cd05 ("iomap: Convert to_iomap_page to take a folio") is not applied. 8ffd74e9 ("iomap: Convert bio completions to use folios") is not applied. 044c6449 ("xfs: drop unused ioend private merge and setfilesize code") is not applied. ] Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Pedro Tammela 提交于
stable inclusion from stable-v5.10.168 commit 4fe9950815e19051b7b8268b4d4c3ac286a741bf category: bugfix bugzilla: 188576, https://gitee.com/src-openeuler/kernel/issues/I6OP9S CVE: CVE-2023-1281 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4fe9950815e19051b7b8268b4d4c3ac286a741bf --------------------------- [ Upstream commit 42018a32 ] Syzkaller found an issue where a handle greater than 16 bits would trigger a null-ptr-deref in the imperfect hash area update. general protection fault, probably for non-canonical address 0xdffffc0000000015: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af] CPU: 0 PID: 5070 Comm: syz-executor456 Not tainted 6.2.0-rc7-syzkaller-00112-gc68f345b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 RIP: 0010:tcindex_set_parms+0x1a6a/0x2990 net/sched/cls_tcindex.c:509 Code: 01 e9 e9 fe ff ff 4c 8b bd 28 fe ff ff e8 0e 57 7d f9 48 8d bb a8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 94 0c 00 00 48 8b 85 f8 fd ff ff 48 8b 9b a8 00 RSP: 0018:ffffc90003d3ef88 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000015 RSI: ffffffff8803a102 RDI: 00000000000000a8 RBP: ffffc90003d3f1d8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88801e2b10a8 R13: dffffc0000000000 R14: 0000000000030000 R15: ffff888017b3be00 FS: 00005555569af300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056041c6d2000 CR3: 000000002bfca000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcindex_change+0x1ea/0x320 net/sched/cls_tcindex.c:572 tc_new_tfilter+0x96e/0x2220 net/sched/cls_api.c:2155 rtnetlink_rcv_msg+0x959/0xca0 net/core/rtnetlink.c:6132 netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2574 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xd3/0x120 net/socket.c:734 ____sys_sendmsg+0x334/0x8c0 net/socket.c:2476 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2530 __sys_sendmmsg+0x18f/0x460 net/socket.c:2616 __do_sys_sendmmsg net/socket.c:2645 [inline] __se_sys_sendmmsg net/socket.c:2642 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2642 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 Fixes: ee059170 ("net/sched: tcindex: update imperfect hash filters respecting rcu") Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NPedro Tammela <pctammela@mojatatu.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Reviewed-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NDong Chenchen <dongchenchen2@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Pedro Tammela 提交于
stable inclusion from stable-v5.10.168 commit eb8e9d8572d1d9df17272783ad8a84843ce559d4 category: bugfix bugzilla: 188576, https://gitee.com/src-openeuler/kernel/issues/I6OP9S CVE: CVE-2023-1281 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=eb8e9d8572d1d9df17272783ad8a84843ce559d4 --------------------------- commit ee059170 upstream. The imperfect hash area can be updated while packets are traversing, which will cause a use-after-free when 'tcf_exts_exec()' is called with the destroyed tcf_ext. CPU 0: CPU 1: tcindex_set_parms tcindex_classify tcindex_lookup tcindex_lookup tcf_exts_change tcf_exts_exec [UAF] Stop operating on the shared area directly, by using a local copy, and update the filter with 'rcu_replace_pointer()'. Delete the old filter version only after a rcu grace period elapsed. Fixes: 9b0d4446 ("net: sched: avoid atomic swap in tcf_exts_change") Reported-by: Nvalis <sec@valis.email> Suggested-by: Nvalis <sec@valis.email> Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NPedro Tammela <pctammela@mojatatu.com> Link: https://lore.kernel.org/r/20230209143739.279867-1-pctammela@mojatatu.comSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NDong Chenchen <dongchenchen2@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Sven Schnelle 提交于
stable inclusion from stable-v5.10.173 commit 84ea44dc3e4ecb2632586238014bf6722aa5843b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6Q4F0 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=84ea44dc3e4ecb2632586238014bf6722aa5843b -------------------------------- [ Upstream commit db4df8e9 ] When specifying an invalid console= device like console=tty3270, tty_driver_lookup_tty() returns the tty struct without checking whether index is a valid number. To reproduce: qemu-system-x86_64 -enable-kvm -nographic -serial mon:stdio \ -kernel ../linux-build-x86/arch/x86/boot/bzImage \ -append "console=ttyS0 console=tty3270" This crashes with: [ 0.770599] BUG: kernel NULL pointer dereference, address: 00000000000000ef [ 0.771265] #PF: supervisor read access in kernel mode [ 0.771773] #PF: error_code(0x0000) - not-present page [ 0.772609] Oops: 0000 [#1] PREEMPT SMP PTI [ 0.774878] RIP: 0010:tty_open+0x268/0x6f0 [ 0.784013] chrdev_open+0xbd/0x230 [ 0.784444] ? cdev_device_add+0x80/0x80 [ 0.784920] do_dentry_open+0x1e0/0x410 [ 0.785389] path_openat+0xca9/0x1050 [ 0.785813] do_filp_open+0xaa/0x150 [ 0.786240] file_open_name+0x133/0x1b0 [ 0.786746] filp_open+0x27/0x50 [ 0.787244] console_on_rootfs+0x14/0x4d [ 0.787800] kernel_init_freeable+0x1e4/0x20d [ 0.788383] ? rest_init+0xc0/0xc0 [ 0.788881] kernel_init+0x11/0x120 [ 0.789356] ret_from_fork+0x22/0x30 Signed-off-by: NSven Schnelle <svens@linux.ibm.com> Reviewed-by: NJiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221209112737.3222509-2-svens@linux.ibm.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYi Yang <yiyang13@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NCai Xinchen <caixinchen1@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-