- 14 7月, 2023 2 次提交
-
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS CVE: NA -------------------------------- Add debug message to notify user that ext4_writepages is stuck in loop caused by ENOSPC. Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 4ae7e703)
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS CVE: NA -------------------------------- This reverts commit 07a8109d. When ext4 runs out of space, there could be a potential data lost in ext4_writepages: If there are many preallocated blocks for some files, e4b bitmap is different from block bitmap, and there are more free blocks accounted by block bitmap. ext4_writepages P2 ext4_mb_new_blocks ext4_map_blocks ext4_mb_regular_allocator // No free bits in e4b bitmap ext4_mb_discard_preallocations_should_retry ext4_mb_discard_preallocations ext4_mb_discard_group_preallocations ext4_mb_release_inode_pa // updates e4b bitmap by pa->pa_free mb_free_blocks ext4_mb_new_blocks ext4_mb_regular_allocator // Got e4b bitmap's free bits ext4_mb_regular_allocator // After 3 times retrying, ret ENOSPC ext4_writepages mpage_map_and_submit_extent mpage_map_one_extent // ret ENOSPC if (err == -ENOSPC && EXT4_SB(sb)->s_mb_free_pending) // s_mb_free_pending is 0 *give_up_on_write = true // Abandon writeback, data lost! Fixes: 07a8109d ("ext4: Stop trying writing pages if no free ...") Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 5f142164)
-
- 07 7月, 2023 1 次提交
-
-
由 Baokun Li 提交于
mainline inclusion from mainline-v6.5 commit d13f99632748462c32fc95d729f5e754bab06064 category: bugfix bugzilla: 188906, https://gitee.com/openeuler/kernel/issues/I7E9M5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d13f99632748462c32fc95d729f5e754bab06064 -------------------------------- Yi found during a review of the patch "ext4: don't BUG on inconsistent journal feature" that when ext4_mark_recovery_complete() returns an error value, the error handling path does not turn off the enabled quotas, which triggers the following kmemleak: ================================================================ unreferenced object 0xffff8cf68678e7c0 (size 64): comm "mount", pid 746, jiffies 4294871231 (age 11.540s) hex dump (first 32 bytes): 00 90 ef 82 f6 8c ff ff 00 00 00 00 41 01 00 00 ............A... c7 00 00 00 bd 00 00 00 0a 00 00 00 48 00 00 00 ............H... backtrace: [<00000000c561ef24>] __kmem_cache_alloc_node+0x4d4/0x880 [<00000000d4e621d7>] kmalloc_trace+0x39/0x140 [<00000000837eee74>] v2_read_file_info+0x18a/0x3a0 [<0000000088f6c877>] dquot_load_quota_sb+0x2ed/0x770 [<00000000340a4782>] dquot_load_quota_inode+0xc6/0x1c0 [<0000000089a18bd5>] ext4_enable_quotas+0x17e/0x3a0 [ext4] [<000000003a0268fa>] __ext4_fill_super+0x3448/0x3910 [ext4] [<00000000b0f2a8a8>] ext4_fill_super+0x13d/0x340 [ext4] [<000000004a9489c4>] get_tree_bdev+0x1dc/0x370 [<000000006e723bf1>] ext4_get_tree+0x1d/0x30 [ext4] [<00000000c7cb663d>] vfs_get_tree+0x31/0x160 [<00000000320e1bed>] do_new_mount+0x1d5/0x480 [<00000000c074654c>] path_mount+0x22e/0xbe0 [<0000000003e97a8e>] do_mount+0x95/0xc0 [<000000002f3d3736>] __x64_sys_mount+0xc4/0x160 [<0000000027d2140c>] do_syscall_64+0x3f/0x90 ================================================================ To solve this problem, we add a "failed_mount10" tag, and call ext4_quota_off_umount() in this tag to release the enabled qoutas. Fixes: 11215630 ("ext4: don't BUG on inconsistent journal feature") Cc: stable@kernel.org Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230327141630.156875-2-libaokun1@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Conflicts: fs/ext4/super.c Signed-off-by: NBaokun Li <libaokun1@huawei.com> (cherry picked from commit e980e714)
-
- 06 7月, 2023 1 次提交
-
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS -------------------------------- Folllowing steps could make ext4_wripages trap into a dead loop: 1. Consume free_clusters until free_clusters > 2 * sbi->s_resv_clusters, and free_clusters > EXT4_FREECLUSTERS_WATERMARK. // eg. free_clusters = 1422, sbi->s_resv_clusters = 512 // nr_cpus = 4, EXT4_FREECLUSTERS_WATERMARK = 512 2. umount && mount. // dirty_clusters = 0 3. Run free_clusters tasks concurrently to write different files, many tasks write(appendant) 4K data by da_write method. And each inode will consume one data block and one extent block in map_block. // There are (free_clusters - EXT4_FREECLUSTERS_WATERMARK = 910) // tasks choosing da_write method, left 512 tasks choose write_begin // method. If tasks which chooses da_write path run first. // dirty_clusters = 910, free_clusters = 1422 // Tasks which choose write_begin path will get ENOSPC: // free_clusters < (nclusters + dirty_clusters + resv_clusters) // 1422 < (1 + 910 + 512) 4. After certain number of map_block iterations in ext4_writepages. // free_clusters = 0, // dirty_clusters = 910 - (1422 / 2) = 199 5. Delete one 4K file. // free_clusters = 1 6. ext4_writepages traps into dead loop: mpage_map_and_submit_extent mpage_map_one_extent // ret = ENOSPC ext4_map_blocks -> ext4_ext_map_blocks -> ext4_mb_new_blocks -> ext4_claim_free_clusters: if (free_clusters >= (nclusters + dirty_clusters)) // false if (err == -ENOSPC && ext4_count_free_clusters(sb)) // true return err *give_up_on_write = true // won't be executed Fix it by terminating ext4_writepages if no free blocks generated. Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 07a8109d)
-
- 05 6月, 2023 1 次提交
-
-
由 Baokun Li 提交于
hulk inclusion category: perf bugzilla: 188836, https://gitee.com/openeuler/kernel/issues/I7AYNZ -------------------------------- This reverts commit 5193a88e. This commit may cause performance degradation, so it is being reverted temporarily. Signed-off-by: NBaokun Li <libaokun1@huawei.com> (cherry picked from commit 6ff958e1)
-
- 02 6月, 2023 1 次提交
-
-
由 Yang Yingliang 提交于
hulk inclusion category: performance bugzilla: https://gitee.com/openeuler/kernel/issues/I7ADMY -------------------------------- In the test of execl, shell1 and shell8 of UnixBench, L3 false sharing occurs between rwsem_try_write_lock_unqueued() and filemap_map_pages(). The offset between address_space.host and address_space.i_mmap_rwsem is 48. It may occur L3 false sharing. Their offsets in struct ext4_inode_info is 696 and 744, so when the address of ext4_inode_info after L3 aligned, it may occur L3 false sharing in the following condition: [0x00 ~ 0x10] false sharing [0x18 ~ 0x40] no false sharing [0x48 ~ 0x80] false sharing Change the offset of 'vfs_inode' from 320 to 360 in ext4_inode_info and make the address of ext4_inode_info L3 aligned, so the offset of host and i_mmap_rwsem in ext4_inode_info is changed to 736 and 784, it can make them in different L3 to avoid false sharing. ./Run -c 96 -i 3 execl Before this patch: System Benchmarks Partial Index BASELINE RESULT INDEX Execl Throughput 43.0 24238.0 5636.8 ======== System Benchmarks Index Score (Partial Only) 5636.8 After this patch: System Benchmarks Partial Index BASELINE RESULT INDEX Execl Throughput 43.0 29363.7 6828.8 ======== System Benchmarks Index Score (Partial Only) 6828.8 Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 01 6月, 2023 1 次提交
-
-
由 Tudor Ambarus 提交于
stable inclusion from stable-v5.10.180 commit 0dde3141c527b09b96bef1e7eeb18b8127810ce9 category: bugfix bugzilla: 188791,https://gitee.com/openeuler/kernel/issues/I76XUJ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=0dde3141c527b09b96bef1e7eeb18b8127810ce9 -------------------------------- commit 4f043518 upstream. When modifying the block device while it is mounted by the filesystem, syzbot reported the following: BUG: KASAN: slab-out-of-bounds in crc16+0x206/0x280 lib/crc16.c:58 Read of size 1 at addr ffff888075f5c0a8 by task syz-executor.2/15586 CPU: 1 PID: 15586 Comm: syz-executor.2 Not tainted 6.2.0-rc5-syzkaller-00205-gc9661827 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:306 print_report+0x107/0x1f0 mm/kasan/report.c:417 kasan_report+0xcd/0x100 mm/kasan/report.c:517 crc16+0x206/0x280 lib/crc16.c:58 ext4_group_desc_csum+0x81b/0xb20 fs/ext4/super.c:3187 ext4_group_desc_csum_set+0x195/0x230 fs/ext4/super.c:3210 ext4_mb_clear_bb fs/ext4/mballoc.c:6027 [inline] ext4_free_blocks+0x191a/0x2810 fs/ext4/mballoc.c:6173 ext4_remove_blocks fs/ext4/extents.c:2527 [inline] ext4_ext_rm_leaf fs/ext4/extents.c:2710 [inline] ext4_ext_remove_space+0x24ef/0x46a0 fs/ext4/extents.c:2958 ext4_ext_truncate+0x177/0x220 fs/ext4/extents.c:4416 ext4_truncate+0xa6a/0xea0 fs/ext4/inode.c:4342 ext4_setattr+0x10c8/0x1930 fs/ext4/inode.c:5622 notify_change+0xe50/0x1100 fs/attr.c:482 do_truncate+0x200/0x2f0 fs/open.c:65 handle_truncate fs/namei.c:3216 [inline] do_open fs/namei.c:3561 [inline] path_openat+0x272b/0x2dd0 fs/namei.c:3714 do_filp_open+0x264/0x4f0 fs/namei.c:3741 do_sys_openat2+0x124/0x4e0 fs/open.c:1310 do_sys_open fs/open.c:1326 [inline] __do_sys_creat fs/open.c:1402 [inline] __se_sys_creat fs/open.c:1396 [inline] __x64_sys_creat+0x11f/0x160 fs/open.c:1396 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f72f8a8c0c9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f72f97e3168 EFLAGS: 00000246 ORIG_RAX: 0000000000000055 RAX: ffffffffffffffda RBX: 00007f72f8bac050 RCX: 00007f72f8a8c0c9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000280 RBP: 00007f72f8ae7ae9 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffd165348bf R14: 00007f72f97e3300 R15: 0000000000022000 Replace le16_to_cpu(sbi->s_es->s_desc_size) with sbi->s_desc_size It reduces ext4's compiled text size, and makes the code more efficient (we remove an extra indirect reference and a potential byte swap on big endian systems), and there is no downside. It also avoids the potential KASAN / syzkaller failure, as a bonus. Reported-by: syzbot+fc51227e7100c9294894@syzkaller.appspotmail.com Reported-by: syzbot+8785e41224a3afd04321@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=70d28d11ab14bd7938f3e088365252aa923cff42 Link: https://syzkaller.appspot.com/bug?id=b85721b38583ecc6b5e72ff524c67302abbc30f3 Link: https://lore.kernel.org/all/000000000000ece18705f3b20934@google.com/ Fixes: 717d50e4 ("Ext4: Uninitialized Block Groups") Cc: stable@vger.kernel.org Signed-off-by: NTudor Ambarus <tudor.ambarus@linaro.org> Link: https://lore.kernel.org/r/20230504121525.3275886-1-tudor.ambarus@linaro.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 0d4053b9)
-
- 10 5月, 2023 2 次提交
-
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188724, https://gitee.com/openeuler/kernel/issues/I70Q22 Reference: https://www.spinics.net/lists/kernel/msg4779681.html ---------------------------------------- When ext4_iomap_overwrite_begin() calls ext4_iomap_begin() map blocks may fail for some reason (e.g. memory allocation failure, bare disk write), and later because "iomap->type ! = IOMAP_MAPPED" triggers WARN_ON(). When ext4 iomap_begin() returns an error, it is normal that the type of iomap->type may not match the expectation. Therefore, we only determine if iomap->type is as expected when ext4_iomap_begin() is executed successfully. Reported-by: syzbot+08106c4b7d60702dbc14@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/00000000000015760b05f9b4eee9@google.comReviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188499, https://gitee.com/openeuler/kernel/issues/I6TNVT CVE: NA Reference: https://patchwork.ozlabs.org/project/linux-ext4/patch/20230412124126.2286716-2-libaokun1@huawei.com/ ---------------------------------------- In our fault injection test, we create an ext4 file, migrate it to non-extent based file, then punch a hole and finally trigger a WARN_ON in the ext4_da_update_reserve_space(): EXT4-fs warning (device sda): ext4_da_update_reserve_space:369: ino 14, used 11 with only 10 reserved data blocks When writing back a non-extent based file, if we enable delalloc, the number of reserved blocks will be subtracted from the number of blocks mapped by ext4_ind_map_blocks(), and the extent status tree will be updated. We update the extent status tree by first removing the old extent_status and then inserting the new extent_status. If the block range we remove happens to be in an extent, then we need to allocate another extent_status with ext4_es_alloc_extent(). use old to remove to add new |----------|------------|------------| old extent_status The problem is that the allocation of a new extent_status failed due to a fault injection, and __es_shrink() did not get free memory, resulting in a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM, we map to the same extent again, and the number of reserved blocks is again subtracted from the number of blocks in that extent. Since the blocks in the same extent are subtracted twice, we end up triggering WARN_ON at ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks. For non-extent based file, we update the number of reserved blocks after ext4_ind_map_blocks() is executed, which causes a problem that when we call ext4_ind_map_blocks() to create a block, it doesn't always create a block, but we always reduce the number of reserved blocks. So we move the logic for updating reserved blocks to ext4_ind_map_blocks() to ensure that the number of reserved blocks is updated only after we do succeed in allocating some new blocks. Fixes: 5f634d06 ("ext4: Fix quota accounting error with fallocate") Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 19 4月, 2023 9 次提交
-
-
由 Jerry Lee 李修賢 提交于
stable inclusion from stable-v5.10.150 commit 67de22cb0b6c6a6a71a472dbe8f0b3f3eb60565d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=67de22cb0b6c6a6a71a472dbe8f0b3f3eb60565d -------------------------------- commit df3cb754 upstream. When expanding a file system from (16TiB-2MiB) to 18TiB, the operation exits early which leads to result inconsistency between resize2fs and Ext4 kernel driver. === before === ○ → resize2fs /dev/mapper/thin resize2fs 1.45.5 (07-Jan-2020) Filesystem at /dev/mapper/thin is mounted on /mnt/test; on-line resizing required old_desc_blocks = 2048, new_desc_blocks = 2304 The filesystem on /dev/mapper/thin is now 4831837696 (4k) blocks long. [ 865.186308] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [ 912.091502] dm-4: detected capacity change from 34359738368 to 38654705664 [ 970.030550] dm-5: detected capacity change from 34359734272 to 38654701568 [ 1000.012751] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks [ 1000.012878] EXT4-fs (dm-5): resized filesystem to 4294967296 === after === [ 129.104898] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [ 143.773630] dm-4: detected capacity change from 34359738368 to 38654705664 [ 198.203246] dm-5: detected capacity change from 34359734272 to 38654701568 [ 207.918603] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks [ 207.918754] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks [ 207.918758] EXT4-fs (dm-5): Converting file system to meta_bg [ 207.918790] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks [ 221.454050] EXT4-fs (dm-5): resized to 4658298880 blocks [ 227.634613] EXT4-fs (dm-5): resized filesystem to 4831837696 Signed-off-by: NJerry Lee <jerrylee@qnap.com> Link: https://lore.kernel.org/r/PU1PR04MB22635E739BD21150DC182AC6A18C9@PU1PR04MB2263.apcprd04.prod.outlook.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Ye Bin 提交于
stable inclusion from stable-v5.10.150 commit 2189756eabbb5f8366b4302183dfa3e98bfed196 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2189756eabbb5f8366b4302183dfa3e98bfed196 -------------------------------- commit 27cd4978 upstream. To avoid to 'state->fc_regions_size' mismatch with 'state->fc_regions' when fail to reallocate 'fc_reqions',only update 'state->fc_regions_size' after 'state->fc_regions' is allocated successfully. Cc: stable@kernel.org Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220921064040.3693255-4-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Ye Bin 提交于
stable inclusion from stable-v5.10.150 commit 2cfb769d60a2a57eb3566765428b6131cd16dcfe category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2cfb769d60a2a57eb3566765428b6131cd16dcfe -------------------------------- commit 7069d105 upstream. As krealloc may return NULL, in this case 'state->fc_regions' may not be freed by krealloc, but 'state->fc_regions' already set NULL. Then will lead to 'state->fc_regions' memory leak. Cc: stable@kernel.org Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220921064040.3693255-3-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Ye Bin 提交于
stable inclusion from stable-v5.10.150 commit c9ce7766dc4e88e624c62a68221a3bbe8f06e856 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c9ce7766dc4e88e624c62a68221a3bbe8f06e856 -------------------------------- commit 9305721a upstream. As krealloc may return NULL, in this case 'state->fc_modified_inodes' may not be freed by krealloc, but 'state->fc_modified_inodes' already set NULL. Then will lead to 'state->fc_modified_inodes' memory leak. Cc: stable@kernel.org Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220921064040.3693255-2-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Ye Bin 提交于
stable inclusion from stable-v5.10.150 commit d575fb52c46686f169774e409382af4b5990b215 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d575fb52c46686f169774e409382af4b5990b215 -------------------------------- commit ccbf8eeb upstream. In 'ext4_fc_write_inode' function first call 'ext4_get_inode_loc' get 'iloc', after use it miss release 'iloc.bh'. So just release 'iloc.bh' before 'ext4_fc_write_inode' return. Cc: stable@kernel.org Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220914100859.1415196-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jinke Han 提交于
stable inclusion from stable-v5.10.150 commit 74d2a398d2d8c54d6468bc1e9da60ed9f3c4739f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=74d2a398d2d8c54d6468bc1e9da60ed9f3c4739f -------------------------------- commit d1052d23 upstream. In our product environment, we encounter some jbd hung waiting handles to stop while several writters were doing memory reclaim for buffer head allocation in delay alloc write path. Ext4 do buffer head allocation with holding transaction handle which may be blocked too long if the reclaim works not so smooth. According to our bcc trace, the reclaim time in buffer head allocation can reach 258s and the jbd transaction commit also take almost the same time meanwhile. Except for these extreme cases, we often see several seconds delays for cgroup memory reclaim on our servers. This is more likely to happen considering docker environment. One thing to note, the allocation of buffer heads is as often as page allocation or more often when blocksize less than page size. Just like page cache allocation, we should also place the buffer head allocation before startting the handle. Cc: stable@kernel.org Signed-off-by: NJinke Han <hanjinke.666@bytedance.com> Link: https://lore.kernel.org/r/20220903012429.22555-1-hanjinke.666@bytedance.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Lukas Czerner 提交于
stable inclusion from stable-v5.10.150 commit 0e1764ad71abca735418fd596a82067074b59687 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=0e1764ad71abca735418fd596a82067074b59687 -------------------------------- commit 50f094a5 upstream. ea_inodes are using i_version for storing part of the reference count so we really need to leave it alone. The problem can be reproduced by xfstest ext4/026 when iversion is enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL inodes in ext4_mark_iloc_dirty(). Cc: stable@kernel.org Signed-off-by: NLukas Czerner <lczerner@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org> Link: https://lore.kernel.org/r/20220824160349.39664-1-lczerner@redhat.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Lalith Rajendran 提交于
stable inclusion from stable-v5.10.150 commit ac66db1a436504159463f811b9e9864c1d2b03f1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ac66db1a436504159463f811b9e9864c1d2b03f1 -------------------------------- commit 3b575495 upstream. ext4_lazyinit_thread is not set freezable. Hence when the thread calls try_to_freeze it doesn't freeze during suspend and continues to send requests to the storage during suspend, resulting in suspend failures. Cc: stable@kernel.org Signed-off-by: NLalith Rajendran <lalithkraj@google.com> Link: https://lore.kernel.org/r/20220818214049.1519544-1-lalithkraj@google.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jan Kara 提交于
stable inclusion from stable-v5.10.150 commit fb98cb61efff3b2a1964939465ccaaf906af1d4f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=fb98cb61efff3b2a1964939465ccaaf906af1d4f -------------------------------- commit 4bb26f28 upstream. When inode is created and written to using direct IO, there is nothing to clear the EXT4_STATE_MAY_INLINE_DATA flag. Thus when inode gets truncated later to say 1 byte and written using normal write, we will try to store the data as inline data. This confuses the code later because the inode now has both normal block and inline data allocated and the confusion manifests for example as: kernel BUG at fs/ext4/inode.c:2721! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 359 Comm: repro Not tainted 5.19.0-rc8-00001-g31ba1e3b8305-dirty #15 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 RIP: 0010:ext4_writepages+0x363d/0x3660 RSP: 0018:ffffc90000ccf260 EFLAGS: 00010293 RAX: ffffffff81e1abcd RBX: 0000008000000000 RCX: ffff88810842a180 RDX: 0000000000000000 RSI: 0000008000000000 RDI: 0000000000000000 RBP: ffffc90000ccf650 R08: ffffffff81e17d58 R09: ffffed10222c680b R10: dfffe910222c680c R11: 1ffff110222c680a R12: ffff888111634128 R13: ffffc90000ccf880 R14: 0000008410000000 R15: 0000000000000001 FS: 00007f72635d2640(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000565243379180 CR3: 000000010aa74000 CR4: 0000000000150eb0 Call Trace: <TASK> do_writepages+0x397/0x640 filemap_fdatawrite_wbc+0x151/0x1b0 file_write_and_wait_range+0x1c9/0x2b0 ext4_sync_file+0x19e/0xa00 vfs_fsync_range+0x17b/0x190 ext4_buffered_write_iter+0x488/0x530 ext4_file_write_iter+0x449/0x1b90 vfs_write+0xbcd/0xf40 ksys_write+0x198/0x2c0 __x64_sys_write+0x7b/0x90 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> Fix the problem by clearing EXT4_STATE_MAY_INLINE_DATA when we are doing direct IO write to a file. Cc: stable@kernel.org Reported-by: NTadeusz Struk <tadeusz.struk@linaro.org> Reported-by: syzbot+bd13648a53ed6933ca49@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=a1e89d09bbbcbd5c4cb45db230ee28c822953984Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NLukas Czerner <lczerner@redhat.com> Tested-by: Tadeusz Struk<tadeusz.struk@linaro.org> Link: https://lore.kernel.org/r/20220727155753.13969-1-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 12 4月, 2023 3 次提交
-
-
由 Zhihao Cheng 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6SMBI CVE: NA Reference: https://www.spinics.net/lists/linux-ext4/msg88386.html -------------------------------- Following process makes i_disksize exceed i_size: generic_perform_write copied = iov_iter_copy_from_user_atomic(len) // copied < len ext4_da_write_end | ext4_update_i_disksize | new_i_size = pos + copied; | WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize) // update i_disksize | generic_write_end | copied = block_write_end(copied, len) // copied = 0 | if (unlikely(copied < len)) | if (!PageUptodate(page)) | copied = 0; | if (pos + copied > inode->i_size) // return false if (unlikely(copied == 0)) goto again; if (unlikely(iov_iter_fault_in_readable(i, bytes))) { status = -EFAULT; break; } We get i_disksize greater than i_size here, which could trigger WARNING check 'i_size_read(inode) < EXT4_I(inode)->i_disksize' while doing dio: ext4_dio_write_iter iomap_dio_rw __iomap_dio_rw // return err, length is not aligned to 512 ext4_handle_inode_extension WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize) // Oops WARNING: CPU: 2 PID: 2609 at fs/ext4/file.c:319 CPU: 2 PID: 2609 Comm: aa Not tainted 6.3.0-rc2 RIP: 0010:ext4_file_write_iter+0xbc7 Call Trace: vfs_write+0x3b1 ksys_write+0x77 do_syscall_64+0x39 Fix it by updating 'copied' value before updating i_disksize just like ext4_write_inline_data_end() does. Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217209 Fixes: 64769240 ("ext4: Add delayed allocation support in data=writeback mode") Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Zhihao Cheng 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6MMUV CVE: NA Reference: https://www.spinics.net/lists/linux-ext4/msg88237.html -------------------------------- As discussed in [1], 'sbi->s_journal_bdev != sb->s_bdev' will always become true if sbi->s_journal_bdev exists. Filesystem block device and journal block device are both opened with 'FMODE_EXCL' mode, so these two devices can't be same one. Then we can remove the redundant checking 'sbi->s_journal_bdev != sb->s_bdev' if 'sbi->s_journal_bdev' exists. [1] https://lore.kernel.org/lkml/f86584f6-3877-ff18-47a1-2efaa12d18b2@huawei.com/Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Zhihao Cheng 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6MMUV CVE: NA Reference: https://www.spinics.net/lists/linux-ext4/msg88237.html -------------------------------- Following process makes ext4 load stale buffer heads from last failed mounting in a new mounting operation: mount_bdev ext4_fill_super | ext4_load_and_init_journal | ext4_load_journal | jbd2_journal_load | load_superblock | journal_get_superblock | set_buffer_verified(bh) // buffer head is verified | jbd2_journal_recover // failed caused by EIO | goto failed_mount3a // skip 'sb->s_root' initialization deactivate_locked_super kill_block_super generic_shutdown_super if (sb->s_root) // false, skip ext4_put_super->invalidate_bdev-> // invalidate_mapping_pages->mapping_evict_folio-> // filemap_release_folio->try_to_free_buffers, which // cannot drop buffer head. blkdev_put blkdev_put_whole if (atomic_dec_and_test(&bdev->bd_openers)) // false, systemd-udev happens to open the device. Then // blkdev_flush_mapping->kill_bdev->truncate_inode_pages-> // truncate_inode_folio->truncate_cleanup_folio-> // folio_invalidate->block_invalidate_folio-> // filemap_release_folio->try_to_free_buffers will be skipped, // dropping buffer head is missed again. Second mount: ext4_fill_super ext4_load_and_init_journal ext4_load_journal ext4_get_journal jbd2_journal_init_inode journal_init_common bh = getblk_unmovable bh = __find_get_block // Found stale bh in last failed mounting journal->j_sb_buffer = bh jbd2_journal_load load_superblock journal_get_superblock if (buffer_verified(bh)) // true, skip journal->j_format_version = 2, value is 0 jbd2_journal_recover do_one_pass next_log_block += count_tags(journal, bh) // According to journal_tag_bytes(), 'tag_bytes' calculating is // affected by jbd2_has_feature_csum3(), jbd2_has_feature_csum3() // returns false because 'j->j_format_version >= 2' is not true, // then we get wrong next_log_block. The do_one_pass may exit // early whenoccuring non JBD2_MAGIC_NUMBER in 'next_log_block'. The filesystem is corrupted here, journal is partially replayed, and new journal sequence number actually is already used by last mounting. The invalidate_bdev() can drop all buffer heads even racing with bare reading block device(eg. systemd-udev), so we can fix it by invalidating bdev in error handling path in __ext4_fill_super(). Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217171 Fixes: 25ed6e8a ("jbd2: enable journal clients to enable v2 checksumming") Cc: stable@vger.kernel.org # v3.5 Conflicts: fs/ext4/super.c [ a7a79c29("ext4: unify the ext4 super block loading operation") is not applied. 7edfd85b("ext4: Completely separate options parsing and sb setup") is not applied. ] Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 04 4月, 2023 2 次提交
-
-
由 Zhang Yi 提交于
mainline inclusion from mainline-v6.3-rc1 commit 240930fb category: perf bugzilla: https://gitee.com/openeuler/kernel/issues/I6S63P CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=240930fb7e6b52229bdee5b1423bfeab0002fed2 -------------------------------- In the dio write path, we only take shared inode lock for the case of aligned overwriting initialized blocks inside EOF. But for overwriting preallocated blocks, it may only need to split unwritten extents, this procedure has been protected under i_data_sem lock, it's safe to release the exclusive inode lock and take shared inode lock. This could give a significant speed up for multi-threaded writes. Test on Intel Xeon Gold 6140 and nvme SSD with below fio parameters. direct=1 ioengine=libaio iodepth=10 numjobs=10 runtime=60 rw=randwrite size=100G And the test result are: Before: bs=4k IOPS=11.1k, BW=43.2MiB/s bs=16k IOPS=11.1k, BW=173MiB/s bs=64k IOPS=11.2k, BW=697MiB/s After: bs=4k IOPS=41.4k, BW=162MiB/s bs=16k IOPS=41.3k, BW=646MiB/s bs=64k IOPS=13.5k, BW=843MiB/s Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221226062015.3479416-1-yi.zhang@huaweicloud.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Conflicts: fs/ext4/file.c [ 2f632965("iomap: pass a flags argument to iomap_dio_rw") is not applied. ] Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Baokun Li 提交于
hulk inclusion category: bugfix bugzilla: 188500, https://gitee.com/openeuler/kernel/issues/I6RJ0V CVE: NA -------------------------------- We got a WARNING in ext4_add_complete_io: ================================================================== WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250 CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85 RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4] [...] Call Trace: <TASK> ext4_end_bio+0xa8/0x240 [ext4] bio_endio+0x195/0x310 blk_update_request+0x184/0x770 scsi_end_request+0x2f/0x240 scsi_io_completion+0x75/0x450 scsi_finish_command+0xef/0x160 scsi_complete+0xa3/0x180 blk_complete_reqs+0x60/0x80 blk_done_softirq+0x25/0x40 __do_softirq+0x119/0x4c8 run_ksoftirqd+0x42/0x70 smpboot_thread_fn+0x136/0x3c0 kthread+0x140/0x1a0 ret_from_fork+0x2c/0x50 ================================================================== Above issue may happen as follows: cpu1 cpu2 ----------------------------|---------------------------- mount -o dioread_lock ext4_writepages ext4_do_writepages *if (ext4_should_dioread_nolock(inode))* // rsv_blocks is not assigned here mount -o remount,dioread_nolock ext4_journal_start_with_reserve __ext4_journal_start __ext4_journal_start_sb jbd2__journal_start *if (rsv_blocks)* // h_rsv_handle is not initialized here mpage_map_and_submit_extent mpage_map_one_extent dioread_nolock = ext4_should_dioread_nolock(inode) if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN)) mpd->io_submit.io_end->handle = handle->h_rsv_handle ext4_set_io_unwritten_flag io_end->flag |= EXT4_IO_END_UNWRITTEN // now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag scsi_finish_command scsi_io_completion scsi_io_completion_action scsi_end_request blk_update_request req_bio_endio bio_endio bio->bi_end_io > ext4_end_bio ext4_put_io_end_defer ext4_add_complete_io // trigger WARN_ON(!io_end->handle && sbi->s_journal); The immediate cause of this problem is that ext4_should_dioread_nolock() function returns inconsistent values in the ext4_do_writepages() and mpage_map_one_extent(). There are four conditions in this function that can be changed at mount time to cause this problem. These four conditions can be divided into two categories: (1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl (2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount The two in the first category have been fixed by commit c8585c6f ("ext4: fix races between changing inode journal mode and ext4_writepages") and commit cb85f4d2 ("ext4: fix race between writepages and enabling EXT4_EXTENTS_FL") respectively. Two cases in the other category have not yet been fixed, and the above issue is caused by this situation. We refer to the fix for the first category, when applying options during remount, we grab s_writepages_rwsem to avoid racing with writepages ops to trigger this problem. Fixes: 6b523df4 ("ext4: use transaction reservation for extent conversion in ext4_end_io") Cc: stable@vger.kernel.org Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 29 3月, 2023 2 次提交
-
-
由 Ye Bin 提交于
mainline inclusion from mainline-v6.3-rc2 commit f57886ca category: bugfix bugzilla: 188471,https://gitee.com/openeuler/kernel/issues/I6MR1V CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f57886ca1606ba74cc4ec4eb5cbf073934ffa559 -------------------------------- Now, jounral error number maybe cleared even though ext4_commit_super() failed. This may lead to error flag miss, then fsck will miss to check file system deeply. Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307061703.245965-3-yebin@huaweicloud.comSigned-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Ye Bin 提交于
mainline inclusion from mainline-v6.3-rc2 commit eee00237 category: bugfix bugzilla: 188471,https://gitee.com/openeuler/kernel/issues/I6MR1V CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eee00237fa5ec8f704f7323b54e48cc34e2d9168 -------------------------------- Now, 'es->s_state' maybe covered by recover journal. And journal errno maybe not recorded in journal sb as IO error. ext4_update_super() only update error information when 'sbi->s_add_error_count' large than zero. Then 'EXT4_ERROR_FS' flag maybe lost. To solve above issue just recover 'es->s_state' error flag after journal replay like error info. Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307061703.245965-2-yebin@huaweicloud.comSigned-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 22 3月, 2023 1 次提交
-
-
由 Darrick J. Wong 提交于
mainline inclusion from mainline-v6.3-rc2 commit c993799b category: bugfix bugzilla: 188522,https://gitee.com/openeuler/kernel/issues/I6N7ZP CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c993799baf9c5861f8df91beb80e1611b12efcbd -------------------------------- Apparently syzbot figured out that issuing this FSMAP call: struct fsmap_head cmd = { .fmh_count = ...; .fmh_keys = { { .fmr_device = /* ext4 dev */, .fmr_physical = 0, }, { .fmr_device = /* ext4 dev */, .fmr_physical = 0, }, }, ... }; ret = ioctl(fd, FS_IOC_GETFSMAP, &cmd); Produces this crash if the underlying filesystem is a 1k-block ext4 filesystem: kernel BUG at fs/ext4/ext4.h:3331! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 3 PID: 3227965 Comm: xfs_io Tainted: G W O 6.2.0-rc8-achx Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:ext4_mb_load_buddy_gfp+0x47c/0x570 [ext4] RSP: 0018:ffffc90007c03998 EFLAGS: 00010246 RAX: ffff888004978000 RBX: ffffc90007c03a20 RCX: ffff888041618000 RDX: 0000000000000000 RSI: 00000000000005a4 RDI: ffffffffa0c99b11 RBP: ffff888012330000 R08: ffffffffa0c2b7d0 R09: 0000000000000400 R10: ffffc90007c03950 R11: 0000000000000000 R12: 0000000000000001 R13: 00000000ffffffff R14: 0000000000000c40 R15: ffff88802678c398 FS: 00007fdf2020c880(0000) GS:ffff88807e100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd318a5fe8 CR3: 000000007f80f001 CR4: 00000000001706e0 Call Trace: <TASK> ext4_mballoc_query_range+0x4b/0x210 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_getfsmap_datadev+0x713/0x890 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_getfsmap+0x2b7/0x330 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_ioc_getfsmap+0x153/0x2b0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] __ext4_ioctl+0x2a7/0x17e0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] __x64_sys_ioctl+0x82/0xa0 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fdf20558aff RSP: 002b:00007ffd318a9e30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000200c0 RCX: 00007fdf20558aff RDX: 00007fdf1feb2010 RSI: 00000000c0c0583b RDI: 0000000000000003 RBP: 00005625c0634be0 R08: 00005625c0634c40 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdf1feb2010 R13: 00005625be70d994 R14: 0000000000000800 R15: 0000000000000000 For GETFSMAP calls, the caller selects a physical block device by writing its block number into fsmap_head.fmh_keys[01].fmr_device. To query mappings for a subrange of the device, the starting byte of the range is written to fsmap_head.fmh_keys[0].fmr_physical and the last byte of the range goes in fsmap_head.fmh_keys[1].fmr_physical. IOWs, to query what mappings overlap with bytes 3-14 of /dev/sda, you'd set the inputs as follows: fmh_keys[0] = { .fmr_device = major(8, 0), .fmr_physical = 3}, fmh_keys[1] = { .fmr_device = major(8, 0), .fmr_physical = 14}, Which would return you whatever is mapped in the 12 bytes starting at physical offset 3. The crash is due to insufficient range validation of keys[1] in ext4_getfsmap_datadev. On 1k-block filesystems, block 0 is not part of the filesystem, which means that s_first_data_block is nonzero. ext4_get_group_no_and_offset subtracts this quantity from the blocknr argument before cracking it into a group number and a block number within a group. IOWs, block group 0 spans blocks 1-8192 (1-based) instead of 0-8191 (0-based) like what happens with larger blocksizes. The net result of this encoding is that blocknr < s_first_data_block is not a valid input to this function. The end_fsb variable is set from the keys that are copied from userspace, which means that in the above example, its value is zero. That leads to an underflow here: blocknr = blocknr - le32_to_cpu(es->s_first_data_block); The division then operates on -1: offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >> EXT4_SB(sb)->s_cluster_bits; Leaving an impossibly large group number (2^32-1) in blocknr. ext4_getfsmap_check_keys checked that keys[0].fmr_physical and keys[1].fmr_physical are in increasing order, but ext4_getfsmap_datadev adjusts keys[0].fmr_physical to be at least s_first_data_block. This implies that we have to check it again after the adjustment, which is the piece that I forgot. Reported-by: syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com Fixes: 4a495624 ("ext4: fix off-by-one fsmap error on 1k block filesystems") Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002 Cc: stable@vger.kernel.org Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/Y+58NPTH7VNGgzdd@magnoliaSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 15 3月, 2023 2 次提交
-
-
由 Ye Bin 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6K53I Reference: https://patchwork.ozlabs.org/project/linux-ext4/patch/20230116020015.1506120-1-yebin@huaweicloud.com/ -------------------------------- Syzbot found the following issue: EXT4-fs: Warning: mounting with data=journal disables delayed allocation, dioread_nolock, O_DIRECT and fast_commit support! EXT4-fs (loop0): orphan cleanup on readonly fs ------------[ cut here ]------------ WARNING: CPU: 1 PID: 5067 at fs/ext4/mballoc.c:1869 mb_find_extent+0x8a1/0xe30 Modules linked in: CPU: 1 PID: 5067 Comm: syz-executor307 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:mb_find_extent+0x8a1/0xe30 fs/ext4/mballoc.c:1869 RSP: 0018:ffffc90003c9e098 EFLAGS: 00010293 RAX: ffffffff82405731 RBX: 0000000000000041 RCX: ffff8880783457c0 RDX: 0000000000000000 RSI: 0000000000000041 RDI: 0000000000000040 RBP: 0000000000000040 R08: ffffffff82405723 R09: ffffed10053c9402 R10: ffffed10053c9402 R11: 1ffff110053c9401 R12: 0000000000000000 R13: ffffc90003c9e538 R14: dffffc0000000000 R15: ffffc90003c9e2cc FS: 0000555556665300(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056312f6796f8 CR3: 0000000022437000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ext4_mb_complex_scan_group+0x353/0x1100 fs/ext4/mballoc.c:2307 ext4_mb_regular_allocator+0x1533/0x3860 fs/ext4/mballoc.c:2735 ext4_mb_new_blocks+0xddf/0x3db0 fs/ext4/mballoc.c:5605 ext4_ext_map_blocks+0x1868/0x6880 fs/ext4/extents.c:4286 ext4_map_blocks+0xa49/0x1cc0 fs/ext4/inode.c:651 ext4_getblk+0x1b9/0x770 fs/ext4/inode.c:864 ext4_bread+0x2a/0x170 fs/ext4/inode.c:920 ext4_quota_write+0x225/0x570 fs/ext4/super.c:7105 write_blk fs/quota/quota_tree.c:64 [inline] get_free_dqblk+0x34a/0x6d0 fs/quota/quota_tree.c:130 do_insert_tree+0x26b/0x1aa0 fs/quota/quota_tree.c:340 do_insert_tree+0x722/0x1aa0 fs/quota/quota_tree.c:375 do_insert_tree+0x722/0x1aa0 fs/quota/quota_tree.c:375 do_insert_tree+0x722/0x1aa0 fs/quota/quota_tree.c:375 dq_insert_tree fs/quota/quota_tree.c:401 [inline] qtree_write_dquot+0x3b6/0x530 fs/quota/quota_tree.c:420 v2_write_dquot+0x11b/0x190 fs/quota/quota_v2.c:358 dquot_acquire+0x348/0x670 fs/quota/dquot.c:444 ext4_acquire_dquot+0x2dc/0x400 fs/ext4/super.c:6740 dqget+0x999/0xdc0 fs/quota/dquot.c:914 __dquot_initialize+0x3d0/0xcf0 fs/quota/dquot.c:1492 ext4_process_orphan+0x57/0x2d0 fs/ext4/orphan.c:329 ext4_orphan_cleanup+0xb60/0x1340 fs/ext4/orphan.c:474 __ext4_fill_super fs/ext4/super.c:5516 [inline] ext4_fill_super+0x81cd/0x8700 fs/ext4/super.c:5644 get_tree_bdev+0x400/0x620 fs/super.c:1282 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Add some debug information: mb_find_extent: mb_find_extent block=41, order=0 needed=64 next=0 ex=0/41/1@3735929054 64 64 7 block_bitmap: ff 3f 0c 00 fc 01 00 00 d2 3d 00 00 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Acctually, blocks per group is 64, but block bitmap indicate at least has 128 blocks. Now, ext4_validate_block_bitmap() didn't check invalid block's bitmap if set. To resolve above issue, add check like fsck "Padding at end of block bitmap is not set". Reported-by: syzbot+68223fe9f6c95ad43bed@syzkaller.appspotmail.com Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NYang Erkun <yangerkun@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Zhang Yi 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D5XF Reference: https://lore.kernel.org/linux-ext4/20230130111138.76tp6pij3yhh4brh@quack3/T/#t -------------------------------- Current _ext4_show_options() do not distinguish MOPT_2 flag, so it mixed extend sbi->s_mount_opt2 options with sbi->s_mount_opt, it could lead to show incorrect options, e.g. show fc_debug_force if we mount with errors=continue mode and miss it if we set. $ mkfs.ext4 /dev/pmem0 $ mount -o errors=remount-ro /dev/pmem0 /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force #empty $ mount -o remount,errors=continue /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force fc_debug_force $ mount -o remount,errors=remount-ro,fc_debug_force /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force #empty Fixes: 995a3ed6 ("ext4: add fast_commit feature and handling for extended mount options") Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Conflict: fs/ext4/super.c Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 17 2月, 2023 2 次提交
-
-
由 Jan Kara 提交于
stable inclusion from stable-v5.10.146 commit c18383218c3102f8f9ba56fd9abed86ac80a69c3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0VX Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c18383218c3102f8f9ba56fd9abed86ac80a69c3 -------------------------------- commit 613c5a85 upstream. Currently the Orlov inode allocator searches for free inodes for a directory only in flex block groups with at most inodes_per_group/16 more directory inodes than average per flex block group. However with growing size of flex block group this becomes unnecessarily strict. Scale allowed difference from average directory count per flex block group with flex block group size as we do with other metrics. Tested-by: NStefan Wahren <stefan.wahren@i2se.com> Tested-by: NOjaswin Mujoo <ojaswin@linux.ibm.com> Cc: stable@kernel.org Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/Signed-off-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220908092136.11770-3-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> Reviewed-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Theodore Ts'o 提交于
stable inclusion from stable-v5.10.146 commit a968542d7e2471f3c75ed9380110049354e1de44 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0VX Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a968542d7e2471f3c75ed9380110049354e1de44 -------------------------------- commit 80fa46d6 upstream. This patch avoids threads live-locking for hours when a large number threads are competing over the last few free extents as they blocks getting added and removed from preallocation pools. From our bug reporter: A reliable way for triggering this has multiple writers continuously write() to files when the filesystem is full, while small amounts of space are freed (e.g. by truncating a large file -1MiB at a time). In the local filesystem, this can be done by simply not checking the return code of write (0) and/or the error (ENOSPACE) that is set. Over NFS with an async mount, even clients with proper error checking will behave this way since the linux NFS client implementation will not propagate the server errors [the write syscalls immediately return success] until the file handle is closed. This leads to a situation where NFS clients send a continuous stream of WRITE rpcs which result in ERRNOSPACE -- but since the client isn't seeing this, the stream of writes continues at maximum network speed. When some space does appear, multiple writers will all attempt to claim it for their current write. For NFS, we may see dozens to hundreds of threads that do this. The real-world scenario of this is database backup tooling (in particular, github.com/mdkent/percona-xtrabackup) which may write large files (>1TiB) to NFS for safe keeping. Some temporary files are written, rewound, and read back -- all before closing the file handle (the temp file is actually unlinked, to trigger automatic deletion on close/crash.) An application like this operating on an async NFS mount will not see an error code until TiB have been written/read. The lockup was observed when running this database backup on large filesystems (64 TiB in this case) with a high number of block groups and no free space. Fragmentation is generally not a factor in this filesystem (~thousands of large files, mostly contiguous except for the parts written while the filesystem is at capacity.) Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> Reviewed-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 09 2月, 2023 2 次提交
-
-
由 Baokun Li 提交于
stable inclusion from stable-v5.10.163 commit 7223d5e75f26352354ea2c0ccf8b579821b52adf category: bugfix bugzilla: 187904,https://gitee.com/openeuler/kernel/issues/I6BJAR CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7223d5e75f26352354ea2c0ccf8b579821b52adf -------------------------------- commit a71248b1 upstream. I caught a issue as follows: ================================================================== BUG: KASAN: use-after-free in __list_add_valid+0x28/0x1a0 Read of size 8 at addr ffff88814b13f378 by task mount/710 CPU: 1 PID: 710 Comm: mount Not tainted 6.1.0-rc3-next #370 Call Trace: <TASK> dump_stack_lvl+0x73/0x9f print_report+0x25d/0x759 kasan_report+0xc0/0x120 __asan_load8+0x99/0x140 __list_add_valid+0x28/0x1a0 ext4_orphan_cleanup+0x564/0x9d0 [ext4] __ext4_fill_super+0x48e2/0x5300 [ext4] ext4_fill_super+0x19f/0x3a0 [ext4] get_tree_bdev+0x27b/0x450 ext4_get_tree+0x19/0x30 [ext4] vfs_get_tree+0x49/0x150 path_mount+0xaae/0x1350 do_mount+0xe2/0x110 __x64_sys_mount+0xf0/0x190 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> [...] ================================================================== Above issue may happen as follows: ------------------------------------- ext4_fill_super ext4_orphan_cleanup --- loop1: assume last_orphan is 12 --- list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan) ext4_truncate --> return 0 ext4_inode_attach_jinode --> return -ENOMEM iput(inode) --> free inode<12> --- loop2: last_orphan is still 12 --- list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan); // use inode<12> and trigger UAF To solve this issue, we need to propagate the return value of ext4_inode_attach_jinode() appropriately. Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221102080633.1630225-1-libaokun1@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 21851e4c)
-
由 Baokun Li 提交于
stable inclusion from stable-v5.10.150 commit f34ab95162763cd7352f46df169296eec28b688d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6BJ5F CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f34ab95162763cd7352f46df169296eec28b688d -------------------------------- commit f9c1f248 upstream. I caught a null-ptr-deref bug as follows: ================================================================== KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f] CPU: 1 PID: 1589 Comm: umount Not tainted 5.10.0-02219-dirty #339 RIP: 0010:ext4_write_info+0x53/0x1b0 [...] Call Trace: dquot_writeback_dquots+0x341/0x9a0 ext4_sync_fs+0x19e/0x800 __sync_filesystem+0x83/0x100 sync_filesystem+0x89/0xf0 generic_shutdown_super+0x79/0x3e0 kill_block_super+0xa1/0x110 deactivate_locked_super+0xac/0x130 deactivate_super+0xb6/0xd0 cleanup_mnt+0x289/0x400 __cleanup_mnt+0x16/0x20 task_work_run+0x11c/0x1c0 exit_to_user_mode_prepare+0x203/0x210 syscall_exit_to_user_mode+0x5b/0x3a0 do_syscall_64+0x59/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ================================================================== Above issue may happen as follows: ------------------------------------- exit_to_user_mode_prepare task_work_run __cleanup_mnt cleanup_mnt deactivate_super deactivate_locked_super kill_block_super generic_shutdown_super shrink_dcache_for_umount dentry = sb->s_root sb->s_root = NULL <--- Here set NULL sync_filesystem __sync_filesystem sb->s_op->sync_fs > ext4_sync_fs dquot_writeback_dquots sb->dq_op->write_info > ext4_write_info ext4_journal_start(d_inode(sb->s_root), EXT4_HT_QUOTA, 2) d_inode(sb->s_root) s_root->d_inode <--- Null pointer dereference To solve this problem, we use ext4_journal_start_sb directly to avoid s_root being used. Cc: stable@kernel.org Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220805123947.565152-1-libaokun1@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 7c7ce1ec)
-
- 17 1月, 2023 1 次提交
-
-
由 Baokun Li 提交于
mainline inclusion from mainline-v6.2-rc1 commit a408f33e category: bugfix bugzilla: 187935,https://gitee.com/src-openeuler/kernel/issues/I5ZR8Z CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a408f33e895e455f16cf964cb5cd4979b658db7b -------------------------------- When online resizing is performed twice consecutively, the error message "Superblock checksum does not match superblock" is displayed for the second time. Here's the reproducer: mkfs.ext4 -F /dev/sdb 100M mount /dev/sdb /tmp/test resize2fs /dev/sdb 5G resize2fs /dev/sdb 6G To solve this issue, we moved the update of the checksum after the es->s_overhead_clusters is updated. Fixes: 026d0d27 ("ext4: reduce computation of overhead during resize") Fixes: de394a86 ("ext4: update s_overhead_clusters in the superblock during an on-line resize") Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NJan Kara <jack@suse.cz> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20221117040341.1380702-2-libaokun1@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Conflict: fs/ext4/resize.c Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 44942478)
-
- 30 11月, 2022 1 次提交
-
-
由 Luís Henriques 提交于
stable inclusion from stable-v5.10.146 commit 958b0ee23f5ac106e7cc11472b71aa2ea9a033bc category: bugfix bugzilla: 187444, https://gitee.com/openeuler/kernel/issues/I6261Z CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=958b0ee23f5ac106e7cc11472b71aa2ea9a033bc -------------------------------- commit 29a5b8a1 upstream. When walking through an inode extents, the ext4_ext_binsearch_idx() function assumes that the extent header has been previously validated. However, there are no checks that verify that the number of entries (eh->eh_entries) is non-zero when depth is > 0. And this will lead to problems because the EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this: [ 135.245946] ------------[ cut here ]------------ [ 135.247579] kernel BUG at fs/ext4/extents.c:2258! [ 135.249045] invalid opcode: 0000 [#1] PREEMPT SMP [ 135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4 [ 135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014 [ 135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0 [ 135.256475] Code: [ 135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246 [ 135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023 [ 135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c [ 135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c [ 135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024 [ 135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000 [ 135.272394] FS: 00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 135.274510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0 [ 135.277952] Call Trace: [ 135.278635] <TASK> [ 135.279247] ? preempt_count_add+0x6d/0xa0 [ 135.280358] ? percpu_counter_add_batch+0x55/0xb0 [ 135.281612] ? _raw_read_unlock+0x18/0x30 [ 135.282704] ext4_map_blocks+0x294/0x5a0 [ 135.283745] ? xa_load+0x6f/0xa0 [ 135.284562] ext4_mpage_readpages+0x3d6/0x770 [ 135.285646] read_pages+0x67/0x1d0 [ 135.286492] ? folio_add_lru+0x51/0x80 [ 135.287441] page_cache_ra_unbounded+0x124/0x170 [ 135.288510] filemap_get_pages+0x23d/0x5a0 [ 135.289457] ? path_openat+0xa72/0xdd0 [ 135.290332] filemap_read+0xbf/0x300 [ 135.291158] ? _raw_spin_lock_irqsave+0x17/0x40 [ 135.292192] new_sync_read+0x103/0x170 [ 135.293014] vfs_read+0x15d/0x180 [ 135.293745] ksys_read+0xa1/0xe0 [ 135.294461] do_syscall_64+0x3c/0x80 [ 135.295284] entry_SYSCALL_64_after_hwframe+0x46/0xb0 This patch simply adds an extra check in __ext4_ext_check(), verifying that eh_entries is not 0 when eh_depth is > 0. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283 Cc: Baokun Li <libaokun1@huawei.com> Cc: stable@kernel.org Signed-off-by: NLuís Henriques <lhenriques@suse.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NBaokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 21 11月, 2022 3 次提交
-
-
由 Ye Bin 提交于
mainline inclusion from mainline-v5.19-rc3 commit 9b6641dd category: bugfix bugzilla: 186927, https://gitee.com/src-openeuler/kernel/issues/I5YIY6 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9b6641dd95a0c441b277dd72ba22fed8d61f76ad -------------------------------- We got issue as follows: [home]# mount /dev/sda test EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended [home]# dmesg EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended EXT4-fs (sda): Errors on filesystem, clearing orphan list. EXT4-fs (sda): recovery complete EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none. [home]# debugfs /dev/sda debugfs 1.46.5 (30-Dec-2021) Checksum errors in superblock! Retrying... Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update super block checksum. To solve above issue, defer update super block checksum after ext4_orphan_cleanup. Signed-off-by: NYe Bin <yebin10@huawei.com> Cc: stable@kernel.org Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NRitesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kiselev, Oleg 提交于
stable inclusion from stable-v5.10.138 commit 80288883294c5b4ed18bae0d8bd9c4a12f297074 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60QFD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=80288883294c5b4ed18bae0d8bd9c4a12f297074 -------------------------------- [ Upstream commit 69cb8e9d ] This patch avoids an attempt to resize the filesystem to an unaligned cluster boundary. An online resize to a size that is not integral to cluster size results in the last iteration attempting to grow the fs by a negative amount, which trips a BUG_ON and leaves the fs with a corrupted in-memory superblock. Signed-off-by: NOleg Kiselev <okiselev@amazon.com> Link: https://lore.kernel.org/r/0E92A0AB-4F16-4F1A-94B7-702CC6504FDE@amazon.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Ye Bin 提交于
stable inclusion from stable-v5.10.138 commit 285447b81925ff785533c677d552c97818081181 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60QFD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=285447b81925ff785533c677d552c97818081181 -------------------------------- [ Upstream commit b24e77ef ] Now if check directoy entry is corrupted, ext4_empty_dir may return true then directory will be removed when file system mounted with "errors=continue". In order not to make things worse just return false when directory is corrupted. Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220622090223.682234-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
- 18 11月, 2022 3 次提交
-
-
由 Eric Whitney 提交于
stable inclusion from stable-v5.10.137 commit e8c747496f23e2cf152899e35de2f25ce647d72b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60PLB Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e8c747496f23e2cf152899e35de2f25ce647d72b -------------------------------- commit 7f0d8e1d upstream. A race can occur in the unlikely event ext4 is unable to allocate a physical cluster for a delayed allocation in a bigalloc file system during writeback. Failure to allocate a cluster forces error recovery that includes a call to mpage_release_unused_pages(). That function removes any corresponding delayed allocated blocks from the extent status tree. If a new delayed write is in progress on the same cluster simultaneously, resulting in the addition of an new extent containing one or more blocks in that cluster to the extent status tree, delayed block accounting can be thrown off if that delayed write then encounters a similar cluster allocation failure during future writeback. Write lock the i_data_sem in mpage_release_unused_pages() to fix this problem. Ext4's block/cluster accounting code for bigalloc relies on i_data_sem for mutual exclusion, as is found in the delayed write path, and the locking in mpage_release_unused_pages() is missing. Cc: stable@kernel.org Reported-by: NYe Bin <yebin10@huawei.com> Signed-off-by: NEric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20220615160530.1928801-1-enwlinux@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Theodore Ts'o 提交于
stable inclusion from stable-v5.10.137 commit ac8cc061145a54ff8d4e0f17f19f0200aabc21ff category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60PLB Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ac8cc061145a54ff8d4e0f17f19f0200aabc21ff -------------------------------- commit de394a86 upstream. When doing an online resize, the on-disk superblock on-disk wasn't updated. This means that when the file system is unmounted and remounted, and the on-disk overhead value is non-zero, this would result in the results of statfs(2) to be incorrect. This was partially fixed by Commits 10b01ee9 ("ext4: fix overhead calculation to account for the reserved gdt blocks"), 85d825db ("ext4: force overhead calculation if the s_overhead_cluster makes no sense"), and eb705421 ("ext4: update the cached overhead value in the superblock"). However, since it was too expensive to forcibly recalculate the overhead for bigalloc file systems at every mount, this didn't fix the problem for bigalloc file systems. This commit should address the problem when resizing file systems with the bigalloc feature enabled. Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20220629040026.112371-1-tytso@mit.eduSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Ye Bin 提交于
stable inclusion from stable-v5.10.137 commit e1682c7171a6c0ff576fe8116b8cba5b8f538b94 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60PLB Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e1682c7171a6c0ff576fe8116b8cba5b8f538b94 -------------------------------- commit 51ae846c upstream. We got issue as follows: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 9310 at fs/ext4/inode.c:3441 ext4_iomap_begin+0x182/0x5d0 RIP: 0010:ext4_iomap_begin+0x182/0x5d0 RSP: 0018:ffff88812460fa08 EFLAGS: 00010293 RAX: ffff88811f168000 RBX: 0000000000000000 RCX: ffffffff97793c12 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: ffff88812c669160 R08: ffff88811f168000 R09: ffffed10258cd20f R10: ffff88812c669077 R11: ffffed10258cd20e R12: 0000000000000001 R13: 00000000000000a4 R14: 000000000000000c R15: ffff88812c6691ee FS: 00007fd0d6ff3740(0000) GS:ffff8883af180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd0d6dda290 CR3: 0000000104a62000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: iomap_apply+0x119/0x570 iomap_bmap+0x124/0x150 ext4_bmap+0x14f/0x250 bmap+0x55/0x80 do_vfs_ioctl+0x952/0xbd0 __x64_sys_ioctl+0xc6/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Above issue may happen as follows: bmap write bmap ext4_bmap iomap_bmap ext4_iomap_begin ext4_file_write_iter ext4_buffered_write_iter generic_perform_write ext4_da_write_begin ext4_da_write_inline_data_begin ext4_prepare_inline_data ext4_create_inline_data ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA); if (WARN_ON_ONCE(ext4_has_inline_data(inode))) ->trigger bug_on To solved above issue hold inode lock in ext4_bamp. Signed-off-by: NYe Bin <yebin10@huawei.com> Link: https://lore.kernel.org/r/20220617013935.397596-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-