提交 · 358a5fb928a9df8b51ec2fcbc7a2272fd2111011 · openeuler / Kernel

29 3月, 2023 2 次提交

ext4: make sure fs error flag setted before clear journal error · 6546b303

由 Ye Bin 提交于 3月 29, 2023

mainline inclusion
from mainline-v6.3-rc2
commit f57886ca
category: bugfix
bugzilla: 188471,https://gitee.com/openeuler/kernel/issues/I6MR1V
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f57886ca1606ba74cc4ec4eb5cbf073934ffa559

--------------------------------

Now, jounral error number maybe cleared even though ext4_commit_super()
failed. This may lead to error flag miss, then fsck will miss to check
file system deeply.
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230307061703.245965-3-yebin@huaweicloud.comSigned-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

6546b303

ext4: commit super block if fs record error when journal record without error · 04933ef6

由 Ye Bin 提交于 3月 29, 2023

mainline inclusion
from mainline-v6.3-rc2
commit eee00237
category: bugfix
bugzilla: 188471,https://gitee.com/openeuler/kernel/issues/I6MR1V
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eee00237fa5ec8f704f7323b54e48cc34e2d9168

--------------------------------

Now, 'es->s_state' maybe covered by recover journal. And journal errno
maybe not recorded in journal sb as IO error. ext4_update_super() only
update error information when 'sbi->s_add_error_count' large than zero.
Then 'EXT4_ERROR_FS' flag maybe lost.
To solve above issue just recover 'es->s_state' error flag after journal
replay like error info.
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230307061703.245965-2-yebin@huaweicloud.comSigned-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

04933ef6

15 3月, 2023 1 次提交

ext4: fix incorrect options show of original mount_opt and extend mount_opt2 · 2b59ae5e

由 Zhang Yi 提交于 3月 15, 2023

maillist inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6D5XF

Reference: https://lore.kernel.org/linux-ext4/20230130111138.76tp6pij3yhh4brh@quack3/T/#t

--------------------------------

Current _ext4_show_options() do not distinguish MOPT_2 flag, so it mixed
extend sbi->s_mount_opt2 options with sbi->s_mount_opt, it could lead to
show incorrect options, e.g. show fc_debug_force if we mount with
errors=continue mode and miss it if we set.

  $ mkfs.ext4 /dev/pmem0
  $ mount -o errors=remount-ro /dev/pmem0 /mnt
  $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
    #empty
  $ mount -o remount,errors=continue /mnt
  $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
    fc_debug_force
  $ mount -o remount,errors=remount-ro,fc_debug_force /mnt
  $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
    #empty

Fixes: 995a3ed6 ("ext4: add fast_commit feature and handling for extended mount options")
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>

Conflict:
  fs/ext4/super.c
Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

2b59ae5e

08 2月, 2023 1 次提交

ext4: fix null-ptr-deref in ext4_write_info · 1575b7b3

由 Baokun Li 提交于 2月 07, 2023

stable inclusion
from stable-v5.10.150
commit f34ab95162763cd7352f46df169296eec28b688d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6BJ5F
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f34ab95162763cd7352f46df169296eec28b688d

--------------------------------

commit f9c1f248 upstream.

I caught a null-ptr-deref bug as follows:

==================================================================
KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f]
CPU: 1 PID: 1589 Comm: umount Not tainted 5.10.0-02219-dirty #339
RIP: 0010:ext4_write_info+0x53/0x1b0
[...]
Call Trace:
 dquot_writeback_dquots+0x341/0x9a0
 ext4_sync_fs+0x19e/0x800
 __sync_filesystem+0x83/0x100
 sync_filesystem+0x89/0xf0
 generic_shutdown_super+0x79/0x3e0
 kill_block_super+0xa1/0x110
 deactivate_locked_super+0xac/0x130
 deactivate_super+0xb6/0xd0
 cleanup_mnt+0x289/0x400
 __cleanup_mnt+0x16/0x20
 task_work_run+0x11c/0x1c0
 exit_to_user_mode_prepare+0x203/0x210
 syscall_exit_to_user_mode+0x5b/0x3a0
 do_syscall_64+0x59/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
 ==================================================================

Above issue may happen as follows:
-------------------------------------
exit_to_user_mode_prepare
 task_work_run
  __cleanup_mnt
   cleanup_mnt
    deactivate_super
     deactivate_locked_super
      kill_block_super
       generic_shutdown_super
        shrink_dcache_for_umount
         dentry = sb->s_root
         sb->s_root = NULL              <--- Here set NULL
        sync_filesystem
         __sync_filesystem
          sb->s_op->sync_fs > ext4_sync_fs
           dquot_writeback_dquots
            sb->dq_op->write_info > ext4_write_info
             ext4_journal_start(d_inode(sb->s_root), EXT4_HT_QUOTA, 2)
              d_inode(sb->s_root)
               s_root->d_inode          <--- Null pointer dereference

To solve this problem, we use ext4_journal_start_sb directly
to avoid s_root being used.

Cc: stable@kernel.org
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220805123947.565152-1-libaokun1@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

1575b7b3

07 12月, 2022 1 次提交

ext4: fix super block checksum incorrect after mount · e3682bb9

由 Ye Bin 提交于 12月 07, 2022

mainline inclusion
from mainline-v5.19-rc3
commit 9b6641dd
category: bugfix
bugzilla: 186927, https://gitee.com/src-openeuler/kernel/issues/I5YIY6
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9b6641dd95a0c441b277dd72ba22fed8d61f76ad

--------------------------------

We got issue as follows:
[home]# mount  /dev/sda  test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock!  Retrying...

Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.

To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.
Signed-off-by: NYe Bin <yebin10@huawei.com>
Cc: stable@kernel.org
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NRitesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e3682bb9

02 11月, 2022 1 次提交

ext4: add helper to check quota inums · 931dd258

由 Baokun Li 提交于 11月 02, 2022

hulk inclusion
category: bugfix
bugzilla: 187821,https://gitee.com/openeuler/kernel/issues/I5X9U0
CVE: NA

--------------------------------

Before quota is enabled, a check on the preset quota inums in
ext4_super_block is added to prevent wrong quota inodes from being loaded.
In addition, when the quota fails to be enabled, the quota type and quota
inum are printed to facilitate fault locating.
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

931dd258

18 10月, 2022 1 次提交

ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate · 014ed988

由 Zhang Yi 提交于 10月 18, 2022

mainline inclusion
from mainline-v6.1-rc1
commit 0b73284c
category: bugfix
bugzilla: 187414, https://gitee.com/openeuler/kernel/issues/I5W498
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0b73284c564d3ae4feef4bc920292f004acf4980

--------------------------------

Recently we notice that ext4 filesystem would occasionally fail to read
metadata from disk and report error message, but the disk and block
layer looks fine. After analyse, we lockon commit 88dbcbb3
("blkdev: avoid migration stalls for blkdev pages"). It provide a
migration method for the bdev, we could move page that has buffers
without extra users now, but it lock the buffers on the page, which
breaks the fragile metadata read operation on ext4 filesystem,
ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the
assumption of that locked buffer means it is under IO. So it just
trylock the buffer and skip submit IO if it lock failed, after
wait_on_buffer() we conclude IO error because the buffer is not
uptodate.

This issue could be easily reproduced by add some delay just after
buffer_migrate_lock_buffers() in __buffer_migrate_folio() and do
fsstress on ext4 filesystem.

  EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193:
  comm fsstress: reading directory lblock 0
  EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334:
  comm fsstress: reading directory lblock 0

Fix it by removing the trylock logic in ext4_read_bh_lock(), just lock
the buffer and submit IO if it's not uptodate, and also leave over
readahead helper.

Cc: stable@kernel.org
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220831074629.3755110-1-yi.zhang@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

Conflict:
  fs/ext4/super.c
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

014ed988

27 9月, 2022 3 次提交

ext4: only allow test_dummy_encryption when supported · b5c0d4fa

由 Eric Biggers 提交于 9月 27, 2022

stable inclusion
from stable-v5.10.121
commit a67100f42665cf7a5ed7821376140f62def0d31e
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6CQ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a67100f42665cf7a5ed7821376140f62def0d31e

--------------------------------

commit 5f41fdae upstream.

Make the test_dummy_encryption mount option require that the encrypt
feature flag be already enabled on the filesystem, rather than
automatically enabling it.  Practically, this means that "-O encrypt"
will need to be included in MKFS_OPTIONS when running xfstests with the
test_dummy_encryption mount option.  (ext4/053 also needs an update.)

Moreover, as long as the preconditions for test_dummy_encryption are
being tightened anyway, take the opportunity to start rejecting it when
!CONFIG_FS_ENCRYPTION rather than ignoring it.

The motivation for requiring the encrypt feature flag is that:

- Having the filesystem auto-enable feature flags is problematic, as it
  bypasses the usual sanity checks.  The specific issue which came up
  recently is that in kernel versions where ext4 supports casefold but
  not encrypt+casefold (v5.1 through v5.10), the kernel will happily add
  the encrypt flag to a filesystem that has the casefold flag, making it
  unmountable -- but only for subsequent mounts, not the initial one.
  This confused the casefold support detection in xfstests, causing
  generic/556 to fail rather than be skipped.

- The xfstests-bld test runners (kvm-xfstests et al.) already use the
  required mkfs flag, so they will not be affected by this change.  Only
  users of test_dummy_encryption alone will be affected.  But, this
  option has always been for testing only, so it should be fine to
  require that the few users of this option update their test scripts.

- f2fs already requires it (for its equivalent feature flag).
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Link: https://lore.kernel.org/r/20220519204437.61645-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

 Conflicts:
	fs/ext4/super.c
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

b5c0d4fa

ext4: filter out EXT4_FC_REPLAY from on-disk superblock field s_state · 4c575e85

由 Theodore Ts'o 提交于 9月 27, 2022

stable inclusion
from stable-v5.10.121
commit cc5b09cb6dacd4b32640537929ab4ee8fb2b9e04
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6CQ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=cc5b09cb6dacd4b32640537929ab4ee8fb2b9e04

--------------------------------

commit c878bea3 upstream.

The EXT4_FC_REPLAY bit in sbi->s_mount_state is used to indicate that
we are in the middle of replay the fast commit journal.  This was
actually a mistake, since the sbi->s_mount_info is initialized from
es->s_state.  Arguably s_mount_state is misleadingly named, but the
name is historical --- s_mount_state and s_state dates back to ext2.

What should have been used is the ext4_{set,clear,test}_mount_flag()
inline functions, which sets EXT4_MF_* bits in sbi->s_mount_flags.

The problem with using EXT4_FC_REPLAY is that a maliciously corrupted
superblock could result in EXT4_FC_REPLAY getting set in
s_mount_state.  This bypasses some sanity checks, and this can trigger
a BUG() in ext4_es_cache_extent().  As a easy-to-backport-fix, filter
out the EXT4_FC_REPLAY bit for now.  We should eventually transition
away from EXT4_FC_REPLAY to something like EXT4_MF_REPLAY.

Cc: stable@kernel.org
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20220420192312.1655305-1-phind.uet@gmail.com
Link: https://lore.kernel.org/r/20220517174028.942119-1-tytso@mit.edu
Reported-by: syzbot+c7358a3cd05ee786eb31@syzkaller.appspotmail.com
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

4c575e85

ext4: reject the 'commit' option on ext2 filesystems · 7e6681f6

由 Eric Biggers 提交于 9月 27, 2022

stable inclusion
from stable-v5.10.121
commit d54ac6ca48c1fa44868faffcba7d8aeab8937f2e
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6CQ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d54ac6ca48c1fa44868faffcba7d8aeab8937f2e

--------------------------------

[ Upstream commit cb8435dc ]

The 'commit' option is only applicable for ext3 and ext4 filesystems,
and has never been accepted by the ext2 filesystem driver, so the ext4
driver shouldn't allow it on ext2 filesystems.

This fixes a failure in xfstest ext4/053.

Fixes: 8dc0aa8c ("ext4: check incompatible mount options while mounting ext2/3")
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NRitesh Harjani <ritesh.list@gmail.com>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20220510183232.172615-1-ebiggers@kernel.orgSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

7e6681f6

26 7月, 2022 2 次提交

ext4: force overhead calculation if the s_overhead_cluster makes no sense · ec82895d

由 Theodore Ts'o 提交于 7月 26, 2022

stable inclusion
from stable-v5.10.113
commit e1e96e37272156d691203a3725b876787f38c8f2
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5ISAH

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e1e96e37272156d691203a3725b876787f38c8f2

--------------------------------

commit 85d825db upstream.

If the file system does not use bigalloc, calculating the overhead is
cheap, so force the recalculation of the overhead so we don't have to
trust the precalculated overhead in the superblock.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

ec82895d

ext4: fix overhead calculation to account for the reserved gdt blocks · 9f203ffb

由 Theodore Ts'o 提交于 7月 26, 2022

stable inclusion
from stable-v5.10.113
commit 4789149b9ea2a1893c62d816742f1a76514fc901
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5ISAH

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4789149b9ea2a1893c62d816742f1a76514fc901

--------------------------------

commit 10b01ee9 upstream.

The kernel calculation was underestimating the overhead by not taking
into account the reserved gdt blocks.  With this change, the overhead
calculated by the kernel matches the overhead calculation in mke2fs.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

9f203ffb

13 7月, 2022 1 次提交

fs: introduce a wrapper uuid_to_fsid() · 3bfbb0e9

由 Amir Goldstein 提交于 7月 13, 2022

mainline inclusion
from mainline-5.13-rc1
commit 9591c3a3
category: feature
bugzilla: 187162, https://gitee.com/openeuler/kernel/issues/I5FYKV
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9591c3a34f7722bd77f42c98d76fd5a5bad465f0

-------------------------------------------------

Some filesystem's use a digest of their uuid for f_fsid.
Create a simple wrapper for this open coded folding.

Filesystems that have a non null uuid but use the block device
number for f_fsid may also consider using this helper.

[JK: Added missing asm/byteorder.h include]
Link: https://lore.kernel.org/r/20210322173944.449469-2-amir73il@gmail.comAcked-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NChristian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3bfbb0e9

23 5月, 2022 1 次提交

ext4: fix warning when submitting superblock in ext4_commit_super() · 064a71f5

由 Zhang Yi 提交于 5月 21, 2022

hulk inclusion
category: bugfix
bugzilla: 186737, https://gitee.com/openeuler/kernel/issues/I58COJ
CVE: NA

--------------------------------

We have already check the io_error and uptodate flag before submitting
the superblock buffer, and re-set the uptodate flag if it has been
failed to write out. But it was lockless and could be raced by another
ext4_commit_super(), and finally trigger '!uptodate' WARNING when
marking buffer dirty. Fix it by submit buffer directly.
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

064a71f5

12 5月, 2022 1 次提交

ext4: fix bug_on in start_this_handle during umount filesystem · fc4bac49

由 Ye Bin 提交于 5月 10, 2022

mainline inclusion
from mainline-v5.18-rc4
commit b98535d0
category: bugfix
bugzilla: 186675, https://gitee.com/openeuler/kernel/issues/I55TUC
CVE: NA

-------------------------------------------------

We got issue as follows:
------------[ cut here ]------------
kernel BUG at fs/jbd2/transaction.c:389!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 9 PID: 131 Comm: kworker/9:1 Not tainted 5.17.0-862.14.0.6.x86_64-00001-g23f87daf7d74-dirty #197
Workqueue: events flush_stashed_error_work
RIP: 0010:start_this_handle+0x41c/0x1160
RSP: 0018:ffff888106b47c20 EFLAGS: 00010202
RAX: ffffed10251b8400 RBX: ffff888128dc204c RCX: ffffffffb52972ac
RDX: 0000000000000200 RSI: 0000000000000004 RDI: ffff888128dc2050
RBP: 0000000000000039 R08: 0000000000000001 R09: ffffed10251b840a
R10: ffff888128dc204f R11: ffffed10251b8409 R12: ffff888116d78000
R13: 0000000000000000 R14: dffffc0000000000 R15: ffff888128dc2000
FS:  0000000000000000(0000) GS:ffff88839d680000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000001620068 CR3: 0000000376c0e000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 jbd2__journal_start+0x38a/0x790
 jbd2_journal_start+0x19/0x20
 flush_stashed_error_work+0x110/0x2b3
 process_one_work+0x688/0x1080
 worker_thread+0x8b/0xc50
 kthread+0x26f/0x310
 ret_from_fork+0x22/0x30
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---

Above issue may happen as follows:
      umount            read procfs            error_work
ext4_put_super
  flush_work(&sbi->s_error_work);

                      ext4_mb_seq_groups_show
	                ext4_mb_load_buddy_gfp
			  ext4_mb_init_group
			    ext4_mb_init_cache
	                      ext4_read_block_bitmap_nowait
			        ext4_validate_block_bitmap
				  ext4_error
			            ext4_handle_error
			              schedule_work(&EXT4_SB(sb)->s_error_work);

  ext4_unregister_sysfs(sb);
  jbd2_journal_destroy(sbi->s_journal);
    journal_kill_thread
      journal->j_flags |= JBD2_UNMOUNT;

                                          flush_stashed_error_work
				            jbd2_journal_start
					      start_this_handle
					        BUG_ON(journal->j_flags & JBD2_UNMOUNT);

To solve this issue, we call 'ext4_unregister_sysfs() before flushing
s_error_work in ext4_put_super().
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/20220322012419.725457-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

conflicts:
	fs/ext4/super.c
Signed-off-by: NChenXiaoSong <chenxiaosong2@huawei.com>
Reviewed-by: Nyebin <yebin10@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fc4bac49

27 4月, 2022 3 次提交

ext4: destroy ext4_fc_dentry_cachep kmemcache on module removal · 2ef3716f

由 Sebastian Andrzej Siewior 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit d60e9daba29e44e0f277333e46fff90c74509398
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d60e9daba29e44e0f277333e46fff90c74509398

--------------------------------

commit ab047d51 upstream.

The kmemcache for ext4_fc_dentry_cachep remains registered after module
removal.

Destroy ext4_fc_dentry_cachep kmemcache on module removal.

Fixes: aa75f4d3 ("ext4: main fast-commit commit path")
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Reviewed-by: NHarshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211110134640.lyku5vklvdndw6uk@linutronix.de
Link: https://lore.kernel.org/r/YbiK3JetFFl08bd7@linutronix.de
Link: https://lore.kernel.org/r/20211223164436.2628390-1-bigeasy@linutronix.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

2ef3716f

ext4: make sure quota gets properly shutdown on error · 9359c4a4

由 Jan Kara 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit 115b762b48ab83de2898b8c1a38e3799446a97af
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=115b762b48ab83de2898b8c1a38e3799446a97af

--------------------------------

commit 15fc69bb upstream.

When we hit an error when enabling quotas and setting inode flags, we do
not properly shutdown quota subsystem despite returning error from
Q_QUOTAON quotactl. This can lead to some odd situations like kernel
using quota file while it is still writeable for userspace. Make sure we
properly cleanup the quota subsystem in case of error.
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20211007155336.12493-2-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

9359c4a4

ext4: make sure to reset inode lockdep class when quota enabling fails · 5b70c51a

由 Jan Kara 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit 762e4c33e9e5ecdcfedb1752e38c2aac2921df2e
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=762e4c33e9e5ecdcfedb1752e38c2aac2921df2e

--------------------------------

commit 4013d47a upstream.

When we succeed in enabling some quota type but fail to enable another
one with quota feature, we correctly disable all enabled quota types.
However we forget to reset i_data_sem lockdep class. When the inode gets
freed and reused, it will inherit this lockdep class (i_data_sem is
initialized only when a slab is created) and thus eventually lockdep
barfs about possible deadlocks.

Reported-and-tested-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20211007155336.12493-3-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

5b70c51a

02 3月, 2022 1 次提交

ext4: convert DIV_ROUND_UP to DIV_ROUND_UP_ULL · 616abaea

由 Zhang Yi 提交于 3月 02, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4VAQC?from=project-issue
CVE: NA

--------------------------------

Fix a compile error on arm32 architecture.

 build failed: arm, allmodconfig

 ERROR: modpost: "__aeabi_ldivmod" [fs/ext4/ext4.ko] undefined!
 make[1]: *** [modules-only.symvers] Error 1
 make[1]: *** Deleting file 'modules-only.symvers'
 make: *** [modules] Error 2

Fixes: 356efe60 ("ext4: fix underflow in ext4_max_bitmap_size()")
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

616abaea

23 2月, 2022 1 次提交

ext4: fix underflow in ext4_max_bitmap_size() · 356efe60

由 Zhang Yi 提交于 2月 23, 2022

hulk inclusion
category: bugfix
bugzilla: 186216, https://gitee.com/openeuler/kernel/issues/I4PW7R
CVE: NA

--------------------------------

The same to commit 1c2d1421 ("ext2: Fix underflow in ext2_max_size()")
in ext2 filesystem, ext4 driver has the same issue with 64K block size
and ^huge_file, fix this issue the same as ext2. This patch also revert
commit 75ca6ad4 ("ext4: fix loff_t overflow in ext4_max_bitmap_size()")
because it's no longer needed.
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

356efe60

26 1月, 2022 1 次提交

ext4: report error to userspace by netlink · 32174988

由 Zhao Minmin 提交于 1月 26, 2022

hulk inclusion
category: feature
bugzilla: 34592 https://gitee.com/openeuler/kernel/issues/I4RF6M
CVE: NA

-------------------------------------------------

Implement the ext3/ext4 file system error report.

This patch is used to implement abnormal alarm of ext3/ext4 filesystem.
You can archieve this by setting "FILESYSTEM_MONITOR" or "FILESYSTEM_ALARM"
on in configuration file. With this setting, alarm will be raised when
ext3/ext4 file system expection occurs.
Signed-off-by: NZhao Minmin <zhaominmin1@huawei.com>
Reviewed-by: NYi Zhang <yi.zhang@huawei.com>
Link: http://hulk.huawei.com/pipermail/kernel.openeuler/2016-March/009711.htmlSigned-off-by: NWang Hui <john.wanghui@huawei.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>

[yebin: cherry-pick this patch from openeuler, commit 6636f443]
conflicts :
fs/ext4/super.c
fs/ext4/ext4.h
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

32174988

22 1月, 2022 1 次提交

ext4: Fix BUG_ON in ext4_bread when write quota data · a6610eed

由 Ye Bin 提交于 1月 22, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 380a0091
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4RP94?from=project-issue
CVE: NA

--------------------------------

We got issue as follows when run syzkaller:
[  167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user
[  167.938306] EXT4-fs (loop0): Remounting filesystem read-only
[  167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) || handle != NULL || create == 0'
[  167.983601] ------------[ cut here ]------------
[  167.984245] kernel BUG at fs/ext4/inode.c:847!
[  167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[  167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G    B             5.16.0-rc5-next-20211217+ #123
[  167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[  167.988590] RIP: 0010:ext4_getblk+0x17e/0x504
[  167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244
[  167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282
[  167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000
[  167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66
[  167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09
[  167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8
[  167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001
[  167.997158] FS:  00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000
[  167.998238] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0
[  167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  168.001899] Call Trace:
[  168.002235]  <TASK>
[  168.007167]  ext4_bread+0xd/0x53
[  168.007612]  ext4_quota_write+0x20c/0x5c0
[  168.010457]  write_blk+0x100/0x220
[  168.010944]  remove_free_dqentry+0x1c6/0x440
[  168.011525]  free_dqentry.isra.0+0x565/0x830
[  168.012133]  remove_tree+0x318/0x6d0
[  168.014744]  remove_tree+0x1eb/0x6d0
[  168.017346]  remove_tree+0x1eb/0x6d0
[  168.019969]  remove_tree+0x1eb/0x6d0
[  168.022128]  qtree_release_dquot+0x291/0x340
[  168.023297]  v2_release_dquot+0xce/0x120
[  168.023847]  dquot_release+0x197/0x3e0
[  168.024358]  ext4_release_dquot+0x22a/0x2d0
[  168.024932]  dqput.part.0+0x1c9/0x900
[  168.025430]  __dquot_drop+0x120/0x190
[  168.025942]  ext4_clear_inode+0x86/0x220
[  168.026472]  ext4_evict_inode+0x9e8/0xa22
[  168.028200]  evict+0x29e/0x4f0
[  168.028625]  dispose_list+0x102/0x1f0
[  168.029148]  evict_inodes+0x2c1/0x3e0
[  168.030188]  generic_shutdown_super+0xa4/0x3b0
[  168.030817]  kill_block_super+0x95/0xd0
[  168.031360]  deactivate_locked_super+0x85/0xd0
[  168.031977]  cleanup_mnt+0x2bc/0x480
[  168.033062]  task_work_run+0xd1/0x170
[  168.033565]  do_exit+0xa4f/0x2b50
[  168.037155]  do_group_exit+0xef/0x2d0
[  168.037666]  __x64_sys_exit_group+0x3a/0x50
[  168.038237]  do_syscall_64+0x3b/0x90
[  168.038751]  entry_SYSCALL_64_after_hwframe+0x44/0xae

In order to reproduce this problem, the following conditions need to be met:
1. Ext4 filesystem with no journal;
2. Filesystem image with incorrect quota data;
3. Abort filesystem forced by user;
4. umount filesystem;

As in ext4_quota_write:
...
         if (EXT4_SB(sb)->s_journal && !handle) {
                 ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)"
                         " cancelled because transaction is not started",
                         (unsigned long long)off, (unsigned long long)len);
                 return -EIO;
         }
...
We only check handle if NULL when filesystem has journal. There is need
check handle if NULL even when filesystem has no journal.
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20211223015506.297766-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a6610eed

14 1月, 2022 1 次提交

ext4: always panic when errors=panic is specified · 1f3e6263

由 Ye Bin 提交于 1月 14, 2022

mainline inclusion
from mainline-v5.13-rc1
commit ac2f7ca5
category: bugfix
bugzilla: 182973 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ac2f7ca51b0929461ea49918f27c11b680f28995

-------------------------------------------------

Before commit 014c9caa ("ext4: make ext4_abort() use
__ext4_error()"), the following series of commands would trigger a
panic:

1. mount /dev/sda -o ro,errors=panic test
2. mount /dev/sda -o remount,abort test

After commit 014c9caa, remounting a file system using the test
mount option "abort" will no longer trigger a panic.  This commit will
restore the behaviour immediately before commit 014c9caa.
(However, note that the Linux kernel's behavior has not been
consistent; some previous kernel versions, including 5.4 and 4.19
similarly did not panic after using the mount option "abort".)

This also makes a change to long-standing behaviour; namely, the
following series commands will now cause a panic, when previously it
did not:

1. mount /dev/sda -o ro,errors=panic test
2. echo test > /sys/fs/ext4/sda/trigger_fs_error

However, this makes ext4's behaviour much more consistent, so this is
a good thing.

Cc: stable@kernel.org
Fixes: 014c9caa ("ext4: make ext4_abort() use __ext4_error()")
Signed-off-by: NYe Bin <yebin10@huawei.com>
Link: https://lore.kernel.org/r/20210401081903.3421208-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NZheng Liang <zhengliang6@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1f3e6263

06 12月, 2021 1 次提交

ext4: fix lazy initialization next schedule time computation in more granular unit · 3b31367d

由 Shaoying Xu 提交于 12月 06, 2021

stable inclusion
from stable-5.10.80
commit aa21b7e3d3205322d11dbced09941702ef308d8d
bugzilla: 185821 https://gitee.com/openeuler/kernel/issues/I4L7CG

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=aa21b7e3d3205322d11dbced09941702ef308d8d

--------------------------------

commit 39fec688 upstream.

Ext4 file system has default lazy inode table initialization setup once
it is mounted. However, it has issue on computing the next schedule time
that makes the timeout same amount in jiffies but different real time in
secs if with various HZ values. Therefore, fix by measuring the current
time in a more granular unit nanoseconds and make the next schedule time
independent of the HZ value.

Fixes: bfff6873 ("ext4: add support for lazy inode table initialization")
Signed-off-by: NShaoying Xu <shaoyi@amazon.com>
Cc: stable@vger.kernel.org
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210902164412.9994-2-shaoyi@amazon.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3b31367d

30 11月, 2021 1 次提交

ext4: fix e2fsprogs checksum failure for mounted filesystem · b939c55f

由 Jan Kara 提交于 11月 30, 2021

mainline inclusion
from mainline-v5.15
commit b2bbb92f
category: bugfix
bugzilla: 179956 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b2bbb92f7042e8075fb036bf97043339576330c3

-------------------------------------------------

Commit 81414b4d ("ext4: remove redundant sb checksum
recomputation") removed checksum recalculation after updating
superblock free space / inode counters in ext4_fill_super() based on
the fact that we will recalculate the checksum on superblock
writeout.

That is correct assumption but until the writeout happens (which can
take a long time) the checksum is incorrect in the buffer cache and if
programs such as tune2fs or resize2fs is called shortly after a file
system is mounted can fail.  So return back the checksum recalculation
and add a comment explaining why.

Fixes: 81414b4d ("ext4: remove redundant sb checksum recomputation")
Cc: stable@kernel.org
Reported-by: NBoyang Xue <bxue@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210812124737.21981-1-jack@suse.czSigned-off-by: NZheng Liang <zhengliang6@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b939c55f

15 11月, 2021 2 次提交

ext4: fix reserved space counter leakage · e9bfcc13

由 Jeffle Xu 提交于 11月 15, 2021

stable inclusion
from stable-5.10.71
commit 9ccf35492b084ecbc916761a5c6f42599450f013
bugzilla: 182981 https://gitee.com/openeuler/kernel/issues/I4I3KD

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9ccf35492b084ecbc916761a5c6f42599450f013

--------------------------------

commit 6fed8395 upstream.

When ext4_insert_delayed block receives and recovers from an error from
ext4_es_insert_delayed_block(), e.g., ENOMEM, it does not release the
space it has reserved for that block insertion as it should. One effect
of this bug is that s_dirtyclusters_counter is not decremented and
remains incorrectly elevated until the file system has been unmounted.
This can result in premature ENOSPC returns and apparent loss of free
space.

Another effect of this bug is that
/sys/fs/ext4/<dev>/delayed_allocation_blocks can remain non-zero even
after syncfs has been executed on the filesystem.

Besides, add check for s_dirtyclusters_counter when inode is going to be
evicted and freed. s_dirtyclusters_counter can still keep non-zero until
inode is written back in .evict_inode(), and thus the check is delayed
to .destroy_inode().

Fixes: 51865fda ("ext4: let ext4 maintain extent status tree")
Cc: stable@kernel.org
Suggested-by: NGao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: NEric Whitney <enwlinux@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210823061358.84473-1-jefflexu@linux.alibaba.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e9bfcc13

ext4: fix loff_t overflow in ext4_max_bitmap_size() · 9614189c

由 Ritesh Harjani 提交于 11月 15, 2021

stable inclusion
from stable-5.10.71
commit d11502fa26913f84883438c55fab0a8cb0d11474
bugzilla: 182981 https://gitee.com/openeuler/kernel/issues/I4I3KD

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d11502fa26913f84883438c55fab0a8cb0d11474

--------------------------------

commit 75ca6ad4 upstream.

We should use unsigned long long rather than loff_t to avoid
overflow in ext4_max_bitmap_size() for comparison before returning.
w/o this patch sbi->s_bitmap_maxbytes was becoming a negative
value due to overflow of upper_limit (with has_huge_files as true)

Below is a quick test to trigger it on a 64KB pagesize system.

sudo mkfs.ext4 -b 65536 -O ^has_extents,^64bit /dev/loop2
sudo mount /dev/loop2 /mnt
sudo echo "hello" > /mnt/hello 	-> This will error out with
				"echo: write error: File too large"
Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/594f409e2c543e90fd836b78188dfa5c575065ba.1622867594.git.riteshh@linux.ibm.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9614189c

21 10月, 2021 1 次提交

ext4: flush s_error_work before journal destroy in ext4_fill_super · bb39867f

由 yangerkun 提交于 10月 21, 2021

mainline inclusion
from mainline-v5.15-rc4
commit bb9464e0
category: bugfix
bugzilla: 176737 https://gitee.com/openeuler/kernel/issues/I4DDEL
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bb9464e08309f6befe80866f5be51778ca355ee9

---------------------------

The error path in ext4_fill_super forget to flush s_error_work before
journal destroy, and it may trigger the follow bug since
flush_stashed_error_work can run concurrently with journal destroy
without any protection for sbi->s_journal.

[32031.740193] EXT4-fs (loop66): get root inode failed
[32031.740484] EXT4-fs (loop66): mount failed
[32031.759805] ------------[ cut here ]------------
[32031.759807] kernel BUG at fs/jbd2/transaction.c:373!
[32031.760075] invalid opcode: 0000 [#1] SMP PTI
[32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded
4.18.0
[32031.765112] Call Trace:
[32031.765375]  ? __switch_to_asm+0x35/0x70
[32031.765635]  ? __switch_to_asm+0x41/0x70
[32031.765893]  ? __switch_to_asm+0x35/0x70
[32031.766148]  ? __switch_to_asm+0x41/0x70
[32031.766405]  ? _cond_resched+0x15/0x40
[32031.766665]  jbd2__journal_start+0xf1/0x1f0 [jbd2]
[32031.766934]  jbd2_journal_start+0x19/0x20 [jbd2]
[32031.767218]  flush_stashed_error_work+0x30/0x90 [ext4]
[32031.767487]  process_one_work+0x195/0x390
[32031.767747]  worker_thread+0x30/0x390
[32031.768007]  ? process_one_work+0x390/0x390
[32031.768265]  kthread+0x10d/0x130
[32031.768521]  ? kthread_flush_work_fn+0x10/0x10
[32031.768778]  ret_from_fork+0x35/0x40

static int start_this_handle(...)
    BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this

Besides, after we enable fast commit, ext4_fc_replay can add work to
s_error_work but return success, so the latter journal destroy in
ext4_load_journal can trigger this problem too.

Fix this problem with two steps:
1. Call ext4_commit_super directly in ext4_handle_error for the case
   that called from ext4_fc_replay
2. Since it's hard to pair the init and flush for s_error_work, we'd
   better add a extras flush_work before journal destroy in
   ext4_fill_super

Besides, this patch will call ext4_commit_super in ext4_handle_error for
any nojournal case too. But it seems safe since the reason we call
schedule_work was that we should save error info to sb through journal
if available. Conversely, for the nojournal case, it seems useless delay
commit superblock to s_error_work.

Fixes: c92dc856 ("ext4: defer saving error info from atomic context")
Fixes: 2d01ddc8 ("ext4: save error info to sb through journal if available")
Cc: stable@kernel.org
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210924093917.1953239-1-yangerkun@huawei.comReviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bb39867f

15 10月, 2021 3 次提交

ext4: inline jbd2_journal_[un]register_shrinker() · 3d41354a

由 Theodore Ts'o 提交于 10月 14, 2021

mainline inclusion
from mainline-5.14-rc1
commit 0705e8d1
category: bugfix
bugzilla: 50788 https://gitee.com/openeuler/kernel/issues/I4DDEL
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0705e8d1e2207ceeb83dc6e1751b6b82718b353a

---------------------------

The function jbd2_journal_unregister_shrinker() was getting called
twice when the file system was getting unmounted.  On Power and ARM
platforms this was causing kernel crash when unmounting the file
system, when a percpu_counter was destroyed twice.

Fix this by removing jbd2_journal_[un]register_shrinker() functions,
and inlining the shrinker setup and teardown into
journal_init_common() and jbd2_journal_destroy().  This means that
ext4 and ocfs2 now no longer need to know about registering and
unregistering jbd2's shrinker.

Also, while we're at it, rename the percpu counter from
j_jh_shrink_count to j_checkpoint_jh_count, since this makes it
clearer what this counter is intended to track.

Link: https://lore.kernel.org/r/20210705145025.3363130-1-tytso@mit.edu
Fixes: 4ba3fcdd ("jbd2,ext4: add a shrinker to release checkpointed buffers")
Reported-by: NJon Hunter <jonathanh@nvidia.com>
Reported-by: NSachin Sant <sachinp@linux.vnet.ibm.com>
Tested-by: NSachin Sant <sachinp@linux.vnet.ibm.com>
Tested-by: NJon Hunter <jonathanh@nvidia.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3d41354a

ext4: remove bdev_try_to_free_page() callback · 1a9f8af5

由 Zhang Yi 提交于 10月 14, 2021

mainline inclusion
from mainline-5.14-rc1
commit 3b672e3a
category: bugfix
bugzilla: 50788 https://gitee.com/openeuler/kernel/issues/I4DDEL
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3b672e3aedffc9f092e7e7eae0050a97a8ca508e

---------------------------

After we introduce a jbd2 shrinker to release checkpointed buffer's
journal head, we could free buffer without bdev_try_to_free_page()
under memory pressure. So this patch remove the whole
bdev_try_to_free_page() callback directly. It also remove many
use-after-free issues relate to it together.
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210610112440.3438139-8-yi.zhang@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1a9f8af5

jbd2,ext4: add a shrinker to release checkpointed buffers · 82c2be92

由 Zhang Yi 提交于 10月 14, 2021

mainline inclusion
from mainline-5.14-rc1
commit 4ba3fcdd
category: bugfix
bugzilla: 50788 https://gitee.com/openeuler/kernel/issues/I4DDEL
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4ba3fcdde7e36af93610ceb3cc38365b14539865

---------------------------

Current metadata buffer release logic in bdev_try_to_free_page() have
a lot of use-after-free issues when umount filesystem concurrently, and
it is difficult to fix directly because ext4 is the only user of
s_op->bdev_try_to_free_page callback and we may have to add more special
refcount or lock that is only used by ext4 into the common vfs layer,
which is unacceptable.

One better solution is remove the bdev_try_to_free_page callback, but
the real problem is we cannot easily release journal_head on the
checkpointed buffer, so try_to_free_buffers() cannot release buffers and
page under memory pressure, which is more likely to trigger
out-of-memory. So we cannot remove the callback directly before we find
another way to release journal_head.

This patch introduce a shrinker to free journal_head on the checkpointed
transaction. After the journal_head got freed, try_to_free_buffers()
could free buffer properly.
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Suggested-by: NJan Kara <jack@suse.cz>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210610112440.3438139-6-yi.zhang@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

82c2be92

13 10月, 2021 1 次提交

ext4: return error code when ext4_fill_flex_info() fails · 9babef7e

由 Yang Yingliang 提交于 10月 13, 2021

stable inclusion
from stable-5.10.50
commit f4e91a4e0d040c7950e647663206644beed95857
bugzilla: 174522 https://gitee.com/openeuler/kernel/issues/I4DNFY

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f4e91a4e0d040c7950e647663206644beed95857

--------------------------------

commit 8f6840c4 upstream.

After commit c89128a0 ("ext4: handle errors on
ext4_commit_super"), 'ret' may be set to 0 before calling
ext4_fill_flex_info(), if ext4_fill_flex_info() fails ext4_mount()
doesn't return error code, it makes 'root' is null which causes crash
in legacy_get_tree().

Fixes: c89128a0 ("ext4: handle errors on ext4_commit_super")
Reported-by: NHulk Robot <hulkci@huawei.com>
Cc: <stable@vger.kernel.org> # v4.18+
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210510111051.55650-1-yangyingliang@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9babef7e

12 10月, 2021 3 次提交

ext4: cleanup in-core orphan list if ext4_truncate() failed to get a transaction handle · 173849fd

由 Zhang Yi 提交于 10月 12, 2021

mainline inclusion
from mainline-5.14-rc1
commit b9a037b7
category: bugfix
bugzilla: 51864 https://gitee.com/openeuler/kernel/issues/I4DDEL
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b9a037b7f3c401d3c63e0423e56aef606b1ffaaf

---------------------------

In ext4_orphan_cleanup(), if ext4_truncate() failed to get a transaction
handle, it didn't remove the inode from the in-core orphan list, which
may probably trigger below error dump in ext4_destroy_inode() during the
final iput() and could lead to memory corruption on the later orphan
list changes.

 EXT4-fs (sda): Inode 6291467 (00000000b8247c67): orphan list check failed!
 00000000b8247c67: 0001f30a 00000004 00000000 00000023  ............#...
 00000000e24cde71: 00000006 014082a3 00000000 00000000  ......@.........
 0000000072c6a5ee: 00000000 00000000 00000000 00000000  ................
 ...

This patch fix this by cleanup in-core orphan list manually if
ext4_truncate() return error.

Cc: stable@kernel.org
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210507071904.160808-1-yi.zhang@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NYang Erkun <yangerkun@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

173849fd

ext4: fix WARN_ON_ONCE(!buffer_uptodate) after an error writing the superblock · db47795b

由 Ye Bin 提交于 10月 12, 2021

mainline inclusion
from mainline-v5.14-rc1
commit 558d6450
category: bugfix
bugzilla: 172157 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=558d6450c7755aa005d89021204b6cdcae5e848f

-----------------------------------------------

If a writeback of the superblock fails with an I/O error, the buffer
is marked not uptodate.  However, this can cause a WARN_ON to trigger
when we attempt to write superblock a second time.  (Which might
succeed this time, for cerrtain types of block devices such as iSCSI
devices over a flaky network.)

Try to detect this case in flush_stashed_error_work(), and also change
__ext4_handle_dirty_metadata() so we always set the uptodate flag, not
just in the nojournal case.

Before this commit, this problem can be repliciated via:

1. dmsetup  create dust1 --table  '0 2097152 dust /dev/sdc 0 4096'
2. mount  /dev/mapper/dust1  /home/test
3. dmsetup message dust1 0 addbadblock 0 10
4. cd /home/test
5. echo "XXXXXXX" > t

After a few seconds, we got following warning:

[   80.654487] end_buffer_async_write: bh=0xffff88842f18bdd0
[   80.656134] Buffer I/O error on dev dm-0, logical block 0, lost async page write
[   85.774450] EXT4-fs error (device dm-0): ext4_check_bdev_write_error:193: comm kworker/u16:8: Error while async write back metadata
[   91.415513] mark_buffer_dirty: bh=0xffff88842f18bdd0
[   91.417038] ------------[ cut here ]------------
[   91.418450] WARNING: CPU: 1 PID: 1944 at fs/buffer.c:1092 mark_buffer_dirty.cold+0x1c/0x5e
[   91.440322] Call Trace:
[   91.440652]  __jbd2_journal_temp_unlink_buffer+0x135/0x220
[   91.441354]  __jbd2_journal_unfile_buffer+0x24/0x90
[   91.441981]  __jbd2_journal_refile_buffer+0x134/0x1d0
[   91.442628]  jbd2_journal_commit_transaction+0x249a/0x3240
[   91.443336]  ? put_prev_entity+0x2a/0x200
[   91.443856]  ? kjournald2+0x12e/0x510
[   91.444324]  kjournald2+0x12e/0x510
[   91.444773]  ? woken_wake_function+0x30/0x30
[   91.445326]  kthread+0x150/0x1b0
[   91.445739]  ? commit_timeout+0x20/0x20
[   91.446258]  ? kthread_flush_worker+0xb0/0xb0
[   91.446818]  ret_from_fork+0x1f/0x30
[   91.447293] ---[ end trace 66f0b6bf3d1abade ]---
Signed-off-by: NYe Bin <yebin10@huawei.com>
Link: https://lore.kernel.org/r/20210615090537.3423231-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

db47795b

ext4: fix possible UAF when remounting r/o a mmp-protected file system · 410be925

由 Theodore Ts'o 提交于 10月 12, 2021

mainline inclusion
from mainline-5.14
commit  61bb4a1c
category: bugfix
bugzilla: 173880 https://gitee.com/openeuler/kernel/issues/I4DDEL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=61bb4a1c417e5b95d9edb4f887f131de32e419cb

-------------------------------------------------

After commit 618f0031 ("ext4: fix memory leak in
ext4_fill_super"), after the file system is remounted read-only, there
is a race where the kmmpd thread can exit, causing sbi->s_mmp_tsk to
point at freed memory, which the call to ext4_stop_mmpd() can trip
over.

Fix this by only allowing kmmpd() to exit when it is stopped via
ext4_stop_mmpd().

Link: https://lore.kernel.org/r/20210707002433.3719773-1-tytso@mit.eduReported-by: NYe Bin <yebin10@huawei.com>
Bug-Report-Link: <20210629143603.2166962-1-yebin10@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

410be925

06 7月, 2021 1 次提交

ext4: fix memory leak in ext4_fill_super · f0b26977

由 Pavel Skripkin 提交于 6月 30, 2021

mainline inclusion
from mainline-5.14
commit 618f0031
category: bugfix
bugzilla: 167360
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=618f003199c6188e01472b03cdbba227f1dc5f24

-------------------------------------------------

static int kthread(void *_create) will return -ENOMEM
or -EINTR in case of internal failure or
kthread_stop() call happens before threadfn call.

To prevent fancy error checking and make code
more straightforward we moved all cleanup code out
of kmmpd threadfn.

Also, dropped struct mmpd_data at all. Now struct super_block
is a threadfn data and struct buffer_head embedded into
struct ext4_sb_info.

Reported-by: syzbot+d9e482e303930fa4f6ff@syzkaller.appspotmail.com
Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
Link: https://lore.kernel.org/r/20210430185046.15742-1-paskripkin@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f0b26977

15 6月, 2021 1 次提交

ext4: fix memory leak in ext4_fill_super · 071edc48

由 Alexey Makhalov 提交于 6月 15, 2021

stable inclusion
from stable-5.10.43
commit 01d349a481f0591230300a9171330136f9159bcd
bugzilla: 109284
CVE: NA

--------------------------------

commit afd09b61 upstream.

Buffer head references must be released before calling kill_bdev();
otherwise the buffer head (and its page referenced by b_data) will not
be freed by kill_bdev, and subsequently that bh will be leaked.

If blocksizes differ, sb_set_blocksize() will kill current buffers and
page cache by using kill_bdev(). And then super block will be reread
again but using correct blocksize this time. sb_set_blocksize() didn't
fully free superblock page and buffer head, and being busy, they were
not freed and instead leaked.

This can easily be reproduced by calling an infinite loop of:

  systemctl start <ext4_on_lvm>.mount, and
  systemctl stop <ext4_on_lvm>.mount

... since systemd creates a cgroup for each slice which it mounts, and
the bh leak get amplified by a dying memory cgroup that also never
gets freed, and memory consumption is much more easily noticed.

Fixes: ce40733c ("ext4: Check for return value from sb_set_blocksize")
Fixes: ac27a0ec ("ext4: initial copy of files from ext3")
Link: https://lore.kernel.org/r/20210521075533.95732-1-amakhalov@vmware.comSigned-off-by: NAlexey Makhalov <amakhalov@vmware.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

071edc48

03 6月, 2021 2 次提交

ext4: fix error code in ext4_commit_super · 17707bfe

由 Fengnan Chang 提交于 5月 22, 2021

stable inclusion
from stable-5.10.36
commit 12905cf9e5c418fe8ad45e56d8d79848c6b11558
bugzilla: 51867
CVE: NA

--------------------------------

commit f88f1466 upstream.

We should set the error code when ext4_commit_super check argument failed.
Found in code review.
Fixes: c4be0c1d ("filesystem freeze: add error handling of write_super_lockfs/unlockfs").

Cc: stable@kernel.org
Signed-off-by: NFengnan Chang <changfengnan@vivo.com>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20210402101631.561-1-changfengnan@vivo.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>

Conflicts:
	fs/ext4/super.c
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

17707bfe

ext4: do not set SB_ACTIVE in ext4_orphan_cleanup() · 55862fe6

由 Zhang Yi 提交于 5月 18, 2021

mainline inclusion
from mainline-5.13-rc1
commit 72ffb49a
category: bugfix
bugzilla: 51858
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=72ffb49a7b623c92a37657eda7cc46a06d3e8398

---------------------------

When CONFIG_QUOTA is enabled, if we failed to mount the filesystem due
to some error happens behind ext4_orphan_cleanup(), it will end up
triggering a after free issue of super_block. The problem is that
ext4_orphan_cleanup() will set SB_ACTIVE flag if CONFIG_QUOTA is
enabled, after we cleanup the truncated inodes, the last iput() will put
them into the lru list, and these inodes' pages may probably dirty and
will be write back by the writeback thread, so it could be raced by
freeing super_block in the error path of mount_bdev().

After check the setting of SB_ACTIVE flag in ext4_orphan_cleanup(), it
was used to ensure updating the quota file properly, but evict inode and
trash data immediately in the last iput does not affect the quotafile,
so setting the SB_ACTIVE flag seems not required[1]. Fix this issue by
just remove the SB_ACTIVE setting.

[1] https://lore.kernel.org/linux-ext4/99cce8ca-e4a0-7301-840f-2ace67c551f3@huawei.com/T/#m04990cfbc4f44592421736b504afcc346b2a7c00

Cc: stable@kernel.org
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Tested-by: NJan Kara <jack@suse.cz>
Reviewed-by: NJan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210331033138.918975-1-yi.zhang@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

55862fe6

22 4月, 2021 1 次提交

ext4: shrink race window in ext4_should_retry_alloc() · 560f326e

由 Eric Whitney 提交于 4月 19, 2021

stable inclusion
from stable-5.10.28
commit 4b3139576a20e27fccb9a103ca5503b02e1ac655
bugzilla: 51779

--------------------------------

[ Upstream commit efc61345 ]

When generic/371 is run on kvm-xfstests using 5.10 and 5.11 kernels, it
fails at significant rates on the two test scenarios that disable
delayed allocation (ext3conv and data_journal) and force actual block
allocation for the fallocate and pwrite functions in the test. The
failure rate on 5.10 for both ext3conv and data_journal on one test
system typically runs about 85%. On 5.11, the failure rate on ext3conv
sometimes drops to as low as 1% while the rate on data_journal
increases to nearly 100%.

The observed failures are largely due to ext4_should_retry_alloc()
cutting off block allocation retries when s_mb_free_pending (used to
indicate that a transaction in progress will free blocks) is 0.
However, free space is usually available when this occurs during runs
of generic/371. It appears that a thread attempting to allocate
blocks is just missing transaction commits in other threads that
increase the free cluster count and reset s_mb_free_pending while
the allocating thread isn't running. Explicitly testing for free space
availability avoids this race.

The current code uses a post-increment operator in the conditional
expression that determines whether the retry limit has been exceeded.
This means that the conditional expression uses the value of the
retry counter before it's increased, resulting in an extra retry cycle.
The current code actually retries twice before hitting its retry limit
rather than once.

Increasing the retry limit to 3 from the current actual maximum retry
count of 2 in combination with the change described above reduces the
observed failure rate to less that 0.1% on both ext3conv and
data_journal with what should be limited impact on users sensitive to
the overhead caused by retries.

A per filesystem percpu counter exported via sysfs is added to allow
users or developers to track the number of times the retry limit is
exceeded without resorting to debugging methods. This should provide
some insight into worst case retry behavior.
Signed-off-by: NEric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20210218151132.19678-1-enwlinux@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

560f326e

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功