1. 14 7月, 2023 2 次提交
    • Z
      ext4: Add debug message to notify user space is out of free · ad36cedd
      Zhihao Cheng 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS
      CVE: NA
      
      --------------------------------
      
      Add debug message to notify user that ext4_writepages is stuck in loop
      caused by ENOSPC.
      Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
      (cherry picked from commit 4ae7e703)
      ad36cedd
    • Z
      Revert "ext4: Stop trying writing pages if no free blocks generated" · b42d3e12
      Zhihao Cheng 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS
      CVE: NA
      
      --------------------------------
      
      This reverts commit 07a8109d.
      
      When ext4 runs out of space, there could be a potential data lost in
      ext4_writepages:
      If there are many preallocated blocks for some files, e4b bitmap is
      different from block bitmap, and there are more free blocks accounted
      by block bitmap.
      
          ext4_writepages                         P2
      ext4_mb_new_blocks                  ext4_map_blocks
       ext4_mb_regular_allocator // No free bits in e4b bitmap
       ext4_mb_discard_preallocations_should_retry
        ext4_mb_discard_preallocations
         ext4_mb_discard_group_preallocations
          ext4_mb_release_inode_pa // updates e4b bitmap by pa->pa_free
           mb_free_blocks
                                           ext4_mb_new_blocks
                                            ext4_mb_regular_allocator
                                            // Got e4b bitmap's free bits
       ext4_mb_regular_allocator  // After 3 times retrying, ret ENOSPC
      
      ext4_writepages
       mpage_map_and_submit_extent
        mpage_map_one_extent // ret ENOSPC
        if (err == -ENOSPC && EXT4_SB(sb)->s_mb_free_pending)
        // s_mb_free_pending is 0
        *give_up_on_write = true  // Abandon writeback, data lost!
      
      Fixes: 07a8109d ("ext4: Stop trying writing pages if no free ...")
      Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
      (cherry picked from commit 5f142164)
      b42d3e12
  2. 06 7月, 2023 1 次提交
    • Z
      ext4: Stop trying writing pages if no free blocks generated · 77d99dff
      Zhihao Cheng 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS
      
      --------------------------------
      
      Folllowing steps could make ext4_wripages trap into a dead loop:
      
      1. Consume free_clusters until free_clusters > 2 * sbi->s_resv_clusters,
         and free_clusters > EXT4_FREECLUSTERS_WATERMARK.
         // eg. free_clusters = 1422, sbi->s_resv_clusters = 512
         // nr_cpus = 4, EXT4_FREECLUSTERS_WATERMARK = 512
      2. umount && mount.  // dirty_clusters = 0
      3. Run free_clusters tasks concurrently to write different files, many
         tasks write(appendant) 4K data by da_write method. And each inode will
         consume one data block and one extent block in map_block.
         // There are (free_clusters - EXT4_FREECLUSTERS_WATERMARK = 910)
         // tasks choosing da_write method, left 512 tasks choose write_begin
         // method. If tasks which chooses da_write path run first.
         // dirty_clusters = 910, free_clusters = 1422
         // Tasks which choose write_begin path will get ENOSPC:
         //  free_clusters < (nclusters + dirty_clusters + resv_clusters)
         //  1422 < (1 + 910 + 512)
      4. After certain number of map_block iterations in ext4_writepages.
         // free_clusters = 0,
         // dirty_clusters = 910 - (1422 / 2) = 199
      5. Delete one 4K file.  // free_clusters = 1
      6. ext4_writepages traps into dead loop:
          mpage_map_and_submit_extent
           mpage_map_one_extent // ret = ENOSPC
             ext4_map_blocks -> ext4_ext_map_blocks -> ext4_mb_new_blocks ->
             ext4_claim_free_clusters:
               if (free_clusters >= (nclusters + dirty_clusters)) // false
           if (err == -ENOSPC && ext4_count_free_clusters(sb)) // true
             return err
           *give_up_on_write = true // won't be executed
      
      Fix it by terminating ext4_writepages if no free blocks generated.
      Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
      (cherry picked from commit 07a8109d)
      77d99dff
  3. 10 5月, 2023 2 次提交
  4. 13 4月, 2023 2 次提交
  5. 12 4月, 2023 1 次提交
    • Z
      ext4: Fix i_disksize exceeding i_size problem in paritally written case · 1be2adf6
      Zhihao Cheng 提交于
      maillist inclusion
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6SMBI
      CVE: NA
      
      Reference: https://www.spinics.net/lists/linux-ext4/msg88386.html
      
      --------------------------------
      
      Following process makes i_disksize exceed i_size:
      
      generic_perform_write
       copied = iov_iter_copy_from_user_atomic(len) // copied < len
       ext4_da_write_end
       | ext4_update_i_disksize
       |  new_i_size = pos + copied;
       |  WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize) // update i_disksize
       | generic_write_end
       |  copied = block_write_end(copied, len) // copied = 0
       |   if (unlikely(copied < len))
       |    if (!PageUptodate(page))
       |     copied = 0;
       |  if (pos + copied > inode->i_size) // return false
       if (unlikely(copied == 0))
        goto again;
       if (unlikely(iov_iter_fault_in_readable(i, bytes))) {
        status = -EFAULT;
        break;
       }
      
      We get i_disksize greater than i_size here, which could trigger WARNING
      check 'i_size_read(inode) < EXT4_I(inode)->i_disksize' while doing dio:
      
      ext4_dio_write_iter
       iomap_dio_rw
        __iomap_dio_rw // return err, length is not aligned to 512
       ext4_handle_inode_extension
        WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize) // Oops
      
       WARNING: CPU: 2 PID: 2609 at fs/ext4/file.c:319
       CPU: 2 PID: 2609 Comm: aa Not tainted 6.3.0-rc2
       RIP: 0010:ext4_file_write_iter+0xbc7
       Call Trace:
        vfs_write+0x3b1
        ksys_write+0x77
        do_syscall_64+0x39
      
      Fix it by updating 'copied' value before updating i_disksize just like
      ext4_write_inline_data_end() does.
      
      Fetch a reproducer in [Link].
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=217209
      Fixes: 64769240 ("ext4: Add delayed allocation support in data=writeback mode")
      Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      1be2adf6
  6. 07 2月, 2023 1 次提交
  7. 18 11月, 2022 2 次提交
  8. 03 11月, 2022 1 次提交
  9. 04 8月, 2022 1 次提交
  10. 26 7月, 2022 2 次提交
  11. 06 7月, 2022 2 次提交
  12. 23 5月, 2022 1 次提交
    • Y
      ext4: Fix warning in ext4_da_release_space · 782a6ba7
      Ye Bin 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I58KLD
      CVE: NA
      
      ---------------------------
      
      We got issue as follows:
      WARNING: CPU: 2 PID: 1936 at fs/ext4/inode.c:1511 ext4_da_release_space+0x1b9/0x266
      Modules linked in:
      CPU: 2 PID: 1936 Comm: dd Not tainted 5.10.0+ #344
      RIP: 0010:ext4_da_release_space+0x1b9/0x266
      RSP: 0018:ffff888127307848 EFLAGS: 00010292
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff843f67cc
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffed1024e60ed9
      RBP: ffff888124dc8140 R08: 0000000000000083 R09: ffffed1075da6d23
      R10: ffff8883aed36917 R11: ffffed1075da6d22 R12: ffff888124dc83f0
      R13: ffff888124dc844c R14: ffff888124dc8168 R15: 000000000000000c
      FS:  00007f6b7247d740(0000) GS:ffff8883aed00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffc1a0b7dd8 CR3: 00000001065ce000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ext4_es_remove_extent+0x187/0x230
       mpage_release_unused_pages+0x3af/0x470
       ext4_writepages+0xb9b/0x1160
       do_writepages+0xbb/0x1e0
       __filemap_fdatawrite_range+0x1b1/0x1f0
       file_write_and_wait_range+0x80/0xe0
       ext4_sync_file+0x13d/0x800
       vfs_fsync_range+0x75/0x140
       do_fsync+0x4d/0x90
       __x64_sys_fsync+0x1d/0x30
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Above issue may happens as follows:
      	process1                        process2
      ext4_da_write_begin
        ext4_da_reserve_space
          ext4_es_insert_delayed_block[1/1]
                                          ext4_da_write_begin
      				      ext4_es_insert_delayed_block[0/1]
      ext4_writepages
        ****Delayed block allocation failed****
        mpage_release_unused_pages
          ext4_es_remove_extent[1/1]
            ext4_da_release_space [reserved 0]
      
      ext4_da_write_begin
        ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)
         ->As there exist [0, 1] extent, so will return true
                                         ext4_writepages
      				   ****Delayed block allocation failed****
                                           mpage_release_unused_pages
      				       ext4_es_remove_extent[0/1]
      				         ext4_da_release_space [reserved 1]
      					   ei->i_reserved_data_blocks [1->0]
      
        ext4_es_insert_delayed_block[1/1]
      
      ext4_writepages
        ****Delayed block allocation failed****
        mpage_release_unused_pages
        ext4_es_remove_extent[1/1]
         ext4_da_release_space [reserved 1]
          ei->i_reserved_data_blocks[0, -1]
          ->As ei->i_reserved_data_blocks already is zero but to_free is 1,
          will trigger warning.
      
      To solve above issue, introduce i_clu_lock to protect insert delayed
      block and remove block under cluster delay allocate mode.
      Signed-off-by: NYe Bin <yebin10@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      782a6ba7
  13. 21 5月, 2022 2 次提交
    • Y
      ext4: fix warning in ext4_handle_inode_extension · d3b4c686
      Ye Bin 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I58A7W?from=project-issue
      CVE: N/A
      
      ---------------------------
      
      We got issue as follows:
      EXT4-fs error (device loop0) in ext4_reserve_inode_write:5741: Out of memory
      EXT4-fs error (device loop0): ext4_setattr:5462: inode #13: comm syz-executor.0: mark_inode_dirty error
      EXT4-fs error (device loop0) in ext4_setattr:5519: Out of memory
      EXT4-fs error (device loop0): ext4_ind_map_blocks:595: inode #13: comm syz-executor.0: Can't allocate blocks for non-extent mapped inodes with bigalloc
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 4361 at fs/ext4/file.c:301 ext4_file_write_iter+0x11c9/0x1220
      Modules linked in:
      CPU: 1 PID: 4361 Comm: syz-executor.0 Not tainted 5.10.0+ #1
      RIP: 0010:ext4_file_write_iter+0x11c9/0x1220
      RSP: 0018:ffff924d80b27c00 EFLAGS: 00010282
      RAX: ffffffff815a3379 RBX: 0000000000000000 RCX: 000000003b000000
      RDX: ffff924d81601000 RSI: 00000000000009cc RDI: 00000000000009cd
      RBP: 000000000000000d R08: ffffffffbc5a2c6b R09: 0000902e0e52a96f
      R10: ffff902e2b7c1b40 R11: ffff902e2b7c1b40 R12: 000000000000000a
      R13: 0000000000000001 R14: ffff902e0e52aa10 R15: ffffffffffffff8b
      FS:  00007f81a7f65700(0000) GS:ffff902e3bc80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffff600400 CR3: 000000012db88001 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       do_iter_readv_writev+0x2e5/0x360
       do_iter_write+0x112/0x4c0
       do_pwritev+0x1e5/0x390
       __x64_sys_pwritev2+0x7e/0xa0
       do_syscall_64+0x37/0x50
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Above issue may happen as follows:
      Assume
      inode.i_size=4096
      EXT4_I(inode)->i_disksize=4096
      
      step 1: set inode->i_isize = 8192
      ext4_setattr
        if (attr->ia_size != inode->i_size)
          EXT4_I(inode)->i_disksize = attr->ia_size;
          rc = ext4_mark_inode_dirty
             ext4_reserve_inode_write
                ext4_get_inode_loc
                  __ext4_get_inode_loc
                    sb_getblk --> return -ENOMEM
         ...
         if (!error)  ->will not update i_size
           i_size_write(inode, attr->ia_size);
      Now:
      inode.i_size=4096
      EXT4_I(inode)->i_disksize=8192
      
      step 2: Direct write 4096 bytes
      ext4_file_write_iter
       ext4_dio_write_iter
         iomap_dio_rw ->return error
       if (extend)
         ext4_handle_inode_extension
           WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize);
      ->Then trigger warning.
      
      To solve above issue, if mark inode dirty failed in ext4_setattr just
      set 'EXT4_I(inode)->i_disksize' with old value.
      Signed-off-by: NYe Bin <yebin10@huawei.com>
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      d3b4c686
    • B
      ext4: fix race condition between ext4_write and ext4_convert_inline_data · 5e347d13
      Baokun Li 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 186638, https://gitee.com/openeuler/kernel/issues/I57PM8
      CVE: NA
      
      --------------------------------
      
      Hulk Robot reported a BUG_ON:
       ==================================================================
       EXT4-fs error (device loop3): ext4_mb_generate_buddy:805: group 0,
       block bitmap and bg descriptor inconsistent: 25 vs 31513 free clusters
       kernel BUG at fs/ext4/ext4_jbd2.c:53!
       invalid opcode: 0000 [#1] SMP KASAN PTI
       CPU: 0 PID: 25371 Comm: syz-executor.3 Not tainted 5.10.0+ #1
       RIP: 0010:ext4_put_nojournal fs/ext4/ext4_jbd2.c:53 [inline]
       RIP: 0010:__ext4_journal_stop+0x10e/0x110 fs/ext4/ext4_jbd2.c:116
       [...]
       Call Trace:
        ext4_write_inline_data_end+0x59a/0x730 fs/ext4/inline.c:795
        generic_perform_write+0x279/0x3c0 mm/filemap.c:3344
        ext4_buffered_write_iter+0x2e3/0x3d0 fs/ext4/file.c:270
        ext4_file_write_iter+0x30a/0x11c0 fs/ext4/file.c:520
        do_iter_readv_writev+0x339/0x3c0 fs/read_write.c:732
        do_iter_write+0x107/0x430 fs/read_write.c:861
        vfs_writev fs/read_write.c:934 [inline]
        do_pwritev+0x1e5/0x380 fs/read_write.c:1031
       [...]
       ==================================================================
      
      Above issue may happen as follows:
                 cpu1                     cpu2
      __________________________|__________________________
      do_pwritev
       vfs_writev
        do_iter_write
         ext4_file_write_iter
          ext4_buffered_write_iter
           generic_perform_write
            ext4_da_write_begin
                                 vfs_fallocate
                                  ext4_fallocate
                                   ext4_convert_inline_data
                                    ext4_convert_inline_data_nolock
                                     ext4_destroy_inline_data_nolock
                                      clear EXT4_STATE_MAY_INLINE_DATA
                                     ext4_map_blocks
                                      ext4_ext_map_blocks
                                       ext4_mb_new_blocks
                                        ext4_mb_regular_allocator
                                         ext4_mb_good_group_nolock
                                          ext4_mb_init_group
                                           ext4_mb_init_cache
                                            ext4_mb_generate_buddy  --> error
             ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
                                      ext4_restore_inline_data
                                       set EXT4_STATE_MAY_INLINE_DATA
             ext4_block_write_begin
            ext4_da_write_end
             ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
             ext4_write_inline_data_end
              handle=NULL
              ext4_journal_stop(handle)
               __ext4_journal_stop
                ext4_put_nojournal(handle)
                 ref_cnt = (unsigned long)handle
                 BUG_ON(ref_cnt == 0)  ---> BUG_ON
      
      The lock held by ext4_convert_inline_data is xattr_sem, but the lock
      held by generic_perform_write is i_rwsem. Therefore, the two locks can
      be concurrent.
      
      To solve above issue, we add inode_lock() for ext4_convert_inline_data().
      At the same time, move ext4_convert_inline_data() in front of
      ext4_punch_hole(), remove similar handling from ext4_punch_hole().
      
      Fixes: 0c8d414f ("ext4: let fallocate handle inline data correctly")
      Cc: stable@vger.kernel.org
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NBaokun Li <libaokun1@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      5e347d13
  14. 27 4月, 2022 3 次提交
  15. 22 1月, 2022 1 次提交
  16. 03 12月, 2021 1 次提交
  17. 15 11月, 2021 5 次提交
  18. 19 10月, 2021 5 次提交
  19. 13 10月, 2021 1 次提交
  20. 03 7月, 2021 1 次提交
    • Y
      ext4: stop return ENOSPC from ext4_issue_zeroout · 8119d09e
      yangerkun 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 167373
      CVE: NA
      
      ---------------------------
      
      Our testcase(briefly described as fsstress on dm thin-provisioning which
      ext4 see volume size with 100G but actual size 10G) trigger a hungtask
      bug since ext4_writepages fall into a infinite loop:
      
      static int ext4_writepages(xxx)
      {
          ...
         while (!done && mpd.first_page <= mpd.last_page) {
             ...
             ret = mpage_prepare_extent_to_map(&mpd);
             if (!ret) {
                 ...
                 ret = mpage_map_and_submit_extent(handle,
      &mpd,&give_up_on_write);
                 <----- will return -ENOSPC
                 ...
             }
             ...
             if (ret == -ENOSPC && sbi->s_journal) {
                 <------ we cannot break since we will get ENOSPC forever
                 jbd2_journal_force_commit_nested(sbi->s_journal);
                 ret = 0;
                 continue;
             }
             ...
         }
      }
      
      Got ENOSPC with follow stack:
      ...
      ext4_ext_map_blocks
        ext4_ext_convert_to_initialized
          ext4_ext_zeroout
            ext4_issue_zeroout
              ...
              submit_bio_wait <-- bio to thinpool will return ENOSPC
      
      Actually the ENOSPC from thin-provisioning means that a EIO from block
      device. We need convert the err as EIO to stop confuse ext4.
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      8119d09e
  21. 22 4月, 2021 1 次提交
  22. 13 4月, 2021 2 次提交