1. 23 8月, 2019 6 次提交
    • C
      f2fs: fix to avoid discard command leak · 04f9287a
      Chao Yu 提交于
       =============================================================================
       BUG discard_cmd (Tainted: G    B      OE  ): Objects remaining in discard_cmd on __kmem_cache_shutdown()
       -----------------------------------------------------------------------------
      
       INFO: Slab 0xffffe1ac481d22c0 objects=36 used=2 fp=0xffff936b4748bf50 flags=0x2ffff0000000100
       Call Trace:
        dump_stack+0x63/0x87
        slab_err+0xa1/0xb0
        __kmem_cache_shutdown+0x183/0x390
        shutdown_cache+0x14/0x110
        kmem_cache_destroy+0x195/0x1c0
        f2fs_destroy_segment_manager_caches+0x21/0x40 [f2fs]
        exit_f2fs_fs+0x35/0x641 [f2fs]
        SyS_delete_module+0x155/0x230
        ? vtime_user_exit+0x29/0x70
        do_syscall_64+0x6e/0x160
        entry_SYSCALL64_slow_path+0x25/0x25
      
       INFO: Object 0xffff936b4748b000 @offset=0
       INFO: Object 0xffff936b4748b070 @offset=112
       kmem_cache_destroy discard_cmd: Slab cache still has objects
       Call Trace:
        dump_stack+0x63/0x87
        kmem_cache_destroy+0x1b4/0x1c0
        f2fs_destroy_segment_manager_caches+0x21/0x40 [f2fs]
        exit_f2fs_fs+0x35/0x641 [f2fs]
        SyS_delete_module+0x155/0x230
        do_syscall_64+0x6e/0x160
        entry_SYSCALL64_slow_path+0x25/0x25
      
      Recovery can cache discard commands, so in error path of fill_super(),
      we need give a chance to handle them, otherwise it will lead to leak
      of discard_cmd slab cache.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      04f9287a
    • C
      f2fs: fix to avoid tagging SBI_QUOTA_NEED_REPAIR incorrectly · 0f1898f9
      Chao Yu 提交于
      On a quota disabled image, with fault injection, SBI_QUOTA_NEED_REPAIR
      will be set incorrectly in error path of f2fs_evict_inode(), fix it.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0f1898f9
    • C
      f2fs: fix to drop meta/node pages during umount · a8933b6b
      Chao Yu 提交于
      As reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=204193
      
      A null pointer dereference bug is triggered in f2fs under kernel-5.1.3.
      
       kasan_report.cold+0x5/0x32
       f2fs_write_end_io+0x215/0x650
       bio_endio+0x26e/0x320
       blk_update_request+0x209/0x5d0
       blk_mq_end_request+0x2e/0x230
       lo_complete_rq+0x12c/0x190
       blk_done_softirq+0x14a/0x1a0
       __do_softirq+0x119/0x3e5
       irq_exit+0x94/0xe0
       call_function_single_interrupt+0xf/0x20
      
      During umount, we will access NULL sbi->node_inode pointer in
      f2fs_write_end_io():
      
      	f2fs_bug_on(sbi, page->mapping == NODE_MAPPING(sbi) &&
      				page->index != nid_of_node(page));
      
      The reason is if disable_checkpoint mount option is on, meta dirty
      pages can remain during umount, and then be flushed by iput() of
      meta_inode, however node_inode has been iput()ed before
      meta_inode's iput().
      
      Since checkpoint is disabled, all meta/node datas are useless and
      should be dropped in next mount, so in umount, let's adjust
      drop_inode() to give a hint to iput_final() to drop all those dirty
      datas correctly.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a8933b6b
    • C
      f2fs: disallow switching io_bits option during remount · 1f78adfa
      Chao Yu 提交于
      If IO alignment feature is turned on after remount, we didn't
      initialize mempool of it, it turns out we will encounter panic
      during IO submission due to access NULL mempool pointer.
      
      This feature should be set only at mount time, so simply deny
      configuring during remount.
      
      This fixes bug reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=204135Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1f78adfa
    • C
      f2fs: fix panic of IO alignment feature · c72db71e
      Chao Yu 提交于
      Since 07173c3e ("block: enable multipage bvecs"), one bio vector
      can store multi pages, so that we can not calculate max IO size of
      bio as PAGE_SIZE * bio->bi_max_vecs. However IO alignment feature of
      f2fs always has that assumption, so finally, it may cause panic during
      IO submission as below stack.
      
       kernel BUG at fs/f2fs/data.c:317!
       RIP: 0010:__submit_merged_bio+0x8b0/0x8c0
       Call Trace:
        f2fs_submit_page_write+0x3cd/0xdd0
        do_write_page+0x15d/0x360
        f2fs_outplace_write_data+0xd7/0x210
        f2fs_do_write_data_page+0x43b/0xf30
        __write_data_page+0xcf6/0x1140
        f2fs_write_cache_pages+0x3ba/0xb40
        f2fs_write_data_pages+0x3dd/0x8b0
        do_writepages+0xbb/0x1e0
        __writeback_single_inode+0xb6/0x800
        writeback_sb_inodes+0x441/0x910
        wb_writeback+0x261/0x650
        wb_workfn+0x1f9/0x7a0
        process_one_work+0x503/0x970
        worker_thread+0x7d/0x820
        kthread+0x1ad/0x210
        ret_from_fork+0x35/0x40
      
      This patch adds one extra condition to check left space in bio while
      trying merging page to bio, to avoid panic.
      
      This bug was reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=204043Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c72db71e
    • C
      f2fs: introduce {page,io}_is_mergeable() for readability · 8896cbdf
      Chao Yu 提交于
      Wrap merge condition into function for readability, no logic change.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8896cbdf
  2. 17 8月, 2019 1 次提交
    • J
      f2fs: fix livelock in swapfile writes · 75a037f3
      Jaegeuk Kim 提交于
      This patch fixes livelock in the below call path when writing swap pages.
      
      [46374.617256] c2    701  __switch_to+0xe4/0x100
      [46374.617265] c2    701  __schedule+0x80c/0xbc4
      [46374.617273] c2    701  schedule+0x74/0x98
      [46374.617281] c2    701  rwsem_down_read_failed+0x190/0x234
      [46374.617291] c2    701  down_read+0x58/0x5c
      [46374.617300] c2    701  f2fs_map_blocks+0x138/0x9a8
      [46374.617310] c2    701  get_data_block_dio_write+0x74/0x104
      [46374.617320] c2    701  __blockdev_direct_IO+0x1350/0x3930
      [46374.617331] c2    701  f2fs_direct_IO+0x55c/0x8bc
      [46374.617341] c2    701  __swap_writepage+0x1d0/0x3e8
      [46374.617351] c2    701  swap_writepage+0x44/0x54
      [46374.617360] c2    701  shrink_page_list+0x140/0xe80
      [46374.617371] c2    701  shrink_inactive_list+0x510/0x918
      [46374.617381] c2    701  shrink_node_memcg+0x2d4/0x804
      [46374.617391] c2    701  shrink_node+0x10c/0x2f8
      [46374.617400] c2    701  do_try_to_free_pages+0x178/0x38c
      [46374.617410] c2    701  try_to_free_pages+0x348/0x4b8
      [46374.617419] c2    701  __alloc_pages_nodemask+0x7f8/0x1014
      [46374.617429] c2    701  pagecache_get_page+0x184/0x2cc
      [46374.617438] c2    701  f2fs_new_node_page+0x60/0x41c
      [46374.617449] c2    701  f2fs_new_inode_page+0x50/0x7c
      [46374.617460] c2    701  f2fs_init_inode_metadata+0x128/0x530
      [46374.617472] c2    701  f2fs_add_inline_entry+0x138/0xd64
      [46374.617480] c2    701  f2fs_do_add_link+0xf4/0x178
      [46374.617488] c2    701  f2fs_create+0x1e4/0x3ac
      [46374.617497] c2    701  path_openat+0xdc0/0x1308
      [46374.617507] c2    701  do_filp_open+0x78/0x124
      [46374.617516] c2    701  do_sys_open+0x134/0x248
      [46374.617525] c2    701  SyS_openat+0x14/0x20
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      75a037f3
  3. 29 7月, 2019 1 次提交
    • I
      f2fs: use EINVAL for superblock with invalid magic · 38fb6d0e
      Icenowy Zheng 提交于
      The kernel mount_block_root() function expects -EACESS or -EINVAL for a
      unmountable filesystem when trying to mount the root with different
      filesystem types.
      
      However, in 5.3-rc1 the behavior when F2FS code cannot find valid block
      changed to return -EFSCORRUPTED(-EUCLEAN), and this error code makes
      mount_block_root() fail when trying to probe F2FS.
      
      When the magic number of the superblock mismatches, it has a high
      probability that it's just not a F2FS. In this case return -EINVAL seems
      to be a better result, and this return value can make mount_block_root()
      probing work again.
      
      Return -EINVAL when the superblock has magic mismatch, -EFSCORRUPTED in
      other cases (the magic matches but the superblock cannot be recognized).
      
      Fixes: 10f966bb ("f2fs: use generic EFSBADCRC/EFSCORRUPTED")
      Signed-off-by: NIcenowy Zheng <icenowy@aosc.io>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      38fb6d0e
  4. 27 7月, 2019 1 次提交
  5. 19 7月, 2019 1 次提交
  6. 13 7月, 2019 3 次提交
  7. 12 7月, 2019 1 次提交
  8. 11 7月, 2019 4 次提交
  9. 10 7月, 2019 1 次提交
  10. 03 7月, 2019 11 次提交
  11. 22 6月, 2019 3 次提交
  12. 13 6月, 2019 1 次提交
  13. 04 6月, 2019 4 次提交
    • D
      f2fs: Add option to limit required GC for checkpoint=disable · 4d3aed70
      Daniel Rosenberg 提交于
      This extends the checkpoint option to allow checkpoint=disable:%u[%]
      This allows you to specify what how much of the disk you are willing
      to lose access to while mounting with checkpoint=disable. If the amount
      lost would be higher, the mount will return -EAGAIN. This can be given
      as a percent of total space, or in blocks.
      
      Currently, we need to run garbage collection until the amount of holes
      is smaller than the OVP space. With the new option, f2fs can mark
      space as unusable up front instead of requiring garbage collection until
      the number of holes is small enough.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4d3aed70
    • D
      f2fs: Fix accounting for unusable blocks · a4c3ecaa
      Daniel Rosenberg 提交于
      Fixes possible underflows when dealing with unusable blocks.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a4c3ecaa
    • D
      f2fs: Fix root reserved on remount · 9a9aecaa
      Daniel Rosenberg 提交于
      On a remount, you can currently set root reserved if it was not
      previously set. This can cause an underflow if reserved has been set to
      a very high value, since then root reserved + current reserved could be
      greater than user_block_count. inc_valid_block_count later subtracts out
      these values from user_block_count, causing an underflow.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9a9aecaa
    • D
      f2fs: Lower threshold for disable_cp_again · ae4ad7ea
      Daniel Rosenberg 提交于
      The existing threshold for allowable holes at checkpoint=disable time is
      too high. The OVP space contains reserved segments, which are always in
      the form of free segments. These must be subtracted from the OVP value.
      
      The current threshold is meant to be the maximum value of holes of a
      single type we can have and still guarantee that we can fill the disk
      without failing to find space for a block of a given type.
      
      If the disk is full, ignoring current reserved, which only helps us,
      the amount of unused blocks is equal to the OVP area. Of that, there
      are reserved segments, which must be free segments, and the rest of the
      ovp area, which can come from either free segments or holes. The maximum
      possible amount of holes is OVP-reserved.
      
      Now, consider the disk when mounting with checkpoint=disable.
      We must be able to fill all available free space with either data or
      node blocks. When we start with checkpoint=disable, holes are locked to
      their current type. Say we have H of one type of hole, and H+X of the
      other. We can fill H of that space with arbitrary typed blocks via SSR.
      For the remaining H+X blocks, we may not have any of a given block type
      left at all. For instance, if we were to fill the disk entirely with
      blocks of the type with fewer holes, the H+X blocks of the opposite type
      would not be used. If H+X > OVP-reserved, there would be more holes than
      could possibly exist, and we would have failed to find a suitable block
      earlier on, leading to a crash in update_sit_entry.
      
      If H+X <= OVP-reserved, then the holes end up effectively masked by the OVP
      region in this case.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ae4ad7ea
  14. 31 5月, 2019 2 次提交