1. 13 7月, 2019 3 次提交
  2. 11 7月, 2019 1 次提交
  3. 03 7月, 2019 5 次提交
  4. 22 6月, 2019 1 次提交
    • E
      f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags · 36098557
      Eric Biggers 提交于
      f2fs copied all the on-disk i_flags from ext4, and along with it the
      assumption that the on-disk i_flags are the same as the bits used by
      FS_IOC_GETFLAGS and FS_IOC_SETFLAGS.  This is problematic because
      reserving an on-disk inode flag in either filesystem's i_flags or in
      these ioctls effectively reserves it in all the other places too.  In
      fact, most of the "f2fs i_flags" are not used by f2fs at all.
      
      Fix this by separating f2fs's i_flags from the ioctl bits and ext4's
      i_flags.
      
      In the process, un-reserve all "f2fs i_flags" that aren't actually
      supported by f2fs.  This included various flags that were not settable
      at all, as well as various flags that were settable by FS_IOC_SETFLAGS
      but didn't actually do anything.
      
      There's a slight chance we'll need to add some flag(s) back to
      FS_IOC_SETFLAGS in order to avoid breaking users who expect f2fs to
      accept some random flag(s).  But hopefully such users don't exist.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      36098557
  5. 09 5月, 2019 5 次提交
  6. 17 4月, 2019 1 次提交
  7. 06 4月, 2019 1 次提交
    • D
      f2fs: Fix use of number of devices · 0916878d
      Damien Le Moal 提交于
      For a single device mount using a zoned block device, the zone
      information for the device is stored in the sbi->devs single entry
      array and sbi->s_ndevs is set to 1. This differs from a single device
      mount using a regular block device which does not allocate sbi->devs
      and sets sbi->s_ndevs to 0.
      
      However, sbi->s_devs == 0 condition is used throughout the code to
      differentiate a single device mount from a multi-device mount where
      sbi->s_ndevs is always larger than 1. This results in problems with
      single zoned block device volumes as these are treated as multi-device
      mounts but do not have the start_blk and end_blk information set. One
      of the problem observed is skipping of zone discard issuing resulting in
      write commands being issued to full zones or unaligned to a zone write
      pointer.
      
      Fix this problem by simply treating the cases sbi->s_ndevs == 0 (single
      regular block device mount) and sbi->s_ndevs == 1 (single zoned block
      device mount) in the same manner. This is done by introducing the
      helper function f2fs_is_multi_device() and using this helper in place
      of direct tests of sbi->s_ndevs value, improving code readability.
      
      Fixes: 7bb3a371 ("f2fs: Fix zoned block device support")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0916878d
  8. 15 3月, 2019 1 次提交
  9. 13 3月, 2019 2 次提交
  10. 06 3月, 2019 1 次提交
    • C
      f2fs: fix potential data inconsistence of checkpoint · c42d28ce
      Chao Yu 提交于
      Previously, we changed lock from cp_rwsem to node_change, it solved
      the deadlock issue which was caused by below race condition:
      
      Thread A			Thread B
      - f2fs_setattr
       - f2fs_lock_op  -- read_lock
       - dquot_transfer
        - __dquot_transfer
         - dquot_acquire
          - commit_dqblk
           - f2fs_quota_write
            - f2fs_write_begin
             - f2fs_write_failed
      				- write_checkpoint
      				 - block_operations
      				  - f2fs_lock_all  -- write_lock
              - f2fs_truncate_blocks
               - f2fs_lock_op  -- read_lock
      
      But it breaks the sematics of cp_rwsem, in other callers like:
      - f2fs_file_write_iter -> f2fs_write_begin -> f2fs_write_failed
      - f2fs_direct_IO -> f2fs_write_failed
      
      We allow to truncate dnode w/o cp_rwsem held, result in incorrect sit
      bitmap update, which can cause further data corruption.
      
      So this patch reverts previous fix implementation, and try to fix
      deadlock by skipping calling f2fs_truncate_blocks() in f2fs_write_failed()
      only for quota file, and keep the preallocated data/node in the tail of
      quota file, we can expecte that the preallocated space can be used to
      store quota info latter soon.
      
      Fixes: af033b2a ("f2fs: guarantee journalled quota data by checkpoint")
      Signed-off-by: NGao Xiang <gaoxiang25@huawei.com>
      Signed-off-by: NSheng Yong <shengyong1@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c42d28ce
  11. 16 2月, 2019 1 次提交
  12. 24 1月, 2019 1 次提交
  13. 09 1月, 2019 2 次提交
  14. 27 12月, 2018 2 次提交
  15. 14 12月, 2018 1 次提交
  16. 27 11月, 2018 6 次提交
    • J
      f2fs: fix m_may_create to make OPU DIO write correctly · f4f0b677
      Jia Zhu 提交于
      Previously, we added a parameter @map.m_may_create to trigger OPU
      allocation and call f2fs_balance_fs() correctly.
      
      But in get_more_blocks(), @create has been overwritten by below code.
      So the function f2fs_map_blocks() will not allocate new block address
      but directly go out. Meanwile,there are several functions calling
      f2fs_map_blocks() directly and @map.m_may_create not initialized.
      CODE:
      create = dio->op == REQ_OP_WRITE;
      	if (dio->flags & DIO_SKIP_HOLES) {
      		if (fs_startblk <= ((i_size_read(dio->inode) - 1) >>
      						i_blkbits))
      			create = 0;
      	}
      
      This patch fixes it.
      Signed-off-by: NJia Zhu <zhujia13@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f4f0b677
    • C
      f2fs: fix out-place-update DIO write · f9d6d059
      Chao Yu 提交于
      In get_more_blocks(), we may override @create as below code:
      
      	create = dio->op == REQ_OP_WRITE;
      	if (dio->flags & DIO_SKIP_HOLES) {
      		if (fs_startblk <= ((i_size_read(dio->inode) - 1) >>
      						i_blkbits))
      			create = 0;
      	}
      
      But in f2fs_map_blocks(), we only trigger f2fs_balance_fs() if @create
      is 1, so in LFS mode, dio overwrite under LFS mode can easily run out
      of free segments, result in below panic.
      
       Call Trace:
        allocate_segment_by_default+0xa8/0x270 [f2fs]
        f2fs_allocate_data_block+0x1ea/0x5c0 [f2fs]
        __allocate_data_block+0x306/0x480 [f2fs]
        f2fs_map_blocks+0x6f6/0x920 [f2fs]
        __get_data_block+0x4f/0xb0 [f2fs]
        get_data_block_dio_write+0x50/0x60 [f2fs]
        do_blockdev_direct_IO+0xcd5/0x21e0
        __blockdev_direct_IO+0x3a/0x3c
        f2fs_direct_IO+0x1ff/0x4a0 [f2fs]
        generic_file_direct_write+0xd9/0x160
        __generic_file_write_iter+0xbb/0x1e0
        f2fs_file_write_iter+0xaf/0x220 [f2fs]
        __vfs_write+0xd0/0x130
        vfs_write+0xb2/0x1b0
        SyS_pwrite64+0x69/0xa0
        ? vtime_user_exit+0x29/0x70
        do_syscall_64+0x6e/0x160
        entry_SYSCALL64_slow_path+0x25/0x25
       RIP: new_curseg+0x36f/0x380 [f2fs] RSP: ffffac570393f7a8
      
      So this patch introduces a parameter map.m_may_create to indicate that
      f2fs_map_blocks() is called from write or read path, which can give the
      right hint to let f2fs_map_blocks() trigger OPU allocation and call
      f2fs_balanc_fs() correctly.
      
      BTW, it disables physical address preallocation for direct IO in
      f2fs_preallocate_blocks, which is redundant to OPU allocation of
      f2fs_map_blocks.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f9d6d059
    • Y
      f2fs: move dir data flush to write checkpoint process · b61ac5b7
      Yunlei He 提交于
      This patch move dir data flush to write checkpoint process, by
      doing this, it may reduce some time for dir fsync.
      
      pre:
      	-f2fs_do_sync_file enter
      		-file_write_and_wait_range  <- flush & wait
      		-write_checkpoint
      			-do_checkpoint	    <- wait all
      	-f2fs_do_sync_file exit
      
      now:
      	-f2fs_do_sync_file enter
      		-write_checkpoint
      			-block_operations   <- flush dir & no wait
      			-do_checkpoint	    <- wait all
      	-f2fs_do_sync_file exit
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b61ac5b7
    • Y
      f2fs: change segment to section in f2fs_ioc_gc_range · 67b0e42b
      Yunlong Song 提交于
      f2fs_ioc_gc_range skips blocks_per_seg each time, however, f2fs_gc moves
      blocks of section each time, so fix it from segment to section.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      67b0e42b
    • C
      f2fs: clean up f2fs_sb_has_##feature_name · 7beb01f7
      Chao Yu 提交于
      In F2FS_HAS_FEATURE(), we will use F2FS_SB(sb) to get sbi pointer to
      access .raw_super field, to avoid unneeded pointer conversion, this
      patch changes to F2FS_HAS_FEATURE() accept sbi parameter directly.
      
      Just do cleanup, no logic change.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7beb01f7
    • C
      f2fs: introduce __is_large_section() for cleanup · 2c70c5e3
      Chao Yu 提交于
      Introduce a wrapper __is_large_section() to clean up codes.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2c70c5e3
  17. 23 10月, 2018 2 次提交
    • C
      f2fs: fix to keep project quota consistent · 78130819
      Chao Yu 提交于
      This patch does below changes to keep consistence of project quota data
      in sudden power-cut case:
      - update inode.i_projid and project quota atomically under lock_op() in
      f2fs_ioc_setproject()
      - recover inode.i_projid and project quota in recover_inode()
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      78130819
    • C
      f2fs: guarantee journalled quota data by checkpoint · af033b2a
      Chao Yu 提交于
      For journalled quota mode, let checkpoint to flush dquot dirty data
      and quota file data to guarntee persistence of all quota sysfile in
      last checkpoint, by this way, we can avoid corrupting quota sysfile
      when encountering SPO.
      
      The implementation is as below:
      
      1. add a global state SBI_QUOTA_NEED_FLUSH to indicate that there is
      cached dquot metadata changes in quota subsystem, and later checkpoint
      should:
       a) flush dquot metadata into quota file.
       b) flush quota file to storage to keep file usage be consistent.
      
      2. add a global state SBI_QUOTA_NEED_REPAIR to indicate that quota
      operation failed due to -EIO or -ENOSPC, so later,
       a) checkpoint will skip syncing dquot metadata.
       b) CP_QUOTA_NEED_FSCK_FLAG will be set in last cp pack to give a
          hint for fsck repairing.
      
      3. add a global state SBI_QUOTA_SKIP_FLUSH, in checkpoint, if quota
      data updating is very heavy, it may cause hungtask in block_operation().
      To avoid this, if our retry time exceed threshold, let's just skip
      flushing and retry in next checkpoint().
      Signed-off-by: NWeichao Guo <guoweichao@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: avoid warnings and set fsck flag]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      af033b2a
  18. 17 10月, 2018 2 次提交
  19. 01 10月, 2018 2 次提交
    • C
      f2fs: allow out-place-update for direct IO in LFS mode · f847c699
      Chao Yu 提交于
      Normally, DIO uses in-pllace-update, but in LFS mode, f2fs doesn't
      allow triggering any in-place-update writes, so we fallback direct
      write to buffered write, result in bad performance of large size
      write.
      
      This patch adds to support triggering out-place-update for direct IO
      to enhance its performance.
      
      Note that it needs to exclude direct read IO during direct write,
      since new data writing to new block address will no be valid until
      write finished.
      
      storage: zram
      
      time xfs_io -f -d /mnt/f2fs/file -c "pwrite 0 1073741824" -c "fsync"
      
      Before:
      real	0m13.061s
      user	0m0.327s
      sys	0m12.486s
      
      After:
      real	0m6.448s
      user	0m0.228s
      sys	0m6.212s
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f847c699
    • C
      f2fs: refactor ->page_mkwrite() flow · 39a86958
      Chao Yu 提交于
      Thread A				Thread B
      - f2fs_vm_page_mkwrite
      					- f2fs_setattr
      					 - down_write(i_mmap_sem)
      					 - truncate_setsize
      					 - f2fs_truncate
      					 - up_write(i_mmap_sem)
       - f2fs_reserve_block
       reserve NEW_ADDR
       - skip dirty page due to truncation
      
      1. we don't need to rserve new block address for a truncated page.
      2. dn.data_blkaddr is used out of node page lock coverage.
      
      Refactor ->page_mkwrite() flow to fix above issues:
      - use __do_map_lock() to avoid racing checkpoint()
      - lock data page in prior to dnode page
      - cover f2fs_reserve_block with i_mmap_sem lock
      - wait page writeback before zeroing page
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      39a86958