1. 26 11月, 2019 3 次提交
    • J
      f2fs: stop GC when the victim becomes fully valid · 803e74be
      Jaegeuk Kim 提交于
      We must stop GC, once the segment becomes fully valid. Otherwise, it can
      produce another dirty segments by moving valid blocks in the segment partially.
      
      Ramon hit no free segment panic sometimes and saw this case happens when
      validating reliable file pinning feature.
      Signed-off-by: NRamon Pantin <pantin@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      803e74be
    • J
      f2fs: expose main_blkaddr in sysfs · a4db59ac
      Jaegeuk Kim 提交于
      Expose in /sys/fs/f2fs/<blockdev>/main_blkaddr the block address where the
      main area starts. This allows user mode programs to determine:
      
      - That pinned files that are made exclusively of fully allocated 2MB
        segments will never be unpinned by the file system.
      
      - Where the main area starts. This is required by programs that want to
        verify if a file is made exclusively of 2MB f2fs segments, the alignment
        boundary for segments starts at this address. Testing for 2MB alignment
        relative to the start of the device is incorrect, because for some
        filesystems main_blkaddr is not at a 2MB boundary relative to the start
        of the device.
      
      The entry will be used when validating reliable pinning file feature proposed
      by "f2fs: support aligned pinned file".
      Signed-off-by: NRamon Pantin <pantin@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a4db59ac
    • C
      f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project() · 909110c0
      Chengguang Xu 提交于
      Setting softlimit larger than hardlimit seems meaningless
      for disk quota but currently it is allowed. In this case,
      there may be a bit of comfusion for users when they run
      df comamnd to directory which has project quota.
      
      For example, we set 20M softlimit and 10M hardlimit of
      block usage limit for project quota of test_dir(project id 123).
      
      [root@hades f2fs]# repquota -P -a
      *** Report for project quotas on device /dev/nvme0n1p8
      Block grace time: 7days; Inode grace time: 7days
      Block limits File limits
      Project used soft hard grace used soft hard grace
      ----------------------------------------------------------------------
      0 -- 4 0 0 1 0 0
      123 +- 10248 20480 10240 2 0 0
      
      The result of df command as below:
      
      [root@hades f2fs]# df -h /mnt/f2fs/test
      Filesystem Size Used Avail Use% Mounted on
      /dev/nvme0n1p8 20M 11M 10M 51% /mnt/f2fs
      
      Even though it looks like there is another 10M free space to use,
      if we write new data to diretory test(inherit project id),
      the write will fail with errno(-EDQUOT).
      
      After this patch, the df result looks like below.
      
      [root@hades f2fs]# df -h /mnt/f2fs/test
      Filesystem Size Used Avail Use% Mounted on
      /dev/nvme0n1p8 10M 10M 0 100% /mnt/f2fs
      Signed-off-by: NChengguang Xu <cgxu519@mykernel.net>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      909110c0
  2. 20 11月, 2019 2 次提交
  3. 14 11月, 2019 1 次提交
  4. 13 11月, 2019 1 次提交
  5. 08 11月, 2019 4 次提交
    • C
      f2fs: fix potential overflow · 1f0d5c91
      Chao Yu 提交于
      We expect 64-bit calculation result from below statement, however
      in 32-bit machine, looped left shift operation on pgoff_t type
      variable may cause overflow issue, fix it by forcing type cast.
      
      page->index << PAGE_SHIFT;
      
      Fixes: 26de9b11 ("f2fs: avoid unnecessary updating inode during fsync")
      Fixes: 0a2aa8fb ("f2fs: refactor __exchange_data_block for speed up")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1f0d5c91
    • C
      f2fs: fix to update dir's i_pino during cross_rename · 2a60637f
      Chao Yu 提交于
      As Eric reported:
      
      RENAME_EXCHANGE support was just added to fsstress in xfstests:
      
      	commit 65dfd40a97b6bbbd2a22538977bab355c5bc0f06
      	Author: kaixuxia <xiakaixu1987@gmail.com>
      	Date:   Thu Oct 31 14:41:48 2019 +0800
      
      	    fsstress: add EXCHANGE renameat2 support
      
      This is causing xfstest generic/579 to fail due to fsck.f2fs reporting errors.
      I'm not sure what the problem is, but it still happens even with all the
      fs-verity stuff in the test commented out, so that the test just runs fsstress.
      
      generic/579 23s ... 	[10:02:25]
      [    7.745370] run fstests generic/579 at 2019-11-04 10:02:25
      _check_generic_filesystem: filesystem on /dev/vdc is inconsistent
      (see /results/f2fs/results-default/generic/579.full for details)
       [10:02:47]
      Ran: generic/579
      Failures: generic/579
      Failed 1 of 1 tests
      Xunit report: /results/f2fs/results-default/result.xml
      
      Here's the contents of 579.full:
      
      _check_generic_filesystem: filesystem on /dev/vdc is inconsistent
      *** fsck.f2fs output ***
      [ASSERT] (__chk_dots_dentries:1378)  --> Bad inode number[0x24] for '..', parent parent ino is [0xd10]
      
      The root cause is that we forgot to update directory's i_pino during
      cross_rename, fix it.
      
      Fixes: 32f9bc25 ("f2fs: support ->rename2()")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Tested-by: NEric Biggers <ebiggers@kernel.org>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2a60637f
    • J
      f2fs: support aligned pinned file · f5a53edc
      Jaegeuk Kim 提交于
      This patch supports 2MB-aligned pinned file, which can guarantee no GC at all
      by allocating fully valid 2MB segment.
      
      Check free segments by has_not_enough_free_secs() with large budget.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f5a53edc
    • J
      f2fs: avoid kernel panic on corruption test · bc005a4d
      Jaegeuk Kim 提交于
      xfstests/generic/475 complains kernel warn/panic while testing corrupted disk.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bc005a4d
  6. 07 11月, 2019 2 次提交
  7. 26 10月, 2019 1 次提交
    • C
      f2fs: cache global IPU bio · 0b20fcec
      Chao Yu 提交于
      In commit 8648de2c ("f2fs: add bio cache for IPU"), we added
      f2fs_submit_ipu_bio() in __write_data_page() as below:
      
      __write_data_page()
      
      	if (!S_ISDIR(inode->i_mode) && !IS_NOQUOTA(inode)) {
      		f2fs_submit_ipu_bio(sbi, bio, page);
      		....
      	}
      
      in order to avoid below deadlock:
      
      Thread A				Thread B
      - __write_data_page (inode x, page y)
       - f2fs_do_write_data_page
        - set_page_writeback        ---- set writeback flag in page y
        - f2fs_inplace_write_data
       - f2fs_balance_fs
      					 - lock gc_mutex
       - lock gc_mutex
      					  - f2fs_gc
      					   - do_garbage_collect
      					    - gc_data_segment
      					     - move_data_page
      					      - f2fs_wait_on_page_writeback
      					       - wait_on_page_writeback  --- wait writeback of page y
      
      However, the bio submission breaks the merge of IPU IOs.
      
      So in this patch let's add a global bio cache for merged IPU pages,
      then f2fs_wait_on_page_writeback() is able to submit bio if a
      writebacked page is cached in global bio cache.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0b20fcec
  8. 23 10月, 2019 4 次提交
  9. 05 10月, 2019 1 次提交
    • C
      f2fs: fix to update time in lazytime mode · fe1897ea
      Chao Yu 提交于
      generic/018 reports an inconsistent status of atime, the
      testcase is as below:
      - open file with O_SYNC
      - write file to construct fraged space
      - calc md5 of file
      - record {a,c,m}time
      - defrag file --- do nothing
      - umount & mount
      - check {a,c,m}time
      
      The root cause is, as f2fs enables lazytime by default, atime
      update will dirty vfs inode, rather than dirtying f2fs inode (by set
      with FI_DIRTY_INODE), so later f2fs_write_inode() called from VFS will
      fail to update inode page due to our skip:
      
      f2fs_write_inode()
      	if (is_inode_flag_set(inode, FI_DIRTY_INODE))
      		return 0;
      
      So eventually, after evict(), we lose last atime for ever.
      
      To fix this issue, we need to check whether {a,c,m,cr}time is
      consistent in between inode cache and inode page, and only skip
      f2fs_update_inode() if f2fs inode is not dirty and time is
      consistent as well.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fe1897ea
  10. 18 9月, 2019 1 次提交
  11. 16 9月, 2019 9 次提交
    • C
      f2fs: fix to add missing F2FS_IO_ALIGNED() condition · 8223ecc4
      Chao Yu 提交于
      In f2fs_allocate_data_block(), we will reset fio.retry for IO
      alignment feature instead of IO serialization feature.
      
      In addition, spread F2FS_IO_ALIGNED() to check IO alignment
      feature status explicitly.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8223ecc4
    • C
      f2fs: fix to fallback to buffered IO in IO aligned mode · 9720ee80
      Chao Yu 提交于
      In LFS mode, we allow OPU for direct IO, however, we didn't consider
      IO alignment feature, so direct IO can trigger unaligned IO, let's
      just fallback to buffered IO to keep correct IO alignment semantics
      in all places.
      
      Fixes: f847c699 ("f2fs: allow out-place-update for direct IO in LFS mode")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9720ee80
    • C
      f2fs: fix to handle error path correctly in f2fs_map_blocks · 05e36006
      Chao Yu 提交于
      In f2fs_map_blocks(), we should bail out once __allocate_data_block()
      failed.
      
      Fixes: f847c699 ("f2fs: allow out-place-update for direct IO in LFS mode")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      05e36006
    • C
      f2fs: fix extent corrupotion during directIO in LFS mode · 86f35dc3
      Chao Yu 提交于
      In LFS mode, por_fsstress testcase reports a bug as below:
      
      [ASSERT] (fsck_chk_inode_blk: 931)  --> ino: 0x12fe has wrong ext: [pgofs:142, blk:215424, len:16]
      
      Since commit f847c699 ("f2fs: allow out-place-update for direct
      IO in LFS mode"), we start to allow OPU mode for direct IO, however,
      we missed to update extent cache in __allocate_data_block(), finally,
      it cause extent field being inconsistent with physical block address,
      fix it.
      
      Fixes: f847c699 ("f2fs: allow out-place-update for direct IO in LFS mode")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      86f35dc3
    • S
      f2fs: check all the data segments against all node ones · 1166c1f2
      Surbhi Palande 提交于
      As a part of the sanity checking while mounting, distinct segment number
      assignment to data and node segments is verified. Fixing a small bug in
      this verification between node and data segments. We need to check all
      the data segments with all the node segments.
      
      Fixes: 042be0f8 ("f2fs: fix to do sanity check with current segment number")
      Signed-off-by: NSurbhi Palande <csurbhi@gmail.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1166c1f2
    • L
      bd7253bc
    • G
      f2fs: fix inode rwsem regression · cb8434f1
      Goldwyn Rodrigues 提交于
      This is similar to 942491c9 ("xfs: fix AIM7 regression")
      Apparently our current rwsem code doesn't like doing the trylock, then
      lock for real scheme.  So change our read/write methods to just do the
      trylock for the RWF_NOWAIT case.
      
      We don't need a check for IOCB_NOWAIT and !direct-IO because it
      is checked in generic_write_checks().
      
      Fixes: b91050a8 ("f2fs: add nowait aio support")
      Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      cb8434f1
    • C
      f2fs: fix to avoid accessing uninitialized field of inode page in is_alive() · 98194030
      Chao Yu 提交于
      If inode is newly created, inode page may not synchronize with inode cache,
      so fields like .i_inline or .i_extra_isize could be wrong, in below call
      path, we may access such wrong fields, result in failing to migrate valid
      target block.
      
      Thread A				Thread B
      - f2fs_create
       - f2fs_add_link
        - f2fs_add_dentry
         - f2fs_init_inode_metadata
          - f2fs_add_inline_entry
           - f2fs_new_inode_page
           - f2fs_put_page
           : inode page wasn't updated with inode cache
      					- gc_data_segment
      					 - is_alive
      					  - f2fs_get_node_page
      					  - datablock_addr
      					   - offset_in_addr
      					   : access uninitialized fields
      
      Fixes: 7a2af766 ("f2fs: enhance on-disk inode structure scalability")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      98194030
    • J
      f2fs: avoid infinite GC loop due to stale atomic files · 743b620c
      Jaegeuk Kim 提交于
      If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
      get commited pages but atomic_file being still set like:
      
      - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
      
      If GC selects this block, we can get an infinite loop like this:
      
      f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
      f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
      f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
      f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
      f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
      f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
      f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
      f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
      
      In that moment, we can observe:
      
      [Before]
      Try to move 5084219 blocks (BG: 384508)
        - data blocks : 4962373 (274483)
        - node blocks : 121846 (110025)
      Skipped : atomic write 4534686 (10)
      
      [After]
      Try to move 5088973 blocks (BG: 384508)
        - data blocks : 4967127 (274483)
        - node blocks : 121846 (110025)
      Skipped : atomic write 4539440 (10)
      
      So, refactor atomic_write flow like this:
      1. start_atomic_write
       - add inmem_list and set atomic_file
      
      2. write()
       - register it in inmem_pages
      
      3. commit_atomic_write
       - if no error, f2fs_drop_inmem_pages()
       - f2fs_commit_inmme_pages() failed
         : __revoked_inmem_pages() was done
       - f2fs_do_sync_file failed
         : abort_atomic_write later
      
      4. abort_atomic_write
       - f2fs_drop_inmem_pages
      
      5. f2fs_drop_inmem_pages
       - clear atomic_file
       - remove inmem_list
      
      Based on this change, when GC fails to move block in atomic_file,
      f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      743b620c
  12. 09 9月, 2019 1 次提交
    • S
      f2fs: Fix indefinite loop in f2fs_gc() · 957fa478
      Sahitya Tummala 提交于
      Policy - foreground GC, LFS mode and greedy GC mode.
      
      Under this policy, f2fs_gc() loops forever to GC as it doesn't have
      enough free segements to proceed and thus it keeps calling gc_more
      for the same victim segment.  This can happen if the selected victim
      segment could not be GC'd due to failed blkaddr validity check i.e.
      is_alive() returns false for the blocks set in current validity map.
      
      Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
      the segment selected could not be GC'd. This helps to select another
      segment for GC and thus helps to proceed forward with GC.
      
      [Note]
      This can happen due to is_alive as well as atomic_file which skipps
      GC.
      Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      957fa478
  13. 07 9月, 2019 8 次提交
  14. 30 8月, 2019 1 次提交
    • D
      timestamp_truncate: Replace users of timespec64_trunc · 3818c190
      Deepa Dinamani 提交于
      Update the inode timestamp updates to use timestamp_truncate()
      instead of timespec64_trunc().
      
      The change was mostly generated by the following coccinelle
      script.
      
      virtual context
      virtual patch
      
      @r1 depends on patch forall@
      struct inode *inode;
      identifier i_xtime =~ "^i_[acm]time$";
      expression e;
      @@
      
      inode->i_xtime =
      - timespec64_trunc(
      + timestamp_truncate(
      ...,
      - e);
      + inode);
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: NJeff Layton <jlayton@kernel.org>
      Cc: adrian.hunter@intel.com
      Cc: dedekind1@gmail.com
      Cc: gregkh@linuxfoundation.org
      Cc: hch@lst.de
      Cc: jaegeuk@kernel.org
      Cc: jlbec@evilplan.org
      Cc: richard@nod.at
      Cc: tj@kernel.org
      Cc: yuchao0@huawei.com
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Cc: linux-ntfs-dev@lists.sourceforge.net
      Cc: linux-mtd@lists.infradead.org
      3818c190
  15. 23 8月, 2019 1 次提交