1. 01 10月, 2016 6 次提交
  2. 13 9月, 2016 1 次提交
  3. 08 9月, 2016 1 次提交
    • C
      f2fs: do in batch synchronously readahead during GC · 7ea984b0
      Chao Yu 提交于
      In order to enhance performance, we try to readahead node page during
      GC, but before loading node page we should get block address of node page
      which is stored in NAT table, so synchronously read of single NAT page
      block our readahead flow.
      
      f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0xa1e, oldaddr = 0xa1e, newaddr = 0xa1e, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x35e9, oldaddr = 0x72d7a, newaddr = 0x72d7a, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0xc1f, oldaddr = 0xc1f, newaddr = 0xc1f, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x389d, oldaddr = 0x72d7d, newaddr = 0x72d7d, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x3a82, oldaddr = 0x72d7f, newaddr = 0x72d7f, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x3bfa, oldaddr = 0x72d86, newaddr = 0x72d86, rw = READAHEAD ^H, type = NODE
      
      This patch adds one phase that do readahead NAT pages in batch before
      readahead node page for more effeciently.
      
      f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0x1952, oldaddr = 0x1952, newaddr = 0x1952, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc34, oldaddr = 0xc34, newaddr = 0xc34, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa33, oldaddr = 0xa33, newaddr = 0xa33, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc30, oldaddr = 0xc30, newaddr = 0xc30, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc32, oldaddr = 0xc32, newaddr = 0xc32, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc26, oldaddr = 0xc26, newaddr = 0xc26, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa2b, oldaddr = 0xa2b, newaddr = 0xa2b, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc23, oldaddr = 0xc23, newaddr = 0xc23, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc24, oldaddr = 0xc24, newaddr = 0xc24, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa10, oldaddr = 0xa10, newaddr = 0xa10, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc2c, oldaddr = 0xc2c, newaddr = 0xc2c, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5db7, oldaddr = 0x6be00, newaddr = 0x6be00, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5db9, oldaddr = 0x6be17, newaddr = 0x6be17, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dbc, oldaddr = 0x6be1a, newaddr = 0x6be1a, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc3, oldaddr = 0x6be20, newaddr = 0x6be20, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc7, oldaddr = 0x6be24, newaddr = 0x6be24, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc9, oldaddr = 0x6be25, newaddr = 0x6be25, rw = READAHEAD ^H, type = NODE
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7ea984b0
  4. 30 8月, 2016 1 次提交
  5. 23 7月, 2016 1 次提交
    • Y
      f2fs: get victim segment again after new cp · fe94793e
      Yunlei He 提交于
      Previous selected segment may become free after write_checkpoint,
      if we do garbage collect on this segment, and then new_curseg happen
      to reuse it, it may cause f2fs_bug_on as below.
      
      	panic+0x154/0x29c
      	do_garbage_collect+0x15c/0xaf4
      	f2fs_gc+0x2dc/0x444
      	f2fs_balance_fs.part.22+0xcc/0x14c
      	f2fs_balance_fs+0x28/0x34
      	f2fs_map_blocks+0x5ec/0x790
      	f2fs_preallocate_blocks+0xe0/0x100
      	f2fs_file_write_iter+0x64/0x11c
      	new_sync_write+0xac/0x11c
      	vfs_write+0x144/0x1e4
      	SyS_write+0x60/0xc0
      
      Here, maybe we check sit and ssa type during reset_curseg. So, we check
      segment is stale or not, and select a new victim to avoid this.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fe94793e
  6. 21 7月, 2016 1 次提交
  7. 16 7月, 2016 2 次提交
    • J
      f2fs: use blk_plug in all the possible paths · 9dfa1baf
      Jaegeuk Kim 提交于
      This patch reverts 19a5f5e2 (f2fs: drop any block plugging),
      and adds blk_plug in write paths additionally.
      
      The main reason is that blk_start_plug can be used to wake up from low-power
      mode before submitting further bios.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9dfa1baf
    • C
      f2fs: fix to avoid data update racing between GC and DIO · 82e0a5aa
      Chao Yu 提交于
      Datas in file can be operated by GC and DIO simultaneously, so we will
      face race case as below:
      
      For write case:
      Thread A				Thread B
      - generic_file_direct_write
       - invalidate_inode_pages2_range
       - f2fs_direct_IO
        - do_blockdev_direct_IO
         - do_direct_IO
          - get_more_blocks
      					- f2fs_gc
      					 - do_garbage_collect
      					  - gc_data_segment
      					   - move_data_page
      					    - do_write_data_page
      					    migrate data block to new block address
         - dio_bio_submit
         update user data to old block address
      
      For read case:
      Thread A                                Thread B
      - generic_file_direct_write
       - invalidate_inode_pages2_range
       - f2fs_direct_IO
        - do_blockdev_direct_IO
         - do_direct_IO
          - get_more_blocks
      					- f2fs_balance_fs
      					 - f2fs_gc
      					  - do_garbage_collect
      					   - gc_data_segment
      					    - move_data_page
      					     - do_write_data_page
      					     migrate data block to new block address
      					  - write_checkpoint
      					   - do_checkpoint
      					    - clear_prefree_segments
      					     - f2fs_issue_discard
                                                   discard old block adress
         - dio_bio_submit
         update user buffer from obsolete block address
      
      In order to fix this, for one file, we should let DIO and GC getting exclusion
      against with each other.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      82e0a5aa
  8. 09 7月, 2016 2 次提交
  9. 09 6月, 2016 2 次提交
  10. 08 6月, 2016 1 次提交
  11. 03 6月, 2016 1 次提交
  12. 08 5月, 2016 1 次提交
  13. 28 4月, 2016 1 次提交
    • C
      f2fs: move node pages only in victim section during GC · da011cc0
      Chao Yu 提交于
      For foreground GC, we cache node blocks in victim section and set them
      dirty, then we call sync_node_pages to flush these node pages, but
      meanwhile, those node pages which does not locate in victim section
      will be flushed together, so more bandwidth and continuous free space
      would be occupied.
      
      So for this condition, it's better to leave those unrelated node page
      in cache for further write hit, and let CP or VM to flush them afterward.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      da011cc0
  14. 27 4月, 2016 1 次提交
  15. 27 2月, 2016 2 次提交
    • C
      f2fs: introduce f2fs_update_data_blkaddr for cleanup · f28b3434
      Chao Yu 提交于
      Add a new help f2fs_update_data_blkaddr to clean up redundant codes.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f28b3434
    • C
      f2fs crypto: fix incorrect positioning for GCing encrypted data page · 4356e48e
      Chao Yu 提交于
      For now, flow of GCing an encrypted data page:
      1) try to grab meta page in meta inode's mapping with index of old block
      address of that data page
      2) load data of ciphertext into meta page
      3) allocate new block address
      4) write the meta page into new block address
      5) update block address pointer in direct node page.
      
      Other reader/writer will use f2fs_wait_on_encrypted_page_writeback to
      check and wait on GCed encrypted data cached in meta page writebacked
      in order to avoid inconsistence among data page cache, meta page cache
      and data on-disk when updating.
      
      However, we will use new block address updated in step 5) as an index to
      lookup meta page in inner bio buffer. That would be wrong, and we will
      never find the GCing meta page, since we use the old block address as
      index of that page in step 1).
      
      This patch fixes the issue by adjust the order of step 1) and step 3),
      and in step 1) grab page with index generated in step 3).
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4356e48e
  16. 23 2月, 2016 8 次提交
    • C
      f2fs: trace old block address for CoWed page · 7a9d7548
      Chao Yu 提交于
      This patch enables to trace old block address of CoWed page for better
      debugging.
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7a9d7548
    • C
      f2fs: fix the wrong stat count of calling gc · 17d899df
      Chao Yu 提交于
      With a partition which was formated as multi segments in one section,
      we stated incorrectly for count of gc operation.
      
      e.g., for a partition with segs_per_sec = 4
      
      cat /sys/kernel/debug/f2fs/status
      
      GC calls: 208 (BG: 7)
        - data segments : 104 (52)
        - node segments : 104 (24)
      
      GC called count should be (104 (data segs) + 104 (node segs)) / 4 = 52,
      rather than 208. Fix it.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      17d899df
    • J
      f2fs: remain last victim segment number ascending order · 4ce53776
      Jaegeuk Kim 提交于
      This patch avoids to remain inefficient victim segment number selected by
      a victim.
      
      For example, if all the dirty segments has same valid blocks, we can get
      the victim segments descending order due to keeping wrong last segment number.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4ce53776
    • C
      f2fs: remove unneeded pointer conversion · 81ca7350
      Chao Yu 提交于
      There are redundant pointer conversion in following call stack:
       - at position a, inode was been converted to f2fs_file_info.
       - at position b, f2fs_file_info was been converted to inode again.
      
       - truncate_blocks(inode,..)
        - fi = F2FS_I(inode)		---a
        - ADDRS_PER_PAGE(node_page, fi)
         - addrs_per_inode(fi)
          - inode = &fi->vfs_inode	---b
          - f2fs_has_inline_xattr(inode)
           - fi = F2FS_I(inode)
           - is_inode_flag_set(fi,..)
      
      In order to avoid unneeded conversion, alter ADDRS_PER_PAGE and
      addrs_per_inode to acept parameter with type of inode pointer.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      81ca7350
    • F
      f2fs: avoid unnecessary search while finding victim in gc · 688159b6
      Fan Li 提交于
      variable nsearched in get_victim_by_default() indicates the number of
      dirty segments we already checked. There are 2 problems about the way
      it updates:
      1. When p.ofs_unit is greater than 1, the victim we find consists
         of multiple segments, possibly more than 1 dirty segment.
         But nsearched always increases by 1.
      2. If segments have been found but not been chosen, nsearched won't
         increase. So even we have checked all dirty segments, nsearched
         may still less than p.max_search.
      All these problems could cause unnecessary search after all dirty
      segments have already been checked.
      Signed-off-by: NFan li <fanofcode.li@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      688159b6
    • J
      f2fs: use wait_for_stable_page to avoid contention · fec1d657
      Jaegeuk Kim 提交于
      In write_begin, if storage supports stable_page, we don't need to wait for
      writeback to update its contents.
      This patch introduces to use wait_for_stable_page instead of
      wait_on_page_writeback.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fec1d657
    • C
      f2fs: enhance foreground GC · 718e53fa
      Chao Yu 提交于
      If we configure section consist of multiple segments, foreground GC will
      do the garbage collection with following approach:
      
      	for each segment in victim section
      		blk_start_plug
      		for each valid block in segment
      			write out by OPU method
      		submit bio cache   <---
      		blk_finish_plug   <---
      
      There are two issue:
      1) for most of the time, 'submit bio cache' will break the merging in
      current bio buffer from writes of next segments, making a smaller bio
      submitting.
      2) block plug only cover IO submitting in one segment, which reduce
      opportunity of merging IOs in plug with multiple segments.
      
      So refactor the code as below structure to strive for biggest
      opportunity of merging IOs:
      
      	blk_start_plug
      	for each segment in victim section
      		for each valid block in segment
      			write out by OPU method
      	submit bio cache
      	blk_finish_plug
      
      Test method:
      1. mkfs.f2fs -s 8 /dev/sdX
      2. touch 32 files
      3. write 2M data into each file
      4. punch 1.5M data from offset 0 for each file
      5. trigger foreground gc through ioctl
      
      Before patch, there are totoally 40 bios submitted.
      f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65536, size = 122880
      f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65776, size = 122880
      f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66016, size = 122880
      f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66256, size = 122880
      f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66496, size = 32768
      ----repeat for 8 times
      
      After patch, there are totally 35 bios submitted.
      f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65536, size = 122880
      ----repeat 34 times
      f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 73696, size = 16384
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      718e53fa
    • J
      f2fs: fix to overcome inline_data floods · 6e17bfbc
      Jaegeuk Kim 提交于
      The scenario is:
      1. create lots of node blocks
      2. sync
      3. write lots of inline_data
      -> got panic due to no free space
      
      In that case, we should flush node blocks when writing inline_data in #3,
      and trigger gc as well.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6e17bfbc
  17. 12 1月, 2016 1 次提交
  18. 31 12月, 2015 1 次提交
  19. 05 12月, 2015 1 次提交
  20. 14 10月, 2015 2 次提交
    • J
      f2fs: relocate the tracepoint for background_gc · 84e4214f
      Jaegeuk Kim 提交于
      Once f2fs_gc is done, wait_ms is changed once more.
      So, its tracepoint would be located after it.
      Reported-by: NHe YunLei <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      84e4214f
    • C
      f2fs crypto: fix racing of accessing encrypted page among · 08b39fbd
      Chao Yu 提交于
       different competitors
      
      Since we use different page cache (normally inode's page cache for R/W
      and meta inode's page cache for GC) to cache the same physical block
      which is belong to an encrypted inode. Writeback of these two page
      cache should be exclusive, but now we didn't handle writeback state
      well, so there may be potential racing problem:
      
      a)
      kworker:				f2fs_gc:
       - f2fs_write_data_pages
        - f2fs_write_data_page
         - do_write_data_page
          - write_data_page
           - f2fs_submit_page_mbio
      (page#1 in inode's page cache was queued
      in f2fs bio cache, and be ready to write
      to new blkaddr)
      					 - gc_data_segment
      					  - move_encrypted_block
      					   - pagecache_get_page
      					(page#2 in meta inode's page cache
      					was cached with the invalid datas
      					of physical block located in new
      					blkaddr)
      					   - f2fs_submit_page_mbio
      					(page#1 was submitted, later, page#2
      					with invalid data will be submitted)
      
      b)
      f2fs_gc:
       - gc_data_segment
        - move_encrypted_block
         - f2fs_submit_page_mbio
      (page#1 in meta inode's page cache was
      queued in f2fs bio cache, and be ready
      to write to new blkaddr)
      					user thread:
      					 - f2fs_write_begin
      					  - f2fs_submit_page_bio
      					(we submit the request to block layer
      					to update page#2 in inode's page cache
      					with physical block located in new
      					blkaddr, so here we may read gabbage
      					data from new blkaddr since GC hasn't
      					writebacked the page#1 yet)
      
      This patch fixes above potential racing problem for encrypted inode.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      08b39fbd
  21. 13 10月, 2015 2 次提交
    • C
      f2fs: support lower priority asynchronous readahead in ra_meta_pages · 26879fb1
      Chao Yu 提交于
      Now, we use ra_meta_pages to reads continuous physical blocks as much as
      possible to improve performance of following reads. However, ra_meta_pages
      uses a synchronous readahead approach by submitting bio with READ, as READ
      is with high priority, it can not be used in the case of preloading blocks,
      and it's not sure when these RAed pages will be used.
      
      This patch supports asynchronous readahead in ra_meta_pages by tagging bio
      with READA flag in order to allow preloading.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      26879fb1
    • J
      f2fs: set GFP_NOFS for grab_cache_page · a56c7c6f
      Jaegeuk Kim 提交于
      For normal inodes, their pages are allocated with __GFP_FS, which can cause
      filesystem calls when reclaiming memory.
      This can incur a dead lock condition accordingly.
      
      So, this patch addresses this problem by introducing
      f2fs_grab_cache_page(.., bool for_write), which calls
      grab_cache_page_write_begin() with AOP_FLAG_NOFS.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a56c7c6f
  22. 10 10月, 2015 1 次提交