1. 24 2月, 2017 22 次提交
    • H
      f2fs: add ovp valid_blocks check for bg gc victim to fg_gc · e93b9865
      Hou Pengyang 提交于
      For foreground gc, greedy algorithm should be adapted, which makes
      this formula work well:
      
      	(2 * (100 / config.overprovision + 1) + 6)
      
      But currently, we fg_gc have a prior to select bg_gc victim segments to gc
      first, these victims are selected by cost-benefit algorithm, we can't guarantee
      such segments have the small valid blocks, which may destroy the f2fs rule, on
      the worstest case, would consume all the free segments.
      
      This patch fix this by add a filter in check_bg_victims, if segment's has # of
      valid blocks over overprovision ratio, skip such segments.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NHou Pengyang <houpengyang@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e93b9865
    • J
      f2fs: do not wait for writeback in write_begin · 86d54795
      Jaegeuk Kim 提交于
      Otherwise we can get livelock like below.
      
      [79880.428136] dbench          D    0 18405  18404 0x00000000
      [79880.428139] Call Trace:
      [79880.428142]  __schedule+0x219/0x6b0
      [79880.428144]  schedule+0x36/0x80
      [79880.428147]  schedule_timeout+0x243/0x2e0
      [79880.428152]  ? update_sd_lb_stats+0x16b/0x5f0
      [79880.428155]  ? ktime_get+0x3c/0xb0
      [79880.428157]  io_schedule_timeout+0xa6/0x110
      [79880.428161]  __lock_page+0xf7/0x130
      [79880.428164]  ? unlock_page+0x30/0x30
      [79880.428167]  pagecache_get_page+0x16b/0x250
      [79880.428171]  grab_cache_page_write_begin+0x20/0x40
      [79880.428182]  f2fs_write_begin+0xa2/0xdb0 [f2fs]
      [79880.428192]  ? f2fs_mark_inode_dirty_sync+0x16/0x30 [f2fs]
      [79880.428197]  ? kmem_cache_free+0x79/0x200
      [79880.428203]  ? __mark_inode_dirty+0x17f/0x360
      [79880.428206]  generic_perform_write+0xbb/0x190
      [79880.428213]  ? file_update_time+0xa4/0xf0
      [79880.428217]  __generic_file_write_iter+0x19b/0x1e0
      [79880.428226]  f2fs_file_write_iter+0x9c/0x180 [f2fs]
      [79880.428231]  __vfs_write+0xc5/0x140
      [79880.428235]  vfs_write+0xb2/0x1b0
      [79880.428238]  SyS_write+0x46/0xa0
      [79880.428242]  entry_SYSCALL_64_fastpath+0x1e/0xad
      
      Fixes: cae96a5c8ab6 ("f2fs: check io submission more precisely")
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      86d54795
    • Y
      f2fs: replace __get_victim by dirty_segments in FG_GC · 05eeb118
      Yunlei He 提交于
      In FG_GC process, it will search victim section twice. This will
      cause some dirty section with less valid blocks skip garbage
      collection.
      
      section # 26425 : valid blocks # 3
      142.037567: get_victim_by_default: victim 26425 : valid blocks # 3
      142.037585: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 244
      142.039494: f2fs_get_victim: dev = (259,30), type = Hot DATA, policy = (Background GC, SSR-mode, Greedy), victim = 19022 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 24
      142.070247: new_curseg: Debug: alloc new segment 26746
      142.244341: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26054 ofs_unit = 1, pre_victim_secno = 26054, prefree = 0, free = 243
      142.254475: do_garbage_collect: Debug: FG_GC, seg_freed = 1
      142.293131: f2fs_get_victim: dev = (259,30), type = Warm DATA, policy = (Background GC, SSR-mode, Greedy), victim = 23466 ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 244
      142.319001: f2fs_get_victim: dev = (259,30), type = Warm DATA, policy = (Background GC, SSR-mode, Greedy), victim = 23467 ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 244
      142.368879: get_victim_by_default: victim 26425 : valid blocks # 3
      142.368894: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 244
      142.378127: f2fs_get_victim: dev = (259,30), type = Hot DATA, policy = (Background GC, SSR-mode, Greedy), victim = 19612 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 24
      142.416917: new_curseg: Debug: alloc new segment 26054
      142.656794: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 25404 ofs_unit = 1, pre_victim_secno = 25404, prefree = 0, free = 243
      142.662139: do_garbage_collect: Debug: FG_GC, seg_freed = 1
      142.684159: new_curseg: Debug: alloc new segment 25197
      142.685059: get_victim_by_default: victim 26425 : valid blocks # 3
      142.685079: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 243
      142.701427: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26238 ofs_unit = 1, pre_victim_secno = 26238, prefree = 0, free = 243
      142.707105: do_garbage_collect: Debug: FG_GC, seg_freed = 1
      142.802444: f2fs_get_victim: dev = (259,30), type = Warm DATA, policy = (Background GC, SSR-mode, Greedy), victim = 23473 ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 244
      142.804422: get_victim_by_default: victim 26425 : valid blocks # 3
      142.804443: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 244
      142.851567: f2fs_get_victim: dev = (259,30), type = Hot DATA, policy = (Background GC, SSR-mode, Greedy), victim = 19092 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 24
      142.865014: new_curseg: Debug: alloc new segment 26238
      143.082245: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26307 ofs_unit = 1, pre_victim_secno = 26307, prefree = 0, free = 244
      143.088252: do_garbage_collect: Debug: FG_GC, seg_freed = 1
      143.128307: new_curseg: Debug: alloc new segment 25404
      143.181846: get_victim_by_default: victim 26425 : valid blocks # 3
      143.181872: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 244
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      05eeb118
    • J
      f2fs: trace victim's cost selectecd by f2fs_gc · 5012de20
      Jaegeuk Kim 提交于
      This patch adds min_cost of each victims.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5012de20
    • J
      f2fs: fix multiple f2fs_add_link() calls having same name · 88c5c13a
      Jaegeuk Kim 提交于
      It turns out a stakable filesystem like sdcardfs in AOSP can trigger multiple
      vfs_create() to lower filesystem. In that case, f2fs will add multiple dentries
      having same name which breaks filesystem consistency.
      
      Until upper layer fixes, let's work around by f2fs, which shows actually not
      much performance regression.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      88c5c13a
    • J
      f2fs: show actual device info in tracepoints · d50aaeec
      Jaegeuk Kim 提交于
      This patch shows actual device information in the tracepoints.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d50aaeec
    • J
      f2fs: use SSR for warm node as well · 5b6c6be2
      Jaegeuk Kim 提交于
      We have had node chains, but haven't used it so far due to stale node blocks.
      Now, we have crc|cp_ver in node footer and give random cp_ver at format time,
      we can start to use it again.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5b6c6be2
    • C
      f2fs: enable inline_xattr by default · 39133a50
      Chao Yu 提交于
      In android, since SElinux is enable, security policy will be appliedd for
      each file, it stores in inode as an xattr entry, so it will take one 4k
      size node block additionally for each file.
      
      Let's enable inline_xattr by default in order to save storage space.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      39133a50
    • C
      f2fs: introduce noinline_xattr mount option · 23cf7212
      Chao Yu 提交于
      This patch introduces new mount option 'noinline_xattr', so we can disable
      inline xattr functionality which is already set as a default mount option.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      23cf7212
    • J
      f2fs: avoid reading NAT page by get_node_info · 25cc5d3b
      Jaegeuk Kim 提交于
      We've not seen this buggy case for a long time, so it's time to avoid this
      unnecessary get_node_info() call which reading NAT page to cache nat entry.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      25cc5d3b
    • J
      f2fs: remove build_free_nids() during checkpoint · 9b064f7d
      Jaegeuk Kim 提交于
      Let's avoid build_free_nids() in checkpoint path.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9b064f7d
    • C
      f2fs: change recovery policy of xattr node block · d260081c
      Chao Yu 提交于
      Currently, if we call fsync after updating the xattr date belongs to the
      file, f2fs needs to trigger checkpoint to keep xattr data consistent. But,
      this policy cause low performance as checkpoint will block most foreground
      operations and cause unneeded and unrelated IOs around checkpoint.
      
      This patch will reuse regular file recovery policy for xattr node block,
      so, we change to write xattr node block tagged with fsync flag to warm
      area instead of cold area, and during recovery, we search warm node chain
      for fsynced xattr block, and do the recovery.
      
      So, for below application IO pattern, performance can be improved
      obviously:
      - touch file
      - create/update/delete xattr entry in file
      - fsync file
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d260081c
    • B
      f2fs: super: constify fscrypt_operations structure · 2ad0ef84
      Bhumika Goyal 提交于
      Declare fscrypt_operations structure as const as it is only stored in
      the s_cop field of a super_block structure. This field is of type const,
      so fscrypt_operations structure having this property can be made const
      too.
      
      File size before: fs/f2fs/super.o
         text	   data	    bss	    dec	    hex	filename
        54131	  31355	    184	  85670	  14ea6	fs/f2fs/super.o
      
      File size after: fs/f2fs/super.o
         text	   data	    bss	    dec	    hex	filename
        54227	  31259	    184	  85670	  14ea6	fs/f2fs/super.o
      Signed-off-by: NBhumika Goyal <bhumirks@gmail.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2ad0ef84
    • J
      f2fs: show checkpoint version at mount time · 1200abb2
      Jaegeuk Kim 提交于
      If we mounted f2fs successfully, let's show current checkpoint version.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1200abb2
    • T
      f2fs: fix a typo in f2fs.txt · 6de3f12e
      Tiezhu Yang 提交于
      There is a typo "f2f2" in f2fs.txt, this patch fixes it.
      Signed-off-by: NTiezhu Yang <kernelpatch@126.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6de3f12e
    • J
      f2fs: remove preflush for nobarrier case · 7f54f51f
      Jaegeuk Kim 提交于
      This patch removes REQ_PREFLUSH in the nobarrier case.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7f54f51f
    • J
      f2fs: check last page index in cached bio to decide submission · 942fd319
      Jaegeuk Kim 提交于
      If the cached bio has the last page's index, then we need to submit it.
      Otherwise, we don't need to submit it and can wait for further IO merges.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      942fd319
    • J
      f2fs: check io submission more precisely · d68f735b
      Jaegeuk Kim 提交于
      This patch check IO submission more precisely than previous rough check.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d68f735b
    • J
      f2fs: call internal __write_data_page directly · f566bae8
      Jaegeuk Kim 提交于
      This patch introduces __write_data_page to call it by f2fs_write_cache_pages
      directly..
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f566bae8
    • J
      f2fs: avoid out-of-order execution of atomic writes · e7c75ab0
      Jaegeuk Kim 提交于
      We need to flush data writes before flushing last node block writes by using
      FUA with PREFLUSH. We don't need to guarantee precedent node writes since if
      those are not written, we can't reach to the last node block when scanning
      node block chain during roll-forward recovery.
      Afterwards f2fs_wait_on_page_writeback guarantees all the IO submission to
      disk, which builds a valid node block chain.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e7c75ab0
    • J
      f2fs: move write_node_page above fsync_node_pages · faa24895
      Jaegeuk Kim 提交于
      This patch just moves write_node_page and introduces an inner function.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      faa24895
    • J
      f2fs: move flush tracepoint · c1b22107
      Jaegeuk Kim 提交于
      This patch moves the tracepoint location for flush command.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c1b22107
  2. 23 2月, 2017 18 次提交