1. 27 2月, 2014 1 次提交
  2. 17 2月, 2014 2 次提交
  3. 14 1月, 2014 1 次提交
  4. 08 1月, 2014 1 次提交
    • J
      f2fs: add a sysfs entry to control max_victim_search · b1c57c1c
      Jaegeuk Kim 提交于
      Previously during SSR and GC, the maximum number of retrials to find a victim
      segment was hard-coded by MAX_VICTIM_SEARCH, 4096 by default.
      
      This number makes an effect on IO locality, when SSR mode is activated, which
      results in performance fluctuation on some low-end devices.
      
      If max_victim_search = 4, the victim will be searched like below.
      ("D" represents a dirty segment, and "*" indicates a selected victim segment.)
      
       D1 D2 D3 D4 D5 D6 D7 D8 D9
      [   *       ]
            [   *    ]
                  [         * ]
      	                [ ....]
      
      This patch adds a sysfs entry to control the number dynamically through:
        /sys/fs/f2fs/$dev/max_victim_search
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b1c57c1c
  5. 23 12月, 2013 6 次提交
    • G
      f2fs: remove the rw_flag domain from f2fs_io_info · 7e8f2308
      Gu Zheng 提交于
      When using the f2fs_io_info in the low level, we still need to merge the
      rw and rw_flag, so use the rw to hold all the io flags directly,
      and remove the rw_flag field.
      
      ps.It is based on the previous patch:
      f2fs: move all the bio initialization into __bio_alloc
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7e8f2308
    • J
      f2fs: refactor bio->rw handling · 458e6197
      Jaegeuk Kim 提交于
      This patch introduces f2fs_io_info to mitigate the complex parameter list.
      
      struct f2fs_io_info {
      	enum page_type type;		/* contains DATA/NODE/META/META_FLUSH */
      	int rw;				/* contains R/RS/W/WS */
      	int rw_flag;			/* contains REQ_META/REQ_PRIO */
      }
      
      1. f2fs_write_data_pages
       - DATA
       - WRITE_SYNC is set when wbc->WB_SYNC_ALL.
      
      2. sync_node_pages
       - NODE
       - WRITE_SYNC all the time
      
      3. sync_meta_pages
       - META
       - WRITE_SYNC all the time
       - REQ_META | REQ_PRIO all the time
      
       ** f2fs_submit_merged_bio() handles META_FLUSH.
      
      4. ra_nat_pages, ra_sit_pages, ra_sum_pages
       - META
       - READ_SYNC
      
      Cc: Fan Li <fanofcode.li@samsung.com>
      Cc: Changman Lee <cm224.lee@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      458e6197
    • F
      f2fs: merge pages with the same sync_mode flag · 63a0b7cb
      Fan Li 提交于
      Previously f2fs submits most of write requests using WRITE_SYNC, but f2fs_write_data_pages
      submits last write requests by sync_mode flags callers pass.
      
      This causes a performance problem since continuous pages with different sync flags
      can't be merged in cfq IO scheduler(thanks yu chao for pointing it out), and synchronous
      requests often take more time.
      
      This patch makes the following modifies to DATA writebacks:
      
      1. every page will be written back using the sync mode caller pass.
      2. only pages with the same sync mode can be merged in one bio request.
      
      These changes are restricted to DATA pages.Other types of writebacks are modified
      To remain synchronous.
      
      In my test with tiotest, f2fs sequence write performance is improved by about 7%-10% ,
      and this patch has no obvious impact on other performance tests.
      Signed-off-by: NFan Li <fanofcode.li@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      63a0b7cb
    • J
      f2fs: add unlikely() macro for compiler more aggressively · 6bacf52f
      Jaegeuk Kim 提交于
      This patch adds unlikely() macro into the most of codes.
      The basic rule is to add that when:
      - checking unusual errors,
      - checking page mappings,
      - and the other unlikely conditions.
      
      Change log from v1:
       - Don't add unlikely for the NULL test and error test: advised by Andi Kleen.
      
      Cc: Chao Yu <chao2.yu@samsung.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6bacf52f
    • J
      f2fs: refactor bio-related operations · 93dfe2ac
      Jaegeuk Kim 提交于
      This patch integrates redundant bio operations on read and write IOs.
      
      1. Move bio-related codes to the top of data.c.
      2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read
         bios additionally.
      3. Introduce __submit_merged_bio to submit the merged bio.
      4. Change f2fs_readpage to f2fs_submit_page_bio.
      5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and
         submit_write_page.
      Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Reviewed-by: Chao Yu <chao2.yu@samsung.com >
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      93dfe2ac
    • J
      f2fs: remove unnecessary condition checks · 031fa8cc
      Jaegeuk Kim 提交于
      This patch removes the unnecessary condition checks on:
      
      fs/f2fs/gc.c:667 do_garbage_collect() warn: 'sum_page' isn't an ERR_PTR
      fs/f2fs/f2fs.h:795 f2fs_put_page() warn: 'page' isn't an ERR_PTR
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      031fa8cc
  6. 25 10月, 2013 3 次提交
  7. 22 10月, 2013 1 次提交
  8. 24 9月, 2013 1 次提交
    • J
      f2fs: optimize the victim searching loop slightly · a57e564d
      Jin Xu 提交于
      Since the MAX_VICTIM_SEARCH has been enlarged from 20 to 4096,
      the victim searching overhead will be increased much than before,
      especially for SSR that searches victim for use quiet often.
      This patch intends to reduce the overhead a little bit by:
      - make the get_gc_cost a inline routine to reduce function call
        overhead
      - reduce multiplication and division operations
      - reduce unnecessary comparison operation
      Signed-off-by: NJin Xu <jinuxstyle@gmail.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a57e564d
  9. 05 9月, 2013 1 次提交
    • J
      f2fs: optimize gc for better performance · a26b7c8a
      Jin Xu 提交于
      This patch improves the gc efficiency by optimizing the victim
      selection policy. With this optimization, the random re-write
      performance could increase up to 20%.
      
      For f2fs, when disk is in shortage of free spaces, gc will selects
      dirty segments and moves valid blocks around for making more space
      available. The gc cost of a segment is determined by the valid blocks
      in the segment. The less the valid blocks, the higher the efficiency.
      The ideal victim segment is the one that has the most garbage blocks.
      
      Currently, it searches up to 20 dirty segments for a victim segment.
      The selected victim is not likely the best victim for gc when there
      are much more dirty segments. Why not searching more dirty segments
      for a better victim? The cost of searching dirty segments is
      negligible in comparison to moving blocks.
      
      In this patch, it enlarges the MAX_VICTIM_SEARCH to 4096 to make
      the search more aggressively for a possible better victim. Since
      it also applies to victim selection for SSR, it will likely improve
      the SSR efficiency as well.
      
      The test case is simple. It creates as many files until the disk full.
      The size for each file is 32KB. Then it writes as many as 100000
      records of 4KB size to random offsets of random files in sync mode.
      The testing was done on a 2GB partition of a SDHC card. Let's see the
      test result of f2fs without and with the patch.
      
      ---------------------------------------
      2GB partition, SDHC
      create 52023 files of size 32768 bytes
      random re-write 100000 records of 4KB
      ---------------------------------------
      | file creation (s) | rewrite time (s) | gc count | gc garbage blocks |
      [no patch]  341         4227             1174          174840
      [patched]   324         2958             645           106682
      
      It's obvious that, with the patch, f2fs finishes the test in 20+% less
      time than without the patch. And internally it does much less gc with
      higher efficiency than before.
      
      Since the performance improvement is related to gc, it might not be so
      obvious for other tests that do not trigger gc as often as this one (
      This is because f2fs selects dirty segments for SSR use most of the
      time when free space is in shortage). The well-known iozone test tool
      was not used for benchmarking the patch becuase it seems do not have
      a test case that performs random re-write on a full disk.
      
      This patch is the revised version based on the suggestion from
      Jaegeuk Kim.
      Signed-off-by: NJin Xu <jinuxstyle@gmail.com>
      [Jaegeuk Kim: suggested simpler solution]
      Reviewed-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a26b7c8a
  10. 26 8月, 2013 1 次提交
    • J
      f2fs: reserve the xattr space dynamically · de93653f
      Jaegeuk Kim 提交于
      This patch enables the number of direct pointers inside on-disk inode block to
      be changed dynamically according to the size of inline xattr space.
      
      The number of direct pointers, ADDRS_PER_INODE, can be changed only if the file
      has inline xattr flag.
      
      The number of direct pointers that will be used by inline xattrs is defined as
      F2FS_INLINE_XATTR_ADDRS.
      Current patch assigns F2FS_INLINE_XATTR_ADDRS to 0 temporarily.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      de93653f
  11. 06 8月, 2013 3 次提交
    • J
      f2fs: fix a deadlock in fsync · a569469e
      Jin Xu 提交于
      This patch fixes a deadlock bug that occurs quite often when there are
      concurrent write and fsync on a same file.
      
      Following is the simplified call trace when tasks get hung.
      
      fsync thread:
      - f2fs_sync_file
       ...
       - f2fs_write_data_pages
       ...
        - update_extent_cache
        ...
         - update_inode
          - wait_on_page_writeback
      
      bdi writeback thread
      - __writeback_single_inode
       - f2fs_write_data_pages
        - mutex_lock(sbi->writepages)
      
      The deadlock happens when the fsync thread waits on a inode page that has
      been added to the f2fs' cached bio sbi->bio[NODE], and unfortunately,
      no one else could be able to submit the cached bio to block layer for
      writeback. This is because the fsync thread already hold a sbi->fs_lock and
      the sbi->writepages lock, causing the bdi thread being blocked when attempt
      to write data pages for the same inode. At the same time, f2fs_gc thread
      does not notice the situation and could not help. Even the sync syscall
      gets blocked.
      
      To fix it, we could submit the cached bio first before waiting on a inode page
      that is being written back.
      Signed-off-by: NJin Xu <jinuxstyle@gmail.com>
      [Jaegeuk Kim: add more cases to use f2fs_wait_on_page_writeback]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a569469e
    • N
      f2fs: add sysfs entries to select the gc policy · d2dc095f
      Namjae Jeon 提交于
      Add sysfs entry gc_idle to control the gc policy. Where
      gc_idle = 1 corresponds to selecting a cost benefit approach,
      while gc_idle = 2 corresponds to selecting a greedy approach
      to garbage collection. The selection is mutually exclusive one
      approach will work at any point. If gc_idle = 0, then this
      option is disabled.
      
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
      Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      [Jaegeuk Kim: change the select_gc_type() flow slightly]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      d2dc095f
    • N
      f2fs: add sysfs support for controlling the gc_thread · b59d0bae
      Namjae Jeon 提交于
      Add sysfs entries to control the timing parameters for
      f2fs gc thread.
      
      Various Sysfs options introduced are:
      gc_min_sleep_time: Min Sleep time for GC in ms
      gc_max_sleep_time: Max Sleep time for GC in ms
      gc_no_gc_sleep_time: Default Sleep time for GC in ms
      
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
      Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      [Jaegeuk Kim: fix an umount bug and some minor changes]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b59d0bae
  12. 02 7月, 2013 1 次提交
  13. 06 6月, 2013 1 次提交
    • N
      f2fs: reorganise the function get_victim_by_default · b2b3460a
      Namjae Jeon 提交于
      Fix the function get_victim_by_default, where it checks
      for the condition  that p.min_segno != NULL_SEGNO as
      shown:
      
      if (p.min_segno != NULL_SEGNO)
                 goto got_it;
      
      and if above condition is true then
      
      got_it:
              if (p.min_segno != NULL_SEGNO) {
      
      So this condition is being checked twice. Hence move the goto
      statement after the if condition so that duplication of condition
      check is avoided.
      
      Also this function makes a call to get_max_cost() to compute
      the max cost based on the f2fs_sbi_info and victim policy. Since
      get_max_cost depends on on three parameters of victim_sel_policy
      => alloc_mode, gc_mode & ofs_unit, once this victim policy is
      initialised, these value will not change till the execution
      time of get_victim_by_default() & also f2fs_sbi_info structure
      parameters will not change.
      
      Hence making calls to get_max_cost() in while loop does not seems to
      be a good point. Instead we can call it once in begining and store
      the results in local variable, which later can serve our purpose
      for comparing the cost with max cost inside the while loop.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b2b3460a
  14. 28 5月, 2013 2 次提交
  15. 30 4月, 2013 1 次提交
  16. 26 4月, 2013 2 次提交
    • J
      f2fs: give a chance to merge IOs by IO scheduler · c718379b
      Jaegeuk Kim 提交于
      Previously, background GC submits many 4KB read requests to load victim blocks
      and/or its (i)node blocks.
      
      ...
      f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed
      f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0]
      f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee
      f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0]
      f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef
      f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0]
      ...
      
      However, by the fact that many IOs are sequential, we can give a chance to merge
      the IOs by IO scheduler.
      In order to do that, let's use blk_plug.
      
      ...
      f2fs_gc : f2fs_iget: ino = 143
      f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee
      f2fs_gc : f2fs_iget: ino = 143
      f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef
      <idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0]
      <idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0]
      <idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0]
      <idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0]
      <idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0]
      <idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0]
      <idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0]
      <idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0]
      <idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0]
      ...
      
      Note that this issue should be addressed in checkpoint, and some readahead
      flows too.
      Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      c718379b
    • J
      f2fs: avoid frequent background GC · 6cb968d9
      Jaegeuk Kim 提交于
      If there is no victim segments selected by background GC, let's wait
      a little bit longer time to collect dirty segments.
      By default, let's give 5 minutes.
      Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6cb968d9
  17. 23 4月, 2013 1 次提交
  18. 09 4月, 2013 2 次提交
    • J
      f2fs: write checkpoint before starting FG_GC · d64f8047
      Jaegeuk Kim 提交于
      In order to be aware of prefree and free sections during FG_GC, let's start with
      write_checkpoint().
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      d64f8047
    • J
      f2fs: introduce a new global lock scheme · 39936837
      Jaegeuk Kim 提交于
      In the previous version, f2fs uses global locks according to the usage types,
      such as directory operations, block allocation, block write, and so on.
      
      Reference the following lock types in f2fs.h.
      enum lock_type {
      	RENAME,		/* for renaming operations */
      	DENTRY_OPS,	/* for directory operations */
      	DATA_WRITE,	/* for data write */
      	DATA_NEW,	/* for data allocation */
      	DATA_TRUNC,	/* for data truncate */
      	NODE_NEW,	/* for node allocation */
      	NODE_TRUNC,	/* for node truncate */
      	NODE_WRITE,	/* for node write */
      	NR_LOCK_TYPE,
      };
      
      In that case, we lose the performance under the multi-threading environment,
      since every types of operations must be conducted one at a time.
      
      In order to address the problem, let's share the locks globally with a mutex
      array regardless of any types.
      So, let users grab a mutex and perform their jobs in parallel as much as
      possbile.
      
      For this, I propose a new global lock scheme as follows.
      
      0. Data structure
       - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
       - f2fs_sb_info -> node_write
      
      1. mutex_lock_op(sbi)
       - try to get an avaiable lock from the array.
       - returns the index of the gottern lock variable.
      
      2. mutex_unlock_op(sbi, index of the lock)
       - unlock the given index of the lock.
      
      3. mutex_lock_all(sbi)
       - grab all the locks in the array before the checkpoint.
      
      4. mutex_unlock_all(sbi)
       - release all the locks in the array after checkpoint.
      
      5. block_operations()
       - call mutex_lock_all()
       - sync_dirty_dir_inodes()
       - grab node_write
       - sync_node_pages()
      
      Note that,
       the pairs of mutex_lock_op()/mutex_unlock_op() and
       mutex_lock_all()/mutex_unlock_all() should be used together.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      39936837
  19. 03 4月, 2013 3 次提交
  20. 20 3月, 2013 1 次提交
  21. 19 3月, 2013 1 次提交
  22. 12 2月, 2013 4 次提交
    • J
      f2fs: fix calculation of max. gc cost in the SSR case · b7250d2d
      Jaegeuk Kim 提交于
      In the SSR case, the max gc cost should be the number of pages in a segment.
      Otherwise, f2fs is able to fail getting dirty segments frequently for SSR.
      
      In get_victim_by_default() previously,
      
      while(1) {
         ...
         cost = get_gc_cost(); <- cost is between 0 ~ 512.
         ...
         if (cost == get_max_cost(sbi, &p)) <- max cost is UINT_MAX due to GC_CB type
      	continue;
      
         if (nsearched++ >= MAX_VICTIM_SEARCH)
      	break;
      }
      
      So, if there are a number of fully valid segments in series, f2fs cannot skip
      those segments by comparing the cost and max cost of each segment.
      
      Note that, the cost is the number of valid blocks at the time of the last
      checkpoint.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b7250d2d
    • J
      f2fs: clarify and enhance the f2fs_gc flow · 43727527
      Jaegeuk Kim 提交于
      This patch makes clearer the ambiguous f2fs_gc flow as follows.
      
      1. Remove intermediate checkpoint condition during f2fs_gc
       (i.e., should_do_checkpoint() and GC_BLOCKED)
      
      2. Remove unnecessary return values of f2fs_gc because of #1.
       (i.e., GC_NODE, GC_OK, etc)
      
      3. Simplify write_checkpoint() because of #2.
      
      4. Clarify the main f2fs_gc flow.
       o monitor how many freed sections during one iteration of do_garbage_collect().
       o do GC more without checkpoints if we can't get enough free sections.
       o do checkpoint once we've got enough free sections through forground GCs.
      
      5. Adopt thread-logging (Slack-Space-Recycle) scheme more aggressively on data
        log types. See. get_ssr_segement()
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      43727527
    • N
      f2fs: mark gc_thread as NULL when thread creation is failed · 25718423
      Namjae Jeon 提交于
      When gc thread creation is failed, mark gc_thread as NULL to avoid
      crash while trying to stop invalid thread in stop_gc_thread->kthread_stop.
      Instead make it return from:
      	if (!gc_th)
             return;
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      25718423
    • N
      f2fs: name gc task as per the block device · ec7b1f2d
      Namjae Jeon 提交于
      Currently GC task is started for each f2fs formatted/mounted device.
      But, when we check the task list, using 'ps', there is no distinguishing
      factor between the tasks. So, name the task as per the block device just
      like the flusher threads.
      Also, remove the macro GC_THREAD_NAME and instead use the name: f2fs_gc
      to avoid name length truncation, as the command length is 16
      -> TASK_COMM_LEN 16 and example name like:
      f2fs_gc_task:8:16 -> this exceeds name length
      
      Before Patch for 2 F2FS formatted partitions:
      root  28061  0.0  0.0  0 0 ? S 10:31   0:00 [f2fs_gc_task]
      root  28087  0.0  0.0  0 0 ? S 10:32   0:00 [f2fs_gc_task]
      
      After Patch:
      root  16756  0.0  0.0  0  0 ?  S  14:57   0:00 [f2fs_gc-8:18]
      root  16765  0.0  0.0  0  0 ?  S  14:57   0:00 [f2fs_gc-8:19]
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      ec7b1f2d