1. 05 11月, 2014 1 次提交
    • J
      f2fs: revisit inline_data to avoid data races and potential bugs · b3d208f9
      Jaegeuk Kim 提交于
      This patch simplifies the inline_data usage with the following rule.
      1. inline_data is set during the file creation.
      2. If new data is requested to be written ranges out of inline_data,
       f2fs converts that inode permanently.
      3. There is no cases which converts non-inline_data inode to inline_data.
      4. The inline_data flag should be changed under inode page lock.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b3d208f9
  2. 04 11月, 2014 4 次提交
  3. 01 10月, 2014 1 次提交
  4. 04 9月, 2014 1 次提交
  5. 02 9月, 2014 1 次提交
    • C
      f2fs: reposition unlock_new_inode to prevent accessing invalid inode · b73e5282
      Chao Yu 提交于
      As the race condition on the inode cache, following scenario can appear:
      [Thread a]				[Thread b]
      					->f2fs_mkdir
      					  ->f2fs_add_link
      					    ->__f2fs_add_link
      					      ->init_inode_metadata failed here
      ->gc_thread_func
        ->f2fs_gc
          ->do_garbage_collect
            ->gc_data_segment
              ->f2fs_iget
                ->iget_locked
                  ->wait_on_inode
      					  ->unlock_new_inode
              ->move_data_page
      					  ->make_bad_inode
      					  ->iput
      
      When we fail in create/symlink/mkdir/mknod/tmpfile, the new allocated inode
      should be set as bad to avoid being accessed by other thread. But in above
      scenario, it allows f2fs to access the invalid inode before this inode was set
      as bad.
      This patch fix the potential problem, and this issue was found by code review.
      
      change log from v1:
       o Add condition judgment in gc_data_segment() suggested by Changman Lee.
       o use iget_failed to simplify code.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b73e5282
  6. 22 8月, 2014 1 次提交
  7. 20 8月, 2014 1 次提交
  8. 25 7月, 2014 1 次提交
  9. 10 7月, 2014 4 次提交
  10. 09 7月, 2014 2 次提交
    • J
      f2fs: do checkpoint for the renamed inode · b2c08299
      Jaegeuk Kim 提交于
      If an inode is renamed, it should be registered as file_lost_pino to conduct
      checkpoint at f2fs_sync_file.
      Otherwise, the inode cannot be recovered due to no dent_mark in the following
      scenario.
      
      Note that, this scenario is from xfstests/322.
      
      1. create "a"
      2. fsync "a"
      3. rename "a" to "b"
      4. fsync "b"
      5. Sudden power-cut
      
      After recovery is done, "b" should be seen.
      However, the result shows "a", since the recovery procedure does not enter
      recover_dentry due to no dent_mark.
      
      The reason is like below.
      - The nid of "a" is checkpointed during #2, f2fs_sync_file.
      - The inode page for "b" produced by #3 is written without dent_mark by
      sync_node_pages.
      
      So, this patch fixes this bug by assinging file_lost_pino to the "a"'s inode.
      If the pino is lost, f2fs_sync_file conducts checkpoint, and then recovers
      the latest pino and its dentry information for further recovery.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b2c08299
    • C
      f2fs: release new entry page correctly in error path of f2fs_rename · dd4d961f
      Chao Yu 提交于
      This patch correct releasing code of new_page to avoid BUG_ON in error patch of
      f2fs_rename.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      dd4d961f
  11. 08 5月, 2014 1 次提交
  12. 07 4月, 2014 1 次提交
  13. 20 3月, 2014 1 次提交
  14. 26 1月, 2014 1 次提交
  15. 22 1月, 2014 1 次提交
  16. 23 12月, 2013 1 次提交
  17. 08 10月, 2013 1 次提交
    • J
      f2fs: fix writing incorrect orphan blocks · ccaaca25
      Jaegeuk Kim 提交于
      Previously, there was a erroneous scenario like below.
      thread 1:                       thread 2:
       f2fs_unlink
        - acquire_orphan_inode
          : sbi->n_orphans++           write_checkpoint
                                       - block_operations
                                        : f2fs_lock_all
                                       - do_checkpoint
                                        : write orphan blocks with sbi->n_orphans
                                       - unblock_operations
        - f2fs_lock_op
        - release_orphan_inode
        - f2fs_unlock_op
      
      During the checkpoint by thread 2, f2fs stores a wrong orphan block according
      to the wrong sbi->n_orphans.
      To avoid this, simply we should make cover acquire_orphan_inode too with
      f2fs_lock_op.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      ccaaca25
  18. 07 10月, 2013 1 次提交
    • G
      f2fs: use rw_sem instead of fs_lock(locks mutex) · e479556b
      Gu Zheng 提交于
      The fs_locks is used to block other ops(ex, recovery) when doing checkpoint.
      And each other operate routine(besides checkpoint) needs to acquire a fs_lock,
      there is a terrible problem here, if these are too many concurrency threads acquiring
      fs_lock, so that they will block each other and may lead to some performance problem,
      but this is not the phenomenon we want to see.
      Though there are some optimization patches introduced to enhance the usage of fs_lock,
      but the thorough solution is using a *rw_sem* to replace the fs_lock.
      Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block
      other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other,
      this can avoid the problem described above completely.
      Because of the weakness of rw_sem, the above change may introduce a potential problem
      that the checkpoint thread might get starved if other threads are intensively locking
      the read semaphore for I/O.(Pointed out by Xu Jin)
      In order to avoid this, a wait_list is introduced, the appending read semaphore ops
      will be dropped into the wait_list if checkpoint thread is waiting for write semaphore,
      and will be waked up when checkpoint thread gives up write semaphore.
      Thanks to Kim's previous review and test, and will be very glad to see other guys'
      performance tests about this patch.
      
      V2:
        -fix the potential starvation problem.
        -use more suitable func name suggested by Xu Jin.
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      [Jaegeuk Kim: adjust minor coding standard]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      e479556b
  19. 27 8月, 2013 2 次提交
  20. 30 7月, 2013 2 次提交
    • J
      f2fs: fix handling orphan inodes · cbd56e7d
      Jaegeuk Kim 提交于
      This patch fixes mishandling of the sbi->n_orphans variable.
      
      If users request lots of f2fs_unlink(), check_orphan_space() could be contended.
      In such the case, sbi->n_orphans can be read incorrectly so that f2fs_unlink()
      would fall into the wrong state which results in the failure of
      add_orphan_inode().
      
      So, let's increment sbi->n_orphans virtually prior to the actual orphan inode
      stuffs. After that, let's release sbi->n_orphans by calling release_orphan_inode
      or remove_orphan_inode.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      cbd56e7d
    • J
      f2fs: update file name in the inode block during f2fs_rename · 1cd14caf
      Jaegeuk Kim 提交于
      The error is reproducible by:
      0. mkfs.f2fs /dev/sdb1 & mount
      1. touch test1
      2. touch test2
      3. mv test1 test2
      4. umount
      5. dumpt.f2fs -i 4 /dev/sdb1
      
      After this, when we retrieve the inode->i_name of test2 by dump.f2fs, we get
      test1 instead of test2.
      This is because f2fs didn't update the file name during the f2fs_rename.
      
      So, this patch fixes that.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      1cd14caf
  21. 14 6月, 2013 1 次提交
  22. 11 6月, 2013 1 次提交
    • J
      f2fs: fix i_blocks translation on various types of files · 2d4d9fb5
      Jaegeuk Kim 提交于
      Basically an inode manages the number of allocated blocks with inode->i_blocks
      which is represented in a unit of sectors, not file system blocks.
      But, f2fs has used i_blocks in a unit of file system blocks, and f2fs_getattr
      translates it to the number of sectors when fstat is called.
      
      However, previously f2fs_file_inode_operations only has this, so this patch adds
      it to all the types of inode_operations.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      2d4d9fb5
  23. 28 5月, 2013 4 次提交
  24. 08 5月, 2013 1 次提交
    • J
      f2fs: avoid deadlock during evict after f2fs_gc · 531ad7d5
      Jaegeuk Kim 提交于
      o Deadlock case #1
      
      Thread 1:
      - writeback_sb_inodes
       - do_writepages
        - f2fs_write_data_pages
         - write_cache_pages
          - f2fs_write_data_page
           - f2fs_balance_fs
            - wait mutex_lock(gc_mutex)
      
      Thread 2:
      - f2fs_balance_fs
       - mutex_lock(gc_mutex)
       - f2fs_gc
        - f2fs_iget
         - wait iget_locked(inode->i_lock)
      
      Thread 3:
      - do_unlinkat
       - iput
        - lock(inode->i_lock)
         - evict
          - inode_wait_for_writeback
      
      o Deadlock case #2
      
      Thread 1:
      - __writeback_single_inode
       : set I_SYNC
        - do_writepages
         - f2fs_write_data_page
          - f2fs_balance_fs
           - f2fs_gc
            - iput
             - evict
              - inode_wait_for_writeback(I_SYNC)
      
      In order to avoid this, even though iput is called with the zero-reference
      count, we need to stop the eviction procedure if the inode is on writeback.
      So this patch links f2fs_drop_inode which checks the I_SYNC flag.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      531ad7d5
  25. 29 4月, 2013 1 次提交
  26. 23 4月, 2013 2 次提交
  27. 09 4月, 2013 1 次提交
    • J
      f2fs: introduce a new global lock scheme · 39936837
      Jaegeuk Kim 提交于
      In the previous version, f2fs uses global locks according to the usage types,
      such as directory operations, block allocation, block write, and so on.
      
      Reference the following lock types in f2fs.h.
      enum lock_type {
      	RENAME,		/* for renaming operations */
      	DENTRY_OPS,	/* for directory operations */
      	DATA_WRITE,	/* for data write */
      	DATA_NEW,	/* for data allocation */
      	DATA_TRUNC,	/* for data truncate */
      	NODE_NEW,	/* for node allocation */
      	NODE_TRUNC,	/* for node truncate */
      	NODE_WRITE,	/* for node write */
      	NR_LOCK_TYPE,
      };
      
      In that case, we lose the performance under the multi-threading environment,
      since every types of operations must be conducted one at a time.
      
      In order to address the problem, let's share the locks globally with a mutex
      array regardless of any types.
      So, let users grab a mutex and perform their jobs in parallel as much as
      possbile.
      
      For this, I propose a new global lock scheme as follows.
      
      0. Data structure
       - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
       - f2fs_sb_info -> node_write
      
      1. mutex_lock_op(sbi)
       - try to get an avaiable lock from the array.
       - returns the index of the gottern lock variable.
      
      2. mutex_unlock_op(sbi, index of the lock)
       - unlock the given index of the lock.
      
      3. mutex_lock_all(sbi)
       - grab all the locks in the array before the checkpoint.
      
      4. mutex_unlock_all(sbi)
       - release all the locks in the array after checkpoint.
      
      5. block_operations()
       - call mutex_lock_all()
       - sync_dirty_dir_inodes()
       - grab node_write
       - sync_node_pages()
      
      Note that,
       the pairs of mutex_lock_op()/mutex_unlock_op() and
       mutex_lock_all()/mutex_unlock_all() should be used together.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      39936837