1. 23 12月, 2013 19 次提交
  2. 29 10月, 2013 1 次提交
  3. 25 10月, 2013 2 次提交
  4. 07 10月, 2013 1 次提交
    • G
      f2fs: use rw_sem instead of fs_lock(locks mutex) · e479556b
      Gu Zheng 提交于
      The fs_locks is used to block other ops(ex, recovery) when doing checkpoint.
      And each other operate routine(besides checkpoint) needs to acquire a fs_lock,
      there is a terrible problem here, if these are too many concurrency threads acquiring
      fs_lock, so that they will block each other and may lead to some performance problem,
      but this is not the phenomenon we want to see.
      Though there are some optimization patches introduced to enhance the usage of fs_lock,
      but the thorough solution is using a *rw_sem* to replace the fs_lock.
      Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block
      other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other,
      this can avoid the problem described above completely.
      Because of the weakness of rw_sem, the above change may introduce a potential problem
      that the checkpoint thread might get starved if other threads are intensively locking
      the read semaphore for I/O.(Pointed out by Xu Jin)
      In order to avoid this, a wait_list is introduced, the appending read semaphore ops
      will be dropped into the wait_list if checkpoint thread is waiting for write semaphore,
      and will be waked up when checkpoint thread gives up write semaphore.
      Thanks to Kim's previous review and test, and will be very glad to see other guys'
      performance tests about this patch.
      
      V2:
        -fix the potential starvation problem.
        -use more suitable func name suggested by Xu Jin.
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      [Jaegeuk Kim: adjust minor coding standard]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      e479556b
  5. 26 8月, 2013 1 次提交
    • J
      f2fs: reserve the xattr space dynamically · de93653f
      Jaegeuk Kim 提交于
      This patch enables the number of direct pointers inside on-disk inode block to
      be changed dynamically according to the size of inline xattr space.
      
      The number of direct pointers, ADDRS_PER_INODE, can be changed only if the file
      has inline xattr flag.
      
      The number of direct pointers that will be used by inline xattrs is defined as
      F2FS_INLINE_XATTR_ADDRS.
      Current patch assigns F2FS_INLINE_XATTR_ADDRS to 0 temporarily.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      de93653f
  6. 20 8月, 2013 1 次提交
    • J
      f2fs: fix wrong BUG_ON condition · d59ff4df
      Jaegeuk Kim 提交于
      This patch removes a false-alaramed BUG_ON.
      The previous BUG_ON condition didn't cover the following true scenario.
      
      In f2fs_add_link, 1) get_new_data_page gives an uptodate page successfully,
      and then, 2) init_inode_metadata returns -ENOSPC.
      At this moment, a new clean data page is remained in the page cache, but its
      block address still indicates NEW_ADDR.
      After then, even if sync is called, this clean data page cannot be written to
      the disk due to the clean state.
      
      So this means that get_lock_data_page should make a new empty page when its
      block address is NEW_ADDR and its page is not uptodated.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      d59ff4df
  7. 12 8月, 2013 1 次提交
  8. 06 8月, 2013 2 次提交
    • J
      f2fs: fix a deadlock in fsync · a569469e
      Jin Xu 提交于
      This patch fixes a deadlock bug that occurs quite often when there are
      concurrent write and fsync on a same file.
      
      Following is the simplified call trace when tasks get hung.
      
      fsync thread:
      - f2fs_sync_file
       ...
       - f2fs_write_data_pages
       ...
        - update_extent_cache
        ...
         - update_inode
          - wait_on_page_writeback
      
      bdi writeback thread
      - __writeback_single_inode
       - f2fs_write_data_pages
        - mutex_lock(sbi->writepages)
      
      The deadlock happens when the fsync thread waits on a inode page that has
      been added to the f2fs' cached bio sbi->bio[NODE], and unfortunately,
      no one else could be able to submit the cached bio to block layer for
      writeback. This is because the fsync thread already hold a sbi->fs_lock and
      the sbi->writepages lock, causing the bdi thread being blocked when attempt
      to write data pages for the same inode. At the same time, f2fs_gc thread
      does not notice the situation and could not help. Even the sync syscall
      gets blocked.
      
      To fix it, we could submit the cached bio first before waiting on a inode page
      that is being written back.
      Signed-off-by: NJin Xu <jinuxstyle@gmail.com>
      [Jaegeuk Kim: add more cases to use f2fs_wait_on_page_writeback]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a569469e
    • N
      f2fs: remove redundant code from f2fs_write_begin · df273efc
      Namjae Jeon 提交于
      This code is being used for nobh_write_end() function.
      But since now f2fs_write_end function is added so
      there is no need for this code.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      df273efc
  9. 31 7月, 2013 1 次提交
  10. 30 7月, 2013 2 次提交
  11. 02 7月, 2013 1 次提交
    • J
      f2fs: fix to recover i_size from roll-forward · a1dd3c13
      Jaegeuk Kim 提交于
      If user requests many data writes and fsync together, the last updated i_size
      should be stored to the inode block consistently.
      
      But, previous write_end just marks the inode as dirty and doesn't update its
      metadata into its inode block.
      After that, fsync just writes the inode block with newly updated data index
      excluding inode metadata updates.
      
      So, this patch introduces write_end in which updates inode block too when the
      i_size is changed.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a1dd3c13
  12. 14 6月, 2013 1 次提交
  13. 12 6月, 2013 1 次提交
    • J
      f2fs: sync dir->i_size with its block allocation · 699489bb
      Jaegeuk Kim 提交于
      If new dentry block is allocated and its i_size is updated, we should update
      its inode block together in order to sync i_size and its block allocation.
      Otherwise, we can loose additional dentry block due to the unconsistent i_size.
      
      Errorneous Scenario
      -------------------
      
      In the recovery routine,
       - recovery_dentry
       | - __f2fs_add_link
       | | - get_new_data_page
       | | | - i_size_write(new_i_size)
       | | | - mark_inode_dirty_sync(dir)
       | | - update_parent_metadata
       | | | - mark_inode_dirty(dir)
       |
       - write_checkpoint
         - sync_dirty_dir_inodes
           - filemap_flush(dentry_blocks)
             - f2fs_write_data_page
               - skip to write the last dentry block due to index < i_size
      
      In the above flow, new_i_size is not updated to its inode block so that the
      last dentry block will be lost accordingly.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      699489bb
  14. 28 5月, 2013 5 次提交
    • N
      f2fs: push some variables to debug part · 35b09d82
      Namjae Jeon 提交于
      Some, counters are needed only for the statistical information
      while debugging.
      So, those can be controlled using CONFIG_F2FS_STAT_FS,
      pushing the usage for few variables under this flag.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      35b09d82
    • J
      f2fs: avoid RECLAIM_FS-ON-W: deadlock · 6f85b352
      Jaegeuk Kim 提交于
      This patch tries to avoid the following deadlock condition of which the reclaim
      path can trigger f2fs_balance_fs again.
      
      =================================
      [ INFO: inconsistent lock state ]
      ---------------------------------
      inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
      kswapd0/41 [HC0[0]:SC0[0]:HE1:SE1] takes:
       (&sbi->gc_mutex){+.+.?.}, at: f2fs_balance_fs+0xe6/0x100 [f2fs]
      {RECLAIM_FS-ON-W} state was registered at:
        [<ffffffff810aa5a9>] mark_held_locks+0xb9/0x140
        [<ffffffff810aae85>] lockdep_trace_alloc+0x85/0xf0
        [<ffffffff8113ab2c>] __alloc_pages_nodemask+0x7c/0x9b0
        [<ffffffff81175aa8>] alloc_pages_current+0xb8/0x180
        [<ffffffff811319cf>] __page_cache_alloc+0xaf/0xd0
        [<ffffffff8113225c>] find_or_create_page+0x4c/0xb0
        [<ffffffffa021359e>] find_data_page+0x14e/0x210 [f2fs]
        [<ffffffffa021161b>] f2fs_gc+0x9eb/0xd90 [f2fs]
        [<ffffffffa0218fae>] f2fs_balance_fs+0xee/0x100 [f2fs]
        [<ffffffffa020848c>] f2fs_setattr+0x6c/0x200 [f2fs]
        [<ffffffff811ae51b>] notify_change+0x1db/0x3a0
        [<ffffffff8118fbd0>] do_truncate+0x60/0xa0
        [<ffffffff8118fd95>] vfs_truncate+0x185/0x1b0
        [<ffffffff8118fe1c>] do_sys_truncate+0x5c/0xa0
        [<ffffffff8118ffee>] SyS_truncate+0xe/0x10
        [<ffffffff816e2b42>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6f85b352
    • J
      f2fs: update inode page after creation · 44a83ff6
      Jaegeuk Kim 提交于
      I found a bug when testing power-off-recovery as follows.
      
      [Bug Scenario]
      1. create a file
      2. fsync the file
      3. reboot w/o any sync
      4. try to recover the file
       - found its fsync mark
       - found its dentry mark
         : try to recover its dentry
          - get its file name
          - get its parent inode number
           : here we got zero value
      
      The reason why we get the wrong parent inode number is that we didn't
      synchronize the inode page with its newly created inode information perfectly.
      
      Especially, previous f2fs stores fi->i_pino and writes it to the cached
      node page in a wrong order, which incurs the zero-valued i_pino during the
      recovery.
      
      So, this patch modifies the creation flow to fix the synchronization order of
      inode page with its inode.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      44a83ff6
    • J
      f2fs: change get_new_data_page to pass a locked node page · 64aa7ed9
      Jaegeuk Kim 提交于
      This patch is for passing a locked node page to get_dnode_of_data.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      64aa7ed9
    • J
      f2fs: fix the inconsistent state of data pages · 650495de
      Jaegeuk Kim 提交于
      In get_lock_data_page, if there is a data race between get_dnode_of_data for
      node and grab_cache_page for data, f2fs is able to face with the following
      BUG_ON(dn.data_blkaddr == NEW_ADDR).
      
      kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/data.c:251!
       [<ffffffffa044966c>] get_lock_data_page+0x1ec/0x210 [f2fs]
      Call Trace:
       [<ffffffffa043b089>] f2fs_readdir+0x89/0x210 [f2fs]
       [<ffffffff811a0920>] ? fillonedir+0x100/0x100
       [<ffffffff811a0920>] ? fillonedir+0x100/0x100
       [<ffffffff811a07f8>] vfs_readdir+0xb8/0xe0
       [<ffffffff811a0b4f>] sys_getdents+0x8f/0x110
       [<ffffffff816d7999>] system_call_fastpath+0x16/0x1b
      
      This bug is able to be occurred when the block address of the data block is
      changed after f2fs_put_dnode().
      In order to avoid that, this patch fixes the lock order of node and data
      blocks in which the node block lock is covered by the data block lock.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      650495de
  15. 22 5月, 2013 1 次提交
    • L
      mm: change invalidatepage prototype to accept length · d47992f8
      Lukas Czerner 提交于
      Currently there is no way to truncate partial page where the end
      truncate point is not at the end of the page. This is because it was not
      needed and the functionality was enough for file system truncate
      operation to work properly. However more file systems now support punch
      hole feature and it can benefit from mm supporting truncating page just
      up to the certain point.
      
      Specifically, with this functionality truncate_inode_pages_range() can
      be changed so it supports truncating partial page at the end of the
      range (currently it will BUG_ON() if 'end' is not at the end of the
      page).
      
      This commit changes the invalidatepage() address space operation
      prototype to accept range to be invalidated and update all the instances
      for it.
      
      We also change the block_invalidatepage() in the same way and actually
      make a use of the new length argument implementing range invalidation.
      
      Actual file system implementations will follow except the file systems
      where the changes are really simple and should not change the behaviour
      in any way .Implementation for truncate_page_range() which will be able
      to accept page unaligned ranges will follow as well.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      d47992f8