1. 28 5月, 2013 29 次提交
    • N
      f2fs: optimize several routines in node.h · a06a2416
      Namjae Jeon 提交于
      There are various functions with common code which could be separated
      out to make common routines. So, made new routines and in order to
      retain the same call path and no major changes, written some macros
      to access those routines.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a06a2416
    • N
      f2fs: remove unneeded initializations in f2fs_parent_dir · 4777f86b
      Namjae Jeon 提交于
      There is no need to initialize few pointers in f2fs_parent_dir
      as the values are not checked and instead directly initialized
      values are used.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      4777f86b
    • N
      f2fs: push some variables to debug part · 35b09d82
      Namjae Jeon 提交于
      Some, counters are needed only for the statistical information
      while debugging.
      So, those can be controlled using CONFIG_F2FS_STAT_FS,
      pushing the usage for few variables under this flag.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      35b09d82
    • J
      f2fs: align data types between on-disk and in-memory block addresses · a9841c4d
      Jaegeuk Kim 提交于
      The on-disk block address is defined as __le32, but in-memory block address,
      block_t, does as u64.
      
      Let's synchronize them to 32 bits.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a9841c4d
    • D
      f2fs: dereferencing an ERR_PTR · f28c06fa
      Dan Carpenter 提交于
      There is an error path where "dir" is an ERR_PTR.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      f28c06fa
    • J
      f2fs: use ihold · 6f6fd833
      Jaegeuk Kim 提交于
      Use the following helper function committed by Al.
      
      commit 7de9c6ee
      Author: Al Viro <viro@zeniv.linux.org.uk>
      Date:   Sat Oct 23 11:11:40 2010 -0400
      
          new helper: ihold()
      
      ...
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6f6fd833
    • J
      f2fs: should not make_bad_inode on f2fs_link failure · 93ff10d6
      Jaegeuk Kim 提交于
      If -ENOSPC is met during f2fs_link, we should not make the inode as bad.
      The inode is still alive.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      93ff10d6
    • J
      f2fs: fix to handle do_recover_data errors · 39cf72cf
      Jaegeuk Kim 提交于
      This patch adds error handling codes of check_index_in_prev_nodes and its
      caller, do_recover_data.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      39cf72cf
    • J
      f2fs: reuse the locked dnode page and its inode · b292dcab
      Jaegeuk Kim 提交于
      This patch fixes the following deadlock bug during the recovery.
      
      INFO: task mount:1322 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mount           D ffffffff81125870     0  1322   1266 0x00000000
       ffff8801207e39d8 0000000000000046 ffff88012ab1dee0 0000000000000046
       ffff8801207e3a08 ffff880115903f40 ffff8801207e3fd8 ffff8801207e3fd8
       ffff8801207e3fd8 ffff880115903f40 ffff8801207e39d8 ffff88012fc94520
      Call Trace:
      [<ffffffff81125870>] ? __lock_page+0x70/0x70
      [<ffffffff816a92d9>] schedule+0x29/0x70
      [<ffffffff816a93af>] io_schedule+0x8f/0xd0
      [<ffffffff8112587e>] sleep_on_page+0xe/0x20
      [<ffffffff816a649a>] __wait_on_bit_lock+0x5a/0xc0
      [<ffffffff81125867>] __lock_page+0x67/0x70
      [<ffffffff8106c7b0>] ? autoremove_wake_function+0x40/0x40
      [<ffffffff81126857>] find_lock_page+0x67/0x80
      [<ffffffff8112698f>] find_or_create_page+0x3f/0xb0
      [<ffffffffa03901a8>] ? sync_inode_page+0xa8/0xd0 [f2fs]
      [<ffffffffa038fdf7>] get_node_page+0x67/0x180 [f2fs]
      [<ffffffffa039818b>] recover_fsync_data+0xacb/0xff0 [f2fs]
      [<ffffffff816aaa1e>] ? _raw_spin_unlock+0x3e/0x40
      [<ffffffffa0389634>] f2fs_fill_super+0x7d4/0x850 [f2fs]
      [<ffffffff81184cf9>] mount_bdev+0x1c9/0x210
      [<ffffffffa0388e60>] ? validate_superblock+0x180/0x180 [f2fs]
      [<ffffffffa0387635>] f2fs_mount+0x15/0x20 [f2fs]
      [<ffffffff81185a13>] mount_fs+0x43/0x1b0
      [<ffffffff81145ba0>] ? __alloc_percpu+0x10/0x20
      [<ffffffff811a0796>] vfs_kern_mount+0x76/0x120
      [<ffffffff811a2cb7>] do_mount+0x237/0xa10
      [<ffffffff81140b9b>] ? strndup_user+0x5b/0x80
      [<ffffffff811a3520>] SyS_mount+0x90/0xe0
      [<ffffffff816b3502>] system_call_fastpath+0x16/0x1b
      
      The bug is triggered when check_index_in_prev_nodes tries to get the direct
      node page by calling get_node_page.
      At this point, if the direct node page is already locked by get_dnode_of_data,
      its caller, we got a deadlock condition.
      
      This patch adds additional condition check for the reuse of locked direct node
      pages prior to the get_node_page call.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b292dcab
    • J
      f2fs: fix wrong condition check · b638f0c4
      Jaegeuk Kim 提交于
      While an orphan inode has zero link_count, f2fs_gc is able to select the inode
      for foreground gc.
      
      - f2fs_gc
       - do_garbage_collect
         - gc_data_segment
           : f2fs_iget is failed
           : get_valid_blocks() != 0, so that retry
      --> here we got the infinite loop.
      
      This patch resolved this issue.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b638f0c4
    • J
      f2fs: add f2fs_readonly() · 77888c1e
      Jaegeuk Kim 提交于
      Introduce a simple macro function for readability.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      77888c1e
    • J
      f2fs: avoid RECLAIM_FS-ON-W: deadlock · 6f85b352
      Jaegeuk Kim 提交于
      This patch tries to avoid the following deadlock condition of which the reclaim
      path can trigger f2fs_balance_fs again.
      
      =================================
      [ INFO: inconsistent lock state ]
      ---------------------------------
      inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
      kswapd0/41 [HC0[0]:SC0[0]:HE1:SE1] takes:
       (&sbi->gc_mutex){+.+.?.}, at: f2fs_balance_fs+0xe6/0x100 [f2fs]
      {RECLAIM_FS-ON-W} state was registered at:
        [<ffffffff810aa5a9>] mark_held_locks+0xb9/0x140
        [<ffffffff810aae85>] lockdep_trace_alloc+0x85/0xf0
        [<ffffffff8113ab2c>] __alloc_pages_nodemask+0x7c/0x9b0
        [<ffffffff81175aa8>] alloc_pages_current+0xb8/0x180
        [<ffffffff811319cf>] __page_cache_alloc+0xaf/0xd0
        [<ffffffff8113225c>] find_or_create_page+0x4c/0xb0
        [<ffffffffa021359e>] find_data_page+0x14e/0x210 [f2fs]
        [<ffffffffa021161b>] f2fs_gc+0x9eb/0xd90 [f2fs]
        [<ffffffffa0218fae>] f2fs_balance_fs+0xee/0x100 [f2fs]
        [<ffffffffa020848c>] f2fs_setattr+0x6c/0x200 [f2fs]
        [<ffffffff811ae51b>] notify_change+0x1db/0x3a0
        [<ffffffff8118fbd0>] do_truncate+0x60/0xa0
        [<ffffffff8118fd95>] vfs_truncate+0x185/0x1b0
        [<ffffffff8118fe1c>] do_sys_truncate+0x5c/0xa0
        [<ffffffff8118ffee>] SyS_truncate+0xe/0x10
        [<ffffffff816e2b42>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6f85b352
    • J
      f2fs: don't do checkpoint if error is occurred · 2c2c149f
      Jaegeuk Kim 提交于
      If we met an error during the dentry recovery, we should not conduct checkpoint.
      Otherwise, some errorneous dentry blocks overwrites the existing blocks that
      contain the remaining recovery information.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      2c2c149f
    • J
      f2fs: fix to unlock page before exit · 45856aff
      Jaegeuk Kim 提交于
      If we got an error after lock_page, we should unlock it before exit.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      45856aff
    • J
      f2fs: remove unnecessary kmap/kunmap operations · 9a55ed65
      Jaegeuk Kim 提交于
      The allocated page used by the recovery is not on HIGHMEM, so that we don't
      need to use kmap/kunmap.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      9a55ed65
    • N
      f2fs: reorganize f2fs_vm_page_mkwrite · 9851e6e1
      Namjae Jeon 提交于
      Few things can be changed in the default mkwrite function
      1) Make file_update_time at the start before acquiring any lock
      2) the condition page_offset(page) >= i_size_read(inode) should be
       changed to page_offset(page) > i_size_read
      3) Move wait_on_page_writeback.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      9851e6e1
    • M
      f2fs: use list_for_each_entry rather than list_for_each_entry_safe · 145b04e5
      majianpeng 提交于
      We can do this, since now we use a global mutex, f2fs_stat_mutex to protect its
      list operations.
      Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
      [Jaegeuk Kim: add description]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      145b04e5
    • H
      f2fs: remove unecessary variable and code · 81fb5e87
      Haicheng Li 提交于
      Code cleanup without behavior changed.
      Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      81fb5e87
    • P
      f2fs, lockdep: annotate mutex_lock_all() · bfe35965
      Peter Zijlstra 提交于
      Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all()
      acquires an array of locks (in global/local lock style).
      
      Any such operation is always serialized using cp_mutex, therefore there is no
      fs_lock[] lock-order issue; tell lockdep about this using the
      mutex_lock_nest_lock() primitive.
      Reported-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      bfe35965
    • J
      f2fs: add debug msgs in the recovery routine · f356fe0c
      Jaegeuk Kim 提交于
      This patch adds some trivial debugging messages in the recovery process.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      f356fe0c
    • J
      f2fs: update inode page after creation · 44a83ff6
      Jaegeuk Kim 提交于
      I found a bug when testing power-off-recovery as follows.
      
      [Bug Scenario]
      1. create a file
      2. fsync the file
      3. reboot w/o any sync
      4. try to recover the file
       - found its fsync mark
       - found its dentry mark
         : try to recover its dentry
          - get its file name
          - get its parent inode number
           : here we got zero value
      
      The reason why we get the wrong parent inode number is that we didn't
      synchronize the inode page with its newly created inode information perfectly.
      
      Especially, previous f2fs stores fi->i_pino and writes it to the cached
      node page in a wrong order, which incurs the zero-valued i_pino during the
      recovery.
      
      So, this patch modifies the creation flow to fix the synchronization order of
      inode page with its inode.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      44a83ff6
    • J
      f2fs: change get_new_data_page to pass a locked node page · 64aa7ed9
      Jaegeuk Kim 提交于
      This patch is for passing a locked node page to get_dnode_of_data.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      64aa7ed9
    • J
      f2fs: skip get_node_page if locked node page is passed · 1646cfac
      Jaegeuk Kim 提交于
      If get_dnode_of_data gets a locked node page, let's skip redundant
      get_node_page calls.
      This is for the futher enhancement.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      1646cfac
    • J
      f2fs: remove unnecessary por_doing check · 0a364af1
      Jaegeuk Kim 提交于
      This por_doing check is totally not related to the recovery process.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      0a364af1
    • J
      f2fs: fix BUG_ON during f2fs_evict_inode(dir) · 74d0b917
      Jaegeuk Kim 提交于
      During the dentry recovery routine, recover_inode() triggers __f2fs_add_link
      with its directory inode.
      
      In the following scenario, a bug is captured.
       1. dir = f2fs_iget(pino)
       2. __f2fs_add_link(dir, name)
       3. iput(dir)
        -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents))
      
      Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable]
      [<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs]
      Call Trace:
       [<ffffffff8118ea00>] evict+0xb0/0x1b0
       [<ffffffff8118f1c5>] iput+0x105/0x190
       [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs]
       [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0
       [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0
       [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70
       [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140
       [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0
       [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30
       [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70
      
      This means that we should flush all the dentry pages between iget and iput().
      But, during the recovery routine, it is unallowed due to consistency, so we
      have to wait the whole recovery process.
      And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we
      can put the stale dir inodes from the dirty_dir_inode_list.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      74d0b917
    • J
      f2fs: fix por_doing variable coverage · 8c26d7d5
      Jaegeuk Kim 提交于
      The reason of using sbi->por_doing is to alleviate data writes during the
      recovery.
      The find_fsync_dnodes() produces some dirty dentry pages, so we should
      cover it too with sbi->por_doing.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      8c26d7d5
    • J
      f2fs: remove redundant assignment · addbe45b
      Jaegeuk Kim 提交于
      We don't need to assign a value redundantly.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      addbe45b
    • J
      f2fs: fix the inconsistent state of data pages · 650495de
      Jaegeuk Kim 提交于
      In get_lock_data_page, if there is a data race between get_dnode_of_data for
      node and grab_cache_page for data, f2fs is able to face with the following
      BUG_ON(dn.data_blkaddr == NEW_ADDR).
      
      kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/data.c:251!
       [<ffffffffa044966c>] get_lock_data_page+0x1ec/0x210 [f2fs]
      Call Trace:
       [<ffffffffa043b089>] f2fs_readdir+0x89/0x210 [f2fs]
       [<ffffffff811a0920>] ? fillonedir+0x100/0x100
       [<ffffffff811a0920>] ? fillonedir+0x100/0x100
       [<ffffffff811a07f8>] vfs_readdir+0xb8/0xe0
       [<ffffffff811a0b4f>] sys_getdents+0x8f/0x110
       [<ffffffff816d7999>] system_call_fastpath+0x16/0x1b
      
      This bug is able to be occurred when the block address of the data block is
      changed after f2fs_put_dnode().
      In order to avoid that, this patch fixes the lock order of node and data
      blocks in which the node block lock is covered by the data block lock.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      650495de
    • J
      f2fs: fix inconsistency of block count during recovery · 65e5cd0a
      Jaegeuk Kim 提交于
      Currently f2fs recovers the dentry of fsynced files.
      When power-off-recovery is conducted, this newly recovered inode should increase
      node block count as well as inode block count.
      
      This patch resolves this inconsistency that results in:
      
      1. create a file
      2. write data
      3. fsync
      4. reboot without sync
      5. mount and recover the file
      6. node block count is 1 and inode block count is 2
       : fall into the inconsistent state
      7. unlink the file
       : trigger the following BUG_ON
      
      ------------[ cut here ]------------
      kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/f2fs.h:716!
      Call Trace:
       [<ffffffffa0344100>] ? get_node_page+0x50/0x1a0 [f2fs]
       [<ffffffffa0344bfc>] remove_inode_page+0x8c/0x100 [f2fs]
       [<ffffffffa03380f0>] ? f2fs_evict_inode+0x180/0x2d0 [f2fs]
       [<ffffffffa033812e>] f2fs_evict_inode+0x1be/0x2d0 [f2fs]
       [<ffffffff811c7a67>] evict+0xa7/0x1a0
       [<ffffffff811c82b5>] iput+0x105/0x190
       [<ffffffff811c2b30>] d_kill+0xe0/0x120
       [<ffffffff811c2c57>] dput+0xe7/0x1e0
       [<ffffffff811acc3d>] __fput+0x19d/0x2d0
       [<ffffffff811acd7e>] ____fput+0xe/0x10
       [<ffffffff81070645>] task_work_run+0xb5/0xe0
       [<ffffffff81002941>] do_notify_resume+0x71/0xb0
       [<ffffffff8175f14a>] int_signal+0x12/0x17
      Reported-and-Tested-by: NChris Fries <C.Fries@motorola.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      65e5cd0a
  2. 25 5月, 2013 11 次提交