1. 21 7月, 2016 1 次提交
  2. 08 6月, 2016 1 次提交
  3. 21 5月, 2016 1 次提交
  4. 19 5月, 2016 2 次提交
  5. 17 5月, 2016 2 次提交
  6. 08 5月, 2016 4 次提交
    • C
      f2fs: fix inode cache leak · f61cce5b
      Chao Yu 提交于
      When testing f2fs with inline_dentry option, generic/342 reports:
      VFS: Busy inodes after unmount of dm-0. Self-destruct in 5 seconds.  Have a nice day...
      
      After rmmod f2fs module, kenrel shows following dmesg:
       =============================================================================
       BUG f2fs_inode_cache (Tainted: G           O   ): Objects remaining in f2fs_inode_cache on __kmem_cache_shutdown()
       -----------------------------------------------------------------------------
      
       Disabling lock debugging due to kernel taint
       INFO: Slab 0xf51ca0e0 objects=22 used=1 fp=0xd1e6fc60 flags=0x40004080
       CPU: 3 PID: 7455 Comm: rmmod Tainted: G    B      O    4.6.0-rc4+ #16
       Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
        00000086 00000086 d062fe18 c13a83a0 f51ca0e0 d062fe38 d062fea4 c11c7276
        c1981040 f51ca0e0 00000016 00000001 d1e6fc60 40004080 656a624f 20737463
        616d6572 6e696e69 6e692067 66326620 6e695f73 5f65646f 68636163 6e6f2065
       Call Trace:
        [<c13a83a0>] dump_stack+0x5f/0x8f
        [<c11c7276>] slab_err+0x76/0x80
        [<c11cbfc0>] ? __kmem_cache_shutdown+0x100/0x2f0
        [<c11cbfc0>] ? __kmem_cache_shutdown+0x100/0x2f0
        [<c11cbfe5>] __kmem_cache_shutdown+0x125/0x2f0
        [<c1198a38>] kmem_cache_destroy+0x158/0x1f0
        [<c176b43d>] ? mutex_unlock+0xd/0x10
        [<f8f15aa3>] exit_f2fs_fs+0x4b/0x5a8 [f2fs]
        [<c10f596c>] SyS_delete_module+0x16c/0x1d0
        [<c1001b10>] ? do_fast_syscall_32+0x30/0x1c0
        [<c13c59bf>] ? __this_cpu_preempt_check+0xf/0x20
        [<c10afa7d>] ? trace_hardirqs_on_caller+0xdd/0x210
        [<c10ad50b>] ? trace_hardirqs_off+0xb/0x10
        [<c1001b81>] do_fast_syscall_32+0xa1/0x1c0
        [<c176d888>] sysenter_past_esp+0x45/0x74
       INFO: Object 0xd1e6d9e0 @offset=6624
       kmem_cache_destroy f2fs_inode_cache: Slab cache still has objects
       CPU: 3 PID: 7455 Comm: rmmod Tainted: G    B      O    4.6.0-rc4+ #16
       Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
        00000286 00000286 d062fef4 c13a83a0 f174b000 d062ff14 d062ff28 c1198ac7
        c197fe18 f3c5b980 d062ff20 000d04f2 d062ff0c d062ff0c d062ff14 d062ff14
        f8f20dc0 fffffff5 d062e000 d062ff30 f8f15aa3 d062ff7c c10f596c 73663266
       Call Trace:
        [<c13a83a0>] dump_stack+0x5f/0x8f
        [<c1198ac7>] kmem_cache_destroy+0x1e7/0x1f0
        [<f8f15aa3>] exit_f2fs_fs+0x4b/0x5a8 [f2fs]
        [<c10f596c>] SyS_delete_module+0x16c/0x1d0
        [<c1001b10>] ? do_fast_syscall_32+0x30/0x1c0
        [<c13c59bf>] ? __this_cpu_preempt_check+0xf/0x20
        [<c10afa7d>] ? trace_hardirqs_on_caller+0xdd/0x210
        [<c10ad50b>] ? trace_hardirqs_off+0xb/0x10
        [<c1001b81>] do_fast_syscall_32+0xa1/0x1c0
        [<c176d888>] sysenter_past_esp+0x45/0x74
      
      The reason is: in recovery flow, we use delayed iput mechanism for directory
      which has recovered dentry block. It means the reference of inode will be
      held until last dirty dentry page being writebacked.
      
      But when we mount f2fs with inline_dentry option, during recovery, dirent
      may only be recovered into dir inode page rather than dentry page, so there
      are no chance for us to release inode reference in ->writepage when
      writebacking last dentry page.
      
      We can call paired iget/iput explicityly for inline_dentry case, but for
      non-inline_dentry case, iput will call writeback_single_inode to write all
      data pages synchronously, but during recovery, ->writepages of f2fs skips
      writing all pages, result in losing dirent.
      
      This patch fixes this issue by obsoleting old mechanism, and introduce a
      new dir_list to hold all directory inodes which has recovered datas until
      finishing recovery.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f61cce5b
    • J
      f2fs: fix leak of orphan inode objects · 74ef9241
      Jaegeuk Kim 提交于
      When unmounting filesystem, we should release all the ino entries.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      74ef9241
    • J
      f2fs: inject ENOSPC failures · cb78942b
      Jaegeuk Kim 提交于
      This patch injects ENOSPC failures.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      cb78942b
    • J
      f2fs: use f2fs_grab_cache_page instead of grab_cache_page · 300e129c
      Jaegeuk Kim 提交于
      This patch converts grab_cache_page to f2fs_grab_cache_page.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      300e129c
  7. 27 4月, 2016 1 次提交
  8. 15 4月, 2016 1 次提交
  9. 18 3月, 2016 1 次提交
  10. 27 2月, 2016 1 次提交
  11. 26 2月, 2016 1 次提交
    • C
      f2fs: fix incorrect upper bound when iterating inode mapping tree · 80dd9c0e
      Chao Yu 提交于
      1. Inode mapping tree can index page in range of [0, ULONG_MAX], however,
      in some places, f2fs only search or iterate page in ragne of [0, LONG_MAX],
      result in miss hitting in page cache.
      
      2. filemap_fdatawait_range accepts range parameters in unit of bytes, so
      the max range it covers should be [0, LLONG_MAX], if we use [0, LONG_MAX]
      as range for waiting on writeback, big number of pages will not be covered.
      
      This patch corrects above two issues.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      80dd9c0e
  12. 24 2月, 2016 1 次提交
  13. 23 2月, 2016 10 次提交
    • C
      f2fs: trace old block address for CoWed page · 7a9d7548
      Chao Yu 提交于
      This patch enables to trace old block address of CoWed page for better
      debugging.
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7a9d7548
    • S
      f2fs: move sanity checking of cp into get_valid_checkpoint · 984ec63c
      Shawn Lin 提交于
      >From the function name of get_valid_checkpoint, it seems to return
      the valid cp or NULL for caller to check. If no valid one is found,
      f2fs_fill_super will print the err log. But if get_valid_checkpoint
      get one valid(the return value indicate that it's valid, however actually
      it is invalid after sanity checking), then print another similar err
      log. That seems strange. Let's keep sanity checking inside the procedure
      of geting valid cp. Another improvement we gained from this move is
      that even the large volume is supported, we check the cp in advanced
      to skip the following procedure if failing the sanity checking.
      Signed-off-by: NShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      984ec63c
    • C
      f2fs: split journal cache from curseg cache · b7ad7512
      Chao Yu 提交于
      In curseg cache, f2fs caches two different parts:
       - datas of current summay block, i.e. summary entries, footer info.
       - journal info, i.e. sparse nat/sit entries or io stat info.
      
      With this approach, 1) it may cause higher lock contention when we access
      or update both of the parts of cache since we use the same mutex lock
      curseg_mutex to protect the cache. 2) current summary block with last
      journal info will be writebacked into device as a normal summary block
      when flushing, however, we treat journal info as valid one only in current
      summary, so most normal summary blocks contain junk journal data, it wastes
      remaining space of summary block.
      
      So, in order to fix above issues, we split curseg cache into two parts:
      a) current summary block, protected by original mutex lock curseg_mutex
      b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
      
      When loading curseg cache during ->mount, we store summary info and
      journal info into different caches; When doing checkpoint, we combine
      datas of two cache into current summary block for persisting.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b7ad7512
    • C
      f2fs: enhance IO path with block plug · e9f5b8b8
      Chao Yu 提交于
      Try to use block plug in more place as below to let process cache bios
      as much as possbile, in order to reduce lock overhead of queue in IO
      scheduler.
      1) sync_meta_pages
      2) ra_meta_pages
      3) f2fs_balance_fs_bg
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e9f5b8b8
    • C
      f2fs: introduce f2fs_journal struct to wrap journal info · dfc08a12
      Chao Yu 提交于
      Introduce a new structure f2fs_journal to wrap journal info in struct
      f2fs_summary_block for readability.
      
      struct f2fs_journal {
      	union {
      		__le16 n_nats;
      		__le16 n_sits;
      	};
      	union {
      		struct nat_journal nat_j;
      		struct sit_journal sit_j;
      		struct f2fs_extra_info info;
      	};
      } __packed;
      
      struct f2fs_summary_block {
      	struct f2fs_summary entries[ENTRIES_IN_SUM];
      	struct f2fs_journal journal;
      	struct summary_footer footer;
      } __packed;
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      dfc08a12
    • Y
      f2fs: fix missing skip pages info · d31c7c3f
      Yunlei He 提交于
      fix missing skip pages info in f2fs_writepages trace event.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d31c7c3f
    • C
      f2fs: introduce f2fs_submit_merged_bio_cond · 0c3a5797
      Chao Yu 提交于
      f2fs use single bio buffer per type data (META/NODE/DATA) for caching
      writes locating in continuous block address as many as possible, after
      submitting, these writes may be still cached in bio buffer, so we have
      to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.
      
      Unfortunately, in the scenario of high concurrency, bio buffer could be
      flushed by someone else before we submit it as below reasons:
      a) there is no space in bio buffer.
      b) add a request of different type (SYNC, ASYNC).
      c) add a discontinuous block address.
      
      For this condition, f2fs_submit_merged_bio will be devastating, because
      it could break the following merging of writes in bio buffer, split one
      big bio into two smaller one.
      
      This patch introduces f2fs_submit_merged_bio_cond which can do a
      conditional submitting with bio buffer, before submitting it will judge
      whether:
       - page in DATA type bio buffer is matching with specified page;
       - page in DATA type bio buffer is belong to specified inode;
       - page in NODE type bio buffer is belong to specified inode;
      If there is no eligible page in bio buffer, we will skip submitting step,
      result in gaining more chance to merge consecutive block IOs in bio cache.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0c3a5797
    • J
      f2fs: wait on page's writeback in writepages path · fa3d2bdf
      Jaegeuk Kim 提交于
      Likewise f2fs_write_cache_pages, let's do for node and meta pages too.
      Especially, for node blocks, we should do this before marking its fsync
      and dentry flags.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fa3d2bdf
    • S
      f2fs: introduce lifetime write IO statistics · 8f1dbbbb
      Shuoran Liu 提交于
      This patch introduces lifetime IO write statistics exposed to the sysfs interface.
      The write IO amount is obtained from block layer, accumulated in the file system and
      stored in the hot node summary of checkpoint.
      Signed-off-by: NShuoran Liu <liushuoran@huawei.com>
      Signed-off-by: NPengyang Hou <houpengyang@huawei.com>
      [Jaegeuk Kim: add sysfs documentation]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8f1dbbbb
    • J
      f2fs: use wait_for_stable_page to avoid contention · fec1d657
      Jaegeuk Kim 提交于
      In write_begin, if storage supports stable_page, we don't need to wait for
      writeback to update its contents.
      This patch introduces to use wait_for_stable_page instead of
      wait_on_page_writeback.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fec1d657
  14. 12 1月, 2016 1 次提交
  15. 01 1月, 2016 1 次提交
    • J
      f2fs: write pending bios when cp_error is set · 8d4ea29b
      Jaegeuk Kim 提交于
      When testing ioc_shutdown, put_super is able to be hanged by waiting for
      writebacking pages as follows.
      
      INFO: task umount:2723 blocked for more than 120 seconds.
            Tainted: G           O    4.4.0-rc3+ #8
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      umount          D ffff88000859f9d8     0  2723   2110 0x00000000
       ffff88000859f9d8 0000000000000000 0000000000000000 ffffffff81e11540
       ffff880078c225c0 ffff8800085a0000 ffff88007fc17440 7fffffffffffffff
       ffffffff818239f0 ffff88000859fb48 ffff88000859f9f0 ffffffff8182310c
      Call Trace:
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff8182310c>] schedule+0x3c/0x90
       [<ffffffff81827fb9>] schedule_timeout+0x2d9/0x430
       [<ffffffff810e0f8f>] ? mark_held_locks+0x6f/0xa0
       [<ffffffff8111614d>] ? ktime_get+0x7d/0x140
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff8106a655>] ? kvm_clock_get_cycles+0x25/0x30
       [<ffffffff8111617c>] ? ktime_get+0xac/0x140
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff81822564>] io_schedule_timeout+0xa4/0x110
       [<ffffffff81823a25>] bit_wait_io+0x35/0x50
       [<ffffffff818235bd>] __wait_on_bit+0x5d/0x90
       [<ffffffff811b9e8b>] wait_on_page_bit+0xcb/0xf0
       [<ffffffff810d5f90>] ? autoremove_wake_function+0x40/0x40
       [<ffffffff811cf84c>] truncate_inode_pages_range+0x4bc/0x840
       [<ffffffff811cfc3d>] truncate_inode_pages_final+0x4d/0x60
       [<ffffffffc023ced5>] f2fs_evict_inode+0x75/0x400 [f2fs]
       [<ffffffff812639bc>] evict+0xbc/0x190
       [<ffffffff81263d19>] iput+0x229/0x2c0
       [<ffffffffc0241885>] f2fs_put_super+0x105/0x1a0 [f2fs]
       [<ffffffff8124756a>] generic_shutdown_super+0x6a/0xf0
       [<ffffffff812478f7>] kill_block_super+0x27/0x70
       [<ffffffffc0241290>] kill_f2fs_super+0x20/0x30 [f2fs]
       [<ffffffff81247b03>] deactivate_locked_super+0x43/0x70
       [<ffffffff81247f4c>] deactivate_super+0x5c/0x60
       [<ffffffff81268d2f>] cleanup_mnt+0x3f/0x90
       [<ffffffff81268dc2>] __cleanup_mnt+0x12/0x20
       [<ffffffff810ac463>] task_work_run+0x73/0xa0
       [<ffffffff810032ac>] exit_to_usermode_loop+0xcc/0xd0
       [<ffffffff81003e7c>] syscall_return_slowpath+0xcc/0xe0
       [<ffffffff81829ea2>] int_ret_from_sys_call+0x25/0x9f
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8d4ea29b
  16. 31 12月, 2015 2 次提交
  17. 18 12月, 2015 2 次提交
  18. 17 12月, 2015 2 次提交
  19. 16 12月, 2015 3 次提交
  20. 13 10月, 2015 2 次提交
    • C
      f2fs: support lower priority asynchronous readahead in ra_meta_pages · 26879fb1
      Chao Yu 提交于
      Now, we use ra_meta_pages to reads continuous physical blocks as much as
      possible to improve performance of following reads. However, ra_meta_pages
      uses a synchronous readahead approach by submitting bio with READ, as READ
      is with high priority, it can not be used in the case of preloading blocks,
      and it's not sure when these RAed pages will be used.
      
      This patch supports asynchronous readahead in ra_meta_pages by tagging bio
      with READA flag in order to allow preloading.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      26879fb1
    • C
      f2fs: don't tag REQ_META for temporary non-meta pages · 2b947003
      Chao Yu 提交于
      In recovery or checkpoint flow, we grab pages temperarily in meta inode's
      mapping for caching temperary data, actually, datas in these pages were
      not meta data of f2fs, but still we tag them with REQ_META flag. However,
      lower device like eMMC may do some optimization for data of such type.
      So in order to avoid wrong optimization, we'd better remove such flag
      for temperary non-meta pages.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2b947003