1. 23 2月, 2016 6 次提交
    • C
      f2fs: introduce f2fs_journal struct to wrap journal info · dfc08a12
      Chao Yu 提交于
      Introduce a new structure f2fs_journal to wrap journal info in struct
      f2fs_summary_block for readability.
      
      struct f2fs_journal {
      	union {
      		__le16 n_nats;
      		__le16 n_sits;
      	};
      	union {
      		struct nat_journal nat_j;
      		struct sit_journal sit_j;
      		struct f2fs_extra_info info;
      	};
      } __packed;
      
      struct f2fs_summary_block {
      	struct f2fs_summary entries[ENTRIES_IN_SUM];
      	struct f2fs_journal journal;
      	struct summary_footer footer;
      } __packed;
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      dfc08a12
    • Y
      f2fs: fix missing skip pages info · d31c7c3f
      Yunlei He 提交于
      fix missing skip pages info in f2fs_writepages trace event.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d31c7c3f
    • C
      f2fs: introduce f2fs_submit_merged_bio_cond · 0c3a5797
      Chao Yu 提交于
      f2fs use single bio buffer per type data (META/NODE/DATA) for caching
      writes locating in continuous block address as many as possible, after
      submitting, these writes may be still cached in bio buffer, so we have
      to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.
      
      Unfortunately, in the scenario of high concurrency, bio buffer could be
      flushed by someone else before we submit it as below reasons:
      a) there is no space in bio buffer.
      b) add a request of different type (SYNC, ASYNC).
      c) add a discontinuous block address.
      
      For this condition, f2fs_submit_merged_bio will be devastating, because
      it could break the following merging of writes in bio buffer, split one
      big bio into two smaller one.
      
      This patch introduces f2fs_submit_merged_bio_cond which can do a
      conditional submitting with bio buffer, before submitting it will judge
      whether:
       - page in DATA type bio buffer is matching with specified page;
       - page in DATA type bio buffer is belong to specified inode;
       - page in NODE type bio buffer is belong to specified inode;
      If there is no eligible page in bio buffer, we will skip submitting step,
      result in gaining more chance to merge consecutive block IOs in bio cache.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0c3a5797
    • J
      f2fs: wait on page's writeback in writepages path · fa3d2bdf
      Jaegeuk Kim 提交于
      Likewise f2fs_write_cache_pages, let's do for node and meta pages too.
      Especially, for node blocks, we should do this before marking its fsync
      and dentry flags.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fa3d2bdf
    • S
      f2fs: introduce lifetime write IO statistics · 8f1dbbbb
      Shuoran Liu 提交于
      This patch introduces lifetime IO write statistics exposed to the sysfs interface.
      The write IO amount is obtained from block layer, accumulated in the file system and
      stored in the hot node summary of checkpoint.
      Signed-off-by: NShuoran Liu <liushuoran@huawei.com>
      Signed-off-by: NPengyang Hou <houpengyang@huawei.com>
      [Jaegeuk Kim: add sysfs documentation]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8f1dbbbb
    • J
      f2fs: use wait_for_stable_page to avoid contention · fec1d657
      Jaegeuk Kim 提交于
      In write_begin, if storage supports stable_page, we don't need to wait for
      writeback to update its contents.
      This patch introduces to use wait_for_stable_page instead of
      wait_on_page_writeback.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fec1d657
  2. 12 1月, 2016 1 次提交
  3. 01 1月, 2016 1 次提交
    • J
      f2fs: write pending bios when cp_error is set · 8d4ea29b
      Jaegeuk Kim 提交于
      When testing ioc_shutdown, put_super is able to be hanged by waiting for
      writebacking pages as follows.
      
      INFO: task umount:2723 blocked for more than 120 seconds.
            Tainted: G           O    4.4.0-rc3+ #8
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      umount          D ffff88000859f9d8     0  2723   2110 0x00000000
       ffff88000859f9d8 0000000000000000 0000000000000000 ffffffff81e11540
       ffff880078c225c0 ffff8800085a0000 ffff88007fc17440 7fffffffffffffff
       ffffffff818239f0 ffff88000859fb48 ffff88000859f9f0 ffffffff8182310c
      Call Trace:
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff8182310c>] schedule+0x3c/0x90
       [<ffffffff81827fb9>] schedule_timeout+0x2d9/0x430
       [<ffffffff810e0f8f>] ? mark_held_locks+0x6f/0xa0
       [<ffffffff8111614d>] ? ktime_get+0x7d/0x140
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff8106a655>] ? kvm_clock_get_cycles+0x25/0x30
       [<ffffffff8111617c>] ? ktime_get+0xac/0x140
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff81822564>] io_schedule_timeout+0xa4/0x110
       [<ffffffff81823a25>] bit_wait_io+0x35/0x50
       [<ffffffff818235bd>] __wait_on_bit+0x5d/0x90
       [<ffffffff811b9e8b>] wait_on_page_bit+0xcb/0xf0
       [<ffffffff810d5f90>] ? autoremove_wake_function+0x40/0x40
       [<ffffffff811cf84c>] truncate_inode_pages_range+0x4bc/0x840
       [<ffffffff811cfc3d>] truncate_inode_pages_final+0x4d/0x60
       [<ffffffffc023ced5>] f2fs_evict_inode+0x75/0x400 [f2fs]
       [<ffffffff812639bc>] evict+0xbc/0x190
       [<ffffffff81263d19>] iput+0x229/0x2c0
       [<ffffffffc0241885>] f2fs_put_super+0x105/0x1a0 [f2fs]
       [<ffffffff8124756a>] generic_shutdown_super+0x6a/0xf0
       [<ffffffff812478f7>] kill_block_super+0x27/0x70
       [<ffffffffc0241290>] kill_f2fs_super+0x20/0x30 [f2fs]
       [<ffffffff81247b03>] deactivate_locked_super+0x43/0x70
       [<ffffffff81247f4c>] deactivate_super+0x5c/0x60
       [<ffffffff81268d2f>] cleanup_mnt+0x3f/0x90
       [<ffffffff81268dc2>] __cleanup_mnt+0x12/0x20
       [<ffffffff810ac463>] task_work_run+0x73/0xa0
       [<ffffffff810032ac>] exit_to_usermode_loop+0xcc/0xd0
       [<ffffffff81003e7c>] syscall_return_slowpath+0xcc/0xe0
       [<ffffffff81829ea2>] int_ret_from_sys_call+0x25/0x9f
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8d4ea29b
  4. 31 12月, 2015 2 次提交
  5. 18 12月, 2015 2 次提交
  6. 17 12月, 2015 2 次提交
  7. 16 12月, 2015 3 次提交
  8. 13 10月, 2015 2 次提交
    • C
      f2fs: support lower priority asynchronous readahead in ra_meta_pages · 26879fb1
      Chao Yu 提交于
      Now, we use ra_meta_pages to reads continuous physical blocks as much as
      possible to improve performance of following reads. However, ra_meta_pages
      uses a synchronous readahead approach by submitting bio with READ, as READ
      is with high priority, it can not be used in the case of preloading blocks,
      and it's not sure when these RAed pages will be used.
      
      This patch supports asynchronous readahead in ra_meta_pages by tagging bio
      with READA flag in order to allow preloading.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      26879fb1
    • C
      f2fs: don't tag REQ_META for temporary non-meta pages · 2b947003
      Chao Yu 提交于
      In recovery or checkpoint flow, we grab pages temperarily in meta inode's
      mapping for caching temperary data, actually, datas in these pages were
      not meta data of f2fs, but still we tag them with REQ_META flag. However,
      lower device like eMMC may do some optimization for data of such type.
      So in order to avoid wrong optimization, we'd better remove such flag
      for temperary non-meta pages.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2b947003
  9. 10 10月, 2015 3 次提交
    • J
      f2fs: merge meta writes as many possible · 6066d8cd
      Jaegeuk Kim 提交于
      This patch tries to merge IOs as many as possible when background flusher
      conducts flushing the dirty meta pages.
      
      [Before]
      
      ...
      2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124320, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124560, size = 32768
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 95720, size = 987136
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123928, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123944, size = 8192
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123968, size = 45056
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124064, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 97648, size = 1007616
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123776, size = 8192
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123800, size = 32768
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124624, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 99616, size = 921600
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123608, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123624, size = 77824
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123792, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123864, size = 32768
      ...
      
      [After]
      
      ...
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 92168, size = 892928
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 93912, size = 753664
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 95384, size = 716800
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 96784, size = 712704
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 104160, size = 364544
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 104872, size = 356352
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 105568, size = 278528
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 106112, size = 319488
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 106736, size = 258048
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 107240, size = 270336
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 107768, size = 180224
      ...
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6066d8cd
    • J
      f2fs: introduce a periodic checkpoint flow · 60b99b48
      Jaegeuk Kim 提交于
      This patch introduces a periodic checkpoint feature.
      Note that, this is not enforcing to conduct checkpoints very strictly in terms
      of trigger timing, instead just hope to help user experiences.
      The default value is 60 seconds.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      60b99b48
    • J
      f2fs: check end_io for metapages before making next checkpoint blocks · a7230d16
      Jaegeuk Kim 提交于
      This patch avoids to produce new checkpoint blocks before the previous meta
      pages were written completely.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a7230d16
  10. 25 8月, 2015 1 次提交
  11. 20 8月, 2015 1 次提交
  12. 15 8月, 2015 1 次提交
    • C
      f2fs: handle error of f2fs_iget correctly · 8c14bfad
      Chao Yu 提交于
      In recover_orphan_inode, whenever f2fs_iget fail, we will make kernel panic,
      but it's not reasonable, because f2fs_iget can fail due to a lot of reasons
      including out of memory.
      
      So we change error handling method as below:
      a) when finding no entry for the orphan inode, bug_on for catching bugs;
      b) for other reasons, report it to caller.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8c14bfad
  13. 05 8月, 2015 5 次提交
  14. 02 6月, 2015 1 次提交
  15. 29 5月, 2015 5 次提交
  16. 08 5月, 2015 1 次提交
  17. 17 4月, 2015 1 次提交
  18. 11 4月, 2015 2 次提交
    • C
      f2fs: cleanup statement about max orphan inodes calc · e0150392
      Changman Lee 提交于
      Through each macro, we can read the meaning easily.
      Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e0150392
    • S
      f2fs: add cond_resched() to sync_dirty_dir_inodes() · 7ecebe5e
      Sebastian Andrzej Siewior 提交于
      In a preempt-off enviroment a alot of FS activity (write/delete) I run
      into a CPU stall:
      
      | NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59]
      | Modules linked in:
      | CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G        W      3.19.0-00010-g10c11c51ffed #153
      | Workqueue: writeback bdi_writeback_workfn (flush-179:0)
      | task: df230000 ti: df23e000 task.ti: df23e000
      | PC is at __submit_merged_bio+0x6c/0x110
      | LR is at f2fs_submit_merged_bio+0x74/0x80
      …
      | [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c)
      | Exception stack(0xdf23fb48 to 0xdf23fb90)
      | fb40:                   deef3484 ffff0001 ffff0001 00000027 deef3484 00000000
      | fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90
      | fb80: c02191f0 c0218fa0 60000013 ffffffff
      | [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110)
      | [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80)
      | [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78)
      | [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10)
      | [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc)
      | [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68)
      | [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110)
      | [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48)
      | [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228)
      | [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378)
      | [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8)
      | [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c)
      | [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460)
      | [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4)
      | [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490)
      | [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100)
      | [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24)
      
      As it turns out, the code loops in sync_dirty_dir_inodes() and waits for
      others to make progress but since it never leaves the CPU there is no
      progress made. At the time of this stall, there is also a rm process
      blocked:
      | rm              R running      0  1989   1774 0x00000000
      | [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c)
      | [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54)
      | [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8)
      | [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30)
      | [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64)
      | [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268)
      | [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140)
      | [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134)
      | [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180)
      | [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c)
      | [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168)
      | [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c)
      | [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c)
      
      As explained by Jaegeuk Kim:
      |This inode is the directory (c.f., do_rmdir) causing a infinite loop on
      |sync_dirty_dir_inodes.
      |The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the
      |inode is under eviction, it submits bios and do it again until eviction
      |is finished.
      
      This patch adds a cond_resched() (as suggested by Jaegeuk) after a BIO
      is submitted so other thread can make progress.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      [Jaegeuk Kim: change fs/f2fs to f2fs in subject as naming convention]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7ecebe5e