1. 17 4月, 2015 1 次提交
  2. 11 4月, 2015 6 次提交
    • C
      f2fs: cleanup statement about max orphan inodes calc · e0150392
      Changman Lee 提交于
      Through each macro, we can read the meaning easily.
      Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e0150392
    • S
      f2fs: add cond_resched() to sync_dirty_dir_inodes() · 7ecebe5e
      Sebastian Andrzej Siewior 提交于
      In a preempt-off enviroment a alot of FS activity (write/delete) I run
      into a CPU stall:
      
      | NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59]
      | Modules linked in:
      | CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G        W      3.19.0-00010-g10c11c51ffed #153
      | Workqueue: writeback bdi_writeback_workfn (flush-179:0)
      | task: df230000 ti: df23e000 task.ti: df23e000
      | PC is at __submit_merged_bio+0x6c/0x110
      | LR is at f2fs_submit_merged_bio+0x74/0x80
      …
      | [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c)
      | Exception stack(0xdf23fb48 to 0xdf23fb90)
      | fb40:                   deef3484 ffff0001 ffff0001 00000027 deef3484 00000000
      | fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90
      | fb80: c02191f0 c0218fa0 60000013 ffffffff
      | [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110)
      | [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80)
      | [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78)
      | [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10)
      | [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc)
      | [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68)
      | [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110)
      | [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48)
      | [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228)
      | [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378)
      | [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8)
      | [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c)
      | [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460)
      | [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4)
      | [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490)
      | [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100)
      | [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24)
      
      As it turns out, the code loops in sync_dirty_dir_inodes() and waits for
      others to make progress but since it never leaves the CPU there is no
      progress made. At the time of this stall, there is also a rm process
      blocked:
      | rm              R running      0  1989   1774 0x00000000
      | [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c)
      | [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54)
      | [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8)
      | [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30)
      | [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64)
      | [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268)
      | [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140)
      | [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134)
      | [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180)
      | [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c)
      | [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168)
      | [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c)
      | [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c)
      
      As explained by Jaegeuk Kim:
      |This inode is the directory (c.f., do_rmdir) causing a infinite loop on
      |sync_dirty_dir_inodes.
      |The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the
      |inode is under eviction, it submits bios and do it again until eviction
      |is finished.
      
      This patch adds a cond_resched() (as suggested by Jaegeuk) after a BIO
      is submitted so other thread can make progress.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      [Jaegeuk Kim: change fs/f2fs to f2fs in subject as naming convention]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7ecebe5e
    • W
      f2fs: fix max orphan inodes calculation · 14b42817
      Wanpeng Li 提交于
      cp_payload is introduced for sit bitmap to support large volume, and it is
      just after the block of f2fs_checkpoint + nat bitmap, so the first segment
      should include F2FS_CP_PACKS + NR_CURSEG_TYPE + cp_payload + orphan blocks.
      However, current max orphan inodes calculation don't consider cp_payload,
      this patch fix it by reducing the number of cp_payload from total blocks of
      the first segment when calculate max orphan inodes.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      14b42817
    • W
      f2fs: fix block_ops trace point · 2bda542d
      Wanpeng Li 提交于
      block operations is used to flush all dirty node and dentry blocks in
      the page cache and suspend ordinary writing activities, however, there
      are some facts such like cp error or mount read-only etc which lead to
      block operations can't be invoked. Current trace point print block_ops
      start premature even if block_ops doesn't have opportunity to execute.
      This patch fix it by move block_ops trace point just before block_ops.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2bda542d
    • W
      f2fs: fix the number of orphan inode blocks · 3c642985
      Wanpeng Li 提交于
      cp_pack_start_sum is calculated in do_checkpoint and is equal to
      cpu_to_le32(1 + cp_payload_blks + orphan_blocks). The number of
      orphan inode blocks is take advantage of by recover_orphan_inodes
      to readahead meta pages and recovery inodes. However, current codes
      forget to reduce the number of cp payload blocks when calculate
      the number of orphan inode blocks. This patch fix it.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3c642985
    • W
      f2fs: introduce macro __cp_payload · 55141486
      Wanpeng Li 提交于
      This patch introduce macro __cp_payload.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      55141486
  3. 04 3月, 2015 1 次提交
  4. 12 2月, 2015 9 次提交
    • J
      f2fs: fix sparse warnings · 29e7043f
      Jaegeuk Kim 提交于
      This patch resolves the following warnings.
      
      include/trace/events/f2fs.h:150:1: warning: expression using sizeof bool
      include/trace/events/f2fs.h:180:1: warning: expression using sizeof bool
      include/trace/events/f2fs.h:990:1: warning: expression using sizeof bool
      include/trace/events/f2fs.h:990:1: warning: expression using sizeof bool
      include/trace/events/f2fs.h:150:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
      include/trace/events/f2fs.h:180:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
      include/trace/events/f2fs.h:990:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
      include/trace/events/f2fs.h:990:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
      
      fs/f2fs/checkpoint.c:27:19: warning: symbol 'inode_entry_slab' was not declared. Should it be static?
      fs/f2fs/checkpoint.c:577:15: warning: cast to restricted __le32
      fs/f2fs/checkpoint.c:592:15: warning: cast to restricted __le32
      
      fs/f2fs/trace.c:19:1: warning: symbol 'pids' was not declared. Should it be static?
      fs/f2fs/trace.c:21:21: warning: symbol 'last_io' was not declared. Should it be static?
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      29e7043f
    • J
      f2fs: introduce macros to convert bytes and blocks in f2fs · f7ef9b83
      Jaegeuk Kim 提交于
      This patch adds two macros for transition between byte and block offsets.
      Currently, f2fs only supports 4KB blocks, so use the default size for now.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f7ef9b83
    • C
      f2fs: merge {invalidate,release}page for meta/node/data pages · 487261f3
      Chao Yu 提交于
      This patch merges ->{invalidate,release}page function for meta/node/data pages.
      
      After this, duplication of codes could be removed.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      487261f3
    • J
      f2fs: keep PagePrivate during releasepage · f68daeeb
      Jaegeuk Kim 提交于
      If PagePrivate is removed by releasepage, f2fs loses counting dirty pages.
      
      e.g., try_to_release_page will not release page when the page is dirty,
      but our releasepage removes PagePrivate.
      
          [<ffffffff81188d75>] try_to_release_page+0x35/0x50
          [<ffffffff811996f9>] invalidate_inode_pages2_range+0x2f9/0x3b0
          [<ffffffffa02a7f54>] ? truncate_blocks+0x384/0x4d0 [f2fs]
          [<ffffffffa02b7583>] ? f2fs_direct_IO+0x283/0x290 [f2fs]
          [<ffffffffa02b7fb0>] ? get_data_block_fiemap+0x20/0x20 [f2fs]
          [<ffffffff8118aa53>] generic_file_direct_write+0x163/0x170
          [<ffffffff8118ad06>] __generic_file_write_iter+0x2a6/0x350
          [<ffffffff8118adef>] generic_file_write_iter+0x3f/0xb0
          [<ffffffff81203081>] new_sync_write+0x81/0xb0
          [<ffffffff81203837>] vfs_write+0xb7/0x1f0
          [<ffffffff81204459>] SyS_write+0x49/0xb0
          [<ffffffff817c286d>] system_call_fastpath+0x16/0x1b
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f68daeeb
    • J
      f2fs: split UMOUNT and FASTBOOT flags · 119ee914
      Jaegeuk Kim 提交于
      This patch adds FASTBOOT flag into checkpoint as follows.
      
       - CP_UMOUNT_FLAG is set when system is umounted.
       - CP_FASTBOOT_FLAG is set when intermediate checkpoint having node summaries
         was done.
      
      So, if you get CP_UMOUNT_FLAG from checkpoint, the system was umounted cleanly.
      Instead, if there was sudden-power-off, you can get CP_FASTBOOT_FLAG or nothing.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      119ee914
    • J
      f2fs: avoid write_checkpoint if f2fs is mounted readonly · 11504a8e
      Jaegeuk Kim 提交于
      Do not change any partition when f2fs is changed to readonly mode.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      11504a8e
    • C
      f2fs: merge flags in struct f2fs_sb_info · caf0047e
      Chao Yu 提交于
      Currently, there are several variables with Boolean type as below:
      
      struct f2fs_sb_info {
      ...
      	int s_dirty;
      	bool need_fsck;
      	bool s_closing;
      ...
      	bool por_doing;
      ...
      }
      
      For this there are some issues:
      1. there are some space of f2fs_sb_info is wasted due to aligning after Boolean
         type variables by compiler.
      2. if we continuously add new flag into f2fs_sb_info, structure will be messed
         up.
      
      So in this patch, we try to:
      1. switch s_dirty to Boolean type variable since it has two status 0/1.
      2. merge s_dirty/need_fsck/s_closing/por_doing variables into s_flag.
      3. introduce an enum type which can indicate different states of sbi.
      4. use new introduced universal interfaces is_sbi_flag_set/{set,clear}_sbi_flag
         to operate flags for sbi.
      
      After that, above issues will be fixed.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      caf0047e
    • C
      f2fs: fix to release count of meta page in ->invalidatepage · 1601839e
      Chao Yu 提交于
      We will encounter deadloop in below scenario:
      
      1. increase page count for F2FS_DIRTY_META type in following path:
      ->recover_fsync_data
        ->recover_data
          ->do_recover_data
            ->recover_data_page
              ->change_curseg
                ->write_sum_page
                  ->set_page_dirty
      2. fail in recover_data()
      3. invalidate meta pages in truncate_inode_pages_final without decreasing page
         count.
      4. deadloop when sync_meta_pages as page count will always be non-zero.
      
      message:
      NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
      
       [<c1129a37>] pagevec_lookup_tag+0x27/0x30
       [<f0e774c7>] sync_meta_pages+0x87/0x160 [f2fs]
       [<f0e86dd9>] recover_fsync_data+0xeb9/0xf10 [f2fs]
       [<f0e75398>] f2fs_fill_super+0x888/0x980 [f2fs]
       [<c11733ca>] mount_bdev+0x16a/0x1a0
       [<f0e7180f>] f2fs_mount+0x1f/0x30 [f2fs]
       [<c1173da6>] mount_fs+0x36/0x170
       [<c118b6f5>] vfs_kern_mount+0x55/0xe0
       [<c118d63f>] do_mount+0x1df/0x9f0
       [<c118e110>] SyS_mount+0x70/0xb0
       [<c15a0c48>] sysenter_do_call+0x12/0x12
      
      To avoid page count leak, let's add ->invalidatepage and ->releasepage in
      f2fs_meta_aops as f2fs_node_aops to release meta page count correctly.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1601839e
    • J
      f2fs: do checkpoint when umount flag is not set · 85dc2f2c
      Jaegeuk Kim 提交于
      If the previous checkpoint was done without CP_UMOUNT flag, it needs to do
      checkpoint with CP_UMOUNT for the next fast boot.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      85dc2f2c
  5. 10 1月, 2015 4 次提交
  6. 09 12月, 2014 3 次提交
  7. 06 12月, 2014 1 次提交
    • J
      f2fs: call radix_tree_preload before radix_tree_insert · 769ec6e5
      Jaegeuk Kim 提交于
      This patch tries to fix:
      
       BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384
        (radix_tree_node_alloc+0x14/0x74) from [<c033d8a0>] (radix_tree_insert+0x110/0x200)
        (radix_tree_insert+0x110/0x200) from [<c02e8264>] (gc_data_segment+0x340/0x52c)
        (gc_data_segment+0x340/0x52c) from [<c02e8658>] (f2fs_gc+0x208/0x400)
        (f2fs_gc+0x208/0x400) from [<c02e8a98>] (gc_thread_func+0x248/0x28c)
        (gc_thread_func+0x248/0x28c) from [<c0139944>] (kthread+0xa0/0xac)
        (kthread+0xa0/0xac) from [<c0105ef8>] (ret_from_fork+0x14/0x3c)
      
      The reason is that f2fs calls radix_tree_insert under enabled preemption.
      So, before calling it, we need to call radix_tree_preload.
      
      Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or
      semaphore to cover the radix tree operations.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      769ec6e5
  8. 20 11月, 2014 2 次提交
    • J
      f2fs: write SSA pages under memory pressure · 857dc4e0
      Jaegeuk Kim 提交于
      Under memory pressure, we don't need to skip SSA page writes.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      857dc4e0
    • C
      f2fs: introduce struct inode_management to wrap inner fields · 67298804
      Chao Yu 提交于
      Now in f2fs, we have three inode cache: ORPHAN_INO, APPEND_INO, UPDATE_INO,
      and we manage fields related to inode cache separately in struct f2fs_sb_info
      for each inode cache type.
      This makes codes a bit messy, so that this patch intorduce a new struct
      inode_management to wrap inner fields as following which make codes more neat.
      
      /* for inner inode cache management */
      struct inode_management {
      	struct radix_tree_root ino_root;	/* ino entry array */
      	spinlock_t ino_lock;			/* for ino entry lock */
      	struct list_head ino_list;		/* inode list head */
      	unsigned long ino_num;			/* number of entries */
      };
      
      struct f2fs_sb_info {
      	...
      	struct inode_management im[MAX_INO_ENTRY];      /* manage inode cache */
      	...
      }
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      67298804
  9. 07 11月, 2014 1 次提交
  10. 05 11月, 2014 1 次提交
    • J
      f2fs: avoid race condition in handling wait_io · 6a8f8ca5
      Jaegeuk Kim 提交于
      __submit_merged_bio    f2fs_write_end_io        f2fs_write_end_io
                             wait_io = X              wait_io = x
                             complete(X)              complete(X)
                             wait_io = NULL
      wait_for_completion()
      free(X)
                                                       spin_lock(X)
                                                       kernel panic
      
      In order to avoid this, this patch removes the wait_io facility.
      Instead, we can use wait_on_all_pages_writeback(sbi) to wait for end_ios.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6a8f8ca5
  11. 04 11月, 2014 1 次提交
  12. 01 10月, 2014 3 次提交
  13. 24 9月, 2014 2 次提交
  14. 16 9月, 2014 2 次提交
  15. 10 9月, 2014 2 次提交
  16. 04 9月, 2014 1 次提交