1. 05 8月, 2015 2 次提交
  2. 25 7月, 2015 2 次提交
    • J
      f2fs: call set_page_dirty to attach i_wb for cgroup · 6282adbf
      Jaegeuk Kim 提交于
      The cgroup attaches inode->i_wb via mark_inode_dirty and when set_page_writeback
      is called, __inc_wb_stat() updates i_wb's stat.
      
      So, we need to explicitly call set_page_dirty->__mark_inode_dirty in prior to
      any writebacking pages.
      
      This patch should resolve the following kernel panic reported by Andreas Reis.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=101801
      
      --- Comment #2 from Andreas Reis <andreas.reis@gmail.com> ---
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
      IP: [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90
      PGD 2951ff067 PUD 2df43f067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 7 PID: 10356 Comm: gcc Tainted: G        W       4.2.0-1-cu #1
      Hardware name: Gigabyte Technology Co., Ltd. G1.Sniper M5/G1.Sniper M5, BIOS
      T01 02/03/2015
      task: ffff880295044f80 ti: ffff880295140000 task.ti: ffff880295140000
      RIP: 0010:[<ffffffff8149deea>]  [<ffffffff8149deea>]
      __percpu_counter_add+0x1a/0x90
      RSP: 0018:ffff880295143ac8  EFLAGS: 00010082
      RAX: 0000000000000003 RBX: ffffea000a526d40 RCX: 0000000000000001
      RDX: 0000000000000020 RSI: 0000000000000001 RDI: 0000000000000088
      RBP: ffff880295143ae8 R08: 0000000000000000 R09: ffff88008f69bb30
      R10: 00000000fffffffa R11: 0000000000000000 R12: 0000000000000088
      R13: 0000000000000001 R14: ffff88041d099000 R15: ffff880084a205d0
      FS:  00007f8549374700(0000) GS:ffff88042f3c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000000a8 CR3: 000000033e1d5000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Stack:
       0000000000000000 ffffea000a526d40 ffff880084a20738 ffff880084a20750
       ffff880295143b48 ffffffff811cc91e ffff880000000000 0000000000000296
       0000000000000000 ffff880417090198 0000000000000000 ffffea000a526d40
      Call Trace:
       [<ffffffff811cc91e>] __test_set_page_writeback+0xde/0x1d0
       [<ffffffff813fee87>] do_write_data_page+0xe7/0x3a0
       [<ffffffff813faeea>] gc_data_segment+0x5aa/0x640
       [<ffffffff813fb0b8>] do_garbage_collect+0x138/0x150
       [<ffffffff813fb3fe>] f2fs_gc+0x1be/0x3e0
       [<ffffffff81405541>] f2fs_balance_fs+0x81/0x90
       [<ffffffff813ee357>] f2fs_unlink+0x47/0x1d0
       [<ffffffff81239329>] vfs_unlink+0x109/0x1b0
       [<ffffffff8123e3d7>] do_unlinkat+0x287/0x2c0
       [<ffffffff8123ebc6>] SyS_unlink+0x16/0x20
       [<ffffffff81942e2e>] entry_SYSCALL_64_fastpath+0x12/0x71
      Code: 41 5e 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 49
      89 f5 41 54 49 89 fc 53 48 83 ec 08 65 ff 05 e6 d9 b6 7e <48> 8b 47 20 48 63 ca
      65 8b 18 48 63 db 48 01 f3 48 39 cb 7d 0a
      RIP  [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90
       RSP <ffff880295143ac8>
      CR2: 00000000000000a8
      ---[ end trace 5132449a58ed93a3 ]---
      note: gcc[10356] exited with preempt_count 2
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6282adbf
    • J
      f2fs: handle error cases in move_encrypted_block · 548aedac
      Jaegeuk Kim 提交于
      This patch fixes some missing error handlers.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      548aedac
  3. 02 6月, 2015 1 次提交
  4. 29 5月, 2015 4 次提交
  5. 11 4月, 2015 1 次提交
  6. 12 2月, 2015 3 次提交
  7. 10 1月, 2015 1 次提交
    • C
      f2fs: reuse inode_entry_slab in gc procedure for using slab more effectively · 06292073
      Chao Yu 提交于
      There are two slab cache inode_entry_slab and winode_slab using the same
      structure as below:
      
      struct dir_inode_entry {
      	struct list_head list;	/* list head */
      	struct inode *inode;	/* vfs inode pointer */
      };
      
      struct inode_entry {
      	struct list_head list;
      	struct inode *inode;
      };
      
      It's a little waste that the two cache can not share their memory space for each
      other.
      So in this patch we remove one redundant winode_slab slab cache, then use more
      universal name struct inode_entry as remaining data structure name of slab,
      finally we reuse the inode_entry_slab to store dirty dir item and gc item for
      more effective.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      06292073
  8. 09 12月, 2014 1 次提交
  9. 06 12月, 2014 1 次提交
    • J
      f2fs: call radix_tree_preload before radix_tree_insert · 769ec6e5
      Jaegeuk Kim 提交于
      This patch tries to fix:
      
       BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384
        (radix_tree_node_alloc+0x14/0x74) from [<c033d8a0>] (radix_tree_insert+0x110/0x200)
        (radix_tree_insert+0x110/0x200) from [<c02e8264>] (gc_data_segment+0x340/0x52c)
        (gc_data_segment+0x340/0x52c) from [<c02e8658>] (f2fs_gc+0x208/0x400)
        (f2fs_gc+0x208/0x400) from [<c02e8a98>] (gc_thread_func+0x248/0x28c)
        (gc_thread_func+0x248/0x28c) from [<c0139944>] (kthread+0xa0/0xac)
        (kthread+0xa0/0xac) from [<c0105ef8>] (ret_from_fork+0x14/0x3c)
      
      The reason is that f2fs calls radix_tree_insert under enabled preemption.
      So, before calling it, we need to call radix_tree_preload.
      
      Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or
      semaphore to cover the radix tree operations.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      769ec6e5
  10. 03 12月, 2014 1 次提交
  11. 28 11月, 2014 1 次提交
  12. 20 11月, 2014 1 次提交
    • C
      f2fs: avoid unable to restart gc thread in remount · 6c029932
      Chao Yu 提交于
      In f2fs_remount, we will stop gc thread and set need_restart_gc as true when new
      option is set without BG_GC, then if any error occurred in the following
      procedure, we can restore to start the gc thread.
      But after that, We will fail to restore gc thread in start_gc_thread as BG_GC is
      not set in new option, so we'd better move this condition judgment out of
      start_gc_thread to fix this issue.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6c029932
  13. 05 11月, 2014 1 次提交
  14. 04 11月, 2014 1 次提交
  15. 01 10月, 2014 2 次提交
    • J
      f2fs: check the use of macros on block counts and addresses · 7cd8558b
      Jaegeuk Kim 提交于
      This patch cleans up the existing and new macros for readability.
      
      Rule is like this.
      
               ,-----------------------------------------> MAX_BLKADDR -,
               |  ,------------- TOTAL_BLKS ----------------------------,
               |  |                                                     |
               |  ,- seg0_blkaddr   ,----- sit/nat/ssa/main blkaddress  |
      block    |  | (SEG0_BLKADDR)  | | | |   (e.g., MAIN_BLKADDR)      |
      address  0..x................ a b c d .............................
                  |                                                     |
      global seg# 0...................... m .............................
                  |                       |                             |
                  |                       `------- MAIN_SEGS -----------'
                  `-------------- TOTAL_SEGS ---------------------------'
                                          |                             |
       seg#                               0..........xx..................
      
      = Note =
       o GET_SEGNO_FROM_SEG0 : blk address -> global segno
       o GET_SEGNO           : blk address -> segno
       o START_BLOCK         : segno -> starting block address
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7cd8558b
    • J
      f2fs: introduce cp_control structure · 75ab4cb8
      Jaegeuk Kim 提交于
      This patch add a new data structure to control checkpoint parameters.
      Currently, it presents the reason of checkpoint such as is_umount and normal
      sync.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      75ab4cb8
  16. 24 9月, 2014 1 次提交
    • C
      f2fs: fix to search whole dirty segmap when get_victim · 210f41bc
      Chao Yu 提交于
      In ->get_victim we get max_search value from dirty_i->nr_dirty without
      protection of seglist_lock, after that, nr_dirty can be increased/decreased
      before we hold seglist_lock lock.
      Then in main loop we attempt to traverse all dirty section one time to find
      victim section, but it's not accurate to use max_search as the total loop count,
      because we might lose checking several sections or check sections redundantly
      for the case of nr_dirty are increased or decreased previously.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      210f41bc
  17. 16 9月, 2014 1 次提交
  18. 10 9月, 2014 1 次提交
    • H
      f2fs: avoid node page to be written twice in gc_node_segment · 9a01b56b
      Huang Ying 提交于
      In gc_node_segment, if node page gc is run concurrently with node page
      writeback, and check_valid_map and get_node_page run after page locked
      and before cur_valid_map is updated as below, it is possible for the
      page to be written twice unnecessarily.
      
      			sync_node_pages
      			  try_lock_page
      			  ...
      check_valid_map		  f2fs_write_node_page
      			    ...
      			    write_node_page
      			      do_write_page
      			        allocate_data_block
      				  ...
      				  refresh_sit_entry /* update cur_valid_map */
      				  ...
      			    ...
      			    unlock_page
      get_node_page
      ...
      set_page_dirty
      ...
      f2fs_put_page
        unlock_page
      
      This can be solved via calling check_valid_map after get_node_page again.
      Signed-off-by: NHuang, Ying <ying.huang@intel.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9a01b56b
  19. 02 9月, 2014 1 次提交
    • C
      f2fs: reposition unlock_new_inode to prevent accessing invalid inode · b73e5282
      Chao Yu 提交于
      As the race condition on the inode cache, following scenario can appear:
      [Thread a]				[Thread b]
      					->f2fs_mkdir
      					  ->f2fs_add_link
      					    ->__f2fs_add_link
      					      ->init_inode_metadata failed here
      ->gc_thread_func
        ->f2fs_gc
          ->do_garbage_collect
            ->gc_data_segment
              ->f2fs_iget
                ->iget_locked
                  ->wait_on_inode
      					  ->unlock_new_inode
              ->move_data_page
      					  ->make_bad_inode
      					  ->iput
      
      When we fail in create/symlink/mkdir/mknod/tmpfile, the new allocated inode
      should be set as bad to avoid being accessed by other thread. But in above
      scenario, it allows f2fs to access the invalid inode before this inode was set
      as bad.
      This patch fix the potential problem, and this issue was found by code review.
      
      change log from v1:
       o Add condition judgment in gc_data_segment() suggested by Changman Lee.
       o use iget_failed to simplify code.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b73e5282
  20. 22 8月, 2014 1 次提交
  21. 20 8月, 2014 1 次提交
  22. 05 8月, 2014 1 次提交
  23. 10 3月, 2014 1 次提交
  24. 27 2月, 2014 1 次提交
  25. 17 2月, 2014 2 次提交
  26. 14 1月, 2014 1 次提交
  27. 08 1月, 2014 1 次提交
    • J
      f2fs: add a sysfs entry to control max_victim_search · b1c57c1c
      Jaegeuk Kim 提交于
      Previously during SSR and GC, the maximum number of retrials to find a victim
      segment was hard-coded by MAX_VICTIM_SEARCH, 4096 by default.
      
      This number makes an effect on IO locality, when SSR mode is activated, which
      results in performance fluctuation on some low-end devices.
      
      If max_victim_search = 4, the victim will be searched like below.
      ("D" represents a dirty segment, and "*" indicates a selected victim segment.)
      
       D1 D2 D3 D4 D5 D6 D7 D8 D9
      [   *       ]
            [   *    ]
                  [         * ]
      	                [ ....]
      
      This patch adds a sysfs entry to control the number dynamically through:
        /sys/fs/f2fs/$dev/max_victim_search
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b1c57c1c
  28. 23 12月, 2013 4 次提交
    • G
      f2fs: remove the rw_flag domain from f2fs_io_info · 7e8f2308
      Gu Zheng 提交于
      When using the f2fs_io_info in the low level, we still need to merge the
      rw and rw_flag, so use the rw to hold all the io flags directly,
      and remove the rw_flag field.
      
      ps.It is based on the previous patch:
      f2fs: move all the bio initialization into __bio_alloc
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7e8f2308
    • J
      f2fs: refactor bio->rw handling · 458e6197
      Jaegeuk Kim 提交于
      This patch introduces f2fs_io_info to mitigate the complex parameter list.
      
      struct f2fs_io_info {
      	enum page_type type;		/* contains DATA/NODE/META/META_FLUSH */
      	int rw;				/* contains R/RS/W/WS */
      	int rw_flag;			/* contains REQ_META/REQ_PRIO */
      }
      
      1. f2fs_write_data_pages
       - DATA
       - WRITE_SYNC is set when wbc->WB_SYNC_ALL.
      
      2. sync_node_pages
       - NODE
       - WRITE_SYNC all the time
      
      3. sync_meta_pages
       - META
       - WRITE_SYNC all the time
       - REQ_META | REQ_PRIO all the time
      
       ** f2fs_submit_merged_bio() handles META_FLUSH.
      
      4. ra_nat_pages, ra_sit_pages, ra_sum_pages
       - META
       - READ_SYNC
      
      Cc: Fan Li <fanofcode.li@samsung.com>
      Cc: Changman Lee <cm224.lee@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      458e6197
    • F
      f2fs: merge pages with the same sync_mode flag · 63a0b7cb
      Fan Li 提交于
      Previously f2fs submits most of write requests using WRITE_SYNC, but f2fs_write_data_pages
      submits last write requests by sync_mode flags callers pass.
      
      This causes a performance problem since continuous pages with different sync flags
      can't be merged in cfq IO scheduler(thanks yu chao for pointing it out), and synchronous
      requests often take more time.
      
      This patch makes the following modifies to DATA writebacks:
      
      1. every page will be written back using the sync mode caller pass.
      2. only pages with the same sync mode can be merged in one bio request.
      
      These changes are restricted to DATA pages.Other types of writebacks are modified
      To remain synchronous.
      
      In my test with tiotest, f2fs sequence write performance is improved by about 7%-10% ,
      and this patch has no obvious impact on other performance tests.
      Signed-off-by: NFan Li <fanofcode.li@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      63a0b7cb
    • J
      f2fs: add unlikely() macro for compiler more aggressively · 6bacf52f
      Jaegeuk Kim 提交于
      This patch adds unlikely() macro into the most of codes.
      The basic rule is to add that when:
      - checking unusual errors,
      - checking page mappings,
      - and the other unlikely conditions.
      
      Change log from v1:
       - Don't add unlikely for the NULL test and error test: advised by Andi Kleen.
      
      Cc: Chao Yu <chao2.yu@samsung.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6bacf52f