1. 05 8月, 2015 3 次提交
  2. 02 6月, 2015 1 次提交
    • T
      writeback: move bandwidth related fields from backing_dev_info into bdi_writeback · a88a341a
      Tejun Heo 提交于
      Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback)
      and the role of the separation is unclear.  For cgroup support for
      writeback IOs, a bdi will be updated to host multiple wb's where each
      wb serves writeback IOs of a different cgroup on the bdi.  To achieve
      that, a wb should carry all states necessary for servicing writeback
      IOs for a cgroup independently.
      
      This patch moves bandwidth related fields from backing_dev_info into
      bdi_writeback.
      
      * The moved fields are: bw_time_stamp, dirtied_stamp, written_stamp,
        write_bandwidth, avg_write_bandwidth, dirty_ratelimit,
        balanced_dirty_ratelimit, completions and dirty_exceeded.
      
      * writeback_chunk_size() and over_bground_thresh() now take @wb
        instead of @bdi.
      
      * bdi_writeout_fraction(bdi, ...)	-> wb_writeout_fraction(wb, ...)
        bdi_dirty_limit(bdi, ...)		-> wb_dirty_limit(wb, ...)
        bdi_position_ration(bdi, ...)		-> wb_position_ratio(wb, ...)
        bdi_update_writebandwidth(bdi, ...)	-> wb_update_write_bandwidth(wb, ...)
        [__]bdi_update_bandwidth(bdi, ...)	-> [__]wb_update_bandwidth(wb, ...)
        bdi_{max|min}_pause(bdi, ...)		-> wb_{max|min}_pause(wb, ...)
        bdi_dirty_limits(bdi, ...)		-> wb_dirty_limits(wb, ...)
      
      * Init/exits of the relocated fields are moved to bdi_wb_init/exit()
        respectively.  Note that explicit zeroing is dropped in the process
        as wb's are cleared in entirety anyway.
      
      * As there's still only one bdi_writeback per backing_dev_info, all
        uses of bdi->stat[] are mechanically replaced with bdi->wb.stat[]
        introducing no behavior changes.
      
      v2: Typo in description fixed as suggested by Jan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a88a341a
  3. 29 5月, 2015 4 次提交
  4. 08 5月, 2015 1 次提交
  5. 11 4月, 2015 3 次提交
  6. 04 3月, 2015 1 次提交
    • C
      f2fs: add core functions for rb-tree extent cache · 429511cd
      Chao Yu 提交于
      This patch adds core functions including slab cache init function and
      init/lookup/update/shrink/destroy function for rb-tree based extent cache.
      
      Thank Jaegeuk Kim and Changman Lee as they gave much suggestion about detail
      design and implementation of extent cache.
      
      Todo:
       * register rb-based extent cache shrink with mm shrink interface.
      
      v2:
       o move set_extent_info and __is_{extent,back,front}_mergeable into f2fs.h.
       o introduce __{attach,detach}_extent_node for code readability.
       o add cond_resched() when fail to invoke kmem_cache_alloc/radix_tree_insert.
       o fix some coding style and typo issues.
      
      v3:
       o fix oops due to using an unassigned pointer.
       o use list_del to remove extent node in shrink list.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
      [Jaegeuk Kim: add static for some funcitons and declare in f2fs.h]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      429511cd
  7. 12 2月, 2015 5 次提交
    • J
      f2fs: fix accessing wrong indexed data blocks · f1a3b98e
      Jaegeuk Kim 提交于
      This patch fixes the following test.
      
      This causes:
       attempt to access beyond end of device
       sdb2: rw=16384, want=14413962000, limit=16777216
      
      The reason is:
       - f2fs_write_begin
        - f2fs_convert_inline_inode returns -ENOSPC
        - f2fs_write_failed
         - truncate_blocks
          - truncate_partial_data_page
           - find_data_page
            - get_dnode_of_data returns wrong data index retrieved from inline_data
            - f2fs_submit_page_bio(wrong data index)
             - submit_bio(wrong data index)
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f1a3b98e
    • J
      f2fs: check node page contents all the time · aaf96075
      Jaegeuk Kim 提交于
      In get_node_page, if the page is up-to-date, we assumed that the page was not
      reclaimed at all.
      But, sometimes it was reported that its contents was missing.
      So, just for sure, let's check its mapping and contents.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      aaf96075
    • C
      f2fs: merge {invalidate,release}page for meta/node/data pages · 487261f3
      Chao Yu 提交于
      This patch merges ->{invalidate,release}page function for meta/node/data pages.
      
      After this, duplication of codes could be removed.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      487261f3
    • J
      f2fs: keep PagePrivate during releasepage · f68daeeb
      Jaegeuk Kim 提交于
      If PagePrivate is removed by releasepage, f2fs loses counting dirty pages.
      
      e.g., try_to_release_page will not release page when the page is dirty,
      but our releasepage removes PagePrivate.
      
          [<ffffffff81188d75>] try_to_release_page+0x35/0x50
          [<ffffffff811996f9>] invalidate_inode_pages2_range+0x2f9/0x3b0
          [<ffffffffa02a7f54>] ? truncate_blocks+0x384/0x4d0 [f2fs]
          [<ffffffffa02b7583>] ? f2fs_direct_IO+0x283/0x290 [f2fs]
          [<ffffffffa02b7fb0>] ? get_data_block_fiemap+0x20/0x20 [f2fs]
          [<ffffffff8118aa53>] generic_file_direct_write+0x163/0x170
          [<ffffffff8118ad06>] __generic_file_write_iter+0x2a6/0x350
          [<ffffffff8118adef>] generic_file_write_iter+0x3f/0xb0
          [<ffffffff81203081>] new_sync_write+0x81/0xb0
          [<ffffffff81203837>] vfs_write+0xb7/0x1f0
          [<ffffffff81204459>] SyS_write+0x49/0xb0
          [<ffffffff817c286d>] system_call_fastpath+0x16/0x1b
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f68daeeb
    • C
      f2fs: merge flags in struct f2fs_sb_info · caf0047e
      Chao Yu 提交于
      Currently, there are several variables with Boolean type as below:
      
      struct f2fs_sb_info {
      ...
      	int s_dirty;
      	bool need_fsck;
      	bool s_closing;
      ...
      	bool por_doing;
      ...
      }
      
      For this there are some issues:
      1. there are some space of f2fs_sb_info is wasted due to aligning after Boolean
         type variables by compiler.
      2. if we continuously add new flag into f2fs_sb_info, structure will be messed
         up.
      
      So in this patch, we try to:
      1. switch s_dirty to Boolean type variable since it has two status 0/1.
      2. merge s_dirty/need_fsck/s_closing/por_doing variables into s_flag.
      3. introduce an enum type which can indicate different states of sbi.
      4. use new introduced universal interfaces is_sbi_flag_set/{set,clear}_sbi_flag
         to operate flags for sbi.
      
      After that, above issues will be fixed.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      caf0047e
  8. 10 1月, 2015 8 次提交
  9. 09 12月, 2014 1 次提交
  10. 06 12月, 2014 1 次提交
    • J
      f2fs: call radix_tree_preload before radix_tree_insert · 769ec6e5
      Jaegeuk Kim 提交于
      This patch tries to fix:
      
       BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384
        (radix_tree_node_alloc+0x14/0x74) from [<c033d8a0>] (radix_tree_insert+0x110/0x200)
        (radix_tree_insert+0x110/0x200) from [<c02e8264>] (gc_data_segment+0x340/0x52c)
        (gc_data_segment+0x340/0x52c) from [<c02e8658>] (f2fs_gc+0x208/0x400)
        (f2fs_gc+0x208/0x400) from [<c02e8a98>] (gc_thread_func+0x248/0x28c)
        (gc_thread_func+0x248/0x28c) from [<c0139944>] (kthread+0xa0/0xac)
        (kthread+0xa0/0xac) from [<c0105ef8>] (ret_from_fork+0x14/0x3c)
      
      The reason is that f2fs calls radix_tree_insert under enabled preemption.
      So, before calling it, we need to call radix_tree_preload.
      
      Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or
      semaphore to cover the radix tree operations.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      769ec6e5
  11. 04 12月, 2014 2 次提交
  12. 26 11月, 2014 3 次提交
  13. 20 11月, 2014 2 次提交
    • J
      f2fs: submit bio for node blocks in the reclaim path · 27c6bd60
      Jaegeuk Kim 提交于
      If a node page is request to be written during the reclaiming path, we should
      submit the bio to avoid pending to recliam it.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      27c6bd60
    • C
      f2fs: introduce struct inode_management to wrap inner fields · 67298804
      Chao Yu 提交于
      Now in f2fs, we have three inode cache: ORPHAN_INO, APPEND_INO, UPDATE_INO,
      and we manage fields related to inode cache separately in struct f2fs_sb_info
      for each inode cache type.
      This makes codes a bit messy, so that this patch intorduce a new struct
      inode_management to wrap inner fields as following which make codes more neat.
      
      /* for inner inode cache management */
      struct inode_management {
      	struct radix_tree_root ino_root;	/* ino entry array */
      	spinlock_t ino_lock;			/* for ino entry lock */
      	struct list_head ino_list;		/* inode list head */
      	unsigned long ino_num;			/* number of entries */
      };
      
      struct f2fs_sb_info {
      	...
      	struct inode_management im[MAX_INO_ENTRY];      /* manage inode cache */
      	...
      }
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      67298804
  14. 10 11月, 2014 1 次提交
  15. 07 11月, 2014 1 次提交
  16. 01 10月, 2014 1 次提交
    • J
      f2fs: refactor flush_nat_entries to remove costly reorganizing ops · 309cc2b6
      Jaegeuk Kim 提交于
      Previously, f2fs tries to reorganize the dirty nat entries into multiple sets
      according to its nid ranges. This can improve the flushing nat pages, however,
      if there are a lot of cached nat entries, it becomes a bottleneck.
      
      This patch introduces a new set management flow by removing dirty nat list and
      adding a series of set operations when the nat entry becomes dirty.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      309cc2b6
  17. 24 9月, 2014 2 次提交
    • J
      f2fs: use MAX_BIO_BLOCKS(sbi) · 90a893c7
      Jaegeuk Kim 提交于
      This patch cleans up a simple macro.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      90a893c7
    • J
      f2fs: fix conditions to remain recovery information in f2fs_sync_file · 88bd02c9
      Jaegeuk Kim 提交于
      This patch revisited whole the recovery information during the f2fs_sync_file.
      
      In this patch, there are three information to make a decision.
      
      a) IS_CHECKPOINTED,	/* is it checkpointed before? */
      b) HAS_FSYNCED_INODE,	/* is the inode fsynced before? */
      c) HAS_LAST_FSYNC,	/* has the latest node fsync mark? */
      
      And, the scenarios for our rule are based on:
      
      [Term] F: fsync_mark, D: dentry_mark
      
      1. inode(x) | CP | inode(x) | dnode(F)
      2. inode(x) | CP | inode(F) | dnode(F)
      3. inode(x) | CP | dnode(F) | inode(x) | inode(F)
      4. inode(x) | CP | dnode(F) | inode(F)
      5. CP | inode(x) | dnode(F) | inode(DF)
      6. CP | inode(DF) | dnode(F)
      7. CP | dnode(F) | inode(DF)
      8. CP | dnode(F) | inode(x) | inode(DF)
      
      For example, #3, the three conditions should be changed as follows.
      
         inode(x) | CP | dnode(F) | inode(x) | inode(F)
      a)    x       o      o          o          o
      b)    x       x      x          x          o
      c)    x       o      o          x          o
      
      If f2fs_sync_file stops   ------^,
       it should write inode(F)    --------------^
      
      So, the need_inode_block_update should return true, since
       c) get_nat_flag(e, HAS_LAST_FSYNC), is false.
      
      For example, #8,
            CP | alloc | dnode(F) | inode(x) | inode(DF)
      a)    o      x        x          x          x
      b)    x               x          x          o
      c)    o               o          x          o
      
      If f2fs_sync_file stops   -------^,
       it should write inode(DF)    --------------^
      
      Note that, the roll-forward policy should follow this rule, which means,
      if there are any missing blocks, we doesn't need to recover that inode.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      88bd02c9