1. 10 10月, 2015 1 次提交
    • J
      f2fs: do not skip dentry block writes · 90b803e6
      Jaegeuk Kim 提交于
      Previously, we skip dentry block writes when wbc is SYNC_NONE with no memory
      pressure and the number of dirty pages is pretty small.
      
      But, we didn't skip for normal data writes, which gives us not much big impact
      on overall performance.
      Moreover, by skipping some data writes, kworker falls into infinite loop to try
      to write blocks, when many dir inodes have only one dentry block.
      
      So, this patch removes skipping data writes.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      90b803e6
  2. 15 8月, 2015 1 次提交
  3. 12 8月, 2015 1 次提交
    • C
      f2fs: remove inmem radix tree · decd36b6
      Chao Yu 提交于
      Previously, we use radix tree to index all registered page entries for
      atomic file, but now we only use radix tree to see whether current page
      is indexed or not, since the other user of radix tree is gone in commit
      042b7816 ("f2fs: remove unnecessary call to invalidate inmemory pages").
      
      So in this patch, we try to use one more efficient way:
      Introducing a macro ATOMIC_WRITTEN_PAGE, and setting it as page private
      value to indicate page indexing status. By using this way, we can save
      memory and lookup time.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      decd36b6
  4. 05 8月, 2015 1 次提交
  5. 02 6月, 2015 2 次提交
    • T
      writeback: separate out include/linux/backing-dev-defs.h · 66114cad
      Tejun Heo 提交于
      With the planned cgroup writeback support, backing-dev related
      declarations will be more widely used across block and cgroup;
      unfortunately, including backing-dev.h from include/linux/blkdev.h
      makes cyclic include dependency quite likely.
      
      This patch separates out backing-dev-defs.h which only has the
      essential definitions and updates blkdev.h to include it.  c files
      which need access to more backing-dev details now include
      backing-dev.h directly.  This takes backing-dev.h off the common
      include dependency chain making it a lot easier to use it across block
      and cgroup.
      
      v2: fs/fat build failure fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      66114cad
    • T
      writeback: move bandwidth related fields from backing_dev_info into bdi_writeback · a88a341a
      Tejun Heo 提交于
      Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback)
      and the role of the separation is unclear.  For cgroup support for
      writeback IOs, a bdi will be updated to host multiple wb's where each
      wb serves writeback IOs of a different cgroup on the bdi.  To achieve
      that, a wb should carry all states necessary for servicing writeback
      IOs for a cgroup independently.
      
      This patch moves bandwidth related fields from backing_dev_info into
      bdi_writeback.
      
      * The moved fields are: bw_time_stamp, dirtied_stamp, written_stamp,
        write_bandwidth, avg_write_bandwidth, dirty_ratelimit,
        balanced_dirty_ratelimit, completions and dirty_exceeded.
      
      * writeback_chunk_size() and over_bground_thresh() now take @wb
        instead of @bdi.
      
      * bdi_writeout_fraction(bdi, ...)	-> wb_writeout_fraction(wb, ...)
        bdi_dirty_limit(bdi, ...)		-> wb_dirty_limit(wb, ...)
        bdi_position_ration(bdi, ...)		-> wb_position_ratio(wb, ...)
        bdi_update_writebandwidth(bdi, ...)	-> wb_update_write_bandwidth(wb, ...)
        [__]bdi_update_bandwidth(bdi, ...)	-> [__]wb_update_bandwidth(wb, ...)
        bdi_{max|min}_pause(bdi, ...)		-> wb_{max|min}_pause(wb, ...)
        bdi_dirty_limits(bdi, ...)		-> wb_dirty_limits(wb, ...)
      
      * Init/exits of the relocated fields are moved to bdi_wb_init/exit()
        respectively.  Note that explicit zeroing is dropped in the process
        as wb's are cleared in entirety anyway.
      
      * As there's still only one bdi_writeback per backing_dev_info, all
        uses of bdi->stat[] are mechanically replaced with bdi->wb.stat[]
        introducing no behavior changes.
      
      v2: Typo in description fixed as suggested by Jan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a88a341a
  6. 29 5月, 2015 1 次提交
  7. 11 4月, 2015 1 次提交
  8. 12 2月, 2015 3 次提交
    • C
      f2fs: use spinlock for segmap_lock instead of rwlock · 1a118ccf
      Chao Yu 提交于
      rwlock can provide better concurrency when there are much more readers than
      writers because readers can hold the rwlock simultaneously.
      
      But now, for segmap_lock rwlock in struct free_segmap_info, there is only one
      reader 'mount' from below call path:
      ->f2fs_fill_super
        ->build_segment_manager
          ->build_dirty_segmap
            ->init_dirty_segmap
              ->find_next_inuse
                read_lock
                ...
                read_unlock
      
      Now that our concurrency can not be improved since there is no other reader for
      this lock, we do not need to use rwlock_t type for segmap_lock, let's replace it
      with spinlock_t type.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1a118ccf
    • J
      f2fs: avoid variable length array · 60a3b782
      Jaegeuk Kim 提交于
      Instead of using variable length array, this patch let preallocate memory for
      them.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      60a3b782
    • C
      f2fs: merge flags in struct f2fs_sb_info · caf0047e
      Chao Yu 提交于
      Currently, there are several variables with Boolean type as below:
      
      struct f2fs_sb_info {
      ...
      	int s_dirty;
      	bool need_fsck;
      	bool s_closing;
      ...
      	bool por_doing;
      ...
      }
      
      For this there are some issues:
      1. there are some space of f2fs_sb_info is wasted due to aligning after Boolean
         type variables by compiler.
      2. if we continuously add new flag into f2fs_sb_info, structure will be messed
         up.
      
      So in this patch, we try to:
      1. switch s_dirty to Boolean type variable since it has two status 0/1.
      2. merge s_dirty/need_fsck/s_closing/por_doing variables into s_flag.
      3. introduce an enum type which can indicate different states of sbi.
      4. use new introduced universal interfaces is_sbi_flag_set/{set,clear}_sbi_flag
         to operate flags for sbi.
      
      After that, above issues will be fixed.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      caf0047e
  9. 10 11月, 2014 1 次提交
  10. 04 11月, 2014 1 次提交
  11. 07 10月, 2014 1 次提交
    • J
      f2fs: support atomic writes · 88b88a66
      Jaegeuk Kim 提交于
      This patch introduces a very limited functionality for atomic write support.
      In order to support atomic write, this patch adds two ioctls:
       o F2FS_IOC_START_ATOMIC_WRITE
       o F2FS_IOC_COMMIT_ATOMIC_WRITE
      
      The database engine should be aware of the following sequence.
      1. open
       -> ioctl(F2FS_IOC_START_ATOMIC_WRITE);
      2. writes
        : all the written data will be treated as atomic pages.
      3. commit
       -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE);
        : this flushes all the data blocks to the disk, which will be shown all or
        nothing by f2fs recovery procedure.
      4. repeat to #2.
      
      The IO pattens should be:
      
        ,- START_ATOMIC_WRITE                  ,- COMMIT_ATOMIC_WRITE
       CP | D D D D D D | FSYNC | D D D D | FSYNC ...
                            `- COMMIT_ATOMIC_WRITE
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      88b88a66
  12. 01 10月, 2014 2 次提交
  13. 24 9月, 2014 5 次提交
  14. 16 9月, 2014 1 次提交
    • J
      f2fs: give an option to enable in-place-updates during fsync to users · c1ce1b02
      Jaegeuk Kim 提交于
      If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file
      only starts to try in-place-updates.
      And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it
      keeps out-of-order manner. Otherwise, it triggers in-place-updates.
      
      This may be used by storage showing very high random write performance.
      
      For example, it can be used when,
      
      Seq. writes (Data) + wait + Seq. writes (Node)
      
      is pretty much slower than,
      
      Rand. writes (Data)
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c1ce1b02
  15. 10 9月, 2014 3 次提交
    • C
      f2fs: refactor flush_sit_entries codes for reducing SIT writes · 184a5cd2
      Chao Yu 提交于
      In commit aec71382 ("f2fs: refactor flush_nat_entries codes for reducing NAT
      writes"), we descripte the issue as below:
      
      "Although building NAT journal in cursum reduce the read/write work for NAT
      block, but previous design leave us lower performance when write checkpoint
      frequently for these cases:
      1. if journal in cursum has already full, it's a bit of waste that we flush all
         nat entries to page for persistence, but not to cache any entries.
      2. if journal in cursum is not full, we fill nat entries to journal util
         journal is full, then flush the left dirty entries to disk without merge
         journaled entries, so these journaled entries may be flushed to disk at next
         checkpoint but lost chance to flushed last time."
      
      Actually, we have the same problem in using SIT journal area.
      
      In this patch, firstly we will update sit journal with dirty entries as many as
      possible. Secondly if there is no space in sit journal, we will remove all
      entries in journal and walk through the whole dirty entry bitmap of sit,
      accounting dirty sit entries located in same SIT block to sit entry set. All
      entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
      by count of entries in set. Later we flush entries in set which have fewest
      entries into journal as many as we can, and then flush dense set with merged
      entries to disk.
      
      In this way we can use sit journal area more effectively, also we will reduce
      SIT update, result in gaining in performance and saving lifetime of flash
      device.
      
      In my testing environment, it shows this patch can help to reduce SIT block
      update obviously.
      
      virtual machine + hard disk:
      fsstress -p 20 -n 400 -l 5
      		sit page num	cp count	sit pages/cp
      based		2006.50		1349.75		1.486
      patched		1566.25		1463.25		1.070
      
      Our latency of merging op is small when handling a great number of dirty SIT
      entries in flush_sit_entries:
      latency(ns)	dirty sit count
      36038		2151
      49168		2123
      37174		2232
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      184a5cd2
    • C
      f2fs: remove unneeded sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO · d3a14afd
      Chao Yu 提交于
      sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO is not used, remove it.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d3a14afd
    • J
      f2fs: add BUG cases to initiate fsck.f2fs · 05796763
      Jaegeuk Kim 提交于
      This patch replaces BUG cases with f2fs_bug_on to remain fsck.f2fs information.
      And it implements some void functions to initiate fsck.f2fs too.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      05796763
  16. 04 9月, 2014 1 次提交
  17. 20 8月, 2014 1 次提交
  18. 31 7月, 2014 1 次提交
  19. 16 7月, 2014 1 次提交
  20. 20 3月, 2014 1 次提交
  21. 18 3月, 2014 2 次提交
  22. 24 2月, 2014 1 次提交
  23. 17 2月, 2014 1 次提交
  24. 20 1月, 2014 1 次提交
  25. 23 12月, 2013 5 次提交
    • J
      f2fs: introduce sysfs entry to control in-place-update policy · 216fbd64
      Jaegeuk Kim 提交于
      This patch introduces new sysfs entries for users to control the policy of
      in-place-updates, namely IPU, in f2fs.
      
      Sometimes f2fs suffers from performance degradation due to its out-of-place
      update policy that produces many additional node block writes.
      If the storage performance is very dependant on the amount of data writes
      instead of IO patterns, we'd better drop this out-of-place update policy.
      
      This patch suggests 5 polcies and their triggering conditions as follows.
      
      [sysfs entry name = ipu_policy]
      
      0: F2FS_IPU_FORCE       all the time,
      1: F2FS_IPU_SSR         if SSR mode is activated,
      2: F2FS_IPU_UTIL        if FS utilization is over threashold,
      3: F2FS_IPU_SSR_UTIL    if SSR mode is activated and FS utilization is over
                              threashold,
      4: F2FS_IPU_DISABLE    disable IPU. (=default option)
      
      [sysfs entry name = min_ipu_util]
      
      This parameter controls the threshold to trigger in-place-updates.
      The number indicates percentage of the filesystem utilization, and used by
      F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.
      
      For more details, see need_inplace_update() in segment.h.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      216fbd64
    • C
      f2fs: add unlikely() macro for compiler optimization · cfb271d4
      Chao Yu 提交于
      As we know, some of our branch condition will rarely be true. So we could add
      'unlikely' to let compiler optimize these code, by this way we could drop
      unneeded 'jump' assemble code to improve performance.
      
      change log:
       o add *unlikely* as many as possible across the whole source files at once
         suggested by Jaegeuk Kim.
      Suggested-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      cfb271d4
    • J
      f2fs: remove the own bi_private allocation · 187b5b8b
      Jaegeuk Kim 提交于
      Previously f2fs allocates its own bi_private data structure all the time even
      though we don't use it. But, can we remove this bi_private allocation?
      
      This patch removes such the additional bi_private allocation.
      
      1. Retrieve f2fs_sb_info from its page->mapping->host->i_sb.
       - This removes the usecases of bi_private in end_io.
      
      2. Use bi_private only when we really need it.
       - The bi_private is used only when the checkpoint procedure is conducted.
       - When conducting the checkpoint, f2fs submits a META_FLUSH bio to wait its bio
      completion.
       - Since we have no dependancies to remove bi_private now, let's just use
       bi_private pointer as the completion pointer.
      Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      187b5b8b
    • C
      f2fs: correct type of wait in struct bio_private · aac44046
      Chao Yu 提交于
      The void *wait in bio_private is used for waiting completion of checkpoint bio.
      So we don't need to use its type as void, but declare it as completion type.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      [Jaegeuk Kim: add description]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      aac44046
    • J
      f2fs: bug fix on bit overflow from 32bits to 64bits · f9a4e6df
      Jaegeuk Kim 提交于
      This patch fixes some bit overflows by the shift operations.
      
      Dan Carpenter reported potential bugs on bit overflows as follows.
      
      fs/f2fs/segment.c:910 submit_write_page()
      	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
      fs/f2fs/checkpoint.c:429 get_valid_checkpoint()
      	warn: should '1 << ()' be a 64 bit type?
      fs/f2fs/data.c:408 f2fs_readpage()
      	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
      fs/f2fs/data.c:457 submit_read_page()
      	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
      fs/f2fs/data.c:525 get_data_block_ro()
      	warn: should 'i << blkbits' be a 64 bit type?
      Bug-Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      f9a4e6df