1. 07 5月, 2014 5 次提交
    • G
      f2fs: add the flush_merge handle in the remount flow · 876dc59e
      Gu Zheng 提交于
      Add the *remount* handle of flush_merge option, so that the users
      can enable flush_merge in the runtime, such as the underlying device
      handles the cache_flush command relatively slowly.
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      876dc59e
    • J
      f2fs: avoid to conduct roll-forward due to the remained garbage blocks · 1e87a78d
      Jaegeuk Kim 提交于
      The f2fs always scans the next chain of direct node blocks.
      But some garbage blocks are able to be remained due to no discard support or
      SSR triggers.
      This occasionally wreaks recovering wrong inodes that were used or BUG_ONs
      due to reallocating node ids as follows.
      
      When mount this f2fs image:
      http://linuxtesting.org/downloads/f2fs_fault_image.zip
      BUG_ON is triggered in f2fs driver (messages below are generated on
      kernel 3.13.2; for other kernels output is similar):
      
      kernel BUG at fs/f2fs/node.c:215!
       Call Trace:
       [<ffffffffa032ebad>] recover_inode_page+0x1fd/0x3e0 [f2fs]
       [<ffffffff811446e7>] ? __lock_page+0x67/0x70
       [<ffffffff81089990>] ? autoremove_wake_function+0x50/0x50
       [<ffffffffa0337788>] recover_fsync_data+0x1398/0x15d0 [f2fs]
       [<ffffffff812b9e5c>] ? selinux_d_instantiate+0x1c/0x20
       [<ffffffff811cb20b>] ? d_instantiate+0x5b/0x80
       [<ffffffffa0321044>] f2fs_fill_super+0xb04/0xbf0 [f2fs]
       [<ffffffff811b861e>] ? mount_bdev+0x7e/0x210
       [<ffffffff811b8769>] mount_bdev+0x1c9/0x210
       [<ffffffffa0320540>] ? validate_superblock+0x210/0x210 [f2fs]
       [<ffffffffa031cf8d>] f2fs_mount+0x1d/0x30 [f2fs]
       [<ffffffff811b9497>] mount_fs+0x47/0x1c0
       [<ffffffff81166e00>] ? __alloc_percpu+0x10/0x20
       [<ffffffff811d4032>] vfs_kern_mount+0x72/0x110
       [<ffffffff811d6763>] do_mount+0x493/0x910
       [<ffffffff811615cb>] ? strndup_user+0x5b/0x80
       [<ffffffff811d6c70>] SyS_mount+0x90/0xe0
       [<ffffffff8166f8d9>] system_call_fastpath+0x16/0x1b
      
      Found by Linux File System Verification project (linuxtesting.org).
      Reported-by: NAndrey Tsyvarev <tsyvarev@ispras.ru>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      1e87a78d
    • G
      f2fs: enable flush_merge only in f2fs is not read-only · b270ad6f
      Gu Zheng 提交于
      Enable flush_merge only in f2fs is not read-only, so does the mount
      option show.
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b270ad6f
    • G
      197d4647
    • G
      f2fs: put the bio when issue_flush completed · a4ed23f2
      Gu Zheng 提交于
      Put the bio when the flush cmd issued, it also can fix the following
      kmemleak:
      unreferenced object 0xffff8800270c73c0 (size 200):
        comm "f2fs_flush-7:0", pid 27161, jiffies 4312127988 (age 988.503s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 40 07 81 19 01 88 ff ff  ........@.......
          01 00 00 00 00 00 00 f0 11 14 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff81559866>] kmemleak_alloc+0x72/0x96
          [<ffffffff81156f7e>] slab_post_alloc_hook+0x28/0x2a
          [<ffffffff811595b1>] kmem_cache_alloc+0xec/0x157
          [<ffffffff8111924d>] mempool_alloc_slab+0x15/0x17
          [<ffffffff81119513>] mempool_alloc+0x71/0x138
          [<ffffffff81193548>] bio_alloc_bioset+0x93/0x18c
          [<ffffffffa040f857>] issue_flush_thread+0x8d/0x145 [f2fs]
          [<ffffffff8107ac16>] kthread+0xba/0xc2
          [<ffffffff81571b2c>] ret_from_fork+0x7c/0xb0
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a4ed23f2
  2. 07 4月, 2014 1 次提交
    • J
      f2fs: introduce f2fs_issue_flush to avoid redundant flush issue · 6b4afdd7
      Jaegeuk Kim 提交于
      Some storage devices show relatively high latencies to complete cache_flush
      commands, even though their normal IO speed is prettry much high. In such
      the case, it needs to merge cache_flush commands as much as possible to avoid
      issuing them redundantly.
      So, this patch introduces a mount option, "-o flush_merge", to mitigate such
      the overhead.
      
      If this option is enabled by user, F2FS merges the cache_flush commands and then
      issues just one cache_flush on behalf of them. Once the single command is
      finished, F2FS sends a completion signal to all the pending threads.
      
      Note that, this option can be used under a workload consisting of very intensive
      concurrent fsync calls, while the storage handles cache_flush commands slowly.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6b4afdd7
  3. 02 4月, 2014 2 次提交
  4. 01 4月, 2014 1 次提交
  5. 20 3月, 2014 1 次提交
  6. 18 3月, 2014 1 次提交
  7. 10 3月, 2014 2 次提交
  8. 17 2月, 2014 3 次提交
  9. 22 1月, 2014 1 次提交
  10. 20 1月, 2014 1 次提交
  11. 14 1月, 2014 1 次提交
  12. 08 1月, 2014 1 次提交
    • J
      f2fs: improve write performance under frequent fsync calls · fb5566da
      Jaegeuk Kim 提交于
      When considering a bunch of data writes with very frequent fsync calls, we
      are able to think the following performance regression.
      
      N: Node IO, D: Data IO, IO scheduler: cfq
      
      Issue    pending IOs
      	 D1 D2 D3 D4
       D1         D2 D3 D4 N1
       D2            D3 D4 N1 N2
       N1            D3 D4 N2 D1
       --> N1 can be selected by cfq becase of the same priority of N and D.
           Then D3 and D4 would be delayed, resuling in performance degradation.
      
      So, when processing the fsync call, it'd better give higher priority to data IOs
      than node IOs by assigning WRITE and WRITE_SYNC respectively.
      This patch improves the random wirte performance with frequent fsync calls by up
      to 10%.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      fb5566da
  13. 23 12月, 2013 20 次提交
    • G
      f2fs: remove the rw_flag domain from f2fs_io_info · 7e8f2308
      Gu Zheng 提交于
      When using the f2fs_io_info in the low level, we still need to merge the
      rw and rw_flag, so use the rw to hold all the io flags directly,
      and remove the rw_flag field.
      
      ps.It is based on the previous patch:
      f2fs: move all the bio initialization into __bio_alloc
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7e8f2308
    • J
      f2fs: introduce a new direct_IO write path · bfad7c2d
      Jaegeuk Kim 提交于
      Previously, f2fs doesn't support direct IOs with high performance, which throws
      every write requests via the buffered write path, resulting in highly
      performance degradation due to memory opeations like copy_from_user.
      
      This patch introduces a new direct IO path in which every write requests are
      processed by generic blockdev_direct_IO() with enhanced get_block function.
      
      The get_data_block() in f2fs handles:
      1. if original data blocks are allocates, then give them to blockdev.
      2. otherwise,
        a. preallocate requested block addresses
        b. do not use extent cache for better performance
        c. give the block addresses to blockdev
      
      This policy induces that:
      - new allocated data are sequentially written to the disk
      - updated data are randomly written to the disk.
      - f2fs gives consistency on its file meta, not file data.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      bfad7c2d
    • J
      f2fs: introduce sysfs entry to control in-place-update policy · 216fbd64
      Jaegeuk Kim 提交于
      This patch introduces new sysfs entries for users to control the policy of
      in-place-updates, namely IPU, in f2fs.
      
      Sometimes f2fs suffers from performance degradation due to its out-of-place
      update policy that produces many additional node block writes.
      If the storage performance is very dependant on the amount of data writes
      instead of IO patterns, we'd better drop this out-of-place update policy.
      
      This patch suggests 5 polcies and their triggering conditions as follows.
      
      [sysfs entry name = ipu_policy]
      
      0: F2FS_IPU_FORCE       all the time,
      1: F2FS_IPU_SSR         if SSR mode is activated,
      2: F2FS_IPU_UTIL        if FS utilization is over threashold,
      3: F2FS_IPU_SSR_UTIL    if SSR mode is activated and FS utilization is over
                              threashold,
      4: F2FS_IPU_DISABLE    disable IPU. (=default option)
      
      [sysfs entry name = min_ipu_util]
      
      This parameter controls the threshold to trigger in-place-updates.
      The number indicates percentage of the filesystem utilization, and used by
      F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.
      
      For more details, see need_inplace_update() in segment.h.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      216fbd64
    • J
      f2fs: refactor bio->rw handling · 458e6197
      Jaegeuk Kim 提交于
      This patch introduces f2fs_io_info to mitigate the complex parameter list.
      
      struct f2fs_io_info {
      	enum page_type type;		/* contains DATA/NODE/META/META_FLUSH */
      	int rw;				/* contains R/RS/W/WS */
      	int rw_flag;			/* contains REQ_META/REQ_PRIO */
      }
      
      1. f2fs_write_data_pages
       - DATA
       - WRITE_SYNC is set when wbc->WB_SYNC_ALL.
      
      2. sync_node_pages
       - NODE
       - WRITE_SYNC all the time
      
      3. sync_meta_pages
       - META
       - WRITE_SYNC all the time
       - REQ_META | REQ_PRIO all the time
      
       ** f2fs_submit_merged_bio() handles META_FLUSH.
      
      4. ra_nat_pages, ra_sit_pages, ra_sum_pages
       - META
       - READ_SYNC
      
      Cc: Fan Li <fanofcode.li@samsung.com>
      Cc: Changman Lee <cm224.lee@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      458e6197
    • F
      f2fs: merge pages with the same sync_mode flag · 63a0b7cb
      Fan Li 提交于
      Previously f2fs submits most of write requests using WRITE_SYNC, but f2fs_write_data_pages
      submits last write requests by sync_mode flags callers pass.
      
      This causes a performance problem since continuous pages with different sync flags
      can't be merged in cfq IO scheduler(thanks yu chao for pointing it out), and synchronous
      requests often take more time.
      
      This patch makes the following modifies to DATA writebacks:
      
      1. every page will be written back using the sync mode caller pass.
      2. only pages with the same sync mode can be merged in one bio request.
      
      These changes are restricted to DATA pages.Other types of writebacks are modified
      To remain synchronous.
      
      In my test with tiotest, f2fs sequence write performance is improved by about 7%-10% ,
      and this patch has no obvious impact on other performance tests.
      Signed-off-by: NFan Li <fanofcode.li@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      63a0b7cb
    • J
      f2fs: refactor bio-related operations · 93dfe2ac
      Jaegeuk Kim 提交于
      This patch integrates redundant bio operations on read and write IOs.
      
      1. Move bio-related codes to the top of data.c.
      2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read
         bios additionally.
      3. Introduce __submit_merged_bio to submit the merged bio.
      4. Change f2fs_readpage to f2fs_submit_page_bio.
      5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and
         submit_write_page.
      Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Reviewed-by: Chao Yu <chao2.yu@samsung.com >
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      93dfe2ac
    • J
      f2fs: remove the own bi_private allocation · 187b5b8b
      Jaegeuk Kim 提交于
      Previously f2fs allocates its own bi_private data structure all the time even
      though we don't use it. But, can we remove this bi_private allocation?
      
      This patch removes such the additional bi_private allocation.
      
      1. Retrieve f2fs_sb_info from its page->mapping->host->i_sb.
       - This removes the usecases of bi_private in end_io.
      
      2. Use bi_private only when we really need it.
       - The bi_private is used only when the checkpoint procedure is conducted.
       - When conducting the checkpoint, f2fs submits a META_FLUSH bio to wait its bio
      completion.
       - Since we have no dependancies to remove bi_private now, let's just use
       bi_private pointer as the completion pointer.
      Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      187b5b8b
    • J
      f2fs: bug fix on bit overflow from 32bits to 64bits · f9a4e6df
      Jaegeuk Kim 提交于
      This patch fixes some bit overflows by the shift operations.
      
      Dan Carpenter reported potential bugs on bit overflows as follows.
      
      fs/f2fs/segment.c:910 submit_write_page()
      	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
      fs/f2fs/checkpoint.c:429 get_valid_checkpoint()
      	warn: should '1 << ()' be a 64 bit type?
      fs/f2fs/data.c:408 f2fs_readpage()
      	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
      fs/f2fs/data.c:457 submit_read_page()
      	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
      fs/f2fs/data.c:525 get_data_block_ro()
      	warn: should 'i << blkbits' be a 64 bit type?
      Bug-Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      f9a4e6df
    • C
      f2fs: send REQ_META or REQ_PRIO when reading meta area · 03232305
      Changman Lee 提交于
      Let's send REQ_META or REQ_PRIO when reading meta area such as NAT/SIT
      etc.
      Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      03232305
    • J
      f2fs: add detailed information of bio types in the tracepoints · a709f4a2
      Jaegeuk Kim 提交于
      This patch inserts information of bio types in more detail.
      So, we can now see REQ_META and REQ_PRIO too.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a709f4a2
    • C
      f2fs: read contiguous sit entry pages by merging for mount performance · 74de593a
      Chao Yu 提交于
      Previously we read sit entries page one by one, this method lost the chance
      of reading contiguous page together. So we read pages as contiguous as
      possible for better mount performance.
      
      change log:
       o merge judgements/use 'Continue' or 'Break' instead of 'Goto' as Gu Zheng
         suggested.
       o add mark_page_accessed() before release page to delay VM reclaiming.
       o remove '*order' for simplification of function as Jaegeuk Kim suggested.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      [Jaegeuk Kim: fix a bug on the block address calculation]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      74de593a
    • C
      f2fs: adds a tracepoint for f2fs_submit_read_bio · d4d288bc
      Chao Yu 提交于
      This patch adds a tracepoint for f2fs_submit_read_bio.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      [Jaegeuk Kim: integrate tracepoints of f2fs_submit_read(_write)_bio]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      d4d288bc
    • C
      f2fs: adds a tracepoint for submit_read_page · 87b8872d
      Chao Yu 提交于
      This patch adds a tracepoint for submit_read_page.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      [Jaegeuk Kim: integrate tracepoints of f2fs_submit_read(_write)_page]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      87b8872d
    • J
      f2fs: introduce a bio array for per-page write bios · 1ff7bd3b
      Jaegeuk Kim 提交于
      The f2fs has three bio types, NODE, DATA, and META, and manages some data
      structures per each bio types.
      
      The codes are a little bit messy, thus, this patch introduces a bio array
      which groups individual data structures as follows.
      
      struct f2fs_bio_info {
      	struct bio *bio;		/* bios to merge */
      	sector_t last_block_in_bio;	/* last block number */
      	struct mutex io_mutex;		/* mutex for bio */
      };
      
      struct f2fs_sb_info {
      	...
      	struct f2fs_bio_info write_io[NR_PAGE_TYPE];	/* for write bios */
      	...
      };
      
      The code changes from this new data structure are trivial.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      1ff7bd3b
    • J
      f2fs: use sbi->write_mutex for write bios · 971767ca
      Jaegeuk Kim 提交于
      This patch removes an unnecessary semaphore (i.e., sbi->bio_sem).
      There is no reason to use the semaphore when f2fs submits read and write IOs.
      Instead, let's use a write mutex and cover the sbi->bio[] by the lock.
      
      Change log from v1:
       o split write_mutex suggested by Chao Yu
      
      Chao described,
      "All DATA/NODE/META bio buffers in superblock is protected by
      'sbi->write_mutex', but each bio buffer area is independent, So we
      should split write_mutex to three for DATA/NODE/META."
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      971767ca
    • J
      f2fs: clean up the do_submit_bio flow · 7d5e5109
      Jaegeuk Kim 提交于
      This patch introduces PAGE_TYPE_OF_BIO() and cleans up do_submit_bio() with it.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7d5e5109
    • J
      f2fs: add a tracepoint for f2fs_issue_discard · 1661d07c
      Jaegeuk Kim 提交于
      This patch adds a tracepoint for f2fs_issue_discard.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      1661d07c
    • J
      f2fs: introduce f2fs_issue_discard() to clean up · 37208879
      Jaegeuk Kim 提交于
      Change log from v1:
       o fix 32bit drops reported by Dan Carpenter
      
      This patch adds f2fs_issue_discard() to clean up blkdev_issue_discard() flows.
      
      Dan carpenter reported:
      "block_t is a 32 bit type and sector_t is a 64 bit type.  The upper 32
      bits of the sector_t are not used because the shift will wrap."
      Bug-Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      37208879
    • J
      f2fs: add key functions for small discards · b2955550
      Jaegeuk Kim 提交于
      This patch adds key functions to activate the small discard feature.
      
      Note that this procedure is conducted during the checkpoint only.
      
      In flush_sit_entries(), when a new dirty sit entry is flushed, f2fs calls
      add_discard_addrs() which searches candidates to be discarded.
      The candidates should be marked *invalidated* and also previous checkpoint
      recognizes it as *valid*.
      
      At the end of a checkpoint procedure, f2fs throws discards based on the
      discard entry list.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b2955550
    • J
      f2fs: add a slab cache entry for small discards · 7fd9e544
      Jaegeuk Kim 提交于
      This patch adds a slab cache entry for small discards.
      
      Each entry consists of:
      
      struct discard_entry {
      	struct list_head list;	/* list head */
      	block_t blkaddr;	/* block address to be discarded */
      	int len;		/* # of consecutive blocks of the discard */
      };
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7fd9e544