提交 · 940a6d34b31b96f0748a4b688a551a0890b2b229 · openanolis / cloud-kernel

23 12月, 2013 21 次提交

f2fs: introduce a new direct_IO write path · bfad7c2d

由 Jaegeuk Kim 提交于 12月 16, 2013

Previously, f2fs doesn't support direct IOs with high performance, which throws
every write requests via the buffered write path, resulting in highly
performance degradation due to memory opeations like copy_from_user.

This patch introduces a new direct IO path in which every write requests are
processed by generic blockdev_direct_IO() with enhanced get_block function.

The get_data_block() in f2fs handles:
1. if original data blocks are allocates, then give them to blockdev.
2. otherwise,
  a. preallocate requested block addresses
  b. do not use extent cache for better performance
  c. give the block addresses to blockdev

This policy induces that:
- new allocated data are sequentially written to the disk
- updated data are randomly written to the disk.
- f2fs gives consistency on its file meta, not file data.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

bfad7c2d

f2fs: introduce sysfs entry to control in-place-update policy · 216fbd64

由 Jaegeuk Kim 提交于 11月 07, 2013

This patch introduces new sysfs entries for users to control the policy of
in-place-updates, namely IPU, in f2fs.

Sometimes f2fs suffers from performance degradation due to its out-of-place
update policy that produces many additional node block writes.
If the storage performance is very dependant on the amount of data writes
instead of IO patterns, we'd better drop this out-of-place update policy.

This patch suggests 5 polcies and their triggering conditions as follows.

[sysfs entry name = ipu_policy]

0: F2FS_IPU_FORCE       all the time,
1: F2FS_IPU_SSR         if SSR mode is activated,
2: F2FS_IPU_UTIL        if FS utilization is over threashold,
3: F2FS_IPU_SSR_UTIL    if SSR mode is activated and FS utilization is over
                        threashold,
4: F2FS_IPU_DISABLE    disable IPU. (=default option)

[sysfs entry name = min_ipu_util]

This parameter controls the threshold to trigger in-place-updates.
The number indicates percentage of the filesystem utilization, and used by
F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.

For more details, see need_inplace_update() in segment.h.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

216fbd64

f2fs: refactor bio->rw handling · 458e6197

由 Jaegeuk Kim 提交于 12月 11, 2013

This patch introduces f2fs_io_info to mitigate the complex parameter list.

struct f2fs_io_info {
	enum page_type type;		/* contains DATA/NODE/META/META_FLUSH */
	int rw;				/* contains R/RS/W/WS */
	int rw_flag;			/* contains REQ_META/REQ_PRIO */
}

1. f2fs_write_data_pages
 - DATA
 - WRITE_SYNC is set when wbc->WB_SYNC_ALL.

2. sync_node_pages
 - NODE
 - WRITE_SYNC all the time

3. sync_meta_pages
 - META
 - WRITE_SYNC all the time
 - REQ_META | REQ_PRIO all the time

 ** f2fs_submit_merged_bio() handles META_FLUSH.

4. ra_nat_pages, ra_sit_pages, ra_sum_pages
 - META
 - READ_SYNC

Cc: Fan Li <fanofcode.li@samsung.com>
Cc: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

458e6197

f2fs: merge pages with the same sync_mode flag · 63a0b7cb

由 Fan Li 提交于 12月 09, 2013

Previously f2fs submits most of write requests using WRITE_SYNC, but f2fs_write_data_pages
submits last write requests by sync_mode flags callers pass.

This causes a performance problem since continuous pages with different sync flags
can't be merged in cfq IO scheduler(thanks yu chao for pointing it out), and synchronous
requests often take more time.

This patch makes the following modifies to DATA writebacks:

1. every page will be written back using the sync mode caller pass.
2. only pages with the same sync mode can be merged in one bio request.

These changes are restricted to DATA pages.Other types of writebacks are modified
To remain synchronous.

In my test with tiotest, f2fs sequence write performance is improved by about 7%-10% ,
and this patch has no obvious impact on other performance tests.
Signed-off-by: NFan Li <fanofcode.li@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

63a0b7cb

f2fs: refactor bio-related operations · 93dfe2ac

由 Jaegeuk Kim 提交于 11月 30, 2013

This patch integrates redundant bio operations on read and write IOs.

1. Move bio-related codes to the top of data.c.
2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read
   bios additionally.
3. Introduce __submit_merged_bio to submit the merged bio.
4. Change f2fs_readpage to f2fs_submit_page_bio.
5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and
   submit_write_page.
Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com >
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

93dfe2ac

f2fs: remove the own bi_private allocation · 187b5b8b

由 Jaegeuk Kim 提交于 11月 30, 2013

Previously f2fs allocates its own bi_private data structure all the time even
though we don't use it. But, can we remove this bi_private allocation?

This patch removes such the additional bi_private allocation.

1. Retrieve f2fs_sb_info from its page->mapping->host->i_sb.
 - This removes the usecases of bi_private in end_io.

2. Use bi_private only when we really need it.
 - The bi_private is used only when the checkpoint procedure is conducted.
 - When conducting the checkpoint, f2fs submits a META_FLUSH bio to wait its bio
completion.
 - Since we have no dependancies to remove bi_private now, let's just use
 bi_private pointer as the completion pointer.
Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

187b5b8b

f2fs: bug fix on bit overflow from 32bits to 64bits · f9a4e6df

由 Jaegeuk Kim 提交于 11月 28, 2013

This patch fixes some bit overflows by the shift operations.

Dan Carpenter reported potential bugs on bit overflows as follows.

fs/f2fs/segment.c:910 submit_write_page()
	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
fs/f2fs/checkpoint.c:429 get_valid_checkpoint()
	warn: should '1 << ()' be a 64 bit type?
fs/f2fs/data.c:408 f2fs_readpage()
	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
fs/f2fs/data.c:457 submit_read_page()
	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
fs/f2fs/data.c:525 get_data_block_ro()
	warn: should 'i << blkbits' be a 64 bit type?
Bug-Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

f9a4e6df

f2fs: send REQ_META or REQ_PRIO when reading meta area · 03232305

由 Changman Lee 提交于 11月 24, 2013

Let's send REQ_META or REQ_PRIO when reading meta area such as NAT/SIT
etc.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

03232305

f2fs: add detailed information of bio types in the tracepoints · a709f4a2

由 Jaegeuk Kim 提交于 11月 24, 2013

This patch inserts information of bio types in more detail.
So, we can now see REQ_META and REQ_PRIO too.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a709f4a2

f2fs: read contiguous sit entry pages by merging for mount performance · 74de593a

由 Chao Yu 提交于 11月 22, 2013

Previously we read sit entries page one by one, this method lost the chance
of reading contiguous page together. So we read pages as contiguous as
possible for better mount performance.

change log:
 o merge judgements/use 'Continue' or 'Break' instead of 'Goto' as Gu Zheng
   suggested.
 o add mark_page_accessed() before release page to delay VM reclaiming.
 o remove '*order' for simplification of function as Jaegeuk Kim suggested.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: fix a bug on the block address calculation]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

74de593a

f2fs: adds a tracepoint for f2fs_submit_read_bio · d4d288bc

由 Chao Yu 提交于 11月 24, 2013

This patch adds a tracepoint for f2fs_submit_read_bio.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: integrate tracepoints of f2fs_submit_read(_write)_bio]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d4d288bc

f2fs: adds a tracepoint for submit_read_page · 87b8872d

由 Chao Yu 提交于 11月 20, 2013

This patch adds a tracepoint for submit_read_page.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: integrate tracepoints of f2fs_submit_read(_write)_page]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

87b8872d

f2fs: introduce a bio array for per-page write bios · 1ff7bd3b

由 Jaegeuk Kim 提交于 11月 19, 2013

The f2fs has three bio types, NODE, DATA, and META, and manages some data
structures per each bio types.

The codes are a little bit messy, thus, this patch introduces a bio array
which groups individual data structures as follows.

struct f2fs_bio_info {
	struct bio *bio;		/* bios to merge */
	sector_t last_block_in_bio;	/* last block number */
	struct mutex io_mutex;		/* mutex for bio */
};

struct f2fs_sb_info {
	...
	struct f2fs_bio_info write_io[NR_PAGE_TYPE];	/* for write bios */
	...
};

The code changes from this new data structure are trivial.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1ff7bd3b

f2fs: use sbi->write_mutex for write bios · 971767ca

由 Jaegeuk Kim 提交于 11月 18, 2013

This patch removes an unnecessary semaphore (i.e., sbi->bio_sem).
There is no reason to use the semaphore when f2fs submits read and write IOs.
Instead, let's use a write mutex and cover the sbi->bio[] by the lock.

Change log from v1:
 o split write_mutex suggested by Chao Yu

Chao described,
"All DATA/NODE/META bio buffers in superblock is protected by
'sbi->write_mutex', but each bio buffer area is independent, So we
should split write_mutex to three for DATA/NODE/META."
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

971767ca

f2fs: clean up the do_submit_bio flow · 7d5e5109

由 Jaegeuk Kim 提交于 11月 18, 2013

This patch introduces PAGE_TYPE_OF_BIO() and cleans up do_submit_bio() with it.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

7d5e5109

f2fs: add a tracepoint for f2fs_issue_discard · 1661d07c

由 Jaegeuk Kim 提交于 11月 12, 2013

This patch adds a tracepoint for f2fs_issue_discard.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1661d07c

f2fs: introduce f2fs_issue_discard() to clean up · 37208879

由 Jaegeuk Kim 提交于 11月 12, 2013

Change log from v1:
 o fix 32bit drops reported by Dan Carpenter

This patch adds f2fs_issue_discard() to clean up blkdev_issue_discard() flows.

Dan carpenter reported:
"block_t is a 32 bit type and sector_t is a 64 bit type.  The upper 32
bits of the sector_t are not used because the shift will wrap."
Bug-Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

37208879

f2fs: add key functions for small discards · b2955550

由 Jaegeuk Kim 提交于 11月 12, 2013

This patch adds key functions to activate the small discard feature.

Note that this procedure is conducted during the checkpoint only.

In flush_sit_entries(), when a new dirty sit entry is flushed, f2fs calls
add_discard_addrs() which searches candidates to be discarded.
The candidates should be marked *invalidated* and also previous checkpoint
recognizes it as *valid*.

At the end of a checkpoint procedure, f2fs throws discards based on the
discard entry list.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b2955550

f2fs: add a slab cache entry for small discards · 7fd9e544

由 Jaegeuk Kim 提交于 11月 15, 2013

This patch adds a slab cache entry for small discards.

Each entry consists of:

struct discard_entry {
	struct list_head list;	/* list head */
	block_t blkaddr;	/* block address to be discarded */
	int len;		/* # of consecutive blocks of the discard */
};
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

7fd9e544

f2fs: improve searching speed of __next_free_blkoff · e81c93cf

由 Changman Lee 提交于 11月 15, 2013

To find a zero bit using the result of OR operation between ckpt_valid_map
and cur_valid_map is more fast than find a zero bit in each bitmap.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
[Jaegeuk Kim: adjust changed function name]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e81c93cf

f2fs: introduce __find_rev_next(_zero)_bit · 9a7f143a

由 Changman Lee 提交于 11月 15, 2013

When f2fs_set_bit is used, in a byte MSB and LSB is reversed,
in that case we can use __find_rev_next_bit or __find_rev_next_zero_bit.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
[Jaegeuk Kim: change the function names]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9a7f143a

11 11月, 2013 1 次提交

f2fs: issue more large discard command · 29e59c14

由 Changman Lee 提交于 11月 11, 2013

o Changes from v1
  Use find_next(_zero)_bit suggested by jg.kim

When f2fs issues discard command, if segment is contiguous,
let's issue more large segment to gather adjacent segments.

** blktrace **
179,1    0     5859    42.619023770   971  C   D 131072 + 2097152 [0]
179,1    0    33665   108.840475468   971  C   D 2228224 + 2494464 [0]
179,1    0    33671   109.131616427   971  C   D 14909440 + 344064 [0]
179,1    0    33677   109.137100677   971  C   D 15261696 + 4096 [0]
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

29e59c14

08 11月, 2013 1 次提交

f2fs: cleanup waiting routine for writeback pages in cp · fb51b5ef

由 Changman Lee 提交于 11月 07, 2013

use genernal method supported by kernel

 o changes from v1
   If any waiter exists at end io, wake up it.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

fb51b5ef

06 11月, 2013 1 次提交

f2fs: avoid to use a NULL point in destroy_segment_manager · 3b03f724

由 Chao Yu 提交于 11月 06, 2013

A NULL point should avoid to be used in destroy_segment_manager after allocating
memory fail for f2fs_sm_info.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

3b03f724

30 10月, 2013 1 次提交

f2fs: change the method of calculating the number summary blocks · 9a47938b

由 Fan Li 提交于 10月 29, 2013

npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary
while its actual size is SUMMARY_SIZE. So the result sometimes is bigger than
actual number by one, which causes checkpoint can't be written into disk
contiguously, and sometimes summary blocks can't be compacted like they should.
Besides, when writing summary blocks into pages, if remain space in a page
isn't big enough for one f2fs_summary, it will be left unused, current code
seems not to take it into account.
Signed-off-by: NFan Li <fanofcode.li@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9a47938b

29 10月, 2013 1 次提交

f2fs: add an option to avoid unnecessary BUG_ONs · 5d56b671

由 Jaegeuk Kim 提交于 10月 29, 2013

If you want to remove unnecessary BUG_ONs, you can just turn off F2FS_CHECK_FS
in your kernel config.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5d56b671

28 10月, 2013 1 次提交

f2fs: remove unnecessary segment bitmap updates · 4625d6aa

由 Changman Lee 提交于 10月 25, 2013

Only one dirty type is set in __locate_dirty_segment and we can know
dirty type of segment. So we don't need to check other dirty types.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

4625d6aa

25 10月, 2013 4 次提交

f2fs: remove redundant set_page_dirty from write_compacted_summaries · e8d61a74

由 Chao Yu 提交于 10月 24, 2013

Previously, set_page_dirty is called every time after writting one summary info
into compacted summary page,
To avoid redundant set_page_dirty, we only call set_page_dirty before release
page.
Signed-off-by: NYu Chao <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e8d61a74

f2fs: introduce f2fs_balance_fs_bg for some background jobs · 4660f9c0

由 Jaegeuk Kim 提交于 10月 24, 2013

This patch merges some background jobs into this new function.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

4660f9c0

f2fs: reclaim prefree segments periodically · 81eb8d6e

由 Jaegeuk Kim 提交于 10月 24, 2013

Previously, f2fs postpones reclaiming prefree segments into free segments
as much as possible.
However, if user writes and deletes a bunch of data without any sync or fsync
calls, some flash storages can suffer from garbage collections.

So, this patch adds the reclaiming codes to f2fs_write_node_pages and background
GC thread.

If there are a lot of prefree segments, let's do checkpoint so that f2fs
submits discard commands for the prefree regions to the flash storage.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

81eb8d6e

f2fs: clean up several status-related operations · dcdfff65

由 Jaegeuk Kim 提交于 10月 22, 2013

This patch cleans up improper definitions that update some status information.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

dcdfff65

22 10月, 2013 2 次提交

f2fs: no need to check other dirty_segmap when the seg has been found · 435f2a1b

由 Haicheng Li 提交于 10月 18, 2013

Because one dirty seg can only be mapped to one dirty_type. Otherwise, it's a bug.
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
[Jaegeuk Kim: modify a comment related to this patch]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

435f2a1b

f2fs: use true and false for boolean value · cffbfa66

由 Haicheng Li 提交于 10月 18, 2013

Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

cffbfa66

18 10月, 2013 1 次提交

f2fs: avoid wait if IO end up when do_checkpoint for better performance · e2340887

由 Gu Zheng 提交于 10月 14, 2013

Previously, do_checkpoint() will call congestion_wait() for waiting the pages
(previous submitted node/meta/data pages) to be written back.
Because congestion_wait() will set a regular period (e.g. HZ / 50 ) for waiting, and
no additional wake up mechanism was introduced if IO ends up before regular period costed.
Yuan Zhong found there is a situation that after the pages have been written back,
but the checkpoint thread still wait for congestion_wait to exit.

So here we store checkpoint task into f2fs_sb when doing checkpoint, it'll wait for IO completes
if there's IO going on, and in the end IO path, wake up checkpoint task when IO ends up.

Thanks to Yuan Zhong's pre work about this problem.
Reported-by: NYuan Zhong <yuan.mark.zhong@samsung.com>
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e2340887

24 9月, 2013 1 次提交

f2fs: avoid allocating failure in bio_alloc · cc7b1bb1

由 Chao Yu 提交于 9月 22, 2013

This patch add macro MAX_BIO_BLOCKS to limit value of npages in
f2fs_bio_alloc, it can avoid allocating failure in bio_alloc caused by
npages is larger than BIO_MAX_PAGES.
Signed-off-by: NYu Chao <chao2.yu@samsung.com>
Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

cc7b1bb1

19 8月, 2013 1 次提交

f2fs: fix a compound statement label error · 7b405275

由 Gu Zheng 提交于 8月 19, 2013

An error "label at end of compound statement" will occur if CONFIG_F2FS_STAT_FS
disabled.
fs/f2fs/segment.c:556:1: error: label at end of compound statement
So clean up the 'out' label to fix it.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

7b405275

12 8月, 2013 1 次提交
- G
  f2fs: clean up the needless end 'return' of void function · 41dfde13
  由 Gu Zheng 提交于 8月 09, 2013
```
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
```
  41dfde13
06 8月, 2013 1 次提交

f2fs: fix a deadlock in fsync · a569469e

由 Jin Xu 提交于 8月 05, 2013

This patch fixes a deadlock bug that occurs quite often when there are
concurrent write and fsync on a same file.

Following is the simplified call trace when tasks get hung.

fsync thread:
- f2fs_sync_file
 ...
 - f2fs_write_data_pages
 ...
  - update_extent_cache
  ...
   - update_inode
    - wait_on_page_writeback

bdi writeback thread
- __writeback_single_inode
 - f2fs_write_data_pages
  - mutex_lock(sbi->writepages)

The deadlock happens when the fsync thread waits on a inode page that has
been added to the f2fs' cached bio sbi->bio[NODE], and unfortunately,
no one else could be able to submit the cached bio to block layer for
writeback. This is because the fsync thread already hold a sbi->fs_lock and
the sbi->writepages lock, causing the bdi thread being blocked when attempt
to write data pages for the same inode. At the same time, f2fs_gc thread
does not notice the situation and could not help. Even the sync syscall
gets blocked.

To fix it, we could submit the cached bio first before waiting on a inode page
that is being written back.
Signed-off-by: NJin Xu <jinuxstyle@gmail.com>
[Jaegeuk Kim: add more cases to use f2fs_wait_on_page_writeback]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a569469e

30 7月, 2013 1 次提交

f2fs: move bio_private allocation out of f2fs_bio_alloc() · d8207f69

由 Gu Zheng 提交于 7月 25, 2013

bio->bi_private is not always needed. As in the reading data path,
end_read_io does not need bio_private for further using, so moving
bio_private allocation out of f2fs_bio_alloc(). Alloc it in the
submit_write_page(), and ignore it in the f2fs_readpage().
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d8207f69

02 7月, 2013 1 次提交

f2fs: remove reusing any prefree segments · 763bfe1b

由 Jaegeuk Kim 提交于 6月 27, 2013

This patch removes check_prefree_segments initially designed to enhance the
performance by narrowing the range of LBA usage across the whole block device.

When allocating a new segment, previous f2fs tries to find proper prefree
segments, and then, if finds a segment, it reuses the segment for further
data or node block allocation.

However, I found that this was totally wrong approach since the prefree segments
have several data or node blocks that will be used by the roll-forward mechanism
operated after sudden-power-off.

Let's assume the following scenario.

/* write 8MB with fsync */
for (i = 0; i < 2048; i++) {
	offset = i * 4096;
	write(fd, offset, 4KB);
	fsync(fd);
}

In this case, naive segment allocation sequence will be like:
 data segment: x, x+1, x+2, x+3
 node segment: y, y+1, y+2, y+3.

But, if we can reuse prefree segments, the sequence can be like:
 data segment: x, x+1, y, y+1
 node segment: y, y+1, y+2, y+3.
Because, y, y+1, and y+2 became prefree segments one by one, and those are
reused by data allocation.

After conducting this workload, we should consider how to recover the latest
inode with its data.
If we reuse the prefree segments such as y or y+1, we lost the old node blocks
so that f2fs even cannot start roll-forward recovery.

Therefore, I suggest that we should remove reusing prefree segments.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

763bfe1b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功