提交 · b7ad7512b84b26f1c0ec823647a387627c138d32 · openanolis / cloud-kernel

23 2月, 2016 40 次提交

f2fs: split journal cache from curseg cache · b7ad7512

由 Chao Yu 提交于 2月 19, 2016

In curseg cache, f2fs caches two different parts:
 - datas of current summay block, i.e. summary entries, footer info.
 - journal info, i.e. sparse nat/sit entries or io stat info.

With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.

So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem

When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b7ad7512

f2fs: enhance IO path with block plug · e9f5b8b8

由 Chao Yu 提交于 2月 14, 2016

Try to use block plug in more place as below to let process cache bios
as much as possbile, in order to reduce lock overhead of queue in IO
scheduler.
1) sync_meta_pages
2) ra_meta_pages
3) f2fs_balance_fs_bg
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e9f5b8b8

f2fs: introduce f2fs_journal struct to wrap journal info · dfc08a12

由 Chao Yu 提交于 2月 14, 2016

Introduce a new structure f2fs_journal to wrap journal info in struct
f2fs_summary_block for readability.

struct f2fs_journal {
	union {
		__le16 n_nats;
		__le16 n_sits;
	};
	union {
		struct nat_journal nat_j;
		struct sit_journal sit_j;
		struct f2fs_extra_info info;
	};
} __packed;

struct f2fs_summary_block {
	struct f2fs_summary entries[ENTRIES_IN_SUM];
	struct f2fs_journal journal;
	struct summary_footer footer;
} __packed;
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

dfc08a12

f2fs crypto: avoid unneeded memory allocation when {en/de}crypting symlink · 922ec355

由 Chao Yu 提交于 2月 15, 2016

This patch adopts f2fs with codes of ext4, it removes unneeded memory
allocation in creating/accessing path of symlink.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

922ec355

f2fs crypto: handle unexpected lack of encryption keys · ae108668

由 Chao Yu 提交于 2月 14, 2016

This patch syncs f2fs with commit abdd438b ("ext4 crypto: handle
unexpected lack of encryption keys") from ext4.

Fix up attempts by users to try to write to a file when they don't
have access to the encryption key.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ae108668

f2fs crypto: make sure the encryption info is initialized on opendir(2) · ed3360ab

由 Chao Yu 提交于 2月 14, 2016

This patch syncs f2fs with commit 6bc445e0 ("ext4 crypto: make
sure the encryption info is initialized on opendir(2)") from ext4.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ed3360ab

f2fs: support revoking atomic written pages · 28bc106b

由 Chao Yu 提交于 2月 06, 2016

f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file

With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.

But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.

So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.

If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

28bc106b

f2fs: split drop_inmem_pages from commit_inmem_pages · 29b96b54

由 Chao Yu 提交于 2月 06, 2016

Split drop_inmem_pages from commit_inmem_pages for code readability,
and prepare for the following modification.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

29b96b54

f2fs: avoid garbage lenghs in dentries · 7d9dfa1d

由 Jaegeuk Kim 提交于 2月 12, 2016

This patch fixes to eliminate garbage name lengths in dentries in order
to provide correct answers of readdir.

For example, if a valid dentry consists of:
 bitmap : 1   1 1 1
 len    : 32  0 x 0,

readdir can start with second bit_pos having len = 0.
Or, it can start with third bit_pos having garbage.

In both of cases, we should avoid to try filling dentries.
So, this patch not only removes any garbage length, but also avoid entering
zero length case in readdir.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7d9dfa1d

f2fs crypto: sync with ext4's fname padding · a263669f

由 Jaegeuk Kim 提交于 2月 10, 2016

This patch fixes wrong adoption on fname padding.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a263669f

f2fs: use correct errno · 60b286c4

由 Jaegeuk Kim 提交于 2月 09, 2016

This patch is to fix misused error number.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

60b286c4

f2fs crypto: add missing locking for keyring_key access · 745e8490

由 Jaegeuk Kim 提交于 2月 05, 2016

This patch adopts:
	ext4 crypto: add missing locking for keyring_key access
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

745e8490

f2fs crypto: check for too-short encrypted file names · 1dafa51d

由 Jaegeuk Kim 提交于 2月 05, 2016

This patch adopts:
	ext4 crypto: check for too-short encrypted file names

An encrypted file name should never be shorter than an 16 bytes, the
AES block size.  The 3.10 crypto layer will oops and crash the kernel
if ciphertext shorter than the block size is passed to it.

Fortunately, in modern kernels the crypto layer will not crash the
kernel in this scenario, but nevertheless, it represents a corrupted
directory, and we should detect it and mark the file system as
corrupted so that e2fsck can fix this.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1dafa51d

f2fs crypto: f2fs_page_crypto() doesn't need a encryption context · ce855a3b

由 Jaegeuk Kim 提交于 2月 05, 2016

This patch adopts:
	ext4 crypto: ext4_page_crypto() doesn't need a encryption context

Since ext4_page_crypto() doesn't need an encryption context (at least
not any more), this allows us to simplify a number function signature
and also allows us to avoid needing to allocate a context in
ext4_block_write_begin().  It also means we no longer need a separate
ext4_decrypt_one() function.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ce855a3b

f2fs crypto: fix spelling typo in comment · 0fac2d50

由 Jaegeuk Kim 提交于 2月 05, 2016

This patch adopts:
	ext4 crypto: fix spelling typo in comment
Signed-off-by: NLaurent Navet <laurent.navet@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0fac2d50

f2fs crypto: replace some BUG_ON()'s with error checks · 66aa3e12

由 Jaegeuk Kim 提交于 2月 05, 2016

This patch adopts:
	ext4 crypto: replace some BUG_ON()'s with error checks
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

66aa3e12

f2fs: increase i_size to avoid missing data · 8ef2af45

由 Jaegeuk Kim 提交于 2月 08, 2016

When finsert is doing with dirting pages, we should increase i_size right away.
Otherwise, the moved page is able to be dropped by the following
filemap_write_and_wait_range before updating i_size.
Especially, it can be done by
	if ((page->index >= end_index + 1) || !offset)
		goto out;
in f2fs_write_data_page.

This should resolve the below xfstests/091 failure reported by Dave.

$ diff -u tests/generic/091.out /home/dave/src/xfstests-dev/results//f2fs/generic/091.out.bad
--- tests/generic/091.out       2014-01-20 16:57:33.000000000 +1100
+++ /home/dave/src/xfstests-dev/results//f2fs/generic/091.out.bad       2016-02-08 15:21:02.701375087 +1100
@@ -1,7 +1,18 @@
 QA output created by 091
 fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W
+mapped writes DISABLED
+skipping insert range behind EOF
+skipping insert range behind EOF
+truncating to largest ever: 0x11e00
+dowrite: write: Invalid argument
+LOG DUMP (7 total operations):
+1(  1 mod 256): SKIPPED (no operation)
+2(  2 mod 256): SKIPPED (no operation)
+3(  3 mod 256): FALLOC   0x2e0f2 thru 0x3134a  (0x3258 bytes) PAST_EOF
+4(  4 mod 256): SKIPPED (no operation)
+5(  5 mod 256): SKIPPED (no operation)
+6(  6 mod 256): TRUNCATE UP    from 0x0 to 0x11e00
+7(  7 mod 256): WRITE    0x73400 thru 0x79fff  (0x6c00 bytes) HOLE
+Log of operations saved to "/mnt/test/junk.fsxops"; replay with --replay-ops
+Correct content saved for comparison
+(maybe hexdump "/mnt/test/junk" vs "/mnt/test/junk.fsxgood")
Reported-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8ef2af45

f2fs: preallocate blocks for buffered aio writes · 24b84912

由 Jaegeuk Kim 提交于 2月 03, 2016

This patch preallocates data blocks for buffered aio writes.
With this patch, we can avoid redundant locking and unlocking of node pages
given consecutive aio request.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

24b84912

f2fs: move dio preallocation into f2fs_file_write_iter · b439b103

由 Jaegeuk Kim 提交于 2月 03, 2016

This patch moves preallocation code for direct IOs into f2fs_file_write_iter.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b439b103

f2fs: fix missing skip pages info · d31c7c3f

由 Yunlei He 提交于 2月 04, 2016

fix missing skip pages info in f2fs_writepages trace event.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d31c7c3f

f2fs: introduce f2fs_submit_merged_bio_cond · 0c3a5797

由 Chao Yu 提交于 1月 18, 2016

f2fs use single bio buffer per type data (META/NODE/DATA) for caching
writes locating in continuous block address as many as possible, after
submitting, these writes may be still cached in bio buffer, so we have
to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.

Unfortunately, in the scenario of high concurrency, bio buffer could be
flushed by someone else before we submit it as below reasons:
a) there is no space in bio buffer.
b) add a request of different type (SYNC, ASYNC).
c) add a discontinuous block address.

For this condition, f2fs_submit_merged_bio will be devastating, because
it could break the following merging of writes in bio buffer, split one
big bio into two smaller one.

This patch introduces f2fs_submit_merged_bio_cond which can do a
conditional submitting with bio buffer, before submitting it will judge
whether:
 - page in DATA type bio buffer is matching with specified page;
 - page in DATA type bio buffer is belong to specified inode;
 - page in NODE type bio buffer is belong to specified inode;
If there is no eligible page in bio buffer, we will skip submitting step,
result in gaining more chance to merge consecutive block IOs in bio cache.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0c3a5797

f2fs: fix conflict on page->private usage · d48dfc20

由 Jaegeuk Kim 提交于 1月 29, 2016

This patch fixes confilct on page->private value between f2fs_trace_pid and
atomic page.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d48dfc20

f2fs: flush bios to handle cp_error in put_super · 17c19120

由 Jaegeuk Kim 提交于 1月 29, 2016

Sometimes, if cp_error is set, there remains under-writeback pages, resulting in
kernel hang in put_super.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

17c19120

f2fs: wait on page's writeback in writepages path · fa3d2bdf

由 Jaegeuk Kim 提交于 1月 28, 2016

Likewise f2fs_write_cache_pages, let's do for node and meta pages too.
Especially, for node blocks, we should do this before marking its fsync
and dentry flags.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fa3d2bdf

f2fs: speed up handling holes in fiemap · da85985c

由 Chao Yu 提交于 1月 26, 2016

This patch makes f2fs_map_blocks supporting returning next potential
page offset which skips hole region in indirect tree of inode, and
use it to speed up fiemap in handling big hole case.

Test method:
xfs_io -f /mnt/f2fs/file  -c "pwrite 1099511627776 4096"
time xfs_io -f /mnt/f2fs/file -c "fiemap -v"

Before:
time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
/mnt/f2fs/file:
 EXT: FILE-OFFSET              BLOCK-RANGE      TOTAL FLAGS
   0: [0..2147483647]:         hole             2147483648
   1: [2147483648..2147483655]: 81920..81927         8   0x1

real    3m3.518s
user    0m0.000s
sys     3m3.456s

After:
time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
/mnt/f2fs/file:
 EXT: FILE-OFFSET              BLOCK-RANGE      TOTAL FLAGS
   0: [0..2147483647]:         hole             2147483648
   1: [2147483648..2147483655]: 81920..81927         8   0x1

real    0m0.008s
user    0m0.000s
sys     0m0.008s
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

da85985c

f2fs: introduce get_next_page_offset to speed up SEEK_DATA · 3cf45747

由 Chao Yu 提交于 1月 26, 2016

When seeking data in ->llseek, if we encounter a big hole which covers
several dnode pages, we will try to seek data from index of page which
is the first page of next dnode page, at most we could skip searching
(ADDRS_PER_BLOCK - 1) pages.

However it's still not efficient, because if our indirect/double-indirect
pointer are NULL, there are no dnode page locate in the tree indirect/
double-indirect pointer point to, it's not necessary to search the whole
region.

This patch introduces get_next_page_offset to calculate next page offset
based on current searching level and max searching level returned from
get_dnode_of_data, with this, we could skip searching the entire area
indirect or double-indirect node block is not exist.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3cf45747

f2fs: remove unneeded pointer conversion · 81ca7350

由 Chao Yu 提交于 1月 26, 2016

There are redundant pointer conversion in following call stack:
 - at position a, inode was been converted to f2fs_file_info.
 - at position b, f2fs_file_info was been converted to inode again.

 - truncate_blocks(inode,..)
  - fi = F2FS_I(inode)		---a
  - ADDRS_PER_PAGE(node_page, fi)
   - addrs_per_inode(fi)
    - inode = &fi->vfs_inode	---b
    - f2fs_has_inline_xattr(inode)
     - fi = F2FS_I(inode)
     - is_inode_flag_set(fi,..)

In order to avoid unneeded conversion, alter ADDRS_PER_PAGE and
addrs_per_inode to acept parameter with type of inode pointer.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

81ca7350

f2fs: simplify __allocate_data_blocks · 5b8db7fa

由 Chao Yu 提交于 1月 26, 2016

This patch uses existing function f2fs_map_block to simplify implementation
of __allocate_data_blocks.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5b8db7fa

f2fs: simplify f2fs_map_blocks · 4fe71e88

由 Chao Yu 提交于 1月 26, 2016

In f2fs_map_blocks, we use duplicated codes to handle first block mapping
and the following blocks mapping, it's unnecessary. This patch simplifies
f2fs_map_blocks to avoid using copied codes.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4fe71e88

f2fs: introduce lifetime write IO statistics · 8f1dbbbb

由 Shuoran Liu 提交于 1月 27, 2016

This patch introduces lifetime IO write statistics exposed to the sysfs interface.
The write IO amount is obtained from block layer, accumulated in the file system and
stored in the hot node summary of checkpoint.
Signed-off-by: NShuoran Liu <liushuoran@huawei.com>
Signed-off-by: NPengyang Hou <houpengyang@huawei.com>
[Jaegeuk Kim: add sysfs documentation]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8f1dbbbb

f2fs: give scheduling point in shrinking path · 6fe2bc95

由 Jaegeuk Kim 提交于 1月 20, 2016

It needs to give a chance to be rescheduled while shrinking slab entries.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6fe2bc95

f2fs: improve shrink performance of extent nodes · 201ef5e0

由 Hou Pengyang 提交于 1月 26, 2016

On the worst case, we need to scan the whole radix tree and every rb-tree to
free the victimed extent_nodes when shrinking.

Pengyang initially introduced a victim_list to record the victimed extent_nodes,
and free these extent_nodes by just scanning a list.

Later, Chao Yu enhances the original patch to improve memory footprint by
removing victim list.

The policy of lru list shrinking becomes:
1) lock lru list's lock
2) trylock extent tree's lock
3) remove extent node from lru list
4) unlock lru list's lock
5) do shrink
6) repeat 1) to 5)
Signed-off-by: NHou Pengyang <houpengyang@huawei.com>
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

201ef5e0

f2fs: don't set cached_en if it will be freed · 42926744

由 Jaegeuk Kim 提交于 1月 26, 2016

If en has empty list pointer, it will be freed sooner, so we don't need to
set cached_en with it.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

42926744

f2fs: move extent_node list operations being coupled with rbtree operation · 43a2fa18

由 Jaegeuk Kim 提交于 1月 26, 2016

This patch moves extent_node list operations to be handled together with
its rbtree operations.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

43a2fa18

f2fs: reconstruct the code to free an extent_node · a03f01f2

由 Hou Pengyang 提交于 1月 26, 2016

There are three steps to free an extent node:
1) list_del_init, 2)__detach_extent_node, 3) kmem_cache_free

In path f2fs_destroy_extent_tree, 1->2->3 to free a node,
But in path f2fs_update_extent_tree_range, it is 2->1->3.

This patch makes all the order to be: 1->2->3
It makes sense, since in the next patch, we import a victim list in the
path shrink_extent_tree, we could check if the extent_node is in the victim
list by checking the list_empty(). So it is necessary to put 1) first.
Signed-off-by: NHou Pengyang <houpengyang@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a03f01f2

f2fs: use wq_has_sleeper for cp_wait wait_queue · 7c506896

由 Jaegeuk Kim 提交于 1月 26, 2016

We need to use wq_has_sleeper including smp_mb to consider cp_wait concurrency.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7c506896

f2fs: avoid unnecessary search while finding victim in gc · 688159b6

由 Fan Li 提交于 2月 03, 2016

variable nsearched in get_victim_by_default() indicates the number of
dirty segments we already checked. There are 2 problems about the way
it updates:
1. When p.ofs_unit is greater than 1, the victim we find consists
   of multiple segments, possibly more than 1 dirty segment.
   But nsearched always increases by 1.
2. If segments have been found but not been chosen, nsearched won't
   increase. So even we have checked all dirty segments, nsearched
   may still less than p.max_search.
All these problems could cause unnecessary search after all dirty
segments have already been checked.
Signed-off-by: NFan li <fanofcode.li@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

688159b6

f2fs: delete unnecessary wait for page writeback · 85ead818

由 Yunlei He 提交于 2月 03, 2016

no need to wait inline file page writeback for no one
use it, so this patch delete unnecessary wait.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

85ead818

f2fs: use wait_for_stable_page to avoid contention · fec1d657

由 Jaegeuk Kim 提交于 1月 20, 2016

In write_begin, if storage supports stable_page, we don't need to wait for
writeback to update its contents.
This patch introduces to use wait_for_stable_page instead of
wait_on_page_writeback.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fec1d657

f2fs: enhance foreground GC · 718e53fa

由 Chao Yu 提交于 1月 23, 2016

If we configure section consist of multiple segments, foreground GC will
do the garbage collection with following approach:

	for each segment in victim section
		blk_start_plug
		for each valid block in segment
			write out by OPU method
		submit bio cache   <---
		blk_finish_plug   <---

There are two issue:
1) for most of the time, 'submit bio cache' will break the merging in
current bio buffer from writes of next segments, making a smaller bio
submitting.
2) block plug only cover IO submitting in one segment, which reduce
opportunity of merging IOs in plug with multiple segments.

So refactor the code as below structure to strive for biggest
opportunity of merging IOs:

	blk_start_plug
	for each segment in victim section
		for each valid block in segment
			write out by OPU method
	submit bio cache
	blk_finish_plug

Test method:
1. mkfs.f2fs -s 8 /dev/sdX
2. touch 32 files
3. write 2M data into each file
4. punch 1.5M data from offset 0 for each file
5. trigger foreground gc through ioctl

Before patch, there are totoally 40 bios submitted.
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65536, size = 122880
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65776, size = 122880
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66016, size = 122880
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66256, size = 122880
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66496, size = 32768
----repeat for 8 times

After patch, there are totally 35 bios submitted.
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65536, size = 122880
----repeat 34 times
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 73696, size = 16384
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

718e53fa

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功