提交 · 0f7b2abd188089a44f60e2bf8521d1363ada9e12 · openanolis / cloud-kernel

29 7月, 2014 1 次提交

f2fs: add nobarrier mount option · 0f7b2abd

由 Jaegeuk Kim 提交于 7月 23, 2014

This patch adds a mount option, nobarrier, in f2fs.
The assumption in here is that file system keeps the IO ordering, but
doesn't care about cache flushes inside the storages.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0f7b2abd

12 7月, 2014 1 次提交

f2fs: remove the unused stat_lock · 4b2868aa

由 Gu Zheng 提交于 7月 11, 2014

Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4b2868aa

10 7月, 2014 4 次提交

G
f2fs: arguments cleanup of finding file flow functions · eee6160f
由 Gu Zheng 提交于 6月 24, 2014
```
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
eee6160f

f2fs: refactor flush_nat_entries codes for reducing NAT writes · aec71382

由 Chao Yu 提交于 6月 24, 2014

Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
   nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
   journal is full, then flush the left dirty entries to disk without merge
   journaled entries, so these journaled entries may be flushed to disk at next
   checkpoint but lost chance to flushed last time.

In this patch we merge dirty entries located in same NAT block to nat entry set,
and linked all set to list, sorted ascending order by entries' count of set.
Later we flush entries in sparse set into journal as many as we can, and then
flush merged entries to disk. In this way we can not only gain in performance,
but also save lifetime of flash device.

In my testing environment, it shows this patch can help to reduce NAT block
writes obviously. In hard disk test case: cost time of fsstress is stablely
reduced by about 5%.

1. virtual machine + hard disk:
fsstress -p 20 -n 200 -l 5
		node num	cp count	nodes/cp
based		4599.6		1803.0		2.551
patched		2714.6		1829.6		1.483

2. virtual machine + 32g micro SD card:
fsstress -p 20 -n 200 -l 1 -w -f chown=0 -f creat=4 -f dwrite=0
-f fdatasync=4 -f fsync=4 -f link=0 -f mkdir=4 -f mknod=4 -f rename=5
-f rmdir=5 -f symlink=0 -f truncate=4 -f unlink=5 -f write=0 -S

		node num	cp count	nodes/cp
based		84.5		43.7		1.933
patched		49.2		40.0		1.23

Our latency of merging op shows not bad when handling extreme case like:
merging a great number of dirty nats:
latency(ns)	dirty nat count
3089219		24922
5129423		27422
4000250		24523

change log from v1:
 o fix wrong logic in add_nat_entry when grab a new nat entry set.
 o swith to create slab cache in create_node_manager_caches.
 o use GFP_ATOMIC instead of GFP_NOFS to avoid potential long latency.

change log from v2:
 o make comment position more appropriate suggested by Jaegeuk Kim.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

aec71382

J
f2fs: clean up an unused parameter and assignment · a014e037
由 Jaegeuk Kim 提交于 6月 20, 2014
```
This patch cleans up simple unnecessary codes.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
a014e037

f2fs: introduce f2fs_do_tmpfile for code consistency · b97a9b5d

由 Jaegeuk Kim 提交于 6月 20, 2014

This patch adds f2fs_do_tmpfile to eliminate the redundant init_inode_metadata
flow.
Throught this, we can provide the consistent lock usage, e.g., fi->i_sem,  and
this will enable better debugging stuffs.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b97a9b5d

09 7月, 2014 2 次提交

f2fs: check lower bound nid value in check_nid_range · d6b7d4b3

由 Chao Yu 提交于 6月 12, 2014

This patch add lower bound verification for nid in check_nid_range, so nids
reserved like 0, node, meta passed by caller could be checked there.

And then check_nid_range could be used in f2fs_nfs_get_inode for simplifying
code.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d6b7d4b3

f2fs: remove unused variables in f2fs_sm_info · 8bc6f60e

由 Chao Yu 提交于 6月 11, 2014

Remove unused variables in struct f2fs_sm_info.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8bc6f60e

08 6月, 2014 1 次提交

f2fs: support f2fs_fiemap · 9ab70134

由 Jaegeuk Kim 提交于 6月 08, 2014

This patch links f2fs_fiemap with generic function with get_block.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9ab70134

04 6月, 2014 2 次提交

f2fs: fix to recover data written by dio · b6fe5873

由 Jaegeuk Kim 提交于 6月 04, 2014

If data are overwritten through dio, previous f2fs doesn't remain the fsync mark
due to no additional node writes.

Note that this patch should resolve the xfstests:311.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b6fe5873

f2fs: large volume support · 1dbe4152

由 Changman Lee 提交于 5月 12, 2014

f2fs's cp has one page which consists of struct f2fs_checkpoint and
version bitmap of sit and nat. To support lots of segments, we need more
blocks for sit bitmap. So let's arrange sit bitmap as following:
+-----------------+------------+
| f2fs_checkpoint | sit bitmap |
| + nat bitmap    |            |
+-----------------+------------+
0                 4k        N blocks
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
[Jaegeuk Kim: simple code change for readability]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1dbe4152

07 5月, 2014 10 次提交

f2fs: fix to truncate inline data in inode page when setattr · 8aa6f1c5

由 Chao Yu 提交于 4月 29, 2014

Previous we do not truncate inline data in inode page when setattr, so following
case could still read the inline data which has already truncated:

1.write inline data
2.ftruncate size to 0
3.ftruncate size to max inline data size
4.read from offset 0

This patch introduces truncate_inline_data() to fix this problem.

change log from v1:
 o fix a bug and do not truncate first page data after truncate inline data.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8aa6f1c5

f2fs: readahead multi pages of directory for performance · 817202d9

由 Chao Yu 提交于 4月 28, 2014

We have no so such readahead mechanism in ->iterate() path as the one in
->read() path, it cause low performance when we read large directory.
This patch add readahead in f2fs_readdir() for better performance.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

817202d9

f2fs: introduce f2fs_seek_block to support SEEK_{DATA, HOLE} in llseek · 267378d4

由 Chao Yu 提交于 4月 23, 2014

In This patch we introduce f2fs_seek_block to support SEEK_{DATA,HOLE} of
lseek(2).

change log from v1:
 o fix bug when lseek from middle of page and fix wrong calculation of
PGOFS_OF_NEXT_DNODE macro.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

267378d4

f2fs: introduce help function {create,destroy}_flush_cmd_control · 2163d198

由 Gu Zheng 提交于 4月 27, 2014

Introduce help function {create,destroy}_flush_cmd_control to clean up
the create/destory flush merge operation.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

2163d198

f2fs: introduce struct flush_cmd_control to wrap the flush_merge fields · a688b9d9

由 Gu Zheng 提交于 4月 27, 2014

Split the flush_merge fields from sm_i, and use the new struct flush_cmd_control
to wrap it, so that we can igonre these fileds if flush_merge is disable, and
it alse can the structs more neat.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a688b9d9

f2fs: adjust free mem size to flush dentry blocks · 6fb03f3a

由 Jaegeuk Kim 提交于 4月 16, 2014

If so many dirty dentry blocks are cached, not reached to the flush condition,
we should fall into livelock in balance_dirty_pages.
So, let's consider the mem size for the condition.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6fb03f3a

f2fs: add available_nids to fix handling max_nid correctly · 7ee0eeab

由 Jaegeuk Kim 提交于 4月 18, 2014

This patch introduces available_nids for alloc_nids() and fixes max_nid for
build_free_nids() and scan_nat_pages().
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

7ee0eeab

f2fs: add the flush_merge handle in the remount flow · 876dc59e

由 Gu Zheng 提交于 4月 11, 2014

Add the *remount* handle of flush_merge option, so that the users
can enable flush_merge in the runtime, such as the underlying device
handles the cache_flush command relatively slowly.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

876dc59e

f2fs: remove costly dirty_dir_inode operations · ed57c27f

由 Jaegeuk Kim 提交于 4月 15, 2014

This patch removes list opeations in handling dirty dir inodes.
Previously, F2FS traverses whole the list of dirty dir inodes to check whether
there is an existing inode or not, resulting in heavy CPU overheads.

So this patch removes such the traverse operations by adding FI_DIRTY_DIR to
indicate the inode lies on the list or not.
Through this simple flag, we can remove redundant operations gracefully.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

ed57c27f

f2fs: avoid to conduct roll-forward due to the remained garbage blocks · 1e87a78d

由 Jaegeuk Kim 提交于 4月 15, 2014

The f2fs always scans the next chain of direct node blocks.
But some garbage blocks are able to be remained due to no discard support or
SSR triggers.
This occasionally wreaks recovering wrong inodes that were used or BUG_ONs
due to reallocating node ids as follows.

When mount this f2fs image:
http://linuxtesting.org/downloads/f2fs_fault_image.zip
BUG_ON is triggered in f2fs driver (messages below are generated on
kernel 3.13.2; for other kernels output is similar):

kernel BUG at fs/f2fs/node.c:215!
 Call Trace:
 [<ffffffffa032ebad>] recover_inode_page+0x1fd/0x3e0 [f2fs]
 [<ffffffff811446e7>] ? __lock_page+0x67/0x70
 [<ffffffff81089990>] ? autoremove_wake_function+0x50/0x50
 [<ffffffffa0337788>] recover_fsync_data+0x1398/0x15d0 [f2fs]
 [<ffffffff812b9e5c>] ? selinux_d_instantiate+0x1c/0x20
 [<ffffffff811cb20b>] ? d_instantiate+0x5b/0x80
 [<ffffffffa0321044>] f2fs_fill_super+0xb04/0xbf0 [f2fs]
 [<ffffffff811b861e>] ? mount_bdev+0x7e/0x210
 [<ffffffff811b8769>] mount_bdev+0x1c9/0x210
 [<ffffffffa0320540>] ? validate_superblock+0x210/0x210 [f2fs]
 [<ffffffffa031cf8d>] f2fs_mount+0x1d/0x30 [f2fs]
 [<ffffffff811b9497>] mount_fs+0x47/0x1c0
 [<ffffffff81166e00>] ? __alloc_percpu+0x10/0x20
 [<ffffffff811d4032>] vfs_kern_mount+0x72/0x110
 [<ffffffff811d6763>] do_mount+0x493/0x910
 [<ffffffff811615cb>] ? strndup_user+0x5b/0x80
 [<ffffffff811d6c70>] SyS_mount+0x90/0xe0
 [<ffffffff8166f8d9>] system_call_fastpath+0x16/0x1b

Found by Linux File System Verification project (linuxtesting.org).
Reported-by: NAndrey Tsyvarev <tsyvarev@ispras.ru>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1e87a78d

07 4月, 2014 1 次提交

f2fs: introduce f2fs_issue_flush to avoid redundant flush issue · 6b4afdd7

由 Jaegeuk Kim 提交于 4月 02, 2014

Some storage devices show relatively high latencies to complete cache_flush
commands, even though their normal IO speed is prettry much high. In such
the case, it needs to merge cache_flush commands as much as possible to avoid
issuing them redundantly.
So, this patch introduces a mount option, "-o flush_merge", to mitigate such
the overhead.

If this option is enabled by user, F2FS merges the cache_flush commands and then
issues just one cache_flush on behalf of them. Once the single command is
finished, F2FS sends a completion signal to all the pending threads.

Note that, this option can be used under a workload consisting of very intensive
concurrent fsync calls, while the storage handles cache_flush commands slowly.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6b4afdd7

01 4月, 2014 1 次提交

f2fs: avoid unnecessary bio submit when wait page writeback · df0f8dc0

由 Chao Yu 提交于 3月 22, 2014

This patch introduce is_merged_page() to check whether current page is merged
in f2fs bio cache. When page is not in cache, we can avoid submitting bio cache,
resulting in having more chance to merge pages.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

df0f8dc0

20 3月, 2014 3 次提交

f2fs: skip unnecessary node writes during fsync · 479f40c4

由 Jaegeuk Kim 提交于 3月 20, 2014

If multiple redundant fsync calls are triggered, we don't need to write its
node pages with fsync mark continuously.

So, this patch adds FI_NEED_FSYNC to track whether the latest node block is
written with the fsync mark or not.
If the mark was set, a new fsync doesn't need to write a node block.
Otherwise, we should do a new node block with the mark for roll-forward
recovery.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

479f40c4

f2fs: introduce fi->i_sem to protect fi's info · d928bfbf

由 Jaegeuk Kim 提交于 3月 20, 2014

This patch introduces fi->i_sem to protect fi's info that includes xattr_ver,
pino, i_nlink.
This enables to remove i_mutex during f2fs_sync_file, resulting in performance
improvement when a number of fsync calls are triggered from many concurrent
threads.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d928bfbf

f2fs: throttle the memory footprint with a sysfs entry · cdfc41c1

由 Jaegeuk Kim 提交于 3月 19, 2014

This patch introduces ram_thresh, a sysfs entry, which controls the memory
footprint used by the free nid list and the nat cache.

Previously, the free nid list was controlled by MAX_FREE_NIDS, while the nat
cache was managed by NM_WOUT_THRESHOLD.
However, this approach cannot be applied dynamically according to the system.

So, this patch adds ram_thresh that users can specify the threshold, which is
in order of 1 / 1024.
For example, if the total ram size is 4GB and the value is set to 10 by default,
f2fs tries to control the number of free nids and nat caches not to consume over
10 * (4GB / 1024) = 10MB.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

cdfc41c1

18 3月, 2014 2 次提交

f2fs: introduce get_dirty_dents for readability · f8b2c1f9

由 Jaegeuk Kim 提交于 3月 18, 2014

The get_dirty_dents gives us the number of dirty dentry pages.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

f8b2c1f9

f2fs: introduce f2fs_has_xattr_block for better readability · 4bc8e9bc

由 Chao Yu 提交于 3月 17, 2014

This patch introduces a help function f2fs_has_xattr_block for better
readability.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

4bc8e9bc

12 3月, 2014 1 次提交

f2fs: introduce f2fs_has_inline_xattr for better readability · 987c7c31

由 Chao Yu 提交于 3月 12, 2014

This patch introduces a help function f2fs_has_inline_xattr for better
readability.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

987c7c31

10 3月, 2014 1 次提交
- G
  f2fs: remove the unused ctor argument of f2fs_kmem_cache_create() · e8512d2e
  由 Gu Zheng 提交于 3月 07, 2014
```
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
```
  e8512d2e
27 2月, 2014 4 次提交

f2fs: use existing macro to clean up some codes · 695fd1ed

由 Chao Yu 提交于 2月 27, 2014

This patch use existing macro F2FS_INODE/NEXT_FREE_BLKADDR to clean up some
codes.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

695fd1ed

f2fs: readahead contiguous SSA blocks for f2fs_gc · 81c1a0f1

由 Chao Yu 提交于 2月 27, 2014

If there are multi segments in one section, we will read those SSA blocks which
have contiguous address one by one in f2fs_gc. It may lost performance, let's
read ahead SSA blocks by merge multi read request.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

81c1a0f1

f2fs: add an sysfs entry to control the directory level · ab9fa662

由 Jaegeuk Kim 提交于 2月 27, 2014

This patch adds an sysfs entry to control dir_level used by the large directory.

The description of this entry is:

 dir_level                    This parameter controls the directory level to
			      support large directory. If a directory has a
			      number of files, it can reduce the file lookup
			      latency by increasing this dir_level value.
			      Otherwise, it needs to decrease this value to
			      reduce the space overhead. The default value is 0.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

ab9fa662

f2fs: introduce large directory support · 38431545

由 Jaegeuk Kim 提交于 2月 27, 2014

This patch introduces an i_dir_level field to support large directory.

Previously, f2fs maintains multi-level hash tables to find a dentry quickly
from a bunch of chiild dentries in a directory, and the hash tables consist of
the following tree structure as below.

In Documentation/filesystems/f2fs.txt,

----------------------
A : bucket
B : block
N : MAX_DIR_HASH_DEPTH
----------------------

level #0   | A(2B)
           |
level #1   | A(2B) - A(2B)
           |
level #2   | A(2B) - A(2B) - A(2B) - A(2B)
     .     |   .       .       .       .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N   | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

But, if we can guess that a directory will handle a number of child files,
we don't need to traverse the tree from level #0 to #N all the time.
Since the lower level tables contain relatively small number of dentries,
the miss ratio of the target dentry is likely to be high.

In order to avoid that, we can configure the hash tables sparsely from level #0
like this.

level #0   | A(2B) - A(2B) - A(2B) - A(2B)

level #1   | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N   | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

With this structure, we can skip the ineffective tree searches in lower level
hash tables.

This patch adds just a facility for this by introducing i_dir_level in
f2fs_inode.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

38431545

24 2月, 2014 3 次提交

f2fs: implement a lock-free stat_show · 8b8343fa

由 Jaegeuk Kim 提交于 2月 24, 2014

The stat_show is just to show the current status of f2fs.
So, we can remove all the there-in locks.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8b8343fa

f2fs: introduce a radix_tree for the free_nid list · 8a7ed66a

由 Jaegeuk Kim 提交于 2月 21, 2014

This patch introduces a radix tree for the list of free_nids, which enhances
the performance on free nid management.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8a7ed66a

f2fs: introduce help macro on_build_free_nids() · f978f5a0

由 Gu Zheng 提交于 2月 21, 2014

Introduce help macro on_build_free_nids() which just uses build_lock
to judge whether the building free nid is going, so that we can remove
the on_build_free_nids field from f2fs_sb_info.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
[Jaegeuk Kim: remove an unnecessary white line removal]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

f978f5a0

17 2月, 2014 3 次提交

f2fs: show counts of checkpoint in status · 942e0be6

由 Changman Lee 提交于 2月 13, 2014

This patch shows the counts of checkpoint in f2fs' status.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

942e0be6

f2fs: introduce ra_meta_pages to readahead CP/NAT/SIT pages · 662befda

由 Chao Yu 提交于 2月 07, 2014

This patch help us to cleanup the readahead code by merging ra_{sit,nat}_pages
function into ra_meta_pages.
Additionally the new function is used to readahead cp block in
recover_orphan_inodes.

Change log from v1:
 o fix a deadloop bug pointed by Jaegeuk Kim.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

662befda

f2fs: clean up redundant function call · 1fe54f9d

由 Jaegeuk Kim 提交于 2月 07, 2014

This patch integrates inode_[inc|dec]_dirty_dents with inc_page_count to remove
redundant calls.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1fe54f9d

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功