提交 · 6b4afdd794783fe515b50838aa36591e3feea990 · openeuler / raspberrypi-kernel

07 4月, 2014 1 次提交

f2fs: introduce f2fs_issue_flush to avoid redundant flush issue · 6b4afdd7

由 Jaegeuk Kim 提交于 4月 02, 2014

Some storage devices show relatively high latencies to complete cache_flush
commands, even though their normal IO speed is prettry much high. In such
the case, it needs to merge cache_flush commands as much as possible to avoid
issuing them redundantly.
So, this patch introduces a mount option, "-o flush_merge", to mitigate such
the overhead.

If this option is enabled by user, F2FS merges the cache_flush commands and then
issues just one cache_flush on behalf of them. Once the single command is
finished, F2FS sends a completion signal to all the pending threads.

Note that, this option can be used under a workload consisting of very intensive
concurrent fsync calls, while the storage handles cache_flush commands slowly.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6b4afdd7

02 4月, 2014 4 次提交

f2fs: fix to cover io->bio with io_rwsem · ce23447f

由 Jaegeuk Kim 提交于 4月 02, 2014

In the f2fs_wait_on_page_writeback, io->bio should be covered by io_rwsem.
Otherwise, the bio pointer can become a dangling pointer due to data races.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

ce23447f

f2fs: fix error path when fail to read inline data · d54c795b

由 Chao Yu 提交于 3月 29, 2014

We should unlock page in ->readpage() path and also should unlock & release page
in error path of ->write_begin() to avoid deadlock or memory leak.
So let's add release code to fix the problem when we fail to read inline data.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d54c795b

f2fs: use list_for_each_entry{_safe} for simplyfying code · 2d7b822a

由 Chao Yu 提交于 3月 29, 2014

This patch use list_for_each_entry{_safe} instead of list_for_each{_safe} for
simplfying code.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

2d7b822a

f2fs: avoid free slab cache under spinlock · cf0ee0f0

由 Chao Yu 提交于 4月 02, 2014

Move kmem_cache_free out of spinlock protection region for better performance.

Change log from v1:
 o remove spinlock protection for kmem_cache_free in destroy_node_manager
suggested by Jaegeuk Kim.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

cf0ee0f0

01 4月, 2014 3 次提交

f2fs: avoid unneeded lookup when xattr name length is too long · 6e452d69

由 Chao Yu 提交于 3月 22, 2014

In f2fs_setxattr we have limit this attribute name length, so we should also
check it in f2fs_getxattr to avoid useless lookup caused by invalid name length.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6e452d69

f2fs: avoid unnecessary bio submit when wait page writeback · df0f8dc0

由 Chao Yu 提交于 3月 22, 2014

This patch introduce is_merged_page() to check whether current page is merged
in f2fs bio cache. When page is not in cache, we can avoid submitting bio cache,
resulting in having more chance to merge pages.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

df0f8dc0

f2fs: return -EIO when node id is not matched · 3bb5e2c8

由 Jaegeuk Kim 提交于 4月 01, 2014

During the cleaing of node segments, F2FS can get errored node blocks due to
data race between node page lock and its valid bitmap operations.
In that case, it needs to return an error to skip such the obsolete block copy.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

3bb5e2c8

20 3月, 2014 8 次提交

f2fs: avoid RECLAIM_FS-ON-W warning · 808a1d74

由 Jaegeuk Kim 提交于 3月 20, 2014

This patch should resolve the following possible bug.

RECLAIM_FS-ON-W at:
 mark_held_locks+0xb9/0x140
 lockdep_trace_alloc+0x85/0xf0
 __kmalloc+0x53/0x1d0
 read_all_xattrs+0x3d1/0x3f0 [f2fs]
 f2fs_getxattr+0x4f/0x100 [f2fs]
 f2fs_get_acl+0x4c/0x290 [f2fs]
 get_acl+0x4f/0x80
 posix_acl_create+0x72/0x180
 f2fs_init_acl+0x29/0xcc [f2fs]
 __f2fs_add_link+0x259/0x710 [f2fs]
 f2fs_create+0xad/0x1c0 [f2fs]
 vfs_create+0xed/0x150
 do_last+0xd36/0xed0
 path_openat+0xc5/0x680
 do_filp_open+0x43/0xa0
 do_sys_open+0x13c/0x230
 SyS_creat+0x1e/0x20
 system_call_fastpath+0x16/0x1b
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

808a1d74

f2fs: skip unnecessary node writes during fsync · 479f40c4

由 Jaegeuk Kim 提交于 3月 20, 2014

If multiple redundant fsync calls are triggered, we don't need to write its
node pages with fsync mark continuously.

So, this patch adds FI_NEED_FSYNC to track whether the latest node block is
written with the fsync mark or not.
If the mark was set, a new fsync doesn't need to write a node block.
Otherwise, we should do a new node block with the mark for roll-forward
recovery.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

479f40c4

f2fs: introduce fi->i_sem to protect fi's info · d928bfbf

由 Jaegeuk Kim 提交于 3月 20, 2014

This patch introduces fi->i_sem to protect fi's info that includes xattr_ver,
pino, i_nlink.
This enables to remove i_mutex during f2fs_sync_file, resulting in performance
improvement when a number of fsync calls are triggered from many concurrent
threads.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d928bfbf

f2fs: change reclaim rate in percentage · 58c41035

由 Jaegeuk Kim 提交于 3月 19, 2014

It is more reasonable to determine the reclaiming rate of prefree segments
according to the volume size, which is set to 5% by default.
For example, if the volume is 128GB, the prefree segments are reclaimed
when the number reaches to 6.4GB.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

58c41035

f2fs: remove unnecessary threshold · a5f42010

由 Jaegeuk Kim 提交于 3月 19, 2014

The NM_WOUT_THRESHOLD is now obsolete since f2fs starts to control on a basis
of the memory footprint.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a5f42010

f2fs: throttle the memory footprint with a sysfs entry · cdfc41c1

由 Jaegeuk Kim 提交于 3月 19, 2014

This patch introduces ram_thresh, a sysfs entry, which controls the memory
footprint used by the free nid list and the nat cache.

Previously, the free nid list was controlled by MAX_FREE_NIDS, while the nat
cache was managed by NM_WOUT_THRESHOLD.
However, this approach cannot be applied dynamically according to the system.

So, this patch adds ram_thresh that users can specify the threshold, which is
in order of 1 / 1024.
For example, if the total ram size is 4GB and the value is set to 10 by default,
f2fs tries to control the number of free nids and nat caches not to consume over
10 * (4GB / 1024) = 10MB.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

cdfc41c1

f2fs: avoid to drop nat entries due to the negative nr_shrink · 40bb0058

由 Jaegeuk Kim 提交于 3月 19, 2014

The try_to_free_nats should not receive the negative nr_shrink.
Otherwise, it can drop all the nat entries by the while loop.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

40bb0058

f2fs: call f2fs_wait_on_page_writeback instead of native function · 3cb5ad15

由 Jaegeuk Kim 提交于 3月 18, 2014

If a page is on writeback, f2fs can face with deadlock due to under writepages.
This is caused by merging IOs inside f2fs, so if it comes to detect, let's throw
merged IOs, which is implemented by f2fs_wait_on_page_writeback.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

3cb5ad15

18 3月, 2014 8 次提交

f2fs: introduce nr_pages_to_write for segment alignment · 50c8cdb3

由 Jaegeuk Kim 提交于 3月 18, 2014

This patch introduces nr_pages_to_write to align page writes to the segment
or other operational unit size, which can be tuned according to the system
environment.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

50c8cdb3

f2fs: increase pages_skipped when skipping writepages · d3baf95d

由 Jaegeuk Kim 提交于 3月 18, 2014

This patch increases pages_skipped when skipping writepages.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d3baf95d

f2fs: avoid small data writes by skipping writepages · 87d6f890

由 Jaegeuk Kim 提交于 3月 18, 2014

This patch introduces nr_pages_to_skip(sbi, type) to determine writepages can
be skipped.
The dentry, node, and meta pages can be conrolled by F2FS without breaking the
FS consistency.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

87d6f890

f2fs: introduce get_dirty_dents for readability · f8b2c1f9

由 Jaegeuk Kim 提交于 3月 18, 2014

The get_dirty_dents gives us the number of dirty dentry pages.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

f8b2c1f9

f2fs: fix incorrect parsing with option string · 04c09388

由 Chao Yu 提交于 3月 18, 2014

Previously 'background_gc={on***,off***}' is being parsed as correct option,
with this patch we cloud fix the trivial bug in mount process.

Change log from v1:
 o need to check length of parameter suggested by Jaegeuk Kim.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

04c09388

f2fs: avoid to return incorrect errno of read_normal_summaries · e4fc5fbf

由 Chao Yu 提交于 3月 17, 2014

We should return error number of read_normal_summaries instead of -EINVAL when
read_normal_summaries failed.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e4fc5fbf

f2fs: introduce f2fs_has_xattr_block for better readability · 4bc8e9bc

由 Chao Yu 提交于 3月 17, 2014

This patch introduces a help function f2fs_has_xattr_block for better
readability.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

4bc8e9bc

f2fs: print type for each segment in segment_info's show · 90aa6dc9

由 Chao Yu 提交于 3月 17, 2014

The original segment_info's show looks out-of-format:
cat /proc/fs/f2fs/loop0/segment_info
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 512
512 512 512 512 512 512 512 0 0 512
348 0 263 0 0 512 0 0 512 512
512 512 0 512 512 512 512 512 512 512
512 512 511 328 512 512 512 512 512 512
512 512 512 512 512 512 512 0 0 175

Let's fix this and show type for each segment.
cat /proc/fs/f2fs/loop0/segment_info
format: segment_type|valid_blocks
segment_type(0:HD, 1:WD, 2:CD, 3:HN, 4:WN, 5:CN)
0    2|0   1|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0
10   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0
20   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0
30   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0
40   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0   0|0
50   3|0   3|0   3|0   3|0   3|0   3|0   3|0   0|0   3|0   3|0
60   3|0   3|0   3|0   3|0   3|0   3|0   3|0   3|0   3|0   3|512
70   3|512 3|512 3|512 3|512 3|512 3|512 3|512 3|0   3|0   3|512
80   3|0   3|0   3|0   3|0   3|0   3|512 3|0   3|0   3|512 3|512
90   3|512 0|512 3|274 0|512 0|512 0|512 0|512 0|512 0|512 3|512
100  3|512 0|512 3|511 0|328 3|512 0|512 0|512 3|512 0|512 0|512
110  0|512 0|512 0|512 0|512 0|512 0|512 0|512 5|0   4|0   3|512
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

90aa6dc9

12 3月, 2014 2 次提交

f2fs: check upper bound of ino value in f2fs_nfs_get_inode · 910bb12d

由 Chao Yu 提交于 3月 12, 2014

Upper bound checking of ino should be added to f2fs_nfs_get_inode, so unneeded
process before do_read_inode in f2fs_iget could be avoided when ino is invalid.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

910bb12d

f2fs: introduce f2fs_has_inline_xattr for better readability · 987c7c31

由 Chao Yu 提交于 3月 12, 2014

This patch introduces a help function f2fs_has_inline_xattr for better
readability.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

987c7c31

11 3月, 2014 1 次提交

f2fs: recover inline xattr data in roll-forward process · 28cdce04

由 Chao Yu 提交于 3月 11, 2014

Previously we do not recover inline xattr data of inode after power-cut, so
inline xattr data may be lost.
We should recover the data during the roll-forward process.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

28cdce04

10 3月, 2014 4 次提交

f2fs: optimize restore_node_summary slightly · d653788a

由 Gu Zheng 提交于 3月 07, 2014

Previously, we ra_sum_pages to pre-read contiguous pages as more
as possible, and if we fail to alloc more pages, an ENOMEM error
will be reported upstream, even though we have alloced some pages
yet. In fact, we can use the available pages to do the job partly,
and continue the rest in the following circle. Only reporting ENOMEM
upstream if we really can not alloc any available page.

And another fix is ignoring dealing with the following pages if an
EIO occurs when reading page from page_list.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: modify the flow for better neat code]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d653788a

f2fs: format segment_info's show for better legibility · 46c04366

由 Gu Zheng 提交于 3月 07, 2014

The original segment_info's show is a bit out-of-format:

[root@guz Demoes]# cat /proc/fs/f2fs/loop0/segment_info
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
......
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 [root@guz Demoes]#

so we fix it here for better legibility.
[root@guz Demoes]# cat /proc/fs/f2fs/loop0/segment_info
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
......
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1
[root@guz Demoes]#
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

46c04366

G
f2fs: remove the unused ctor argument of f2fs_kmem_cache_create() · e8512d2e
由 Gu Zheng 提交于 3月 07, 2014
```
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
```
e8512d2e

f2fs: update start nid only once each circle · b6ce391e

由 Gu Zheng 提交于 3月 07, 2014

Integrated a couple of minor changes for better readability suggested by
Chao Yu.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b6ce391e

05 3月, 2014 1 次提交

f2fs: fix wrong kernel coding style · 20f70751

由 Jaegeuk Kim 提交于 3月 05, 2014

This patch includes a simple fix to adjust coding style.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

20f70751

03 3月, 2014 1 次提交

f2fs: fix to write node pages with WRITE_SYNC · c81bf1c8

由 Jaegeuk Kim 提交于 3月 03, 2014

This patch fixes performance regression of dbench reported by
Alex <hbx7d@yandex.com>.

This issue was revealed by Phoronix tests results:
http://www.phoronix.com/scan.php?page=article&item=linux_314_ssdfs&num=2

It turns out that we need to assign WRITE_SYNC to the node writes, if
fsync is triggered.

The performance numbers are like below, which is measured by Alex.
1. 355MB/s       ext4
2. 225MB/s       f2fs : WRITE for node writes
3. 525MB/s       f2fs : WRITE_SYNC for node writes

Reported-And-Tested-by: Alex <hbx7d@yandex.com>.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c81bf1c8

28 2月, 2014 1 次提交

f2fs: fix dirty page accounting when redirty · 9cf3c389

由 Chao Yu 提交于 2月 28, 2014

We should de-account dirty counters for page when redirty in ->writepage().

Wu Fengguang described in 'commit 971767ca':
"writeback: fix dirtied pages accounting on redirty
De-account the accumulative dirty counters on page redirty.

Page redirties (very common in ext4) will introduce mismatch between
counters (a) and (b)

a) NR_DIRTIED, BDI_DIRTIED, tsk->nr_dirtied
b) NR_WRITTEN, BDI_WRITTEN

This will introduce systematic errors in balanced_rate and result in
dirty page position errors (ie. the dirty pages are no longer balanced
around the global/bdi setpoints)."
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9cf3c389

27 2月, 2014 5 次提交

f2fs: use existing macro to clean up some codes · 695fd1ed

由 Chao Yu 提交于 2月 27, 2014

This patch use existing macro F2FS_INODE/NEXT_FREE_BLKADDR to clean up some
codes.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

695fd1ed

f2fs: readahead contiguous SSA blocks for f2fs_gc · 81c1a0f1

由 Chao Yu 提交于 2月 27, 2014

If there are multi segments in one section, we will read those SSA blocks which
have contiguous address one by one in f2fs_gc. It may lost performance, let's
read ahead SSA blocks by merge multi read request.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

81c1a0f1

f2fs: add an sysfs entry to control the directory level · ab9fa662

由 Jaegeuk Kim 提交于 2月 27, 2014

This patch adds an sysfs entry to control dir_level used by the large directory.

The description of this entry is:

 dir_level                    This parameter controls the directory level to
			      support large directory. If a directory has a
			      number of files, it can reduce the file lookup
			      latency by increasing this dir_level value.
			      Otherwise, it needs to decrease this value to
			      reduce the space overhead. The default value is 0.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

ab9fa662

f2fs: introduce large directory support · 38431545

由 Jaegeuk Kim 提交于 2月 27, 2014

This patch introduces an i_dir_level field to support large directory.

Previously, f2fs maintains multi-level hash tables to find a dentry quickly
from a bunch of chiild dentries in a directory, and the hash tables consist of
the following tree structure as below.

In Documentation/filesystems/f2fs.txt,

----------------------
A : bucket
B : block
N : MAX_DIR_HASH_DEPTH
----------------------

level #0   | A(2B)
           |
level #1   | A(2B) - A(2B)
           |
level #2   | A(2B) - A(2B) - A(2B) - A(2B)
     .     |   .       .       .       .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N   | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

But, if we can guess that a directory will handle a number of child files,
we don't need to traverse the tree from level #0 to #N all the time.
Since the lower level tables contain relatively small number of dentries,
the miss ratio of the target dentry is likely to be high.

In order to avoid that, we can configure the hash tables sparsely from level #0
like this.

level #0   | A(2B) - A(2B) - A(2B) - A(2B)

level #1   | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N   | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

With this structure, we can skip the ineffective tree searches in lower level
hash tables.

This patch adds just a facility for this by introducing i_dir_level in
f2fs_inode.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

38431545

f2fs: remove costly bit operations for f2fs_find_entry · 5d0c6671

由 Jaegeuk Kim 提交于 2月 27, 2014

It turns out that a bit operation like find_next_bit is not always fast enough
for f2fs_find_entry.
Instead, it is pretty much simple and fast to traverse each dentries.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5d0c6671

24 2月, 2014 1 次提交

f2fs: implement a lock-free stat_show · 8b8343fa

由 Jaegeuk Kim 提交于 2月 24, 2014

The stat_show is just to show the current status of f2fs.
So, we can remove all the there-in locks.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8b8343fa