提交 · 1b3e27a92ab60452b8fbb35e3ba691ac34f2c0fb · openanolis / cloud-kernel

11 4月, 2015 3 次提交

f2fs: preserve extent info for extent cache · 0bdee482

由 Chao Yu 提交于 3月 19, 2015

This patch tries to preserve last extent info in extent tree cache into on-disk
inode, so this can help us to reuse the last extent info next time for
performance.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0bdee482

f2fs: initialize extent tree with on-disk extent info of inode · 028a41e8

由 Chao Yu 提交于 3月 19, 2015

With normal extent info cache, we records largest extent mapping between logical
block and physical block into extent info, and we persist extent info in on-disk
inode.

When we enable extent tree cache, if extent info of on-disk inode is exist, and
the extent is not a small fragmented mapping extent. We'd better to load the
extent info into extent tree cache when inode is loaded. By this way we can have
more chance to hit extent tree cache rather than taking more time to read dnode
page for block address.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

028a41e8

f2fs: avoid punch_hole overhead when releasing volatile data · 3c6c2beb

由 Jaegeuk Kim 提交于 3月 17, 2015

This patch is to avoid some punch_hole overhead when releasing volatile data.
If volatile data was not written yet, we just can make the first page as zero.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3c6c2beb

04 3月, 2015 2 次提交

f2fs: enable rb-tree extent cache · 1dcc336b

由 Chao Yu 提交于 2月 05, 2015

This patch enables rb-tree based extent cache in f2fs.

When we mount with "-o extent_cache", f2fs will try to add recently accessed
page-block mappings into rb-tree based extent cache as much as possible, instead
of original one extent info cache.

By this way, f2fs can support more effective cache between dnode page cache and
disk. It will supply high hit ratio in the cache with fewer memory when dnode
page cache are reclaimed in environment of low memory.

Storage: Sandisk sd card 64g
1.append write file (offset: 0, size: 128M);
2.override write file (offset: 2M, size: 1M);
3.override write file (offset: 4M, size: 1M);
...
4.override write file (offset: 48M, size: 1M);
...
5.override write file (offset: 112M, size: 1M);
6.sync
7.echo 3 > /proc/sys/vm/drop_caches
8.read file (size:128M, unit: 4k, count: 32768)
(time dd if=/mnt/f2fs/128m bs=4k count=32768)

Extent Hit Ratio:
		before		patched
Hit Ratio	121 / 1071	1071 / 1071

Performance:
		before		patched
real    	0m37.051s	0m35.556s
user    	0m0.040s	0m0.026s
sys     	0m2.990s	0m2.251s

Memory Cost:
		before		patched
Tree Count:	0		1 (size: 24 bytes)
Node Count:	0		45 (size: 1440 bytes)

v3:
 o retest and given more details of test result.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1dcc336b

f2fs: move ext_lock out of struct extent_info · 0c872e2d

由 Chao Yu 提交于 2月 05, 2015

Move ext_lock out of struct extent_info, then in the following patches we can
use variables with struct extent_info type as a parameter to pass pure data.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0c872e2d

10 1月, 2015 2 次提交

f2fs: get rid of kzalloc in __recover_inline_status · 9e5ba77f

由 Chao Yu 提交于 1月 06, 2015

We use kzalloc to allocate memory in __recover_inline_status, and use this
all-zero memory to check the inline date content of inode page by comparing
them. This is low effective and not needed, let's check inline date content
directly.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: make the code more neat]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9e5ba77f

f2fs: change atomic and volatile write policies · 1e84371f

由 Jaegeuk Kim 提交于 12月 09, 2014

This patch adds two new ioctls to release inmemory pages grabbed by atomic
writes.
 o f2fs_ioc_abort_volatile_write
  - If transaction was failed, all the grabbed pages and data should be written.
 o f2fs_ioc_release_volatile_write
  - This is to enhance the performance of PERSIST mode in sqlite.

In order to avoid huge memory consumption which causes OOM, this patch changes
volatile writes to use normal dirty pages, instead blocked flushing to the disk
as long as system does not suffer from memory pressure.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1e84371f

09 12月, 2014 1 次提交

f2fs: count inline_xx in do_read_inode · 9d1015dd

由 Jaegeuk Kim 提交于 12月 05, 2014

In do_read_inode, if we failed __recover_inline_status, the inode has inline
flag without increasing its count.
Later, f2fs_evict_inode will decrease the count, which causes -1.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9d1015dd

05 11月, 2014 1 次提交

f2fs: revisit inline_data to avoid data races and potential bugs · b3d208f9

由 Jaegeuk Kim 提交于 10月 23, 2014

This patch simplifies the inline_data usage with the following rule.
1. inline_data is set during the file creation.
2. If new data is requested to be written ranges out of inline_data,
 f2fs converts that inode permanently.
3. There is no cases which converts non-inline_data inode to inline_data.
4. The inline_data flag should be changed under inode page lock.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b3d208f9

04 11月, 2014 3 次提交

f2fs: fix counting inline_data inode numbers · e7a2bf22

由 Jaegeuk Kim 提交于 10月 14, 2014

This patch fixes wrongly counting inline_data inode numbers.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e7a2bf22

f2fs: add stat info for inline_dentry inodes · 3289c061

由 Jaegeuk Kim 提交于 10月 13, 2014

This patch adds status information for inline_dentry inodes.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3289c061

f2fs: use highmem for directory pages · a78186eb

由 Jaegeuk Kim 提交于 10月 17, 2014

This patch fixes to use highmem for directory pages.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a78186eb

08 10月, 2014 1 次提交

f2fs: support volatile operations for transient data · 02a1335f

由 Jaegeuk Kim 提交于 10月 06, 2014

This patch adds support for volatile writes which keep data pages in memory
until f2fs_evict_inode is called by iput.

For instance, we can use this feature for the sqlite database as follows.
While supporting atomic writes for main database file, we can keep its journal
data temporarily in the page cache by the following sequence.

1. open
 -> ioctl(F2FS_IOC_START_VOLATILE_WRITE);
2. writes
 : keep all the data in the page cache.
3. flush to the database file with atomic writes
  a. ioctl(F2FS_IOC_START_ATOMIC_WRITE);
  b. writes
  c. ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE);
4. close
 -> drop the cached data
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

02a1335f

07 10月, 2014 1 次提交

f2fs: support atomic writes · 88b88a66

由 Jaegeuk Kim 提交于 10月 06, 2014

This patch introduces a very limited functionality for atomic write support.
In order to support atomic write, this patch adds two ioctls:
 o F2FS_IOC_START_ATOMIC_WRITE
 o F2FS_IOC_COMMIT_ATOMIC_WRITE

The database engine should be aware of the following sequence.
1. open
 -> ioctl(F2FS_IOC_START_ATOMIC_WRITE);
2. writes
  : all the written data will be treated as atomic pages.
3. commit
 -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE);
  : this flushes all the data blocks to the disk, which will be shown all or
  nothing by f2fs recovery procedure.
4. repeat to #2.

The IO pattens should be:

  ,- START_ATOMIC_WRITE                  ,- COMMIT_ATOMIC_WRITE
 CP | D D D D D D | FSYNC | D D D D | FSYNC ...
                      `- COMMIT_ATOMIC_WRITE
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

88b88a66

01 10月, 2014 1 次提交

f2fs: call f2fs_unlock_op after error was handled · 44c16156

由 Jaegeuk Kim 提交于 9月 25, 2014

This patch relocates f2fs_unlock_op in every directory operations to be called
after any error was processed.
Otherwise, the checkpoint can be entered with valid node ids without its
dentry when -ENOSPC is occurred.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

44c16156

16 9月, 2014 1 次提交

f2fs: expand counting dirty pages in the inode page cache · a7ffdbe2

由 Jaegeuk Kim 提交于 9月 12, 2014

Previously f2fs only counts dirty dentry pages, but there is no reason not to
expand the scope.

This patch changes the names on the management of dirty pages and to count
dirty pages in each inode info as well.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a7ffdbe2

10 9月, 2014 1 次提交
- J
  f2fs: need fsck.f2fs when f2fs_bug_on is triggered · 9850cf4a
  由 Jaegeuk Kim 提交于 9月 02, 2014
```
If any f2fs_bug_on is triggered, fsck.f2fs is needed.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
  9850cf4a
04 9月, 2014 1 次提交

f2fs: introduce F2FS_I_SB, F2FS_M_SB, and F2FS_P_SB · 4081363f

由 Jaegeuk Kim 提交于 9月 02, 2014

This patch adds three inline functions to clean up dirty casting codes.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4081363f

05 8月, 2014 1 次提交

f2fs: invalidate xattr node page when evict inode · 002a41ca

由 Chao Yu 提交于 8月 04, 2014

When inode is evicted, all the page cache belong to this inode should be
released including the xattr node page. But previously we didn't do this, this
patch fixed this issue.

v2:
 o reposition invalidate_mapping_pages() to the right place suggested by
Jaegeuk Kim.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

002a41ca

29 7月, 2014 1 次提交

f2fs: add info of appended or updated data writes · fff04f90

由 Jaegeuk Kim 提交于 7月 25, 2014

This patch introduces a inode number list in which represents inodes having
appended data writes or updated data writes after last checkpoint.
This will be used at fsync to determine whether the recovery information
should be written or not.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fff04f90

25 7月, 2014 1 次提交

f2fs: avoid use invalid mapping of node_inode when evict meta inode · dbf20cb2

由 Chao Yu 提交于 7月 25, 2014

Andrey Tsyvarev reported:
"Using memory error detector reveals the following use-after-free error
in 3.15.0:

AddressSanitizer: heap-use-after-free in f2fs_evict_inode
Read of size 8 by thread T22279:
  [<ffffffffa02d8702>] f2fs_evict_inode+0x102/0x2e0 [f2fs]
  [<ffffffff812359af>] evict+0x15f/0x290
  [<     inlined    >] iput+0x196/0x280 iput_final
  [<ffffffff812369a6>] iput+0x196/0x280
  [<ffffffffa02dc416>] f2fs_put_super+0xd6/0x170 [f2fs]
  [<ffffffff81210095>] generic_shutdown_super+0xc5/0x1b0
  [<ffffffff812105fd>] kill_block_super+0x4d/0xb0
  [<ffffffff81210a86>] deactivate_locked_super+0x66/0x80
  [<ffffffff81211c98>] deactivate_super+0x68/0x80
  [<ffffffff8123cc88>] mntput_no_expire+0x198/0x250
  [<     inlined    >] SyS_umount+0xe9/0x1a0 SYSC_umount
  [<ffffffff8123f1c9>] SyS_umount+0xe9/0x1a0
  [<ffffffff81cc8df9>] system_call_fastpath+0x16/0x1b

Freed by thread T3:
  [<ffffffffa02dc337>] f2fs_i_callback+0x27/0x30 [f2fs]
  [<     inlined    >] rcu_process_callbacks+0x2d6/0x930 __rcu_reclaim
  [<     inlined    >] rcu_process_callbacks+0x2d6/0x930 rcu_do_batch
  [<     inlined    >] rcu_process_callbacks+0x2d6/0x930 invoke_rcu_callbacks
  [<     inlined    >] rcu_process_callbacks+0x2d6/0x930 __rcu_process_callbacks
  [<ffffffff810fd266>] rcu_process_callbacks+0x2d6/0x930
  [<ffffffff8107cce2>] __do_softirq+0x142/0x380
  [<ffffffff8107cf50>] run_ksoftirqd+0x30/0x50
  [<ffffffff810b2a87>] smpboot_thread_fn+0x197/0x280
  [<ffffffff810a8238>] kthread+0x148/0x160
  [<ffffffff81cc8d4c>] ret_from_fork+0x7c/0xb0

Allocated by thread T22276:
  [<ffffffffa02dc7dd>] f2fs_alloc_inode+0x2d/0x170 [f2fs]
  [<ffffffff81235e2a>] iget_locked+0x10a/0x230
  [<ffffffffa02d7495>] f2fs_iget+0x35/0xa80 [f2fs]
  [<ffffffffa02e2393>] f2fs_fill_super+0xb53/0xff0 [f2fs]
  [<ffffffff81211bce>] mount_bdev+0x1de/0x240
  [<ffffffffa02dbce0>] f2fs_mount+0x10/0x20 [f2fs]
  [<ffffffff81212a85>] mount_fs+0x55/0x220
  [<ffffffff8123c026>] vfs_kern_mount+0x66/0x200
  [<     inlined    >] do_mount+0x2b4/0x1120 do_new_mount
  [<ffffffff812400d4>] do_mount+0x2b4/0x1120
  [<     inlined    >] SyS_mount+0xb2/0x110 SYSC_mount
  [<ffffffff812414a2>] SyS_mount+0xb2/0x110
  [<ffffffff81cc8df9>] system_call_fastpath+0x16/0x1b

The buggy address ffff8800587866c8 is located 48 bytes inside
  of 680-byte region [ffff880058786698, ffff880058786940)

Memory state around the buggy address:
  ffff880058786100: ffffffff ffffffff ffffffff ffffffff
  ffff880058786200: ffffffff ffffffff ffffffrr rrrrrrrr
  ffff880058786300: rrrrrrrr rrffffff ffffffff ffffffff
  ffff880058786400: ffffffff ffffffff ffffffff ffffffff
  ffff880058786500: ffffffff ffffffff ffffffff fffffffr
 >ffff880058786600: rrrrrrrr rrrrrrrr rrrfffff ffffffff
                                                ^
  ffff880058786700: ffffffff ffffffff ffffffff ffffffff
  ffff880058786800: ffffffff ffffffff ffffffff ffffffff
  ffff880058786900: ffffffff rrrrrrrr rrrrrrrr rrrr....
  ffff880058786a00: ........ ........ ........ ........
  ffff880058786b00: ........ ........ ........ ........
Legend:
  f - 8 freed bytes
  r - 8 redzone bytes
  . - 8 allocated bytes
  x=1..7 - x allocated bytes + (8-x) redzone bytes

Investigation shows, that f2fs_evict_inode, when called for
'meta_inode', uses invalidate_mapping_pages() for 'node_inode'.
But 'node_inode' is deleted before 'meta_inode' in f2fs_put_super via
iput().

It seems that in common usage scenario this use-after-free is benign,
because 'node_inode' remains partially valid data even after
kmem_cache_free().
But things may change if, while 'meta_inode' is evicted in one f2fs
filesystem, another (mounted) f2fs filesystem requests inode from cache,
and formely
'node_inode' of the first filesystem is returned."

Nids for both meta_inode and node_inode are reservation, so it's not necessary
for us to invalidate pages which will never be allocated.
To fix this issue, let's skipping needlessly invalidating pages for
{meta,node}_inode in f2fs_evict_inode.
Reported-by: NAndrey Tsyvarev <tsyvarev@ispras.ru>
Tested-by: NAndrey Tsyvarev <tsyvarev@ispras.ru>
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

dbf20cb2

09 7月, 2014 1 次提交

f2fs: check lower bound nid value in check_nid_range · d6b7d4b3

由 Chao Yu 提交于 6月 12, 2014

This patch add lower bound verification for nid in check_nid_range, so nids
reserved like 0, node, meta passed by caller could be checked there.

And then check_nid_range could be used in f2fs_nfs_get_inode for simplifying
code.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d6b7d4b3

07 5月, 2014 2 次提交

f2fs: deactivate inode page if the inode is evicted · 8198899b

由 Jaegeuk Kim 提交于 4月 30, 2014

If the inode page is clean during its inode eviction, it'd better drop the page
to reduce further memory pressure.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8198899b

f2fs: atomically set inode->i_flags in f2fs_set_inode_flags() · 8abfb36a

由 Zhang Zhen 提交于 4月 15, 2014

Use set_mask_bits() to atomically set i_flags instead of clearing out the
S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
FS_IMMUTABLE_FL, FS_APPEND_FL, etc. flags, since this opens up a race
where an immutable file has the immutable flag cleared for a brief
window of time.
Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8abfb36a

04 4月, 2014 1 次提交

mm + fs: store shadow entries in page cache · 91b0abe3

由 Johannes Weiner 提交于 4月 03, 2014

Reclaim will be leaving shadow entries in the page cache radix tree upon
evicting the real page.  As those pages are found from the LRU, an
iput() can lead to the inode being freed concurrently.  At this point,
reclaim must no longer install shadow pages because the inode freeing
code needs to ensure the page tree is really empty.

Add an address_space flag, AS_EXITING, that the inode freeing code sets
under the tree lock before doing the final truncate.  Reclaim will check
for this flag before installing shadow pages.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Reviewed-by: NRik van Riel <riel@redhat.com>
Reviewed-by: NMinchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

91b0abe3

18 3月, 2014 1 次提交

f2fs: introduce get_dirty_dents for readability · f8b2c1f9

由 Jaegeuk Kim 提交于 3月 18, 2014

The get_dirty_dents gives us the number of dirty dentry pages.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

f8b2c1f9

27 2月, 2014 1 次提交

f2fs: introduce large directory support · 38431545

由 Jaegeuk Kim 提交于 2月 27, 2014

This patch introduces an i_dir_level field to support large directory.

Previously, f2fs maintains multi-level hash tables to find a dentry quickly
from a bunch of chiild dentries in a directory, and the hash tables consist of
the following tree structure as below.

In Documentation/filesystems/f2fs.txt,

----------------------
A : bucket
B : block
N : MAX_DIR_HASH_DEPTH
----------------------

level #0   | A(2B)
           |
level #1   | A(2B) - A(2B)
           |
level #2   | A(2B) - A(2B) - A(2B) - A(2B)
     .     |   .       .       .       .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N   | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

But, if we can guess that a directory will handle a number of child files,
we don't need to traverse the tree from level #0 to #N all the time.
Since the lower level tables contain relatively small number of dentries,
the miss ratio of the target dentry is likely to be high.

In order to avoid that, we can configure the hash tables sparsely from level #0
like this.

level #0   | A(2B) - A(2B) - A(2B) - A(2B)

level #1   | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N   | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

With this structure, we can skip the ineffective tree searches in lower level
hash tables.

This patch adds just a facility for this by introducing i_dir_level in
f2fs_inode.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

38431545

17 2月, 2014 1 次提交

f2fs: update_inode_page should be done all the time · 744602cf

由 Jaegeuk Kim 提交于 1月 24, 2014

In order to make fs consistency, update_inode_page should not be failed all
the time. Otherwise, it is possible to lose some metadata in the inode like
a link count.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

744602cf

20 1月, 2014 1 次提交

f2fs: clean checkpatch warnings · 6c311ec6

由 Chris Fries 提交于 1月 17, 2014

Fixed a variety of trivial checkpatch warnings.  The only delta should
be some minor formatting on log strings that were split / too long.
Signed-off-by: NChris Fries <cfries@motorola.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6c311ec6

14 1月, 2014 1 次提交

f2fs: remove the needless parameter of f2fs_wait_on_page_writeback · 5514f0aa

由 Yuan Zhong 提交于 1月 10, 2014

"boo sync" parameter is never referenced in f2fs_wait_on_page_writeback.
We should remove this parameter.
Signed-off-by: NYuan Zhong <yuan.mark.zhong@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5514f0aa

06 1月, 2014 1 次提交

f2fs: add the number of inline_data files to status info · 0dbdc2ae

由 Jaegeuk Kim 提交于 11月 26, 2013

This patch adds the number of inline_data files into the status information.
Note that the number is reset whenever the filesystem is newly mounted.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

0dbdc2ae

26 12月, 2013 1 次提交

f2fs: introduce F2FS_INODE macro to get f2fs_inode · 58bfaf44

由 Jaegeuk Kim 提交于 12月 26, 2013

This patch introduces F2FS_INODE that returns struct f2fs_inode * from the inode
page.
By using this macro, we can remove unnecessary casting codes like below.

   struct f2fs_inode *ri = &F2FS_NODE(inode_page)->i;
-> struct f2fs_inode *ri = F2FS_INODE(inode_page);
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

58bfaf44

29 10月, 2013 1 次提交

f2fs: add an option to avoid unnecessary BUG_ONs · 5d56b671

由 Jaegeuk Kim 提交于 10月 29, 2013

If you want to remove unnecessary BUG_ONs, you can just turn off F2FS_CHECK_FS
in your kernel config.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5d56b671

18 10月, 2013 1 次提交

f2fs: fix to store and retrieve i_rdev correctly · 3d1e3807

由 Jaegeuk Kim 提交于 10月 08, 2013

When storing i_rdev, we should check its file type.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

3d1e3807

07 10月, 2013 1 次提交

f2fs: use rw_sem instead of fs_lock(locks mutex) · e479556b

由 Gu Zheng 提交于 9月 27, 2013

The fs_locks is used to block other ops(ex, recovery) when doing checkpoint.
And each other operate routine(besides checkpoint) needs to acquire a fs_lock,
there is a terrible problem here, if these are too many concurrency threads acquiring
fs_lock, so that they will block each other and may lead to some performance problem,
but this is not the phenomenon we want to see.
Though there are some optimization patches introduced to enhance the usage of fs_lock,
but the thorough solution is using a *rw_sem* to replace the fs_lock.
Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block
other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other,
this can avoid the problem described above completely.
Because of the weakness of rw_sem, the above change may introduce a potential problem
that the checkpoint thread might get starved if other threads are intensively locking
the read semaphore for I/O.(Pointed out by Xu Jin)
In order to avoid this, a wait_list is introduced, the appending read semaphore ops
will be dropped into the wait_list if checkpoint thread is waiting for write semaphore,
and will be waked up when checkpoint thread gives up write semaphore.
Thanks to Kim's previous review and test, and will be very glad to see other guys'
performance tests about this patch.

V2:
  -fix the potential starvation problem.
  -use more suitable func name suggested by Xu Jin.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
[Jaegeuk Kim: adjust minor coding standard]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e479556b

26 8月, 2013 1 次提交

f2fs: add flags for inline xattrs · 444c580f

由 Jaegeuk Kim 提交于 8月 08, 2013

This patch adds basic inode flags for inline xattrs, F2FS_INLINE_XATTR,
and add a mount option, inline_xattr, which is enabled when xattr is set.

If the mount option is enabled, all the files are marked with the inline_xattrs
flag.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

444c580f

19 8月, 2013 1 次提交

f2fs: avoid writing inode redundantly when creating a file · 92c4342f

由 Jin Xu 提交于 8月 15, 2013

In f2fs_write_inode, updating inode after f2fs_balance_fs is not
a optimized way in the case that f2fs_gc is performed ahead. The
inode page will be unnecessarily written out twice, one of which
is in f2fs_gc->...->sync_node_pages and the other is in
update_inode_page.

Let's update the inode page in prior to f2fs_balance_fs to avoid
this.

To reproduce it,
$ touch file (before this step, should make the device need f2fs_gc)
$ sync (or wait the bdi to write dirty inode)
Signed-off-by: NJin Xu <jinuxstyle@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

92c4342f

06 8月, 2013 1 次提交

f2fs: fix a deadlock in fsync · a569469e

由 Jin Xu 提交于 8月 05, 2013

This patch fixes a deadlock bug that occurs quite often when there are
concurrent write and fsync on a same file.

Following is the simplified call trace when tasks get hung.

fsync thread:
- f2fs_sync_file
 ...
 - f2fs_write_data_pages
 ...
  - update_extent_cache
  ...
   - update_inode
    - wait_on_page_writeback

bdi writeback thread
- __writeback_single_inode
 - f2fs_write_data_pages
  - mutex_lock(sbi->writepages)

The deadlock happens when the fsync thread waits on a inode page that has
been added to the f2fs' cached bio sbi->bio[NODE], and unfortunately,
no one else could be able to submit the cached bio to block layer for
writeback. This is because the fsync thread already hold a sbi->fs_lock and
the sbi->writepages lock, causing the bdi thread being blocked when attempt
to write data pages for the same inode. At the same time, f2fs_gc thread
does not notice the situation and could not help. Even the sync syscall
gets blocked.

To fix it, we could submit the cached bio first before waiting on a inode page
that is being written back.
Signed-off-by: NJin Xu <jinuxstyle@gmail.com>
[Jaegeuk Kim: add more cases to use f2fs_wait_on_page_writeback]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a569469e

30 7月, 2013 1 次提交

f2fs: introduce help function F2FS_NODE() · 45590710

由 Gu Zheng 提交于 7月 15, 2013

Introduce help function F2FS_NODE() to simplify the conversion of node_page to
f2fs_node.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

45590710

14 6月, 2013 1 次提交

f2fs: avoid freqeunt write_inode calls · b3783873

由 Jaegeuk Kim 提交于 6月 10, 2013

If update_inode is called, we don't need to do write_inode.
So, let's use a *dirty* flag for each inode.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b3783873

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功