提交 · afcb7ca01f47b0481e0b248d1542d0934fa70767 · openeuler / Kernel

29 4月, 2013 1 次提交

f2fs: check truncation of mapping after lock_page · afcb7ca0

由 Jaegeuk Kim 提交于 4月 26, 2013

We call lock_page when we need to update a page after readpage.
Between grab and lock page, the page can be truncated by other thread.
So, we should check the page after lock_page whether it was truncated or not.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

afcb7ca0

26 4月, 2013 1 次提交

f2fs: give a chance to merge IOs by IO scheduler · c718379b

由 Jaegeuk Kim 提交于 4月 24, 2013

Previously, background GC submits many 4KB read requests to load victim blocks
and/or its (i)node blocks.

...
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed
f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee
f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef
f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0]
...

However, by the fact that many IOs are sequential, we can give a chance to merge
the IOs by IO scheduler.
In order to do that, let's use blk_plug.

...
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef
<idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0]
<idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0]
<idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0]
<idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0]
<idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0]
<idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0]
<idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0]
...

Note that this issue should be addressed in checkpoint, and some readahead
flows too.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c718379b

23 4月, 2013 3 次提交

f2fs: add tracepoints to debug the block allocation · c01e2853

由 Namjae Jeon 提交于 4月 23, 2013

Add tracepoints to debug the block allocation & fallocate.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
[Jaegeuk: enhance information]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c01e2853

f2fs: add tracepoints for truncate operation · 51dd6249

由 Namjae Jeon 提交于 4月 20, 2013

add tracepoints for tracing the truncate operations
like truncate node/data blocks, f2fs_truncate etc.

Tracepoints are added at entry and exit of operation
to trace the success & failure of operation.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
[Jaegeuk: combine and modify the tracepoint structures]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

51dd6249

f2fs: add tracepoints for sync & inode operations · a2a4a7e4

由 Namjae Jeon 提交于 4月 20, 2013

Add tracepoints in f2fs for tracing the syncing
operations like filesystem sync, file sync enter/exit.
It will helf to trace the code under debugging scenarios.

Also add tracepoints for tracing the various inode operations
like building inode, eviction of inode, link/unlike of
inodes.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
[Jaegeuk: combine and modify the tracepoint structures]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a2a4a7e4

09 4月, 2013 2 次提交

f2fs: introduce a new global lock scheme · 39936837

由 Jaegeuk Kim 提交于 11月 22, 2012

In the previous version, f2fs uses global locks according to the usage types,
such as directory operations, block allocation, block write, and so on.

Reference the following lock types in f2fs.h.
enum lock_type {
	RENAME,		/* for renaming operations */
	DENTRY_OPS,	/* for directory operations */
	DATA_WRITE,	/* for data write */
	DATA_NEW,	/* for data allocation */
	DATA_TRUNC,	/* for data truncate */
	NODE_NEW,	/* for node allocation */
	NODE_TRUNC,	/* for node truncate */
	NODE_WRITE,	/* for node write */
	NR_LOCK_TYPE,
};

In that case, we lose the performance under the multi-threading environment,
since every types of operations must be conducted one at a time.

In order to address the problem, let's share the locks globally with a mutex
array regardless of any types.
So, let users grab a mutex and perform their jobs in parallel as much as
possbile.

For this, I propose a new global lock scheme as follows.

0. Data structure
 - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
 - f2fs_sb_info -> node_write

1. mutex_lock_op(sbi)
 - try to get an avaiable lock from the array.
 - returns the index of the gottern lock variable.

2. mutex_unlock_op(sbi, index of the lock)
 - unlock the given index of the lock.

3. mutex_lock_all(sbi)
 - grab all the locks in the array before the checkpoint.

4. mutex_unlock_all(sbi)
 - release all the locks in the array after checkpoint.

5. block_operations()
 - call mutex_lock_all()
 - sync_dirty_dir_inodes()
 - grab node_write
 - sync_node_pages()

Note that,
 the pairs of mutex_lock_op()/mutex_unlock_op() and
 mutex_lock_all()/mutex_unlock_all() should be used together.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

39936837

f2fs: move f2fs_balance_fs from truncate to punch_hole · 1127a3d4

由 Jason Hrycay 提交于 4月 08, 2013

Move the f2fs_balance_fs out of the truncate_hole function and only
perform that in punch_hole use case.  The commit:

  ed60b1644e7f7e5dd67d21caf7e4425dff05dad0

intended to do this but moved it into truncate_hole to cover more
cases.  However, a deadlock scenario is possible when deleting an inode
entry under specific conditions:

 f2fs_delete_entry()
     mutex_lock_op(sbi, DENTRY_OPS);
     truncate_hole()
         f2fs_balance_fs()
             mutex_lock(&sbi->gc_mutex);
             f2fs_gc()
                 write_checkpoint()
                     block_operations()
                         mutex_lock_op(sbi, DENTRY_OPS);

Lets move it into the punch_hole case to cover the original intent of
avoiding it during fallocate's expand_inode_data case.

Change-Id: I29f8ea1056b0b88b70ba8652d901b6e8431bb27e
Signed-off-by: NJason Hrycay <jason.hrycay@motorola.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1127a3d4

27 3月, 2013 2 次提交

f2fs: fix to give correct parent inode number for roll forward · 953a3e27

由 Jaegeuk Kim 提交于 3月 21, 2013

When we recover fsync'ed data after power-off-recovery, we should guarantee
that any parent inode number should be correct for each direct inode blocks.

So, let's make the following rules.

- The fsync should do checkpoint to all the inodes that were experienced hard
links.

- So, the only normal files can be recovered by roll-forward.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

953a3e27

f2fs: do not skip writing file meta during fsync · 0ff153a2

由 Jaegeuk Kim 提交于 3月 20, 2013

This patch removes data_version check flow during the fsync call.
The original purpose for the use of data_version was to avoid writng inode
pages redundantly by the fsync calls repeatedly.
However, when user can modify file meta and then call fsync, we should not
skip fsync procedure.
So, let's remove this condition check and hope that user triggers in right
manner.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

0ff153a2

20 3月, 2013 1 次提交

f2fs: fix to call WRITE_FLUSH at the end of fsync · ae51fb31

由 Jaegeuk Kim 提交于 3月 16, 2013

The fsync call should be ended after flushing the in-device caches.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

ae51fb31

18 3月, 2013 1 次提交

f2fs: introduce readahead mode of node pages · 266e97a8

由 Jaegeuk Kim 提交于 2月 26, 2013

Previously, f2fs reads several node pages ahead when get_dnode_of_data is called
with RDONLY_NODE flag.
And, this flag is set by the following functions.
- get_data_block_ro
- get_lock_data_page
- do_write_data_page
- truncate_blocks
- truncate_hole

However, this readahead mechanism is initially introduced for the use of
get_data_block_ro to enhance the sequential read performance.

So, let's clarify all the cases with the additional modes as follows.

enum {
	ALLOC_NODE,	/* allocate a new node page if needed */
	LOOKUP_NODE,	/* look up a node without readahead */
	LOOKUP_NODE_RA,	/*
			 * look up a node with readahead called
			 * by get_datablock_ro.
			 */
}
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>

266e97a8

28 2月, 2013 1 次提交
- A
  more file_inode() open-coded instances · 6131ffaa
  由 Al Viro 提交于 2月 27, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  6131ffaa
12 2月, 2013 4 次提交

f2fs: add compat_ioctl to provide backward compatability · e9750824

由 Namjae Jeon 提交于 2月 04, 2013

adding compat_ioctl to provide support for backward comptability - 32bit binary
execution on 64bit kernel.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e9750824

f2fs: stop repeated checking if cp is needed · facb0205

由 Changman Lee 提交于 1月 31, 2013

If it is decided that f2fs should do checkpoint, skip next comparison.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>

facb0205

f2fs: avoid balanc_fs during evict_inode · d4686d56

由 Jaegeuk Kim 提交于 1月 31, 2013

1. Background

Previously, if f2fs tries to move data blocks of an *evicting* inode during the
cleaning process, it stops the process incompletely and then restarts the whole
process, since it needs a locked inode to grab victim data pages in its address
space. In order to get a locked inode, iget_locked() by f2fs_iget() is normally
used, but, it waits if the inode is on freeing.

So, here is a deadlock scenario.
1. f2fs_evict_inode()       <- inode "A"
  2. f2fs_balance_fs()
    3. f2fs_gc()
      4. gc_data_segment()
        5. f2fs_iget()      <- inode "A" too!

If step #1 and #5 treat a same inode "A", step #5 would fall into deadlock since
the inode "A" is on freeing. In order to resolve this, f2fs_iget_nowait() which
skips __wait_on_freeing_inode() was introduced in step #5, and stops f2fs_gc()
to complete f2fs_evict_inode().

1. f2fs_evict_inode()           <- inode "A"
  2. f2fs_balance_fs()
    3. f2fs_gc()
      4. gc_data_segment()
        5. f2fs_iget_nowait()   <- inode "A", then stop f2fs_gc() w/ -ENOENT

2. Problem and Solution

In the above scenario, however, f2fs cannot finish f2fs_evict_inode() only if:
 o there are not enough free sections, and
 o f2fs_gc() tries to move data blocks of the *evicting* inode repeatedly.

So, the final solution is to use f2fs_iget() and remove f2fs_balance_fs() in
f2fs_evict_inode().
The f2fs_evict_inode() actually truncates all the data and node blocks, which
means that it doesn't produce any dirty node pages accordingly.
So, we don't need to do f2fs_balance_fs() in practical.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d4686d56

f2fs: cover global locks for reserve_new_block · bd43df02

由 Jaegeuk Kim 提交于 1月 25, 2013

The fill_zero() from fallocate() calls get_new_data_page() in which calls
reserve_new_block().
The reserve_new_block() should be covered by *DATA_NEW*, one of global locks.
And also, before getting the lock, we should check free sections by calling
f2fs_balance_fs().

If we break this rule, f2fs is able to face with out-of-control free space
management and fall into infinite loop like the following scenario as well.

[f2fs_sync_fs()]             [fallocate()]
 - write_checkpoint()        - fill_zero()
  - block_operations()        - get_new_data_page()
   : grab NODE_NEW             - get_dnode_of_data()
                                : get locked dirty node page
    - sync_node_pages()
                                : try to grab NODE_NEW for data allocation
     : trylock and skip the dirty node page
   : call sync_node_pages() repeatedly in order to flush all the dirty node
     pages!

In order to avoid this, we should grab another global lock such as DATA_NEW
before calling get_new_data_page() in fill_zero().
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

bd43df02

22 1月, 2013 1 次提交

f2fs: add remap_pages as generic_file_remap_pages · 692bb55d

由 Jaegeuk Kim 提交于 1月 17, 2013

This was added for all the file systems before.

See the following commit.

commit id: 0b173bc4

[PATCH] mm: kill vma flag VM_CAN_NONLINEAR

This patch moves actual ptes filling for non-linear file mappings
into special vma operation: ->remap_pages().

File system must implement this method to get non-linear mappings support,
if it uses filemap_fault() then generic_file_remap_pages() can be used.

Now device drivers can implement this method and obtain nonlinear vma support."
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

692bb55d

11 1月, 2013 2 次提交

f2fs: move f2fs_balance_fs to punch_hole · 9eaeba70

由 Jaegeuk Kim 提交于 1月 11, 2013

The f2fs_fallocate() has two operations: punch_hole and expand_size.

Only in the case of punch_hole, dirty node pages can be produced, so let's
trigger f2fs_balance_fs() in this case only.
Furthermore, let's trigger it at every data truncation routine.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9eaeba70

f2fs: add f2fs_balance_fs in several interfaces · 7d82db83

由 Jaegeuk Kim 提交于 1月 11, 2013

The f2fs_balance_fs() is to check the number of free sections and decide whether
it needs to conduct cleaning or not. If there are not enough free sections, the
cleaning job should be started.

In order to control an amount of free sections even under high utilization, f2fs
should call f2fs_balance_fs at all the VFS interfaces that are able to produce
dirty pages.
This patch adds the function calls in the missing interfaces as follows.

1. f2fs_setxattr()
The f2fs_setxattr() produces dirty node pages so that we should call
f2fs_balance_fs() either likewise doing in other VFS interfaces such as
f2fs_lookup(), f2fs_mkdir(), and so on.

2. f2fs_sync_file()
We should guarantee serving free sections for syncing metadata during fsync.
Previously, there is no space check before triggering checkpoint and
sync_node_pages.
Therefore, if a bunch of fsync calls are triggered under 100% of FS utilization,
f2fs is able to be faced with no free sections, resulting in BUG_ON().

3. f2fs_sync_fs()
Before calling write_checkpoint(), we should guarantee that there are minimum
free sections.

4. f2fs_write_inode()
f2fs_write_inode() is also able to produce dirty node pages.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

7d82db83

04 1月, 2013 1 次提交

f2fs: fix time update in case of f2fs fallocate · 3af60a49

由 Namjae Jeon 提交于 12月 30, 2012

After doing a punch hole or expanding inode doing fallocation.
The change and modification time are not update for the file.
So, update time after no issue is observed in fallocate.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

3af60a49

26 12月, 2012 1 次提交

f2fs: fix handling errors got by f2fs_write_inode · 398b1ac5

由 Jaegeuk Kim 提交于 12月 19, 2012

Ruslan reported that f2fs hangs with an infinite loop in f2fs_sync_file():

	while (sync_node_pages(sbi, inode->i_ino, &wbc) == 0)
		f2fs_write_inode(inode, NULL);

The reason was revealed that the cold flag is not set even thought this inode is
a normal file. Therefore, sync_node_pages() skips to write node blocks since it
only writes cold node blocks.

The cold flag is stored to the node_footer in node block, and whenever a new
node page is allocated, it is set according to its file type, file or directory.

But, after sudden-power-off, when recovering the inode page, f2fs doesn't recover
its cold flag.

So, let's assign the cold flag in more right places.

One more thing:
If f2fs_write_inode() returns an error due to whatever situations, there would
be no dirty node pages so that sync_node_pages() returns zero.
(i.e., zero means nothing was written.)
Reported-by: NRuslan N. Marchenko <me@ruff.mobi>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

398b1ac5

11 12月, 2012 4 次提交

f2fs: remove unused variable · 705f814e

由 Wei Yongjun 提交于 12月 02, 2012

The variables node_page and page_offset are initialized but never used
otherwise, so remove those unused variables.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>

705f814e

f2fs: check read only condition before beginning write out · 1fa95b0b

由 Namjae Jeon 提交于 12月 01, 2012

If the filesystem is mounted as read-only then return from that point itself
instead of first doing a writeout/wait and then checking for read-only
condition.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>

1fa95b0b

f2fs: adjust kernel coding style · 0a8165d7

由 Jaegeuk Kim 提交于 11月 29, 2012

As pointed out by Randy Dunlap, this patch removes all usage of "/**" for comment
blocks. Instead, just use "/*".
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

0a8165d7

f2fs: add file operations · fbfa2cc5

由 Jaegeuk Kim 提交于 11月 02, 2012

This adds memory operations and file/file_inode operations.

- F2FS supports fallocate(), mmap(), fsync(), and basic ioctl().
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

fbfa2cc5

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功