提交 · 696c018c7718f5e33e1107da19c4d64a25018878 · openeuler / Kernel

07 6月, 2013 1 次提交

f2fs: fix iget/iput of dir during recovery · 5deb8267

由 Jaegeuk Kim 提交于 6月 05, 2013

It is possible that iput is skipped after iget during the recovery.

In recover_dentry(),
 dir = f2fs_iget();
 ...
 if (de && inode->i_ino == le32_to_cpu(de->ino))
	goto out;

In this case, this dir is not able to be added in dirty_dir_inode_list.
The actual linking is done only when set_page_dirty() is called.

So let's add this newly got inode into the list explicitly, and put it at the
end of the recovery routine.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5deb8267

28 5月, 2013 4 次提交

f2fs: fix incorrect iputs during the dentry recovery · afc3eda2

由 Jaegeuk Kim 提交于 5月 28, 2013

- iget/iput flow in the dentry recovery process

1. *dir* = f2fs_iget
2. set FI_DELAY_IPUT to *dir*
3. add *dir* to the dirty_dir_list
		   - __f2fs_add_link
		     - recover_dentry)
4. iput *dir* by remove_dirty_dir_inode
		   - sync_dirty_dir_inodes
		     - write_chekcpoint

If *dir*'s i_count is not 1 (i.e., root dir), remove_dirty_dir_inode is called
later and then iput is triggered again due to the FI_DELAY_IPUT flag.
So, let's unset the flag properly once iput is triggered.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

afc3eda2

f2fs: iput only if whole data blocks are flushed · 3b10b1fd

由 Jaegeuk Kim 提交于 5月 27, 2013

If there remains some unwritten blocks from the recovery, we should not call
iput on that directory inode.
Otherwise, we can loose some dentry blocks after the recovery.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

3b10b1fd

f2fs: push some variables to debug part · 35b09d82

由 Namjae Jeon 提交于 5月 23, 2013

Some, counters are needed only for the statistical information
while debugging.
So, those can be controlled using CONFIG_F2FS_STAT_FS,
pushing the usage for few variables under this flag.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

35b09d82

f2fs: fix BUG_ON during f2fs_evict_inode(dir) · 74d0b917

由 Jaegeuk Kim 提交于 5月 15, 2013

During the dentry recovery routine, recover_inode() triggers __f2fs_add_link
with its directory inode.

In the following scenario, a bug is captured.
 1. dir = f2fs_iget(pino)
 2. __f2fs_add_link(dir, name)
 3. iput(dir)
  -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents))

Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable]
[<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs]
Call Trace:
 [<ffffffff8118ea00>] evict+0xb0/0x1b0
 [<ffffffff8118f1c5>] iput+0x105/0x190
 [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs]
 [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0
 [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0
 [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70
 [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140
 [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0
 [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30
 [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70

This means that we should flush all the dentry pages between iget and iput().
But, during the recovery routine, it is unallowed due to consistency, so we
have to wait the whole recovery process.
And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we
can put the stale dir inodes from the dirty_dir_inode_list.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

74d0b917

29 4月, 2013 1 次提交

f2fs: check truncation of mapping after lock_page · afcb7ca0

由 Jaegeuk Kim 提交于 4月 26, 2013

We call lock_page when we need to update a page after readpage.
Between grab and lock page, the page can be truncated by other thread.
So, we should check the page after lock_page whether it was truncated or not.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

afcb7ca0

26 4月, 2013 1 次提交

f2fs: give a chance to merge IOs by IO scheduler · c718379b

由 Jaegeuk Kim 提交于 4月 24, 2013

Previously, background GC submits many 4KB read requests to load victim blocks
and/or its (i)node blocks.

...
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed
f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee
f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef
f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0]
...

However, by the fact that many IOs are sequential, we can give a chance to merge
the IOs by IO scheduler.
In order to do that, let's use blk_plug.

...
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef
<idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0]
<idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0]
<idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0]
<idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0]
<idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0]
<idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0]
<idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0]
...

Note that this issue should be addressed in checkpoint, and some readahead
flows too.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c718379b

23 4月, 2013 1 次提交

f2fs: add tracepoints to debug checkpoint request · 2af4bd6c

由 Namjae Jeon 提交于 4月 23, 2013

Add tracepoints to debug checkpoint request.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
[Jaegeuk: change expressions]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

2af4bd6c

09 4月, 2013 1 次提交

f2fs: introduce a new global lock scheme · 39936837

由 Jaegeuk Kim 提交于 11月 22, 2012

In the previous version, f2fs uses global locks according to the usage types,
such as directory operations, block allocation, block write, and so on.

Reference the following lock types in f2fs.h.
enum lock_type {
	RENAME,		/* for renaming operations */
	DENTRY_OPS,	/* for directory operations */
	DATA_WRITE,	/* for data write */
	DATA_NEW,	/* for data allocation */
	DATA_TRUNC,	/* for data truncate */
	NODE_NEW,	/* for node allocation */
	NODE_TRUNC,	/* for node truncate */
	NODE_WRITE,	/* for node write */
	NR_LOCK_TYPE,
};

In that case, we lose the performance under the multi-threading environment,
since every types of operations must be conducted one at a time.

In order to address the problem, let's share the locks globally with a mutex
array regardless of any types.
So, let users grab a mutex and perform their jobs in parallel as much as
possbile.

For this, I propose a new global lock scheme as follows.

0. Data structure
 - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
 - f2fs_sb_info -> node_write

1. mutex_lock_op(sbi)
 - try to get an avaiable lock from the array.
 - returns the index of the gottern lock variable.

2. mutex_unlock_op(sbi, index of the lock)
 - unlock the given index of the lock.

3. mutex_lock_all(sbi)
 - grab all the locks in the array before the checkpoint.

4. mutex_unlock_all(sbi)
 - release all the locks in the array after checkpoint.

5. block_operations()
 - call mutex_lock_all()
 - sync_dirty_dir_inodes()
 - grab node_write
 - sync_node_pages()

Note that,
 the pairs of mutex_lock_op()/mutex_unlock_op() and
 mutex_lock_all()/mutex_unlock_all() should be used together.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

39936837

03 4月, 2013 1 次提交

f2fs: change GC bitmaps to apply the section granularity · 5ec4e49f

由 Jaegeuk Kim 提交于 3月 31, 2013

This patch removes a bitmap for victim segments selected by foreground GC, and
modifies the other bitmap for victim segments selected by background GC.

1) foreground GC bitmap
 : We don't need to manage this, since we just only one previous victim section
   number instead of the whole victim history.
   The f2fs uses the victim section number in order not to allocate currently
   GC'ed section to current active logs.

2) background GC bitmap
 : This bitmap is used to avoid selecting victims repeatedly by background GCs.
   In addition, the victims are able to be selected by foreground GCs, since
   there is no need to read victim blocks during foreground GCs.

   By the fact that the foreground GC reclaims segments in a section unit, it'd
   be better to manage this bitmap based on the section granularity.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5ec4e49f

20 3月, 2013 1 次提交

f2fs: reduce unncessary locking pages during read · 393ff91f

由 Jaegeuk Kim 提交于 3月 08, 2013

This patch reduces redundant locking and unlocking pages during read operations.
In f2fs_readpage, let's use wait_on_page_locked() instead of lock_page.
And then, when we need to modify any data finally, let's lock the page so that
we can avoid lock contention.

[readpage rule]
- The f2fs_readpage returns unlocked page, or released page too in error cases.
- Its caller should handle read error, -EIO, after locking the page, which
  indicates read completion.
- Its caller should check PageUptodate after grab_cache_page.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

393ff91f

12 2月, 2013 4 次提交

f2fs: clarify and enhance the f2fs_gc flow · 43727527

由 Jaegeuk Kim 提交于 2月 04, 2013

This patch makes clearer the ambiguous f2fs_gc flow as follows.

1. Remove intermediate checkpoint condition during f2fs_gc
 (i.e., should_do_checkpoint() and GC_BLOCKED)

2. Remove unnecessary return values of f2fs_gc because of #1.
 (i.e., GC_NODE, GC_OK, etc)

3. Simplify write_checkpoint() because of #2.

4. Clarify the main f2fs_gc flow.
 o monitor how many freed sections during one iteration of do_garbage_collect().
 o do GC more without checkpoints if we can't get enough free sections.
 o do checkpoint once we've got enough free sections through forground GCs.

5. Adopt thread-logging (Slack-Space-Recycle) scheme more aggressively on data
  log types. See. get_ssr_segement()
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

43727527

f2fs: remove repeated F2FS_SET_SB_DIRT call · 94787d91

由 Changman Lee 提交于 2月 01, 2013

F2FS_SET_SB_DIRT is called in inc_page_count and
it is directly called one more time in the next line.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

94787d91

f2fs: clean up the add_orphan_inode func · a2617dc6

由 majianpeng 提交于 1月 29, 2013

For the code
> prev = list_entry(orphan->list.prev, typeof(*prev), list);
if orphan->list.prev == head, it can't get the right prev.
And we can use the parameter 'this' to add.
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a2617dc6

f2fs: prevent checkpoint once any IO failure is detected · 577e3495

由 Jaegeuk Kim 提交于 1月 24, 2013

This patch enhances the checkpoint routine to cope with IO errors.

Basically f2fs detects IO errors from end_io_write, and the errors are able to
be occurred during one of data, node, and meta page writes.

In the previous code, when an IO error is occurred during writes, f2fs sets a
flag, CP_ERROR_FLAG, in the raw ckeckpoint buffer which will be written to disk.
Afterwards, write_checkpoint() will check the flag and remount f2fs as a
read-only (ro) mode.

However, even once f2fs is remounted as a ro mode, dirty checkpoint pages are
freely able to be written to disk by flusher or kswapd in background.
In such a case, after cold reboot, f2fs would restore the checkpoint data having
CP_ERROR_FLAG, resulting in disabling write_checkpoint and remounting f2fs as
a ro mode again.

Therefore, let's prevent any checkpoint page (meta) writes once an IO error is
occurred, and remount f2fs as a ro mode right away at that moment.
Reported-by: NOliver Winker <oliver@oli1170.net>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>

577e3495

22 1月, 2013 1 次提交

f2fs: add __init to functions in init_f2fs_fs · 6e6093a8

由 Namjae Jeon 提交于 1月 17, 2013

Add __init to functions in init_f2fs_fs for code consistency.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6e6093a8

04 1月, 2013 1 次提交

f2fs: remove unneeded INIT_LIST_HEAD at few places · 24c366a9

由 Namjae Jeon 提交于 12月 30, 2012

While creating a new entry for addition to the list(orphan inode list
and fsync inode entry list), there is no need to call HEAD initialization
for these entries. So, remove that init part.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

24c366a9

11 12月, 2012 3 次提交

f2fs: adjust kernel coding style · 0a8165d7

由 Jaegeuk Kim 提交于 11月 29, 2012

As pointed out by Randy Dunlap, this patch removes all usage of "/**" for comment
blocks. Instead, just use "/*".
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

0a8165d7

f2fs: fix endian conversion bugs reported by sparse · 25ca923b

由 Jaegeuk Kim 提交于 11月 28, 2012

This patch should resolve the bugs reported by the sparse tool.
Initial reports were written by "kbuild test robot" managed by fengguang.wu.

In my local machines, I've tested also by running:
> make C=2 CF="-D__CHECK_ENDIAN__"

Accordingly, I've found lots of warnings and bugs related to the endian
conversion. And I've fixed all at this moment.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

25ca923b

f2fs: add checkpoint operations · 127e670a

由 Jaegeuk Kim 提交于 11月 02, 2012

This adds functions required by the checkpoint operations.

Basically, f2fs adopts a roll-back model with checkpoint blocks written in the
CP area. The checkpoint procedure includes as follows.

- write_checkpoint()
1. block_operations() freezes VFS calls.
2. submit cached bios.
3. flush_nat_entries() writes NAT pages updated by dirty NAT entries.
4. flush_sit_entries() writes SIT pages updated by dirty SIT entries.
5. do_checkpoint() writes,
  - checkpoint block (#0)
  - orphan inode blocks
  - summary blocks made by active logs
  - checkpoint block (copy of #0)
6. unblock_opeations()

In order to provide an address space for meta pages, f2fs_sb_info has a special
inode, namely meta_inode. This patch also adds the address space operations for
meta_inode.
Signed-off-by: NChul Lee <chur.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

127e670a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功