提交 · dbe6a5ff4fa78bdfa983458c338831d91b35f315 · openanolis / cloud-kernel

09 8月, 2013 1 次提交

f2fs: fix the use of XATTR_NODE_OFFSET · dbe6a5ff

由 Jaegeuk Kim 提交于 8月 09, 2013

This patch fixes the use of XATTR_NODE_OFFSET.

o The offset should not use several MSB bits which are used by marking node
blocks.

o IS_DNODE should handle XATTR_NODE_OFFSET to avoid potential abnormality
during the fsync call.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

dbe6a5ff

08 8月, 2013 1 次提交

f2fs: fix a build failure due to missing the kobject header · c2d715d1

由 Jaegeuk Kim 提交于 8月 08, 2013

This patch should resolve the following error reported by kbuild test robot.

All error/warnings:

   In file included from fs/f2fs/dir.c:13:0:
   >> fs/f2fs/f2fs.h:435:17: error: field 's_kobj' has incomplete type
        struct kobject s_kobj;

The failure was caused by missing the kobject header file in dir.c.
So, this patch move the header file to the right location, f2fs.h.

CC: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c2d715d1

06 8月, 2013 2 次提交

f2fs: fix a deadlock in fsync · a569469e

由 Jin Xu 提交于 8月 05, 2013

This patch fixes a deadlock bug that occurs quite often when there are
concurrent write and fsync on a same file.

Following is the simplified call trace when tasks get hung.

fsync thread:
- f2fs_sync_file
 ...
 - f2fs_write_data_pages
 ...
  - update_extent_cache
  ...
   - update_inode
    - wait_on_page_writeback

bdi writeback thread
- __writeback_single_inode
 - f2fs_write_data_pages
  - mutex_lock(sbi->writepages)

The deadlock happens when the fsync thread waits on a inode page that has
been added to the f2fs' cached bio sbi->bio[NODE], and unfortunately,
no one else could be able to submit the cached bio to block layer for
writeback. This is because the fsync thread already hold a sbi->fs_lock and
the sbi->writepages lock, causing the bdi thread being blocked when attempt
to write data pages for the same inode. At the same time, f2fs_gc thread
does not notice the situation and could not help. Even the sync syscall
gets blocked.

To fix it, we could submit the cached bio first before waiting on a inode page
that is being written back.
Signed-off-by: NJin Xu <jinuxstyle@gmail.com>
[Jaegeuk Kim: add more cases to use f2fs_wait_on_page_writeback]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a569469e

f2fs: add sysfs support for controlling the gc_thread · b59d0bae

由 Namjae Jeon 提交于 8月 04, 2013

Add sysfs entries to control the timing parameters for
f2fs gc thread.

Various Sysfs options introduced are:
gc_min_sleep_time: Min Sleep time for GC in ms
gc_max_sleep_time: Max Sleep time for GC in ms
gc_no_gc_sleep_time: Default Sleep time for GC in ms

Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
[Jaegeuk Kim: fix an umount bug and some minor changes]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b59d0bae

30 7月, 2013 5 次提交

f2fs: fix handling orphan inodes · cbd56e7d

由 Jaegeuk Kim 提交于 7月 30, 2013

This patch fixes mishandling of the sbi->n_orphans variable.

If users request lots of f2fs_unlink(), check_orphan_space() could be contended.
In such the case, sbi->n_orphans can be read incorrectly so that f2fs_unlink()
would fall into the wrong state which results in the failure of
add_orphan_inode().

So, let's increment sbi->n_orphans virtually prior to the actual orphan inode
stuffs. After that, let's release sbi->n_orphans by calling release_orphan_inode
or remove_orphan_inode.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

cbd56e7d

f2fs: update file name in the inode block during f2fs_rename · 1cd14caf

由 Jaegeuk Kim 提交于 7月 18, 2013

The error is reproducible by:
0. mkfs.f2fs /dev/sdb1 & mount
1. touch test1
2. touch test2
3. mv test1 test2
4. umount
5. dumpt.f2fs -i 4 /dev/sdb1

After this, when we retrieve the inode->i_name of test2 by dump.f2fs, we get
test1 instead of test2.
This is because f2fs didn't update the file name during the f2fs_rename.

So, this patch fixes that.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1cd14caf

f2fs: introduce help function F2FS_NODE() · 45590710

由 Gu Zheng 提交于 7月 15, 2013

Introduce help function F2FS_NODE() to simplify the conversion of node_page to
f2fs_node.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

45590710

f2fs: add a help func F2FS_STAT() to get the f2fs_stat_info · 963d4f7d

由 Gu Zheng 提交于 7月 12, 2013

Add a help func F2FS_STAT() to get the f2fs_stat_info.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

963d4f7d

f2fs: add proc entry to monitor current usage of segments · 5e176d54

由 Jaegeuk Kim 提交于 6月 28, 2013

You can monitor valid block counts of whole segments in:
  /proc/fs/f2fs/sdb1/segment_info.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5e176d54

02 7月, 2013 1 次提交

f2fs: fix crc endian conversion · 7e586fa0

由 Jaegeuk Kim 提交于 6月 19, 2013

While calculating CRC for the checkpoint block, we use __u32, but when storing
the crc value to the disk, we use __le32.

Let's fix the inconsistency.
Reported-and-Tested-by: NOded Gabbay <ogabbay@advaoptical.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

7e586fa0

14 6月, 2013 3 次提交

f2fs: recover wrong pino after checkpoint during fsync · 354a3399

由 Jaegeuk Kim 提交于 6月 14, 2013

If a file is linked, f2fs loose its parent inode number so that fsync calls
for the linked file should do checkpoint all the time.
But, if we can recover its parent inode number after the checkpoint, we can
adjust roll-forward mechanism for the further fsync calls, which is able to
improve the fsync performance significatly.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

354a3399

f2fs: make locate_dirty_segment() as static · 8d8451af

由 Haicheng Li 提交于 6月 13, 2013

It's used only locally and could be static.
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8d8451af

f2fs: avoid freqeunt write_inode calls · b3783873

由 Jaegeuk Kim 提交于 6月 10, 2013

If update_inode is called, we don't need to do write_inode.
So, let's use a *dirty* flag for each inode.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b3783873

12 6月, 2013 1 次提交

f2fs: sync dir->i_size with its block allocation · 699489bb

由 Jaegeuk Kim 提交于 6月 07, 2013

If new dentry block is allocated and its i_size is updated, we should update
its inode block together in order to sync i_size and its block allocation.
Otherwise, we can loose additional dentry block due to the unconsistent i_size.

Errorneous Scenario
-------------------

In the recovery routine,
 - recovery_dentry
 | - __f2fs_add_link
 | | - get_new_data_page
 | | | - i_size_write(new_i_size)
 | | | - mark_inode_dirty_sync(dir)
 | | - update_parent_metadata
 | | | - mark_inode_dirty(dir)
 |
 - write_checkpoint
   - sync_dirty_dir_inodes
     - filemap_flush(dentry_blocks)
       - f2fs_write_data_page
         - skip to write the last dentry block due to index < i_size

In the above flow, new_i_size is not updated to its inode block so that the
last dentry block will be lost accordingly.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

699489bb

11 6月, 2013 2 次提交

f2fs: fix i_blocks translation on various types of files · 2d4d9fb5

由 Jaegeuk Kim 提交于 6月 07, 2013

Basically an inode manages the number of allocated blocks with inode->i_blocks
which is represented in a unit of sectors, not file system blocks.
But, f2fs has used i_blocks in a unit of file system blocks, and f2fs_getattr
translates it to the number of sectors when fstat is called.

However, previously f2fs_file_inode_operations only has this, so this patch adds
it to all the types of inode_operations.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

2d4d9fb5

f2fs: support xattr security labels · 8ae8f162

由 Jaegeuk Kim 提交于 6月 03, 2013

This patch adds the support of security labels for f2fs, which will be used
by Linus Security Models (LSMs).

Quote from http://en.wikipedia.org/wiki/Linux_Security_Modules:
"Linux Security Modules (LSM) is a framework that allows the Linux kernel to
support a variety of computer security models while avoiding favoritism toward
any single security implementation. The framework is licensed under the terms of
the GNU General Public License and is standard part of the Linux kernel since
Linux 2.6. AppArmor, SELinux, Smack and TOMOYO Linux are the currently accepted
modules in the official kernel.".
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8ae8f162

07 6月, 2013 1 次提交

f2fs: fix iget/iput of dir during recovery · 5deb8267

由 Jaegeuk Kim 提交于 6月 05, 2013

It is possible that iput is skipped after iget during the recovery.

In recover_dentry(),
 dir = f2fs_iget();
 ...
 if (de && inode->i_ino == le32_to_cpu(de->ino))
	goto out;

In this case, this dir is not able to be added in dirty_dir_inode_list.
The actual linking is done only when set_page_dirty() is called.

So let's add this newly got inode into the list explicitly, and put it at the
end of the recovery routine.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5deb8267

28 5月, 2013 8 次提交

f2fs: push some variables to debug part · 35b09d82

由 Namjae Jeon 提交于 5月 23, 2013

Some, counters are needed only for the statistical information
while debugging.
So, those can be controlled using CONFIG_F2FS_STAT_FS,
pushing the usage for few variables under this flag.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

35b09d82

f2fs: align data types between on-disk and in-memory block addresses · a9841c4d

由 Jaegeuk Kim 提交于 5月 24, 2013

The on-disk block address is defined as __le32, but in-memory block address,
block_t, does as u64.

Let's synchronize them to 32 bits.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a9841c4d

f2fs: reuse the locked dnode page and its inode · b292dcab

由 Jaegeuk Kim 提交于 5月 22, 2013

This patch fixes the following deadlock bug during the recovery.

INFO: task mount:1322 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mount           D ffffffff81125870     0  1322   1266 0x00000000
 ffff8801207e39d8 0000000000000046 ffff88012ab1dee0 0000000000000046
 ffff8801207e3a08 ffff880115903f40 ffff8801207e3fd8 ffff8801207e3fd8
 ffff8801207e3fd8 ffff880115903f40 ffff8801207e39d8 ffff88012fc94520
Call Trace:
[<ffffffff81125870>] ? __lock_page+0x70/0x70
[<ffffffff816a92d9>] schedule+0x29/0x70
[<ffffffff816a93af>] io_schedule+0x8f/0xd0
[<ffffffff8112587e>] sleep_on_page+0xe/0x20
[<ffffffff816a649a>] __wait_on_bit_lock+0x5a/0xc0
[<ffffffff81125867>] __lock_page+0x67/0x70
[<ffffffff8106c7b0>] ? autoremove_wake_function+0x40/0x40
[<ffffffff81126857>] find_lock_page+0x67/0x80
[<ffffffff8112698f>] find_or_create_page+0x3f/0xb0
[<ffffffffa03901a8>] ? sync_inode_page+0xa8/0xd0 [f2fs]
[<ffffffffa038fdf7>] get_node_page+0x67/0x180 [f2fs]
[<ffffffffa039818b>] recover_fsync_data+0xacb/0xff0 [f2fs]
[<ffffffff816aaa1e>] ? _raw_spin_unlock+0x3e/0x40
[<ffffffffa0389634>] f2fs_fill_super+0x7d4/0x850 [f2fs]
[<ffffffff81184cf9>] mount_bdev+0x1c9/0x210
[<ffffffffa0388e60>] ? validate_superblock+0x180/0x180 [f2fs]
[<ffffffffa0387635>] f2fs_mount+0x15/0x20 [f2fs]
[<ffffffff81185a13>] mount_fs+0x43/0x1b0
[<ffffffff81145ba0>] ? __alloc_percpu+0x10/0x20
[<ffffffff811a0796>] vfs_kern_mount+0x76/0x120
[<ffffffff811a2cb7>] do_mount+0x237/0xa10
[<ffffffff81140b9b>] ? strndup_user+0x5b/0x80
[<ffffffff811a3520>] SyS_mount+0x90/0xe0
[<ffffffff816b3502>] system_call_fastpath+0x16/0x1b

The bug is triggered when check_index_in_prev_nodes tries to get the direct
node page by calling get_node_page.
At this point, if the direct node page is already locked by get_dnode_of_data,
its caller, we got a deadlock condition.

This patch adds additional condition check for the reuse of locked direct node
pages prior to the get_node_page call.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b292dcab

f2fs: add f2fs_readonly() · 77888c1e

由 Jaegeuk Kim 提交于 5月 20, 2013

Introduce a simple macro function for readability.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

77888c1e

f2fs, lockdep: annotate mutex_lock_all() · bfe35965

由 Peter Zijlstra 提交于 5月 16, 2013

Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all()
acquires an array of locks (in global/local lock style).

Any such operation is always serialized using cp_mutex, therefore there is no
fs_lock[] lock-order issue; tell lockdep about this using the
mutex_lock_nest_lock() primitive.
Reported-by: Nmajianpeng <majianpeng@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

bfe35965

f2fs: update inode page after creation · 44a83ff6

由 Jaegeuk Kim 提交于 5月 20, 2013

I found a bug when testing power-off-recovery as follows.

[Bug Scenario]
1. create a file
2. fsync the file
3. reboot w/o any sync
4. try to recover the file
 - found its fsync mark
 - found its dentry mark
   : try to recover its dentry
    - get its file name
    - get its parent inode number
     : here we got zero value

The reason why we get the wrong parent inode number is that we didn't
synchronize the inode page with its newly created inode information perfectly.

Especially, previous f2fs stores fi->i_pino and writes it to the cached
node page in a wrong order, which incurs the zero-valued i_pino during the
recovery.

So, this patch modifies the creation flow to fix the synchronization order of
inode page with its inode.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

44a83ff6

f2fs: change get_new_data_page to pass a locked node page · 64aa7ed9

由 Jaegeuk Kim 提交于 5月 20, 2013

This patch is for passing a locked node page to get_dnode_of_data.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

64aa7ed9

f2fs: fix BUG_ON during f2fs_evict_inode(dir) · 74d0b917

由 Jaegeuk Kim 提交于 5月 15, 2013

During the dentry recovery routine, recover_inode() triggers __f2fs_add_link
with its directory inode.

In the following scenario, a bug is captured.
 1. dir = f2fs_iget(pino)
 2. __f2fs_add_link(dir, name)
 3. iput(dir)
  -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents))

Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable]
[<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs]
Call Trace:
 [<ffffffff8118ea00>] evict+0xb0/0x1b0
 [<ffffffff8118f1c5>] iput+0x105/0x190
 [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs]
 [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0
 [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0
 [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70
 [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140
 [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0
 [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30
 [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70

This means that we should flush all the dentry pages between iget and iput().
But, during the recovery routine, it is unallowed due to consistency, so we
have to wait the whole recovery process.
And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we
can put the stale dir inodes from the dirty_dir_inode_list.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

74d0b917

29 4月, 2013 1 次提交

f2fs: enhance alloc_nid and build_free_nids flows · 55008d84

由 Jaegeuk Kim 提交于 4月 25, 2013

In order to avoid build_free_nid lock contention, let's change the order of
function calls as follows.

At first, check whether there is enough free nids.
 - If available, just get a free nid with spin_lock without any overhead.
 - Otherwise, conduct build_free_nids.
  : scan nat pages, journal nat entries, and nat cache entries.

We should consider carefullly not to serve free nids intermediately made by
build_free_nids.
We can get stable free nids only after build_free_nids is done.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

55008d84

26 4月, 2013 1 次提交

f2fs: give a chance to merge IOs by IO scheduler · c718379b

由 Jaegeuk Kim 提交于 4月 24, 2013

Previously, background GC submits many 4KB read requests to load victim blocks
and/or its (i)node blocks.

...
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed
f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee
f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef
f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0]
...

However, by the fact that many IOs are sequential, we can give a chance to merge
the IOs by IO scheduler.
In order to do that, let's use blk_plug.

...
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef
<idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0]
<idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0]
<idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0]
<idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0]
<idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0]
<idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0]
<idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0]
...

Note that this issue should be addressed in checkpoint, and some readahead
flows too.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c718379b

09 4月, 2013 1 次提交

f2fs: introduce a new global lock scheme · 39936837

由 Jaegeuk Kim 提交于 11月 22, 2012

In the previous version, f2fs uses global locks according to the usage types,
such as directory operations, block allocation, block write, and so on.

Reference the following lock types in f2fs.h.
enum lock_type {
	RENAME,		/* for renaming operations */
	DENTRY_OPS,	/* for directory operations */
	DATA_WRITE,	/* for data write */
	DATA_NEW,	/* for data allocation */
	DATA_TRUNC,	/* for data truncate */
	NODE_NEW,	/* for node allocation */
	NODE_TRUNC,	/* for node truncate */
	NODE_WRITE,	/* for node write */
	NR_LOCK_TYPE,
};

In that case, we lose the performance under the multi-threading environment,
since every types of operations must be conducted one at a time.

In order to address the problem, let's share the locks globally with a mutex
array regardless of any types.
So, let users grab a mutex and perform their jobs in parallel as much as
possbile.

For this, I propose a new global lock scheme as follows.

0. Data structure
 - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
 - f2fs_sb_info -> node_write

1. mutex_lock_op(sbi)
 - try to get an avaiable lock from the array.
 - returns the index of the gottern lock variable.

2. mutex_unlock_op(sbi, index of the lock)
 - unlock the given index of the lock.

3. mutex_lock_all(sbi)
 - grab all the locks in the array before the checkpoint.

4. mutex_unlock_all(sbi)
 - release all the locks in the array after checkpoint.

5. block_operations()
 - call mutex_lock_all()
 - sync_dirty_dir_inodes()
 - grab node_write
 - sync_node_pages()

Note that,
 the pairs of mutex_lock_op()/mutex_unlock_op() and
 mutex_lock_all()/mutex_unlock_all() should be used together.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

39936837

03 4月, 2013 1 次提交

f2fs: change GC bitmaps to apply the section granularity · 5ec4e49f

由 Jaegeuk Kim 提交于 3月 31, 2013

This patch removes a bitmap for victim segments selected by foreground GC, and
modifies the other bitmap for victim segments selected by background GC.

1) foreground GC bitmap
 : We don't need to manage this, since we just only one previous victim section
   number instead of the whole victim history.
   The f2fs uses the victim section number in order not to allocate currently
   GC'ed section to current active logs.

2) background GC bitmap
 : This bitmap is used to avoid selecting victims repeatedly by background GCs.
   In addition, the victims are able to be selected by foreground GCs, since
   there is no need to read victim blocks during foreground GCs.

   By the fact that the foreground GC reclaims segments in a section unit, it'd
   be better to manage this bitmap based on the section granularity.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5ec4e49f

27 3月, 2013 3 次提交

f2fs: fix to give correct parent inode number for roll forward · 953a3e27

由 Jaegeuk Kim 提交于 3月 21, 2013

When we recover fsync'ed data after power-off-recovery, we should guarantee
that any parent inode number should be correct for each direct inode blocks.

So, let's make the following rules.

- The fsync should do checkpoint to all the inodes that were experienced hard
links.

- So, the only normal files can be recovered by roll-forward.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

953a3e27

f2fs: do not skip writing file meta during fsync · 0ff153a2

由 Jaegeuk Kim 提交于 3月 20, 2013

This patch removes data_version check flow during the fsync call.
The original purpose for the use of data_version was to avoid writng inode
pages redundantly by the fsync calls repeatedly.
However, when user can modify file meta and then call fsync, we should not
skip fsync procedure.
So, let's remove this condition check and hope that user triggers in right
manner.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

0ff153a2

f2fs: fix the recovery flow to handle errors correctly · 6ead1142

由 Jaegeuk Kim 提交于 3月 20, 2013

We should handle errors during the recovery flow correctly.
For example, if we get -ENOMEM, we should report a mount failure instead of
conducting the remained mount procedure.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6ead1142

20 3月, 2013 2 次提交

f2fs: fix typo in comments · 111d2495

由 Masanari Iida 提交于 3月 19, 2013

Correct spelling typo in comments
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Acked-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

111d2495

f2fs: avoid BUG_ON from check_nid_range and update return path in do_read_inode · 064e0823

由 Namjae Jeon 提交于 3月 17, 2013

In function check_nid_range, there is no need to trigger BUG_ON and make kernel stop.
Instead it could just check and indicate the inode number to be EINVAL.
Update the return path in do_read_inode to use the return from check_nid_range.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
[Jaegeuk: replace BUG_ON with WARN_ON]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

064e0823

19 3月, 2013 1 次提交

f2fs: Fix typo in comments · 434720fa

由 Masanari Iida 提交于 3月 19, 2013

Correct spelling typo in comments
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Acked-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

434720fa

18 3月, 2013 1 次提交

f2fs: introduce readahead mode of node pages · 266e97a8

由 Jaegeuk Kim 提交于 2月 26, 2013

Previously, f2fs reads several node pages ahead when get_dnode_of_data is called
with RDONLY_NODE flag.
And, this flag is set by the following functions.
- get_data_block_ro
- get_lock_data_page
- do_write_data_page
- truncate_blocks
- truncate_hole

However, this readahead mechanism is initially introduced for the use of
get_data_block_ro to enhance the sequential read performance.

So, let's clarify all the cases with the additional modes as follows.

enum {
	ALLOC_NODE,	/* allocate a new node page if needed */
	LOOKUP_NODE,	/* look up a node without readahead */
	LOOKUP_NODE_RA,	/*
			 * look up a node with readahead called
			 * by get_datablock_ro.
			 */
}
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>

266e97a8

12 2月, 2013 4 次提交

f2fs: add compat_ioctl to provide backward compatability · e9750824

由 Namjae Jeon 提交于 2月 04, 2013

adding compat_ioctl to provide support for backward comptability - 32bit binary
execution on 64bit kernel.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e9750824

f2fs: clarify and enhance the f2fs_gc flow · 43727527

由 Jaegeuk Kim 提交于 2月 04, 2013

This patch makes clearer the ambiguous f2fs_gc flow as follows.

1. Remove intermediate checkpoint condition during f2fs_gc
 (i.e., should_do_checkpoint() and GC_BLOCKED)

2. Remove unnecessary return values of f2fs_gc because of #1.
 (i.e., GC_NODE, GC_OK, etc)

3. Simplify write_checkpoint() because of #2.

4. Clarify the main f2fs_gc flow.
 o monitor how many freed sections during one iteration of do_garbage_collect().
 o do GC more without checkpoints if we can't get enough free sections.
 o do checkpoint once we've got enough free sections through forground GCs.

5. Adopt thread-logging (Slack-Space-Recycle) scheme more aggressively on data
  log types. See. get_ssr_segement()
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

43727527

f2fs: make an accessor to get sections for particular block type · 5ac206cf

由 Namjae Jeon 提交于 2月 02, 2013

Introduce accessor to get the sections based upon the block type
(node,dents...) and modify the functions : should_do_checkpoint,
has_not_enough_free_secs to use this accessor function to get
the node sections and dent sections.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5ac206cf

f2fs: avoid balanc_fs during evict_inode · d4686d56

由 Jaegeuk Kim 提交于 1月 31, 2013

1. Background

Previously, if f2fs tries to move data blocks of an *evicting* inode during the
cleaning process, it stops the process incompletely and then restarts the whole
process, since it needs a locked inode to grab victim data pages in its address
space. In order to get a locked inode, iget_locked() by f2fs_iget() is normally
used, but, it waits if the inode is on freeing.

So, here is a deadlock scenario.
1. f2fs_evict_inode()       <- inode "A"
  2. f2fs_balance_fs()
    3. f2fs_gc()
      4. gc_data_segment()
        5. f2fs_iget()      <- inode "A" too!

If step #1 and #5 treat a same inode "A", step #5 would fall into deadlock since
the inode "A" is on freeing. In order to resolve this, f2fs_iget_nowait() which
skips __wait_on_freeing_inode() was introduced in step #5, and stops f2fs_gc()
to complete f2fs_evict_inode().

1. f2fs_evict_inode()           <- inode "A"
  2. f2fs_balance_fs()
    3. f2fs_gc()
      4. gc_data_segment()
        5. f2fs_iget_nowait()   <- inode "A", then stop f2fs_gc() w/ -ENOENT

2. Problem and Solution

In the above scenario, however, f2fs cannot finish f2fs_evict_inode() only if:
 o there are not enough free sections, and
 o f2fs_gc() tries to move data blocks of the *evicting* inode repeatedly.

So, the final solution is to use f2fs_iget() and remove f2fs_balance_fs() in
f2fs_evict_inode().
The f2fs_evict_inode() actually truncates all the data and node blocks, which
means that it doesn't produce any dirty node pages accordingly.
So, we don't need to do f2fs_balance_fs() in practical.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d4686d56

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功