提交 · f28c06fa6f3d3215a1ba5e62ebc5ce7229d7a895 · openeuler / raspberrypi-kernel

28 5月, 2013 25 次提交

f2fs: dereferencing an ERR_PTR · f28c06fa

由 Dan Carpenter 提交于 5月 23, 2013

There is an error path where "dir" is an ERR_PTR.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

f28c06fa

f2fs: use ihold · 6f6fd833

由 Jaegeuk Kim 提交于 5月 22, 2013

Use the following helper function committed by Al.

commit 7de9c6ee
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sat Oct 23 11:11:40 2010 -0400

    new helper: ihold()

...
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6f6fd833

f2fs: should not make_bad_inode on f2fs_link failure · 93ff10d6

由 Jaegeuk Kim 提交于 5月 22, 2013

If -ENOSPC is met during f2fs_link, we should not make the inode as bad.
The inode is still alive.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

93ff10d6

f2fs: fix to handle do_recover_data errors · 39cf72cf

由 Jaegeuk Kim 提交于 5月 22, 2013

This patch adds error handling codes of check_index_in_prev_nodes and its
caller, do_recover_data.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

39cf72cf

f2fs: reuse the locked dnode page and its inode · b292dcab

由 Jaegeuk Kim 提交于 5月 22, 2013

This patch fixes the following deadlock bug during the recovery.

INFO: task mount:1322 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mount           D ffffffff81125870     0  1322   1266 0x00000000
 ffff8801207e39d8 0000000000000046 ffff88012ab1dee0 0000000000000046
 ffff8801207e3a08 ffff880115903f40 ffff8801207e3fd8 ffff8801207e3fd8
 ffff8801207e3fd8 ffff880115903f40 ffff8801207e39d8 ffff88012fc94520
Call Trace:
[<ffffffff81125870>] ? __lock_page+0x70/0x70
[<ffffffff816a92d9>] schedule+0x29/0x70
[<ffffffff816a93af>] io_schedule+0x8f/0xd0
[<ffffffff8112587e>] sleep_on_page+0xe/0x20
[<ffffffff816a649a>] __wait_on_bit_lock+0x5a/0xc0
[<ffffffff81125867>] __lock_page+0x67/0x70
[<ffffffff8106c7b0>] ? autoremove_wake_function+0x40/0x40
[<ffffffff81126857>] find_lock_page+0x67/0x80
[<ffffffff8112698f>] find_or_create_page+0x3f/0xb0
[<ffffffffa03901a8>] ? sync_inode_page+0xa8/0xd0 [f2fs]
[<ffffffffa038fdf7>] get_node_page+0x67/0x180 [f2fs]
[<ffffffffa039818b>] recover_fsync_data+0xacb/0xff0 [f2fs]
[<ffffffff816aaa1e>] ? _raw_spin_unlock+0x3e/0x40
[<ffffffffa0389634>] f2fs_fill_super+0x7d4/0x850 [f2fs]
[<ffffffff81184cf9>] mount_bdev+0x1c9/0x210
[<ffffffffa0388e60>] ? validate_superblock+0x180/0x180 [f2fs]
[<ffffffffa0387635>] f2fs_mount+0x15/0x20 [f2fs]
[<ffffffff81185a13>] mount_fs+0x43/0x1b0
[<ffffffff81145ba0>] ? __alloc_percpu+0x10/0x20
[<ffffffff811a0796>] vfs_kern_mount+0x76/0x120
[<ffffffff811a2cb7>] do_mount+0x237/0xa10
[<ffffffff81140b9b>] ? strndup_user+0x5b/0x80
[<ffffffff811a3520>] SyS_mount+0x90/0xe0
[<ffffffff816b3502>] system_call_fastpath+0x16/0x1b

The bug is triggered when check_index_in_prev_nodes tries to get the direct
node page by calling get_node_page.
At this point, if the direct node page is already locked by get_dnode_of_data,
its caller, we got a deadlock condition.

This patch adds additional condition check for the reuse of locked direct node
pages prior to the get_node_page call.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b292dcab

f2fs: fix wrong condition check · b638f0c4

由 Jaegeuk Kim 提交于 5月 21, 2013

While an orphan inode has zero link_count, f2fs_gc is able to select the inode
for foreground gc.

- f2fs_gc
 - do_garbage_collect
   - gc_data_segment
     : f2fs_iget is failed
     : get_valid_blocks() != 0, so that retry
--> here we got the infinite loop.

This patch resolved this issue.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b638f0c4

f2fs: add f2fs_readonly() · 77888c1e

由 Jaegeuk Kim 提交于 5月 20, 2013

Introduce a simple macro function for readability.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

77888c1e

f2fs: avoid RECLAIM_FS-ON-W: deadlock · 6f85b352

由 Jaegeuk Kim 提交于 5月 20, 2013

This patch tries to avoid the following deadlock condition of which the reclaim
path can trigger f2fs_balance_fs again.

=================================
[ INFO: inconsistent lock state ]
---------------------------------
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/41 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&sbi->gc_mutex){+.+.?.}, at: f2fs_balance_fs+0xe6/0x100 [f2fs]
{RECLAIM_FS-ON-W} state was registered at:
  [<ffffffff810aa5a9>] mark_held_locks+0xb9/0x140
  [<ffffffff810aae85>] lockdep_trace_alloc+0x85/0xf0
  [<ffffffff8113ab2c>] __alloc_pages_nodemask+0x7c/0x9b0
  [<ffffffff81175aa8>] alloc_pages_current+0xb8/0x180
  [<ffffffff811319cf>] __page_cache_alloc+0xaf/0xd0
  [<ffffffff8113225c>] find_or_create_page+0x4c/0xb0
  [<ffffffffa021359e>] find_data_page+0x14e/0x210 [f2fs]
  [<ffffffffa021161b>] f2fs_gc+0x9eb/0xd90 [f2fs]
  [<ffffffffa0218fae>] f2fs_balance_fs+0xee/0x100 [f2fs]
  [<ffffffffa020848c>] f2fs_setattr+0x6c/0x200 [f2fs]
  [<ffffffff811ae51b>] notify_change+0x1db/0x3a0
  [<ffffffff8118fbd0>] do_truncate+0x60/0xa0
  [<ffffffff8118fd95>] vfs_truncate+0x185/0x1b0
  [<ffffffff8118fe1c>] do_sys_truncate+0x5c/0xa0
  [<ffffffff8118ffee>] SyS_truncate+0xe/0x10
  [<ffffffff816e2b42>] system_call_fastpath+0x16/0x1b
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6f85b352

f2fs: don't do checkpoint if error is occurred · 2c2c149f

由 Jaegeuk Kim 提交于 5月 20, 2013

If we met an error during the dentry recovery, we should not conduct checkpoint.
Otherwise, some errorneous dentry blocks overwrites the existing blocks that
contain the remaining recovery information.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

2c2c149f

f2fs: fix to unlock page before exit · 45856aff

由 Jaegeuk Kim 提交于 5月 20, 2013

If we got an error after lock_page, we should unlock it before exit.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

45856aff

f2fs: remove unnecessary kmap/kunmap operations · 9a55ed65

由 Jaegeuk Kim 提交于 5月 20, 2013

The allocated page used by the recovery is not on HIGHMEM, so that we don't
need to use kmap/kunmap.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9a55ed65

f2fs: reorganize f2fs_vm_page_mkwrite · 9851e6e1

由 Namjae Jeon 提交于 4月 28, 2013

Few things can be changed in the default mkwrite function
1) Make file_update_time at the start before acquiring any lock
2) the condition page_offset(page) >= i_size_read(inode) should be
 changed to page_offset(page) > i_size_read
3) Move wait_on_page_writeback.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9851e6e1

f2fs: use list_for_each_entry rather than list_for_each_entry_safe · 145b04e5

由 majianpeng 提交于 5月 14, 2013

We can do this, since now we use a global mutex, f2fs_stat_mutex to protect its
list operations.
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
[Jaegeuk Kim: add description]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

145b04e5

f2fs: remove unecessary variable and code · 81fb5e87

由 Haicheng Li 提交于 5月 14, 2013

Code cleanup without behavior changed.
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

81fb5e87

f2fs, lockdep: annotate mutex_lock_all() · bfe35965

由 Peter Zijlstra 提交于 5月 16, 2013

Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all()
acquires an array of locks (in global/local lock style).

Any such operation is always serialized using cp_mutex, therefore there is no
fs_lock[] lock-order issue; tell lockdep about this using the
mutex_lock_nest_lock() primitive.
Reported-by: Nmajianpeng <majianpeng@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

bfe35965

f2fs: add debug msgs in the recovery routine · f356fe0c

由 Jaegeuk Kim 提交于 5月 16, 2013

This patch adds some trivial debugging messages in the recovery process.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

f356fe0c

f2fs: update inode page after creation · 44a83ff6

由 Jaegeuk Kim 提交于 5月 20, 2013

I found a bug when testing power-off-recovery as follows.

[Bug Scenario]
1. create a file
2. fsync the file
3. reboot w/o any sync
4. try to recover the file
 - found its fsync mark
 - found its dentry mark
   : try to recover its dentry
    - get its file name
    - get its parent inode number
     : here we got zero value

The reason why we get the wrong parent inode number is that we didn't
synchronize the inode page with its newly created inode information perfectly.

Especially, previous f2fs stores fi->i_pino and writes it to the cached
node page in a wrong order, which incurs the zero-valued i_pino during the
recovery.

So, this patch modifies the creation flow to fix the synchronization order of
inode page with its inode.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

44a83ff6

f2fs: change get_new_data_page to pass a locked node page · 64aa7ed9

由 Jaegeuk Kim 提交于 5月 20, 2013

This patch is for passing a locked node page to get_dnode_of_data.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

64aa7ed9

f2fs: skip get_node_page if locked node page is passed · 1646cfac

由 Jaegeuk Kim 提交于 5月 20, 2013

If get_dnode_of_data gets a locked node page, let's skip redundant
get_node_page calls.
This is for the futher enhancement.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1646cfac

f2fs: remove unnecessary por_doing check · 0a364af1

由 Jaegeuk Kim 提交于 5月 16, 2013

This por_doing check is totally not related to the recovery process.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

0a364af1

f2fs: fix BUG_ON during f2fs_evict_inode(dir) · 74d0b917

由 Jaegeuk Kim 提交于 5月 15, 2013

During the dentry recovery routine, recover_inode() triggers __f2fs_add_link
with its directory inode.

In the following scenario, a bug is captured.
 1. dir = f2fs_iget(pino)
 2. __f2fs_add_link(dir, name)
 3. iput(dir)
  -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents))

Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable]
[<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs]
Call Trace:
 [<ffffffff8118ea00>] evict+0xb0/0x1b0
 [<ffffffff8118f1c5>] iput+0x105/0x190
 [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs]
 [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0
 [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0
 [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70
 [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140
 [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0
 [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30
 [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70

This means that we should flush all the dentry pages between iget and iput().
But, during the recovery routine, it is unallowed due to consistency, so we
have to wait the whole recovery process.
And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we
can put the stale dir inodes from the dirty_dir_inode_list.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

74d0b917

f2fs: fix por_doing variable coverage · 8c26d7d5

由 Jaegeuk Kim 提交于 5月 15, 2013

The reason of using sbi->por_doing is to alleviate data writes during the
recovery.
The find_fsync_dnodes() produces some dirty dentry pages, so we should
cover it too with sbi->por_doing.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8c26d7d5

f2fs: remove redundant assignment · addbe45b

由 Jaegeuk Kim 提交于 5月 15, 2013

We don't need to assign a value redundantly.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

addbe45b

f2fs: fix the inconsistent state of data pages · 650495de

由 Jaegeuk Kim 提交于 5月 13, 2013

In get_lock_data_page, if there is a data race between get_dnode_of_data for
node and grab_cache_page for data, f2fs is able to face with the following
BUG_ON(dn.data_blkaddr == NEW_ADDR).

kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/data.c:251!
 [<ffffffffa044966c>] get_lock_data_page+0x1ec/0x210 [f2fs]
Call Trace:
 [<ffffffffa043b089>] f2fs_readdir+0x89/0x210 [f2fs]
 [<ffffffff811a0920>] ? fillonedir+0x100/0x100
 [<ffffffff811a0920>] ? fillonedir+0x100/0x100
 [<ffffffff811a07f8>] vfs_readdir+0xb8/0xe0
 [<ffffffff811a0b4f>] sys_getdents+0x8f/0x110
 [<ffffffff816d7999>] system_call_fastpath+0x16/0x1b

This bug is able to be occurred when the block address of the data block is
changed after f2fs_put_dnode().
In order to avoid that, this patch fixes the lock order of node and data
blocks in which the node block lock is covered by the data block lock.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

650495de

f2fs: fix inconsistency of block count during recovery · 65e5cd0a

由 Jaegeuk Kim 提交于 5月 14, 2013

Currently f2fs recovers the dentry of fsynced files.
When power-off-recovery is conducted, this newly recovered inode should increase
node block count as well as inode block count.

This patch resolves this inconsistency that results in:

1. create a file
2. write data
3. fsync
4. reboot without sync
5. mount and recover the file
6. node block count is 1 and inode block count is 2
 : fall into the inconsistent state
7. unlink the file
 : trigger the following BUG_ON

------------[ cut here ]------------
kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/f2fs.h:716!
Call Trace:
 [<ffffffffa0344100>] ? get_node_page+0x50/0x1a0 [f2fs]
 [<ffffffffa0344bfc>] remove_inode_page+0x8c/0x100 [f2fs]
 [<ffffffffa03380f0>] ? f2fs_evict_inode+0x180/0x2d0 [f2fs]
 [<ffffffffa033812e>] f2fs_evict_inode+0x1be/0x2d0 [f2fs]
 [<ffffffff811c7a67>] evict+0xa7/0x1a0
 [<ffffffff811c82b5>] iput+0x105/0x190
 [<ffffffff811c2b30>] d_kill+0xe0/0x120
 [<ffffffff811c2c57>] dput+0xe7/0x1e0
 [<ffffffff811acc3d>] __fput+0x19d/0x2d0
 [<ffffffff811acd7e>] ____fput+0xe/0x10
 [<ffffffff81070645>] task_work_run+0xb5/0xe0
 [<ffffffff81002941>] do_notify_resume+0x71/0xb0
 [<ffffffff8175f14a>] int_signal+0x12/0x17
Reported-and-Tested-by: NChris Fries <C.Fries@motorola.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

65e5cd0a

08 5月, 2013 8 次提交

f2fs: cover free_nid management with spin_lock · 59bbd474

由 Jaegeuk Kim 提交于 5月 07, 2013

After build_free_nids() searches free nid candidates from nat pages and
current journal blocks, it checks all the candidates if they are allocated
so that the nat cache has its nid with an allocated block address.

In this procedure, previously we used
    list_for_each_entry_safe(fnid, next_fnid, &nm_i->free_nid_list, list).
But, this is not covered by free_nid_list_lock, resulting in null pointer bug.

This patch moves this checking routine inside add_free_nid() in order not to use
the spin_lock.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

59bbd474

f2fs: optimize scan_nat_page() · 23d38844

由 Haicheng Li 提交于 5月 06, 2013

When nm_i->fcnt > 2 * MAX_FREE_NIDS, stop scanning other NAT entries.
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
[Jaegeuk Kim: fix handling the return value of add_free_nid()]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

23d38844

f2fs: code cleanup for scan_nat_page() and build_free_nids() · 8760952d

由 Haicheng Li 提交于 5月 06, 2013

This patch does two cleanups:
1. remove unused variable "fcnt" in build_free_nids().
2. make scan_nat_page() as void type and remove useless variable "fcnt".
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8760952d

f2fs: bugfix for alloc_nid_failed() · 95630cba

由 Haicheng Li 提交于 5月 06, 2013

Directly drop the free_nid cache when nm_i->fcnt > 2 * MAX_FREE_NIDS

Since there is NOT nmi->free_nid_list_lock spinlock protection between
a sequential calling of alloc_nid() and alloc_nid_failed(), some other
threads may already add new free_nid to the free_nid_list during this
period.

We need to make sure nmi->fcnt is never > 2 * MAX_FREE_NIDS.
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
[Jaegeuk Kim: fit the coding style]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

95630cba

f2fs: recover when journal contains deleted files · 047184b4

由 Chris Fries 提交于 5月 02, 2013

When recovering a journal file with fsync data for files that have
been deleted, don't bail out on recovery.
Signed-off-by: NChris Fries <C.Fries@motorola.com>
Reviewed-by: NRussell Knize <rknize2@motorola.com>
Reviewed-by: NJason Hrycay <jason.hrycay@motorola.com>
[Jaegeuk Kim: fit the coding style]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

047184b4

f2fs: continue to mount after failing recovery · bde582b2

由 Chris Fries 提交于 5月 02, 2013

When unable to roll forward the journal, we shouldn't bail out and
not mount, we should continue to attempt the mount.  Bad recovery data
is likely unrecoverable at this point, and requiring the user to try
to mount again doesn't solve any issues.
Signed-off-by: NChris Fries <C.Fries@motorola.com>
Reviewed-by: NRussell Knize <rknize2@motorola.com>
Reviewed-by: NJason Hrycay <jason.hrycay@motorola.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

bde582b2

f2fs: avoid deadlock during evict after f2fs_gc · 531ad7d5

由 Jaegeuk Kim 提交于 4月 30, 2013

o Deadlock case #1

Thread 1:
- writeback_sb_inodes
 - do_writepages
  - f2fs_write_data_pages
   - write_cache_pages
    - f2fs_write_data_page
     - f2fs_balance_fs
      - wait mutex_lock(gc_mutex)

Thread 2:
- f2fs_balance_fs
 - mutex_lock(gc_mutex)
 - f2fs_gc
  - f2fs_iget
   - wait iget_locked(inode->i_lock)

Thread 3:
- do_unlinkat
 - iput
  - lock(inode->i_lock)
   - evict
    - inode_wait_for_writeback

o Deadlock case #2

Thread 1:
- __writeback_single_inode
 : set I_SYNC
  - do_writepages
   - f2fs_write_data_page
    - f2fs_balance_fs
     - f2fs_gc
      - iput
       - evict
        - inode_wait_for_writeback(I_SYNC)

In order to avoid this, even though iput is called with the zero-reference
count, we need to stop the eviction procedure if the inode is on writeback.
So this patch links f2fs_drop_inode which checks the I_SYNC flag.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

531ad7d5

aio: don't include aio.h in sched.h · a27bb332

由 Kent Overstreet 提交于 5月 07, 2013

Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a27bb332

30 4月, 2013 3 次提交

f2fs: modify the number of issued pages to merge IOs · ac5d156c

由 Jaegeuk Kim 提交于 4月 29, 2013

When testing f2fs on an SSD, I found some 128 page IOs followed by 1 page IO
were issued by f2fs_write_node_pages.
This means that there were some mishandling flows which degrades performance.

Previous f2fs_write_node_pages determines the number of pages to be written,
nr_to_write, as follows.

1. The bio_get_nr_vecs returns 129 pages.
2. The bio_alloc makes a room for 128 pages.
3. The initial 128 pages go into one bio.
4. The existing bio is submitted, and a new bio is prepared for the last 1 page.
5. Finally, sync_node_pages submits the last 1 page bio.

The problem is from the use of bio_get_nr_vecs, so this patch replace it
with max_hw_blocks using queue_max_sectors.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

ac5d156c

H
f2fs: remove useless #include <linux/proc_fs.h> as we're now using sysfs as debug entry. · b743ba78
由 Haicheng Li 提交于 4月 28, 2013
```
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
```
b743ba78

f2fs: fix inconsistent using of NM_WOUT_THRESHOLD · 6cac3759

由 Haicheng Li 提交于 4月 28, 2013

try_to_free_nats() is usually called with parameter nr_shrink as
	"nm_i->nat_cnt - NM_WOUT_THRESHOLD"
by flush_nat_entries() during checkpointing process.

However, this is inconsistent with the actual threshold check as
	"if (nm_i->nat_cnt < 2 * NM_WOUT_THRESHOLD)"
, which will ignore the free_nats requests when
	NM_WOUT_THRESHOLD < nm_i->nat_cnt < 2 * NM_WOUT_THRESHOLD

So fix the threshold check condition.
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6cac3759

29 4月, 2013 3 次提交

f2fs: check truncation of mapping after lock_page · afcb7ca0

由 Jaegeuk Kim 提交于 4月 26, 2013

We call lock_page when we need to update a page after readpage.
Between grab and lock page, the page can be truncated by other thread.
So, we should check the page after lock_page whether it was truncated or not.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

afcb7ca0

f2fs: enhance alloc_nid and build_free_nids flows · 55008d84

由 Jaegeuk Kim 提交于 4月 25, 2013

In order to avoid build_free_nid lock contention, let's change the order of
function calls as follows.

At first, check whether there is enough free nids.
 - If available, just get a free nid with spin_lock without any overhead.
 - Otherwise, conduct build_free_nids.
  : scan nat pages, journal nat entries, and nat cache entries.

We should consider carefullly not to serve free nids intermediately made by
build_free_nids.
We can get stable free nids only after build_free_nids is done.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

55008d84

f2fs: add a tracepoint on f2fs_new_inode · d70b4f53

由 Jaegeuk Kim 提交于 4月 25, 2013

This can help when debugging the free nid allocation flows.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d70b4f53

26 4月, 2013 1 次提交

f2fs: check nid == 0 in add_free_nid · 9198aceb

由 Jaegeuk Kim 提交于 4月 25, 2013

It is more obvious that add_free_nid checks whether the free nid is zero or not.
Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9198aceb