- 28 5月, 2013 13 次提交
-
-
由 majianpeng 提交于
We can do this, since now we use a global mutex, f2fs_stat_mutex to protect its list operations. Signed-off-by: NJianpeng Ma <majianpeng@gmail.com> [Jaegeuk Kim: add description] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
Code cleanup without behavior changed. Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Peter Zijlstra 提交于
Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all() acquires an array of locks (in global/local lock style). Any such operation is always serialized using cp_mutex, therefore there is no fs_lock[] lock-order issue; tell lockdep about this using the mutex_lock_nest_lock() primitive. Reported-by: Nmajianpeng <majianpeng@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch adds some trivial debugging messages in the recovery process. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
I found a bug when testing power-off-recovery as follows. [Bug Scenario] 1. create a file 2. fsync the file 3. reboot w/o any sync 4. try to recover the file - found its fsync mark - found its dentry mark : try to recover its dentry - get its file name - get its parent inode number : here we got zero value The reason why we get the wrong parent inode number is that we didn't synchronize the inode page with its newly created inode information perfectly. Especially, previous f2fs stores fi->i_pino and writes it to the cached node page in a wrong order, which incurs the zero-valued i_pino during the recovery. So, this patch modifies the creation flow to fix the synchronization order of inode page with its inode. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch is for passing a locked node page to get_dnode_of_data. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
If get_dnode_of_data gets a locked node page, let's skip redundant get_node_page calls. This is for the futher enhancement. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This por_doing check is totally not related to the recovery process. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
During the dentry recovery routine, recover_inode() triggers __f2fs_add_link with its directory inode. In the following scenario, a bug is captured. 1. dir = f2fs_iget(pino) 2. __f2fs_add_link(dir, name) 3. iput(dir) -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents)) Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable] [<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs] Call Trace: [<ffffffff8118ea00>] evict+0xb0/0x1b0 [<ffffffff8118f1c5>] iput+0x105/0x190 [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs] [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0 [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0 [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70 [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140 [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0 [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30 [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70 This means that we should flush all the dentry pages between iget and iput(). But, during the recovery routine, it is unallowed due to consistency, so we have to wait the whole recovery process. And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we can put the stale dir inodes from the dirty_dir_inode_list. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
The reason of using sbi->por_doing is to alleviate data writes during the recovery. The find_fsync_dnodes() produces some dirty dentry pages, so we should cover it too with sbi->por_doing. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
We don't need to assign a value redundantly. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
In get_lock_data_page, if there is a data race between get_dnode_of_data for node and grab_cache_page for data, f2fs is able to face with the following BUG_ON(dn.data_blkaddr == NEW_ADDR). kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/data.c:251! [<ffffffffa044966c>] get_lock_data_page+0x1ec/0x210 [f2fs] Call Trace: [<ffffffffa043b089>] f2fs_readdir+0x89/0x210 [f2fs] [<ffffffff811a0920>] ? fillonedir+0x100/0x100 [<ffffffff811a0920>] ? fillonedir+0x100/0x100 [<ffffffff811a07f8>] vfs_readdir+0xb8/0xe0 [<ffffffff811a0b4f>] sys_getdents+0x8f/0x110 [<ffffffff816d7999>] system_call_fastpath+0x16/0x1b This bug is able to be occurred when the block address of the data block is changed after f2fs_put_dnode(). In order to avoid that, this patch fixes the lock order of node and data blocks in which the node block lock is covered by the data block lock. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
Currently f2fs recovers the dentry of fsynced files. When power-off-recovery is conducted, this newly recovered inode should increase node block count as well as inode block count. This patch resolves this inconsistency that results in: 1. create a file 2. write data 3. fsync 4. reboot without sync 5. mount and recover the file 6. node block count is 1 and inode block count is 2 : fall into the inconsistent state 7. unlink the file : trigger the following BUG_ON ------------[ cut here ]------------ kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/f2fs.h:716! Call Trace: [<ffffffffa0344100>] ? get_node_page+0x50/0x1a0 [f2fs] [<ffffffffa0344bfc>] remove_inode_page+0x8c/0x100 [f2fs] [<ffffffffa03380f0>] ? f2fs_evict_inode+0x180/0x2d0 [f2fs] [<ffffffffa033812e>] f2fs_evict_inode+0x1be/0x2d0 [f2fs] [<ffffffff811c7a67>] evict+0xa7/0x1a0 [<ffffffff811c82b5>] iput+0x105/0x190 [<ffffffff811c2b30>] d_kill+0xe0/0x120 [<ffffffff811c2c57>] dput+0xe7/0x1e0 [<ffffffff811acc3d>] __fput+0x19d/0x2d0 [<ffffffff811acd7e>] ____fput+0xe/0x10 [<ffffffff81070645>] task_work_run+0xb5/0xe0 [<ffffffff81002941>] do_notify_resume+0x71/0xb0 [<ffffffff8175f14a>] int_signal+0x12/0x17 Reported-and-Tested-by: NChris Fries <C.Fries@motorola.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 08 5月, 2013 8 次提交
-
-
由 Jaegeuk Kim 提交于
After build_free_nids() searches free nid candidates from nat pages and current journal blocks, it checks all the candidates if they are allocated so that the nat cache has its nid with an allocated block address. In this procedure, previously we used list_for_each_entry_safe(fnid, next_fnid, &nm_i->free_nid_list, list). But, this is not covered by free_nid_list_lock, resulting in null pointer bug. This patch moves this checking routine inside add_free_nid() in order not to use the spin_lock. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
When nm_i->fcnt > 2 * MAX_FREE_NIDS, stop scanning other NAT entries. Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> [Jaegeuk Kim: fix handling the return value of add_free_nid()] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
This patch does two cleanups: 1. remove unused variable "fcnt" in build_free_nids(). 2. make scan_nat_page() as void type and remove useless variable "fcnt". Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
Directly drop the free_nid cache when nm_i->fcnt > 2 * MAX_FREE_NIDS Since there is NOT nmi->free_nid_list_lock spinlock protection between a sequential calling of alloc_nid() and alloc_nid_failed(), some other threads may already add new free_nid to the free_nid_list during this period. We need to make sure nmi->fcnt is never > 2 * MAX_FREE_NIDS. Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> [Jaegeuk Kim: fit the coding style] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Chris Fries 提交于
When recovering a journal file with fsync data for files that have been deleted, don't bail out on recovery. Signed-off-by: NChris Fries <C.Fries@motorola.com> Reviewed-by: NRussell Knize <rknize2@motorola.com> Reviewed-by: NJason Hrycay <jason.hrycay@motorola.com> [Jaegeuk Kim: fit the coding style] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Chris Fries 提交于
When unable to roll forward the journal, we shouldn't bail out and not mount, we should continue to attempt the mount. Bad recovery data is likely unrecoverable at this point, and requiring the user to try to mount again doesn't solve any issues. Signed-off-by: NChris Fries <C.Fries@motorola.com> Reviewed-by: NRussell Knize <rknize2@motorola.com> Reviewed-by: NJason Hrycay <jason.hrycay@motorola.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
o Deadlock case #1 Thread 1: - writeback_sb_inodes - do_writepages - f2fs_write_data_pages - write_cache_pages - f2fs_write_data_page - f2fs_balance_fs - wait mutex_lock(gc_mutex) Thread 2: - f2fs_balance_fs - mutex_lock(gc_mutex) - f2fs_gc - f2fs_iget - wait iget_locked(inode->i_lock) Thread 3: - do_unlinkat - iput - lock(inode->i_lock) - evict - inode_wait_for_writeback o Deadlock case #2 Thread 1: - __writeback_single_inode : set I_SYNC - do_writepages - f2fs_write_data_page - f2fs_balance_fs - f2fs_gc - iput - evict - inode_wait_for_writeback(I_SYNC) In order to avoid this, even though iput is called with the zero-reference count, we need to stop the eviction procedure if the inode is on writeback. So this patch links f2fs_drop_inode which checks the I_SYNC flag. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Kent Overstreet 提交于
Faster kernel compiles by way of fewer unnecessary includes. [akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: NKent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2013 3 次提交
-
-
由 Jaegeuk Kim 提交于
When testing f2fs on an SSD, I found some 128 page IOs followed by 1 page IO were issued by f2fs_write_node_pages. This means that there were some mishandling flows which degrades performance. Previous f2fs_write_node_pages determines the number of pages to be written, nr_to_write, as follows. 1. The bio_get_nr_vecs returns 129 pages. 2. The bio_alloc makes a room for 128 pages. 3. The initial 128 pages go into one bio. 4. The existing bio is submitted, and a new bio is prepared for the last 1 page. 5. Finally, sync_node_pages submits the last 1 page bio. The problem is from the use of bio_get_nr_vecs, so this patch replace it with max_hw_blocks using queue_max_sectors. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
try_to_free_nats() is usually called with parameter nr_shrink as "nm_i->nat_cnt - NM_WOUT_THRESHOLD" by flush_nat_entries() during checkpointing process. However, this is inconsistent with the actual threshold check as "if (nm_i->nat_cnt < 2 * NM_WOUT_THRESHOLD)" , which will ignore the free_nats requests when NM_WOUT_THRESHOLD < nm_i->nat_cnt < 2 * NM_WOUT_THRESHOLD So fix the threshold check condition. Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 29 4月, 2013 3 次提交
-
-
由 Jaegeuk Kim 提交于
We call lock_page when we need to update a page after readpage. Between grab and lock page, the page can be truncated by other thread. So, we should check the page after lock_page whether it was truncated or not. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
In order to avoid build_free_nid lock contention, let's change the order of function calls as follows. At first, check whether there is enough free nids. - If available, just get a free nid with spin_lock without any overhead. - Otherwise, conduct build_free_nids. : scan nat pages, journal nat entries, and nat cache entries. We should consider carefullly not to serve free nids intermediately made by build_free_nids. We can get stable free nids only after build_free_nids is done. Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This can help when debugging the free nid allocation flows. Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 26 4月, 2013 4 次提交
-
-
由 Jaegeuk Kim 提交于
It is more obvious that add_free_nid checks whether the free nid is zero or not. Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
Adding REQ_META for all the metadata requests can help in improving the FS performance, if the underlying device supports TAGGING. So, when considering the submit_bio path for all the f2fs requests. We can add REQ_META for all the META requests. As a precursor to this change we considered the commit 4265900e 'mmc: MMC-4.5 Data Tag Support' Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
Previously, background GC submits many 4KB read requests to load victim blocks and/or its (i)node blocks. ... f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0] ... However, by the fact that many IOs are sequential, we can give a chance to merge the IOs by IO scheduler. In order to do that, let's use blk_plug. ... f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef <idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0] <idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0] <idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0] <idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0] <idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0] <idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0] <idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0] ... Note that this issue should be addressed in checkpoint, and some readahead flows too. Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
If there is no victim segments selected by background GC, let's wait a little bit longer time to collect dirty segments. By default, let's give 5 minutes. Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 23 4月, 2013 8 次提交
-
-
由 Namjae Jeon 提交于
Add tracepoints to debug checkpoint request. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> [Jaegeuk: change expressions] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
Add tracepoints to debug the various page write operation like data pages, meta pages. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> [Jaegeuk: remove unnecessary tracepoints] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
Add tracepoints to debug the block allocation & fallocate. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> [Jaegeuk: enhance information] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
Add tracepoints for tracing the garbage collector threads in f2fs with status of collection & type. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> [Jaegeuk: modify slightly to show information] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
Add tracepoints for page i/o operations and block allocation tracing during page read operation. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> [Jaegeuk: combine and modify the tracepoint structures] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
add tracepoints for tracing the truncate operations like truncate node/data blocks, f2fs_truncate etc. Tracepoints are added at entry and exit of operation to trace the success & failure of operation. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> [Jaegeuk: combine and modify the tracepoint structures] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
Add tracepoints in f2fs for tracing the syncing operations like filesystem sync, file sync enter/exit. It will helf to trace the code under debugging scenarios. Also add tracepoints for tracing the various inode operations like building inode, eviction of inode, link/unlike of inodes. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> [Jaegeuk: combine and modify the tracepoint structures] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
The code conditions put inside the function is_multimedia_file are reverse to the name i.e, we need to negate the return to actually check if the file is a multimedia file. So, change the code and usage path to align both the name and comparision conditions. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 22 4月, 2013 1 次提交
-
-
由 Wei Yongjun 提交于
Fix to return a negative error code from the error handling case instead of 0, as returned elsewhere in this function. Introduce by commit c0d39e(f2fs: fix return values from validate superblock) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-