- 23 12月, 2013 8 次提交
-
-
由 Jaegeuk Kim 提交于
This patch integrates redundant bio operations on read and write IOs. 1. Move bio-related codes to the top of data.c. 2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read bios additionally. 3. Introduce __submit_merged_bio to submit the merged bio. 4. Change f2fs_readpage to f2fs_submit_page_bio. 5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and submit_write_page. Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com > Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Chao Yu 提交于
The recover_orphan_inodes() returns no error all the time, so we don't need to check its errors. Signed-off-by: NChao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: add description] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Chao Yu 提交于
Because we will write node summaries when do_checkpoint with umount flag, our number of max orphan blocks should minus NR_CURSEG_NODE_TYPE additional. Signed-off-by: NChao Yu <chao2.yu@samsung.com> Signed-off-by: NShu Tan <shu.tan@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch fixes some bit overflows by the shift operations. Dan Carpenter reported potential bugs on bit overflows as follows. fs/f2fs/segment.c:910 submit_write_page() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/checkpoint.c:429 get_valid_checkpoint() warn: should '1 << ()' be a 64 bit type? fs/f2fs/data.c:408 f2fs_readpage() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/data.c:457 submit_read_page() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/data.c:525 get_data_block_ro() warn: should 'i << blkbits' be a 64 bit type? Bug-Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Fix a potential out of range issue introduced by commit: 22fb72225a f2fs: simplify write_orphan_inodes for better readable Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Changman Lee 提交于
Let's send REQ_META or REQ_PRIO when reading meta area such as NAT/SIT etc. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Simplify write_orphan_inodes for better readable. Because we hold the orphan_inode_mutex, so it's safe to use list_for_each_entry instead of list_for_each_safe. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 08 11月, 2013 1 次提交
-
-
由 Changman Lee 提交于
use genernal method supported by kernel o changes from v1 If any waiter exists at end io, wake up it. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 29 10月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
If you want to remove unnecessary BUG_ONs, you can just turn off F2FS_CHECK_FS in your kernel config. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 25 10月, 2013 3 次提交
-
-
由 Jaegeuk Kim 提交于
This patch adds a tracepoint for set_page_dirty. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch cleans up improper definitions that update some status information. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 22 10月, 2013 1 次提交
-
-
由 Gu Zheng 提交于
Introduce the unfailed version of kmem_cache_alloc named f2fs_kmem_cache_alloc to hide the retry routine and make the code a bit cleaner. v2: Fix the wrong use of 'retry' tag pointed out by Gao feng. Use more neat code to remove redundant tag suggested by Haicheng Li. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 18 10月, 2013 2 次提交
-
-
由 Jaegeuk Kim 提交于
This patch enhances the recovery routine not to write any data/node/meta until its completion. If any writes are sent to the disk, it could contaminate the written history that will be used for further recovery. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Previously, do_checkpoint() will call congestion_wait() for waiting the pages (previous submitted node/meta/data pages) to be written back. Because congestion_wait() will set a regular period (e.g. HZ / 50 ) for waiting, and no additional wake up mechanism was introduced if IO ends up before regular period costed. Yuan Zhong found there is a situation that after the pages have been written back, but the checkpoint thread still wait for congestion_wait to exit. So here we store checkpoint task into f2fs_sb when doing checkpoint, it'll wait for IO completes if there's IO going on, and in the end IO path, wake up checkpoint task when IO ends up. Thanks to Yuan Zhong's pre work about this problem. Reported-by: NYuan Zhong <yuan.mark.zhong@samsung.com> Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 07 10月, 2013 1 次提交
-
-
由 Gu Zheng 提交于
The fs_locks is used to block other ops(ex, recovery) when doing checkpoint. And each other operate routine(besides checkpoint) needs to acquire a fs_lock, there is a terrible problem here, if these are too many concurrency threads acquiring fs_lock, so that they will block each other and may lead to some performance problem, but this is not the phenomenon we want to see. Though there are some optimization patches introduced to enhance the usage of fs_lock, but the thorough solution is using a *rw_sem* to replace the fs_lock. Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other, this can avoid the problem described above completely. Because of the weakness of rw_sem, the above change may introduce a potential problem that the checkpoint thread might get starved if other threads are intensively locking the read semaphore for I/O.(Pointed out by Xu Jin) In order to avoid this, a wait_list is introduced, the appending read semaphore ops will be dropped into the wait_list if checkpoint thread is waiting for write semaphore, and will be waked up when checkpoint thread gives up write semaphore. Thanks to Kim's previous review and test, and will be very glad to see other guys' performance tests about this patch. V2: -fix the potential starvation problem. -use more suitable func name suggested by Xu Jin. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: adjust minor coding standard] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 25 9月, 2013 1 次提交
-
-
由 Russ W. Knize 提交于
Accounting errors from buggy code calling the acquire/release/remove orphan inode interfaces can cause n_orphans to underflow, which will then cause acquire_orphan_inode() to return -ENOSPC on the next operation. This commit guards against that condition. Signed-off-by: NRuss Knize <rknize@motorola.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 09 8月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
This patch introduces a new inline function, cur_cp_version, to reduce redundant codes. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 30 7月, 2013 2 次提交
-
-
由 Jaegeuk Kim 提交于
This patch fixes mishandling of the sbi->n_orphans variable. If users request lots of f2fs_unlink(), check_orphan_space() could be contended. In such the case, sbi->n_orphans can be read incorrectly so that f2fs_unlink() would fall into the wrong state which results in the failure of add_orphan_inode(). So, let's increment sbi->n_orphans virtually prior to the actual orphan inode stuffs. After that, let's release sbi->n_orphans by calling release_orphan_inode or remove_orphan_inode. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
As we remove the target single node, so list_for_each is enought, in order to clean up, we use list_for_each_entry instead. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 02 7月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
While calculating CRC for the checkpoint block, we use __u32, but when storing the crc value to the disk, we use __le32. Let's fix the inconsistency. Reported-and-Tested-by: NOded Gabbay <ogabbay@advaoptical.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 07 6月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
It is possible that iput is skipped after iget during the recovery. In recover_dentry(), dir = f2fs_iget(); ... if (de && inode->i_ino == le32_to_cpu(de->ino)) goto out; In this case, this dir is not able to be added in dirty_dir_inode_list. The actual linking is done only when set_page_dirty() is called. So let's add this newly got inode into the list explicitly, and put it at the end of the recovery routine. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 28 5月, 2013 4 次提交
-
-
由 Jaegeuk Kim 提交于
- iget/iput flow in the dentry recovery process 1. *dir* = f2fs_iget 2. set FI_DELAY_IPUT to *dir* 3. add *dir* to the dirty_dir_list - __f2fs_add_link - recover_dentry) 4. iput *dir* by remove_dirty_dir_inode - sync_dirty_dir_inodes - write_chekcpoint If *dir*'s i_count is not 1 (i.e., root dir), remove_dirty_dir_inode is called later and then iput is triggered again due to the FI_DELAY_IPUT flag. So, let's unset the flag properly once iput is triggered. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
If there remains some unwritten blocks from the recovery, we should not call iput on that directory inode. Otherwise, we can loose some dentry blocks after the recovery. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Namjae Jeon 提交于
Some, counters are needed only for the statistical information while debugging. So, those can be controlled using CONFIG_F2FS_STAT_FS, pushing the usage for few variables under this flag. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
During the dentry recovery routine, recover_inode() triggers __f2fs_add_link with its directory inode. In the following scenario, a bug is captured. 1. dir = f2fs_iget(pino) 2. __f2fs_add_link(dir, name) 3. iput(dir) -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents)) Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable] [<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs] Call Trace: [<ffffffff8118ea00>] evict+0xb0/0x1b0 [<ffffffff8118f1c5>] iput+0x105/0x190 [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs] [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0 [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0 [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70 [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140 [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0 [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30 [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70 This means that we should flush all the dentry pages between iget and iput(). But, during the recovery routine, it is unallowed due to consistency, so we have to wait the whole recovery process. And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we can put the stale dir inodes from the dirty_dir_inode_list. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 29 4月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
We call lock_page when we need to update a page after readpage. Between grab and lock page, the page can be truncated by other thread. So, we should check the page after lock_page whether it was truncated or not. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 26 4月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
Previously, background GC submits many 4KB read requests to load victim blocks and/or its (i)node blocks. ... f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0] ... However, by the fact that many IOs are sequential, we can give a chance to merge the IOs by IO scheduler. In order to do that, let's use blk_plug. ... f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef <idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0] <idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0] <idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0] <idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0] <idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0] <idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0] <idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0] ... Note that this issue should be addressed in checkpoint, and some readahead flows too. Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 23 4月, 2013 1 次提交
-
-
由 Namjae Jeon 提交于
Add tracepoints to debug checkpoint request. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> [Jaegeuk: change expressions] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 09 4月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
In the previous version, f2fs uses global locks according to the usage types, such as directory operations, block allocation, block write, and so on. Reference the following lock types in f2fs.h. enum lock_type { RENAME, /* for renaming operations */ DENTRY_OPS, /* for directory operations */ DATA_WRITE, /* for data write */ DATA_NEW, /* for data allocation */ DATA_TRUNC, /* for data truncate */ NODE_NEW, /* for node allocation */ NODE_TRUNC, /* for node truncate */ NODE_WRITE, /* for node write */ NR_LOCK_TYPE, }; In that case, we lose the performance under the multi-threading environment, since every types of operations must be conducted one at a time. In order to address the problem, let's share the locks globally with a mutex array regardless of any types. So, let users grab a mutex and perform their jobs in parallel as much as possbile. For this, I propose a new global lock scheme as follows. 0. Data structure - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS] - f2fs_sb_info -> node_write 1. mutex_lock_op(sbi) - try to get an avaiable lock from the array. - returns the index of the gottern lock variable. 2. mutex_unlock_op(sbi, index of the lock) - unlock the given index of the lock. 3. mutex_lock_all(sbi) - grab all the locks in the array before the checkpoint. 4. mutex_unlock_all(sbi) - release all the locks in the array after checkpoint. 5. block_operations() - call mutex_lock_all() - sync_dirty_dir_inodes() - grab node_write - sync_node_pages() Note that, the pairs of mutex_lock_op()/mutex_unlock_op() and mutex_lock_all()/mutex_unlock_all() should be used together. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 03 4月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
This patch removes a bitmap for victim segments selected by foreground GC, and modifies the other bitmap for victim segments selected by background GC. 1) foreground GC bitmap : We don't need to manage this, since we just only one previous victim section number instead of the whole victim history. The f2fs uses the victim section number in order not to allocate currently GC'ed section to current active logs. 2) background GC bitmap : This bitmap is used to avoid selecting victims repeatedly by background GCs. In addition, the victims are able to be selected by foreground GCs, since there is no need to read victim blocks during foreground GCs. By the fact that the foreground GC reclaims segments in a section unit, it'd be better to manage this bitmap based on the section granularity. Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 20 3月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
This patch reduces redundant locking and unlocking pages during read operations. In f2fs_readpage, let's use wait_on_page_locked() instead of lock_page. And then, when we need to modify any data finally, let's lock the page so that we can avoid lock contention. [readpage rule] - The f2fs_readpage returns unlocked page, or released page too in error cases. - Its caller should handle read error, -EIO, after locking the page, which indicates read completion. - Its caller should check PageUptodate after grab_cache_page. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 12 2月, 2013 4 次提交
-
-
由 Jaegeuk Kim 提交于
This patch makes clearer the ambiguous f2fs_gc flow as follows. 1. Remove intermediate checkpoint condition during f2fs_gc (i.e., should_do_checkpoint() and GC_BLOCKED) 2. Remove unnecessary return values of f2fs_gc because of #1. (i.e., GC_NODE, GC_OK, etc) 3. Simplify write_checkpoint() because of #2. 4. Clarify the main f2fs_gc flow. o monitor how many freed sections during one iteration of do_garbage_collect(). o do GC more without checkpoints if we can't get enough free sections. o do checkpoint once we've got enough free sections through forground GCs. 5. Adopt thread-logging (Slack-Space-Recycle) scheme more aggressively on data log types. See. get_ssr_segement() Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Changman Lee 提交于
F2FS_SET_SB_DIRT is called in inc_page_count and it is directly called one more time in the next line. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 majianpeng 提交于
For the code > prev = list_entry(orphan->list.prev, typeof(*prev), list); if orphan->list.prev == head, it can't get the right prev. And we can use the parameter 'this' to add. Signed-off-by: NJianpeng Ma <majianpeng@gmail.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch enhances the checkpoint routine to cope with IO errors. Basically f2fs detects IO errors from end_io_write, and the errors are able to be occurred during one of data, node, and meta page writes. In the previous code, when an IO error is occurred during writes, f2fs sets a flag, CP_ERROR_FLAG, in the raw ckeckpoint buffer which will be written to disk. Afterwards, write_checkpoint() will check the flag and remount f2fs as a read-only (ro) mode. However, even once f2fs is remounted as a ro mode, dirty checkpoint pages are freely able to be written to disk by flusher or kswapd in background. In such a case, after cold reboot, f2fs would restore the checkpoint data having CP_ERROR_FLAG, resulting in disabling write_checkpoint and remounting f2fs as a ro mode again. Therefore, let's prevent any checkpoint page (meta) writes once an IO error is occurred, and remount f2fs as a ro mode right away at that moment. Reported-by: NOliver Winker <oliver@oli1170.net> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com> Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
-
- 22 1月, 2013 1 次提交
-
-
由 Namjae Jeon 提交于
Add __init to functions in init_f2fs_fs for code consistency. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 04 1月, 2013 1 次提交
-
-
由 Namjae Jeon 提交于
While creating a new entry for addition to the list(orphan inode list and fsync inode entry list), there is no need to call HEAD initialization for these entries. So, remove that init part. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 11 12月, 2012 1 次提交
-
-
由 Jaegeuk Kim 提交于
As pointed out by Randy Dunlap, this patch removes all usage of "/**" for comment blocks. Instead, just use "/*". Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-