- 17 2月, 2014 9 次提交
-
-
由 Jaegeuk Kim 提交于
If f2fs entered errorneous checkpoint status, it should skip writing meta pages instead of redirtying the pages out. Otherwise, it cannot unmount the partition even though f2fs is under read-only status. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
When a new directory is allocated, if an error is occurred, we should truncate preallocated dentry pages too. This bug was reported by Andrey Tsyvarev after a while as follows. mkdir()-> f2fs_add_link()-> init_inode_metadata()-> f2fs_init_acl()-> f2fs_get_acl()-> f2fs_getxattr()-> read_all_xattrs() fails. Also there was a BUG_ON triggered after the fault in mkdir()-> f2fs_add_link()-> init_inode_metadata()-> remove_inode_page() -> f2fs_bug_on(inode->i_blocks != 0 && inode->i_blocks != 1); But, previous patch wasn't perfect to resolve that bug, so the following bug report was also submitted. kernel BUG at fs/f2fs/inode.c:274! Call Trace: [<ffffffff811fde03>] evict+0xa3/0x1a0 [<ffffffff811fe615>] iput+0xf5/0x180 [<ffffffffa01c7f63>] f2fs_mkdir+0xf3/0x150 [f2fs] [<ffffffff811f2a77>] vfs_mkdir+0xb7/0x160 [<ffffffff811f36bf>] SyS_mkdir+0x5f/0xc0 [<ffffffff81680769>] system_call_fastpath+0x16/0x1b Finally, this patch resolves all the issues like below. If an error is occurred after make_empty_dir(), 1. truncate_inode_pages() The make_bad_inode() prior to iput() will change i_mode to S_IFREG, which means that f2fs will not decrement fi->dirty_dents during f2fs_evict_inode. But, by calling it here, we can do that. 2. truncate_blocks() Preallocated dentry pages are trucated here to sync i_blocks. 3. remove_dirty_dir_inode() Remove this directory inode from the list. Reported-and-Tested-by: NAndrey Tsyvarev <tsyvarev@ispras.ru> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch modifies flow a little bit to avoid the following build warnings. src/fs/f2fs/recovery.c: In function ‘check_index_in_prev_nodes’: src/fs/f2fs/recovery.c:288:51: warning: ‘sum.<U5390>.<U52f8>.ofs_in_node’ may be used uninitialized in this function [-Wmaybe-uninitialized] src/fs/f2fs/recovery.c:260:23: warning: ‘sum.nid’ may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch adds GET_BLKOFF_FROM_SEG0 to clean up some codes. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This is the erroneous scenario. i_size on-disk i_size i_blocks __f2fs_add_link() 4096 4096 2 get_new_data_page 8192 4096 3 -ENOSPC = init_inode_metadata checkpoint - 4096 3 POR and reboot __f2fs_add_link() 4096 4096 3 page = get_new_data_page (page->index = 1 by NEW_ADDR) add a dentry to the page successfully f2fs_rmdir() f2fs_empty_dir() 4096 4096 3 f2fs_unlink() goes, since there is no valid dentry due to i_size = 4096. But, still there is one dentry in page->index = 1. So this patch moves the code to write dir->i_size into on-disk i_size in order to sync dir's i_size, on-disk i_size, and its i_blocks. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch modifies the use of bi_private to remove pointer chasing for sbi. Previously, we had a bi_private structure, but it needs memory allocation. So this patch uses bi_private by the sbi pointer and adds a completion pointer into the sbi. This can achieve no memory allocation and nice use of the bi_private. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
If a new xattr node page was allocated and its inode is fsynced, we should recover the xattr node page during the roll-forward process after power-cut. But, previously, f2fs didn't handle that case, resulting in kernel panic as follows reported by Tom Li. BUG: unable to handle kernel paging request at ffffc9001c861a98 IP: [<ffffffffa0295236>] check_index_in_prev_nodes+0x86/0x2d0 [f2fs] Call Trace: [<ffffffff815ece9b>] ? printk+0x48/0x4a [<ffffffffa029626a>] recover_fsync_data+0xdca/0xf50 [f2fs] [<ffffffffa02873ae>] f2fs_fill_super+0x92e/0x970 [f2fs] [<ffffffff8112c9f8>] mount_bdev+0x1b8/0x200 [<ffffffffa0286a80>] ? f2fs_remount+0x130/0x130 [f2fs] [<ffffffffa0285e40>] f2fs_mount+0x10/0x20 [f2fs] [<ffffffff8112d4de>] mount_fs+0x3e/0x1b0 [<ffffffff810ef4eb>] ? __alloc_percpu+0xb/0x10 [<ffffffff8114761f>] vfs_kern_mount+0x6f/0x120 [<ffffffff811497b9>] do_mount+0x259/0xa90 [<ffffffff810ead1d>] ? memdup_user+0x3d/0x80 [<ffffffff810eadb3>] ? strndup_user+0x53/0x70 [<ffffffff8114a2c9>] SyS_mount+0x89/0xd0 [<ffffffff815feae2>] system_call_fastpath+0x16/0x1b This patch adds a recovery function of xattr node pages. Reported-by: NTom Li <biergaizi@members.fsf.org> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch cleans up the refresh_sit_entry to handle locate_dirty_segments. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
In order to make fs consistency, update_inode_page should not be failed all the time. Otherwise, it is possible to lose some metadata in the inode like a link count. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 26 1月, 2014 3 次提交
-
-
由 Christoph Hellwig 提交于
f2fs has some weird mode bit handling, so still using the old chmod code for now. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Rename the current posix_acl_created to __posix_acl_create and add a fully featured helper to set up the ACLs on file creation that uses get_acl(). Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Rename the current posix_acl_chmod to __posix_acl_chmod and add a fully featured ACL chmod helper that uses the ->set_acl inode operation. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 23 1月, 2014 1 次提交
-
-
由 Jaegeuk Kim 提交于
If a node page is trucated, we'd better drop the page in the node_inode's page cache for better memory footprint. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 22 1月, 2014 5 次提交
-
-
由 Jaegeuk Kim 提交于
This patch adds NODE_MAPPING which is similar as META_MAPPING introduced by Gu Zheng. Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
As the orphan_blocks may be max to 504, so it is not security and rigorous to store such a large array in the kernel stack as Dan Carpenter said. In fact, grab_meta_page has locked the page in the page cache, and we can use find_get_page() to fetch the page safely in the downstream, so we can remove the page array directly. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Introduce help function META_MAPPING() to get the cache meta blocks' address space. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch moves a function in f2fs_delete_entry for code readability. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
If a dentry page is updated, we should call mark_inode_dirty to add the inode into the dirty list, so that its dentry pages are flushed to the disk. Otherwise, the inode can be evicted without flush. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 20 1月, 2014 1 次提交
-
-
由 Chris Fries 提交于
Fixed a variety of trivial checkpatch warnings. The only delta should be some minor formatting on log strings that were split / too long. Signed-off-by: NChris Fries <cfries@motorola.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 16 1月, 2014 2 次提交
-
-
由 Changman Lee 提交于
Doing sync_meta_pages with META_FLUSH when checkpoint, we overide rw using WRITE_FLUSH_FUA. At this time, we also should set REQ_META|REQ_PRIO. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch should resolve the following bug. ========================================================= [ INFO: possible irq lock inversion dependency detected ] 3.13.0-rc5.f2fs+ #6 Not tainted --------------------------------------------------------- kswapd0/41 just changed the state of lock: (&sbi->gc_mutex){+.+.-.}, at: [<ffffffffa030503e>] f2fs_balance_fs+0xae/0xd0 [f2fs] but this lock took another, RECLAIM_FS-READ-unsafe lock in the past: (&sbi->cp_rwsem){++++.?} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Chain exists of: &sbi->gc_mutex --> &sbi->cp_mutex --> &sbi->cp_rwsem Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->cp_rwsem); local_irq_disable(); lock(&sbi->gc_mutex); lock(&sbi->cp_mutex); <Interrupt> lock(&sbi->gc_mutex); *** DEADLOCK *** This bug is due to the f2fs_balance_fs call in f2fs_write_data_page. If f2fs_write_data_page is triggered by wbc->for_reclaim via kswapd, it should not call f2fs_balance_fs which tries to get a mutex grabbed by original syscall flow. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 14 1月, 2014 5 次提交
-
-
由 Changman Lee 提交于
Support for f2fs-tools/tools/f2stat to monitor /sys/kernel/debug/f2fs/status Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
With the 2 previous changes, all the long time operations are moved out of the protection region, so here we can use spinlock rather than mutex (orphan_inode_mutex) for lower overhead. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Move alloc new orphan node out of lock protection region. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Move grabing orphan block page out of protection region, and grab all the orphan block pages ahead. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: NChao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: remove unnecessary code pointed by Chao Yu] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Yuan Zhong 提交于
"boo sync" parameter is never referenced in f2fs_wait_on_page_writeback. We should remove this parameter. Signed-off-by: NYuan Zhong <yuan.mark.zhong@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 08 1月, 2014 2 次提交
-
-
由 Jaegeuk Kim 提交于
Previously during SSR and GC, the maximum number of retrials to find a victim segment was hard-coded by MAX_VICTIM_SEARCH, 4096 by default. This number makes an effect on IO locality, when SSR mode is activated, which results in performance fluctuation on some low-end devices. If max_victim_search = 4, the victim will be searched like below. ("D" represents a dirty segment, and "*" indicates a selected victim segment.) D1 D2 D3 D4 D5 D6 D7 D8 D9 [ * ] [ * ] [ * ] [ ....] This patch adds a sysfs entry to control the number dynamically through: /sys/fs/f2fs/$dev/max_victim_search Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
When considering a bunch of data writes with very frequent fsync calls, we are able to think the following performance regression. N: Node IO, D: Data IO, IO scheduler: cfq Issue pending IOs D1 D2 D3 D4 D1 D2 D3 D4 N1 D2 D3 D4 N1 N2 N1 D3 D4 N2 D1 --> N1 can be selected by cfq becase of the same priority of N and D. Then D3 and D4 would be delayed, resuling in performance degradation. So, when processing the fsync call, it'd better give higher priority to data IOs than node IOs by assigning WRITE and WRITE_SYNC respectively. This patch improves the random wirte performance with frequent fsync calls by up to 10%. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 06 1月, 2014 9 次提交
-
-
由 Chao Yu 提交于
Here is a case which could read inline page data not from first page. 1. write inline data 2. lseek to offset 4096 3. read 4096 bytes from offset 4096 (read_inline_data read inline data page to non-first page, And previously VFS has add this page to page cache) 4. ftruncate offset 8192 5. read 4096 bytes from offset 4096 (we meet this updated page with inline data in cache) So we should leave this page with inited data and uptodate flag for this case. Change log from v1: o fix a deadlock bug Signed-off-by: NChao Yu <chao2.yu@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Chao Yu 提交于
Change log from v1: o reduce unneeded memset in __f2fs_convert_inline_data >From 58796be2bd2becbe8d52305210fb2a64e7dd80b6 Mon Sep 17 00:00:00 2001 From: Chao Yu <chao2.yu@samsung.com> Date: Mon, 30 Dec 2013 09:21:33 +0800 Subject: [PATCH] f2fs: avoid to left uninitialized data in page when read inline data We left uninitialized data in the tail of page when we read an inline data page. So let's initialize left part of the page excluding inline data region. Signed-off-by: NChao Yu <chao2.yu@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 shifei10.ge 提交于
The truncate_partial_nodes puts pages incorrectly in the following two cases. Note that the value for argc 'depth' can only be 2 or 3. Please see truncate_inode_blocks() and truncate_partial_nodes(). 1) An err is occurred in the first 'for' loop When err is occurred with depth = 2, pages[0] is invalid, so this page doesn't need to be put. There is no problem, however, when depth is 3, it doesn't put the pages correctly where pages[0] is valid and pages[1] is invalid. In this case, depth is set to 2 (ref to statemnt depth = i + 1), and then 'goto fail'. In label 'fail', for (i = depth - 3; i >= 0; i--) cannot meet the condition because i = -1, so pages[0] cann't be put. 2) An err happened in the second 'for' loop Now we've got pages[0] with depth = 2, or we've got pages[0] and pages[1] with depth = 3. When an err is detected, we need 'goto fail' to put such the pages. When depth is 2, in label 'fail', for (i = depth - 3; i >= 0; i--) cann't meet the condition because i = -1, so pages[0] cann't be put. When depth is 3, in label 'fail', for (i = depth - 3; i >= 0; i--) can only put pages[0], pages[1] also cann't be put. Note that 'depth' has been changed before first 'goto fail' (ref to statemnt depth = i + 1), so passing this modified 'depth' to the tracepoint, trace_f2fs_truncate_partial_nodes, is also incorrect. Signed-off-by: NShifei Ge <shifei10.ge@samsung.com> [Jaegeuk Kim: modify the description and fix one bug] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
The get_dnode_of_data nullifies inode and node page when error is occurred. There are two cases that passes inode page into get_dnode_of_data(). 1. make_empty_dir() -> get_new_data_page() -> f2fs_reserve_block(ipage) -> get_dnode_of_data() 2. f2fs_convert_inline_data() -> __f2fs_convert_inline_data() -> f2fs_reserve_block(ipage) -> get_dnode_of_data() This patch adds correct error handling codes when get_dnode_of_data() returns an error. At first, f2fs_reserve_block() calls f2fs_put_dnode() whenever reserve_new_block returns an error. So, the rule of f2fs_reserve_block() is to nullify inode page when there is any error internally. Finally, two callers of f2fs_reserve_block() should call f2fs_put_dnode() appropriately if they got an error since successful f2fs_reserve_block(). Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch adds a inline_data recovery routine with the following policy. [prev.] [next] of inline_data flag o o -> recover inline_data o x -> remove inline_data, and then recover data blocks x o -> remove inline_data, and then recover inline_data x x -> recover data blocks Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch adds the number of inline_data files into the status information. Note that the number is reset whenever the filesystem is newly mounted. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
Change log from v1: o handle NULL pointer of grab_cache_page_write_begin() pointed by Chao Yu. This patch refactors f2fs_convert_inline_data to check a couple of conditions internally for deciding whether it needs to convert inline_data or not. So, the new f2fs_convert_inline_data initially checks: 1) f2fs_has_inline_data(), and 2) the data size to be changed. If the inode has inline_data but the size to fill is less than MAX_INLINE_DATA, then we don't need to convert the inline_data with data allocation. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
In f2fs_write_begin(), if f2fs_conver_inline_data() returns an error like -ENOSPC, f2fs should call f2fs_put_page(). Otherwise, it is remained as a locked page, resulting in the following bug. [<ffffffff8114657e>] sleep_on_page+0xe/0x20 [<ffffffff81146567>] __lock_page+0x67/0x70 [<ffffffff81157d08>] truncate_inode_pages_range+0x368/0x5d0 [<ffffffff81157ff5>] truncate_inode_pages+0x15/0x20 [<ffffffff8115804b>] truncate_pagecache+0x4b/0x70 [<ffffffff81158082>] truncate_setsize+0x12/0x20 [<ffffffffa02a1842>] f2fs_setattr+0x72/0x270 [f2fs] [<ffffffff811cdae3>] notify_change+0x213/0x400 [<ffffffff811ab376>] do_truncate+0x66/0xa0 [<ffffffff811ab541>] vfs_truncate+0x191/0x1b0 [<ffffffff811ab5bc>] do_sys_truncate+0x5c/0xa0 [<ffffffff811ab78e>] SyS_truncate+0xe/0x10 [<ffffffff81756052>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
In the punch_hole(), let's convert inline_data all the time for simplicity and to avoid potential deadlock conditions. It is pretty much not a big deal to do this. Reviewed-by: NChao Yu <chao2.yu@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 27 12月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
This patch locates checking the inline_data prior to calling f2fs_lock_op() in truncate_blocks(), since getting the lock is unnecessary. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 26 12月, 2013 2 次提交
-
-
由 Huajun Li 提交于
Hook inline data read/write, truncate, fallocate, setattr, etc. Files need meet following 2 requirement to inline: 1) file size is not greater than MAX_INLINE_DATA; 2) file doesn't pre-allocate data blocks by fallocate(). FI_INLINE_DATA will not be set while creating a new regular inode because most of the files are bigger than ~3.4K. Set FI_INLINE_DATA only when data is submitted to block layer, ranther than set it while creating a new inode, this also avoids converting data from inline to normal data block and vice versa. While writting inline data to inode block, the first data block should be released if the file has a block indexed by i_addr[0]. On the other hand, when a file operation is appied to a file with inline data, we need to test if this file can remain inline by doing this operation, otherwise it should be convert into normal file by reserving a new data block, copying inline data to this new block and clear FI_INLINE_DATA flag. Because reserve a new data block here will make use of i_addr[0], if we save inline data in i_addr[0..872], then the first 4 bytes would be overwriten. This problem can be avoided simply by not using i_addr[0] for inline data. Signed-off-by: NHuajun Li <huajun.li@intel.com> Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NWeihong Xu <weihong.xu@intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Huajun Li 提交于
Functions to implement inline data read/write, and move inline data to normal data block when file size exceeds inline data limitation. Signed-off-by: NHuajun Li <huajun.li@intel.com> Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NWeihong Xu <weihong.xu@intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-