- 29 10月, 2013 1 次提交
-
-
由 Jaegeuk Kim 提交于
This config will support an option to remove so many BUG_ONs that degrade the performance potentially. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 28 10月, 2013 3 次提交
-
-
由 Jaegeuk Kim 提交于
The deadlock is found through the following scenario. sys_mkdir() -> f2fs_add_link() -> __f2fs_add_link() -> init_inode_metadata() : lock_page(inode); -> f2fs_init_acl() -> f2fs_set_acl() -> f2fs_setxattr(..., NULL) : This NULL page incurs a deadlock at update_inode_page(). So, likewise f2fs_init_security(), this patch adds a parameter to transfer the locked inode page to f2fs_setxattr(). Found by Linux File System Verification project (linuxtesting.org). Reported-by: NAlexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch cleans up a couple of acl codes. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Changman Lee 提交于
Only one dirty type is set in __locate_dirty_segment and we can know dirty type of segment. So we don't need to check other dirty types. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 25 10月, 2013 8 次提交
-
-
由 Jaegeuk Kim 提交于
This patch adds a tracepoint for f2fs_vm_page_mkwrite. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch adds a tracepoint for set_page_dirty. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Chao Yu 提交于
Previously, set_page_dirty is called every time after writting one summary info into compacted summary page, To avoid redundant set_page_dirty, we only call set_page_dirty before release page. Signed-off-by: NYu Chao <chao2.yu@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch adds a control method in sysfs to reclaim prefree segments. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch merges some background jobs into this new function. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
Previously, f2fs postpones reclaiming prefree segments into free segments as much as possible. However, if user writes and deletes a bunch of data without any sync or fsync calls, some flash storages can suffer from garbage collections. So, this patch adds the reclaiming codes to f2fs_write_node_pages and background GC thread. If there are a lot of prefree segments, let's do checkpoint so that f2fs submits discard commands for the prefree regions to the flash storage. Signed-off-by: NChangman Lee <cm224.lee@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch cleans up improper definitions that update some status information. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 22 10月, 2013 3 次提交
-
-
由 Gu Zheng 提交于
Introduce the unfailed version of kmem_cache_alloc named f2fs_kmem_cache_alloc to hide the retry routine and make the code a bit cleaner. v2: Fix the wrong use of 'retry' tag pointed out by Gao feng. Use more neat code to remove redundant tag suggested by Haicheng Li. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
Because one dirty seg can only be mapped to one dirty_type. Otherwise, it's a bug. Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> [Jaegeuk Kim: modify a comment related to this patch] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Haicheng Li 提交于
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 18 10月, 2013 5 次提交
-
-
由 Jaegeuk Kim 提交于
This patch enhances the recovery routine not to write any data/node/meta until its completion. If any writes are sent to the disk, it could contaminate the written history that will be used for further recovery. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Previously, do_checkpoint() will call congestion_wait() for waiting the pages (previous submitted node/meta/data pages) to be written back. Because congestion_wait() will set a regular period (e.g. HZ / 50 ) for waiting, and no additional wake up mechanism was introduced if IO ends up before regular period costed. Yuan Zhong found there is a situation that after the pages have been written back, but the checkpoint thread still wait for congestion_wait to exit. So here we store checkpoint task into f2fs_sb when doing checkpoint, it'll wait for IO completes if there's IO going on, and in the end IO path, wake up checkpoint task when IO ends up. Thanks to Yuan Zhong's pre work about this problem. Reported-by: NYuan Zhong <yuan.mark.zhong@samsung.com> Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
Introduce function read_raw_super_block() to hide reading raw super block and the retry routine if the first sb is invalid. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
This patch removes the logic previously introduced to address the starvation on cp_rwsem. One potential there-in bug is that we should cover the wait.list with spin_lock, but the previous code broke this rule. And, actually current rwsem handles this starvation issue reasonably, so that we didn't need to do this before neither. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
When storing i_rdev, we should check its file type. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 08 10月, 2013 2 次提交
-
-
由 Jaegeuk Kim 提交于
Previously, there was a erroneous scenario like below. thread 1: thread 2: f2fs_unlink - acquire_orphan_inode : sbi->n_orphans++ write_checkpoint - block_operations : f2fs_lock_all - do_checkpoint : write orphan blocks with sbi->n_orphans - unblock_operations - f2fs_lock_op - release_orphan_inode - f2fs_unlock_op During the checkpoint by thread 2, f2fs stores a wrong orphan block according to the wrong sbi->n_orphans. To avoid this, simply we should make cover acquire_orphan_inode too with f2fs_lock_op. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
During the f2fs_put_super procedure, we don't need to conduct checkpoint all the time, since we don't need to do that if superblock is clean. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 07 10月, 2013 2 次提交
-
-
由 Kelly Anderson 提交于
The current f2fs code errors if the xattr or acl options are passed when remounting. This is important in a typical scenario where f2fs is mounted as a "ro" root file-system by the boot loader and then the init process wants to remount it "rw" with the "remount,rw" option. Signed-off-by: NKelly Anderson <kelly@xilka.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Gu Zheng 提交于
The fs_locks is used to block other ops(ex, recovery) when doing checkpoint. And each other operate routine(besides checkpoint) needs to acquire a fs_lock, there is a terrible problem here, if these are too many concurrency threads acquiring fs_lock, so that they will block each other and may lead to some performance problem, but this is not the phenomenon we want to see. Though there are some optimization patches introduced to enhance the usage of fs_lock, but the thorough solution is using a *rw_sem* to replace the fs_lock. Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other, this can avoid the problem described above completely. Because of the weakness of rw_sem, the above change may introduce a potential problem that the checkpoint thread might get starved if other threads are intensively locking the read semaphore for I/O.(Pointed out by Xu Jin) In order to avoid this, a wait_list is introduced, the appending read semaphore ops will be dropped into the wait_list if checkpoint thread is waiting for write semaphore, and will be waked up when checkpoint thread gives up write semaphore. Thanks to Kim's previous review and test, and will be very glad to see other guys' performance tests about this patch. V2: -fix the potential starvation problem. -use more suitable func name suggested by Xu Jin. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: adjust minor coding standard] Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 25 9月, 2013 4 次提交
-
-
由 Russ W. Knize 提交于
During recovery, orphan inodes are deleted via truncate_hole(). These orphans are added by recover_dentry() via f2fs_delete_entry(). However, f2fs_delete_entry() adds them via add_orphan_inode() without calling acquire_orphan_inode() first. This prevents the counters from being incremented properly, which causes them to underflow when remove_orphan_inode() is called later on. Signed-off-by: NRuss Knize <rknize@motorola.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Russ Knize 提交于
f2fs_initxattrs() is called internally from within F2FS and should not call functions that are used by VFS handlers. This avoids certain deadlocks: - vfs_create() - f2fs_create() <-- takes an fs_lock - f2fs_add_link() - __f2fs_add_link() - init_inode_metadata() - f2fs_init_security() - security_inode_init_security() - f2fs_initxattrs() - f2fs_setxattr() <-- also takes an fs_lock If the caller happens to grab the same fs_lock from the pool in both places, they will deadlock. There are also deadlocks involving multiple threads and mutexes: - f2fs_write_begin() - f2fs_balance_fs() <-- takes gc_mutex - f2fs_gc() - write_checkpoint() - block_operations() - mutex_lock_all() <-- blocks trying to grab all fs_locks - f2fs_mkdir() <-- takes an fs_lock - __f2fs_add_link() - f2fs_init_security() - security_inode_init_security() - f2fs_initxattrs() - f2fs_setxattr() - f2fs_balance_fs() <-- blocks trying to take gc_mutex Signed-off-by: NRuss Knize <Russ.Knize@motorola.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Russ W. Knize 提交于
Accounting errors from buggy code calling the acquire/release/remove orphan inode interfaces can cause n_orphans to underflow, which will then cause acquire_orphan_inode() to return -ENOSPC on the next operation. This commit guards against that condition. Signed-off-by: NRuss Knize <rknize@motorola.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Chao Yu 提交于
Previously, recover_fsync_data still to write checkpoint when there is nothing to recover with normal umount image. It may reduce mount performance and flash memory lifetime, so let's remove it. Signed-off-by: NTan Shu <shu.tan@samsung.com> Signed-off-by: NYu Chao <chao2.yu@samsung.com> Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 24 9月, 2013 3 次提交
-
-
由 Chao Yu 提交于
This patch add macro MAX_BIO_BLOCKS to limit value of npages in f2fs_bio_alloc, it can avoid allocating failure in bio_alloc caused by npages is larger than BIO_MAX_PAGES. Signed-off-by: NYu Chao <chao2.yu@samsung.com> Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jin Xu 提交于
Since the MAX_VICTIM_SEARCH has been enlarged from 20 to 4096, the victim searching overhead will be increased much than before, especially for SSR that searches victim for use quiet often. This patch intends to reduce the overhead a little bit by: - make the get_gc_cost a inline routine to reduce function call overhead - reduce multiplication and division operations - reduce unnecessary comparison operation Signed-off-by: NJin Xu <jinuxstyle@gmail.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Yu Chao 提交于
There is a performance problem: when all sbi->fs_lock are holded, then all the following threads may get the same next_lock value from sbi->next_lock_num in function mutex_lock_op, and wait for the same lock(fs_lock[next_lock]), it may cause performance reduce. So we move the sbi->next_lock_num++ before getting lock, this will average the following threads if all sbi->fs_lock are holded. v1-->v2: Drop the needless spin_lock as Jaegeuk suggested. Suggested-by: NJaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: NYu Chao <chao2.yu@samsung.com> Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 05 9月, 2013 2 次提交
-
-
由 Jin Xu 提交于
This patch improves the gc efficiency by optimizing the victim selection policy. With this optimization, the random re-write performance could increase up to 20%. For f2fs, when disk is in shortage of free spaces, gc will selects dirty segments and moves valid blocks around for making more space available. The gc cost of a segment is determined by the valid blocks in the segment. The less the valid blocks, the higher the efficiency. The ideal victim segment is the one that has the most garbage blocks. Currently, it searches up to 20 dirty segments for a victim segment. The selected victim is not likely the best victim for gc when there are much more dirty segments. Why not searching more dirty segments for a better victim? The cost of searching dirty segments is negligible in comparison to moving blocks. In this patch, it enlarges the MAX_VICTIM_SEARCH to 4096 to make the search more aggressively for a possible better victim. Since it also applies to victim selection for SSR, it will likely improve the SSR efficiency as well. The test case is simple. It creates as many files until the disk full. The size for each file is 32KB. Then it writes as many as 100000 records of 4KB size to random offsets of random files in sync mode. The testing was done on a 2GB partition of a SDHC card. Let's see the test result of f2fs without and with the patch. --------------------------------------- 2GB partition, SDHC create 52023 files of size 32768 bytes random re-write 100000 records of 4KB --------------------------------------- | file creation (s) | rewrite time (s) | gc count | gc garbage blocks | [no patch] 341 4227 1174 174840 [patched] 324 2958 645 106682 It's obvious that, with the patch, f2fs finishes the test in 20+% less time than without the patch. And internally it does much less gc with higher efficiency than before. Since the performance improvement is related to gc, it might not be so obvious for other tests that do not trigger gc as often as this one ( This is because f2fs selects dirty segments for SSR use most of the time when free space is in shortage). The well-known iozone test tool was not used for benchmarking the patch becuase it seems do not have a test case that performs random re-write on a full disk. This patch is the revised version based on the suggestion from Jaegeuk Kim. Signed-off-by: NJin Xu <jinuxstyle@gmail.com> [Jaegeuk Kim: suggested simpler solution] Reviewed-by: NJaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
Previously, we experience bio traces as follows when running simple sequential write test. f2fs_do_submit_bio: type = NODE, io = no sync, sector = 500104928, size = 4K f2fs_do_submit_bio: type = NODE, io = no sync, sector = 499922208, size = 368K f2fs_do_submit_bio: type = NODE, io = no sync, sector = 499914752, size = 140K -> total 512K The first one is to write an indirect node block, and the others are to write direct node blocks. The reason why there are two separate bios for direct node blocks is: 0. initial state ------------------ ------------------ | | |xxxxxxxx | ------------------ ------------------ 1. write 368K ------------------ ------------------ | | |xxxxxxxxWWWWWWWW| ------------------ ------------------ 2. write 140K ------------------ ------------------ |WWWWWWW | |xxxxxxxxWWWWWWWW| ------------------ ------------------ This is because f2fs_write_node_pages tries to write just 512K totally, so that we can lose the chance to merge more bios nicely. After this patch is applied, we can get the following bio traces. f2fs_do_submit_bio: type = NODE, io = no sync, sector = 500103168, size = 8K f2fs_do_submit_bio: type = NODE, io = no sync, sector = 500111368, size = 4K f2fs_do_submit_bio: type = NODE, io = no sync, sector = 500107272, size = 512K f2fs_do_submit_bio: type = NODE, io = no sync, sector = 500108296, size = 512K f2fs_do_submit_bio: type = NODE, io = no sync, sector = 500109320, size = 500K And finally, we can improve the sequential write performance, from 458.775 MB/s to 479.945 MB/s on SSD. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 03 9月, 2013 2 次提交
-
-
由 Jaegeuk Kim 提交于
The current f2fs uses all the block counts with 32 bit numbers, which is able to cover about 15TB volume. But in calculation of utilization, f2fs multiplies the count by 100 which can induce overflow. This patch fixes this. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
Previously, f2fs conducts SSR when free_sections() < overprovision_sections. But, even though there are a lot of prefree segments, it can consider SSR only. So, let's consider the number of prefree segments too for triggering SSR. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 27 8月, 2013 2 次提交
-
-
由 Gu Zheng 提交于
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
The f2fs_set_link updates its parent inode number, so we should sync this to the inode block. Otherwise, the data can be lost after sudden-power-off. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
- 26 8月, 2013 3 次提交
-
-
由 Jaegeuk Kim 提交于
0. modified inode structure -------------------------------------- metadata (e.g., i_mtime, i_ctime, etc) -------------------------------------- direct pointers [0 ~ 873] inline xattrs (200 bytes by default) indirect pointers [0 ~ 4] -------------------------------------- node footer -------------------------------------- 1. setxattr flow - read_all_xattrs copies all the xattrs from inline and xattr node block. - handle xattr entries - write_all_xattrs copies modified xattrs into inline and xattr node block. 2. getxattr flow - read_all_xattrs copies all the xattrs from inline and xattr node block. - check target entries 3. Usage # mount -t f2fs -o inline_xattr $DEV $MNT Once mounted with the inline_xattr option, f2fs marks all the newly created files to reserve an amount of inline xattr space explicitly inside the inode block. Without the mount option, f2fs will not touch any existing files and newly created files as well. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
The truncate_xattr_node function will be used by inline xattr. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-
由 Jaegeuk Kim 提交于
The __find_xattr is to search the wanted xattr entry starting from the base_addr. If not found, the returned entry is the last empty xattr entry that can be allocated newly. Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
-