提交 · c33ec32692e1f2f4650f7bf5bb1108bb346b82a4 · openanolis / cloud-kernel

16 1月, 2014 1 次提交

f2fs: avoid f2fs_balance_fs call during pageout · c33ec326

由 Jaegeuk Kim 提交于 1月 16, 2014

This patch should resolve the following bug.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.13.0-rc5.f2fs+ #6 Not tainted
---------------------------------------------------------
kswapd0/41 just changed the state of lock:
 (&sbi->gc_mutex){+.+.-.}, at: [<ffffffffa030503e>] f2fs_balance_fs+0xae/0xd0 [f2fs]
but this lock took another, RECLAIM_FS-READ-unsafe lock in the past:
 (&sbi->cp_rwsem){++++.?}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
Chain exists of:
  &sbi->gc_mutex --> &sbi->cp_mutex --> &sbi->cp_rwsem

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sbi->cp_rwsem);
                               local_irq_disable();
                               lock(&sbi->gc_mutex);
                               lock(&sbi->cp_mutex);
  <Interrupt>
    lock(&sbi->gc_mutex);

 *** DEADLOCK ***

This bug is due to the f2fs_balance_fs call in f2fs_write_data_page.
If f2fs_write_data_page is triggered by wbc->for_reclaim via kswapd, it should
not call f2fs_balance_fs which tries to get a mutex grabbed by original syscall
flow.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c33ec326

14 1月, 2014 5 次提交

f2fs: add delimiter to seperate name and value in debug phrase · 499046ab

由 Changman Lee 提交于 1月 13, 2014

Support for f2fs-tools/tools/f2stat to monitor
/sys/kernel/debug/f2fs/status
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

499046ab

f2fs: use spinlock rather than mutex for better speed · 17b692f6

由 Gu Zheng 提交于 1月 10, 2014

With the 2 previous changes, all the long time operations are moved out
of the protection region, so here we can use spinlock rather than mutex
(orphan_inode_mutex) for lower overhead.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

17b692f6

f2fs: move alloc new orphan node out of lock protection region · c1ef3725

由 Gu Zheng 提交于 1月 10, 2014

Move alloc new orphan node out of lock protection region.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

c1ef3725

f2fs: move grabing orphan pages out of protection region · 4531929e

由 Gu Zheng 提交于 1月 10, 2014

Move grabing orphan block page out of protection region, and grab all
the orphan block pages ahead.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: remove unnecessary code pointed by Chao Yu]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

4531929e

f2fs: remove the needless parameter of f2fs_wait_on_page_writeback · 5514f0aa

由 Yuan Zhong 提交于 1月 10, 2014

"boo sync" parameter is never referenced in f2fs_wait_on_page_writeback.
We should remove this parameter.
Signed-off-by: NYuan Zhong <yuan.mark.zhong@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5514f0aa

09 1月, 2014 1 次提交

f2fs: update documents and a MAINTAINERS entry · 3bac380c

由 Jaegeuk Kim 提交于 1月 09, 2014

This patch adds missing some description of sysfs entries in
 - Documentation/ABI/testing/sysfs-fs-f2fs
 - Documentation/filesystems/f2fs.txt.

And it adds a maintained document entry of F2FS in MAINTAINERS.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

3bac380c

08 1月, 2014 2 次提交

f2fs: add a sysfs entry to control max_victim_search · b1c57c1c

由 Jaegeuk Kim 提交于 1月 08, 2014

Previously during SSR and GC, the maximum number of retrials to find a victim
segment was hard-coded by MAX_VICTIM_SEARCH, 4096 by default.

This number makes an effect on IO locality, when SSR mode is activated, which
results in performance fluctuation on some low-end devices.

If max_victim_search = 4, the victim will be searched like below.
("D" represents a dirty segment, and "*" indicates a selected victim segment.)

 D1 D2 D3 D4 D5 D6 D7 D8 D9
[   *       ]
      [   *    ]
            [         * ]
	                [ ....]

This patch adds a sysfs entry to control the number dynamically through:
  /sys/fs/f2fs/$dev/max_victim_search
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

b1c57c1c

f2fs: improve write performance under frequent fsync calls · fb5566da

由 Jaegeuk Kim 提交于 1月 08, 2014

When considering a bunch of data writes with very frequent fsync calls, we
are able to think the following performance regression.

N: Node IO, D: Data IO, IO scheduler: cfq

Issue    pending IOs
	 D1 D2 D3 D4
 D1         D2 D3 D4 N1
 D2            D3 D4 N1 N2
 N1            D3 D4 N2 D1
 --> N1 can be selected by cfq becase of the same priority of N and D.
     Then D3 and D4 would be delayed, resuling in performance degradation.

So, when processing the fsync call, it'd better give higher priority to data IOs
than node IOs by assigning WRITE and WRITE_SYNC respectively.
This patch improves the random wirte performance with frequent fsync calls by up
to 10%.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

fb5566da

06 1月, 2014 9 次提交

f2fs: avoid to read inline data except first page · 04a17fb1

由 Chao Yu 提交于 12月 30, 2013

Here is a case which could read inline page data not from first page.

1. write inline data
2. lseek to offset 4096
3. read 4096 bytes from offset 4096
	(read_inline_data read inline data page to non-first page,
	And previously VFS has add this page to page cache)
4. ftruncate offset 8192
5. read 4096 bytes from offset 4096
	(we meet this updated page with inline data in cache)

So we should leave this page with inited data and uptodate flag
for this case.

Change log from v1:
 o fix a deadlock bug
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

04a17fb1

f2fs: avoid to left uninitialized data in page when read inline data · 18309aaa

由 Chao Yu 提交于 12月 30, 2013

Change log from v1:
 o reduce unneeded memset in __f2fs_convert_inline_data

>From 58796be2bd2becbe8d52305210fb2a64e7dd80b6 Mon Sep 17 00:00:00 2001
From: Chao Yu <chao2.yu@samsung.com>
Date: Mon, 30 Dec 2013 09:21:33 +0800
Subject: [PATCH] f2fs: avoid to left uninitialized data in page when read
 inline data

We left uninitialized data in the tail of page when we read an inline data
page. So let's initialize left part of the page excluding inline data region.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

18309aaa

f2fs: fix truncate_partial_nodes bug · a225dca3

由 shifei10.ge 提交于 10月 29, 2013

The truncate_partial_nodes puts pages incorrectly in the following two cases.
Note that the value for argc 'depth' can only be 2 or 3.
Please see truncate_inode_blocks() and truncate_partial_nodes().

1) An err is occurred in the first 'for' loop
  When err is occurred with depth = 2, pages[0] is invalid, so this page doesn't
  need to be put. There is no problem, however, when depth is 3, it doesn't put
  the pages correctly where pages[0] is valid and pages[1] is invalid.
  In this case, depth is set to 2 (ref to statemnt depth = i + 1), and then
  'goto fail'.
  In label 'fail', for (i = depth - 3; i >= 0; i--) cannot meet the condition
  because i = -1, so pages[0] cann't be put.

2) An err happened in the second 'for' loop
  Now we've got pages[0] with depth = 2, or we've got pages[0] and pages[1]
  with depth = 3. When an err is detected, we need 'goto fail' to put such
  the pages.
  When depth is 2, in label 'fail', for (i = depth - 3; i >= 0; i--) cann't
  meet the condition because i = -1, so pages[0] cann't be put.
  When depth is 3, in label 'fail', for (i = depth - 3; i >= 0; i--) can
  only put pages[0], pages[1] also cann't be put.

Note that 'depth' has been changed before first 'goto fail' (ref to statemnt
depth = i + 1), so passing this modified 'depth' to the tracepoint,
trace_f2fs_truncate_partial_nodes, is also incorrect.
Signed-off-by: NShifei Ge <shifei10.ge@samsung.com>
[Jaegeuk Kim: modify the description and fix one bug]
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a225dca3

f2fs: handle errors correctly during f2fs_reserve_block · a8865372

由 Jaegeuk Kim 提交于 12月 27, 2013

The get_dnode_of_data nullifies inode and node page when error is occurred.

There are two cases that passes inode page into get_dnode_of_data().

1. make_empty_dir()
    -> get_new_data_page()
      -> f2fs_reserve_block(ipage)
	-> get_dnode_of_data()

2. f2fs_convert_inline_data()
    -> __f2fs_convert_inline_data()
      -> f2fs_reserve_block(ipage)
	-> get_dnode_of_data()

This patch adds correct error handling codes when get_dnode_of_data() returns
an error.

At first, f2fs_reserve_block() calls f2fs_put_dnode() whenever reserve_new_block
returns an error.
So, the rule of f2fs_reserve_block() is to nullify inode page when there is any
error internally.

Finally, two callers of f2fs_reserve_block() should call f2fs_put_dnode()
appropriately if they got an error since successful f2fs_reserve_block().
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

a8865372

f2fs: add inline_data recovery routine · 1e1bb4ba

由 Jaegeuk Kim 提交于 12月 26, 2013

This patch adds a inline_data recovery routine with the following policy.

[prev.] [next] of inline_data flag
   o       o  -> recover inline_data
   o       x  -> remove inline_data, and then recover data blocks
   x       o  -> remove inline_data, and then recover inline_data
   x       x  -> recover data blocks
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1e1bb4ba

f2fs: add the number of inline_data files to status info · 0dbdc2ae

由 Jaegeuk Kim 提交于 11月 26, 2013

This patch adds the number of inline_data files into the status information.
Note that the number is reset whenever the filesystem is newly mounted.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

0dbdc2ae

f2fs: refactor f2fs_convert_inline_data · 9e09fc85

由 Jaegeuk Kim 提交于 12月 27, 2013

Change log from v1:
 o handle NULL pointer of grab_cache_page_write_begin() pointed by Chao Yu.

This patch refactors f2fs_convert_inline_data to check a couple of conditions
internally for deciding whether it needs to convert inline_data or not.

So, the new f2fs_convert_inline_data initially checks:
1) f2fs_has_inline_data(), and
2) the data size to be changed.

If the inode has inline_data but the size to fill is less than MAX_INLINE_DATA,
then we don't need to convert the inline_data with data allocation.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9e09fc85

f2fs: call f2fs_put_page at the error case · 26f466f4

由 Jaegeuk Kim 提交于 12月 27, 2013

In f2fs_write_begin(), if f2fs_conver_inline_data() returns an error like
-ENOSPC, f2fs should call f2fs_put_page().
Otherwise, it is remained as a locked page, resulting in the following bug.

[<ffffffff8114657e>] sleep_on_page+0xe/0x20
[<ffffffff81146567>] __lock_page+0x67/0x70
[<ffffffff81157d08>] truncate_inode_pages_range+0x368/0x5d0
[<ffffffff81157ff5>] truncate_inode_pages+0x15/0x20
[<ffffffff8115804b>] truncate_pagecache+0x4b/0x70
[<ffffffff81158082>] truncate_setsize+0x12/0x20
[<ffffffffa02a1842>] f2fs_setattr+0x72/0x270 [f2fs]
[<ffffffff811cdae3>] notify_change+0x213/0x400
[<ffffffff811ab376>] do_truncate+0x66/0xa0
[<ffffffff811ab541>] vfs_truncate+0x191/0x1b0
[<ffffffff811ab5bc>] do_sys_truncate+0x5c/0xa0
[<ffffffff811ab78e>] SyS_truncate+0xe/0x10
[<ffffffff81756052>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

26f466f4

f2fs: convert inline_data for punch_hole · 8230a0a4

由 Jaegeuk Kim 提交于 12月 27, 2013

In the punch_hole(), let's convert inline_data all the time for simplicity and
to avoid potential deadlock conditions.
It is pretty much not a big deal to do this.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

8230a0a4

27 12月, 2013 1 次提交

f2fs: don't need to get f2fs_lock_op for the inline_data test · f185ff97

由 Jaegeuk Kim 提交于 12月 27, 2013

This patch locates checking the inline_data prior to calling f2fs_lock_op()
in truncate_blocks(), since getting the lock is unnecessary.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

f185ff97

26 12月, 2013 8 次提交

f2fs: update f2fs Documentation · e4024e86

由 Huajun Li 提交于 11月 10, 2013

This patch describes the inline_data support in f2fs document.
Signed-off-by: NHuajun Li <huajun.li@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e4024e86

f2fs: handle inline data operations · 9ffe0fb5

由 Huajun Li 提交于 11月 10, 2013

Hook inline data read/write, truncate, fallocate, setattr, etc.

Files need meet following 2 requirement to inline:
 1) file size is not greater than MAX_INLINE_DATA;
 2) file doesn't pre-allocate data blocks by fallocate().

FI_INLINE_DATA will not be set while creating a new regular inode because
most of the files are bigger than ~3.4K. Set FI_INLINE_DATA only when
data is submitted to block layer, ranther than set it while creating a new
inode, this also avoids converting data from inline to normal data block
and vice versa.

While writting inline data to inode block, the first data block should be
released if the file has a block indexed by i_addr[0].

On the other hand, when a file operation is appied to a file with inline
data, we need to test if this file can remain inline by doing this
operation, otherwise it should be convert into normal file by reserving
a new data block, copying inline data to this new block and clear
FI_INLINE_DATA flag. Because reserve a new data block here will make use
of i_addr[0], if we save inline data in i_addr[0..872], then the first
4 bytes would be overwriten. This problem can be avoided simply by
not using i_addr[0] for inline data.
Signed-off-by: NHuajun Li <huajun.li@intel.com>
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: NWeihong Xu <weihong.xu@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

9ffe0fb5

f2fs: key functions to handle inline data · e18c65b2

由 Huajun Li 提交于 11月 10, 2013

Functions to implement inline data read/write, and move inline data to
normal data block when file size exceeds inline data limitation.
Signed-off-by: NHuajun Li <huajun.li@intel.com>
Signed-off-by: NHaicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: NWeihong Xu <weihong.xu@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

e18c65b2

f2fs: convert max_orphans to a field of f2fs_sb_info · 0d47c1ad

由 Gu Zheng 提交于 12月 26, 2013

Previously, we need to calculate the max orphan num when we try to acquire an
orphan inode, but it's a stable value since the super block was inited. So
converting it to a field of f2fs_sb_info and use it directly when needed seems
a better choose.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

0d47c1ad

f2fs: check the blocksize before calling generic_direct_IO path · 944fcfc1

由 Jaegeuk Kim 提交于 12月 26, 2013

The f2fs supports 4KB block size. If user requests dwrite with under 4KB data,
it allocates a new 4KB data block.
However, f2fs doesn't add zero data into the untouched data area inside the
newly allocated data block.

This incurs an error during the xfstest #263 test as follow.

263 12s ... [failed, exit status 1] - output mismatch (see 263.out.bad)
	--- 263.out	2013-03-09 03:37:15.043967603 +0900
	+++ 263.out.bad	2013-12-27 04:20:39.230203114 +0900
	@@ -1,3 +1,976 @@
	QA output created by 263
	fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z
	-fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z
	+fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z
	+truncating to largest ever: 0x12a00
	+truncating to largest ever: 0x75400
	+fallocating to largest ever: 0x79cbf
	...
	(Run 'diff -u 263.out 263.out.bad' to see the entire diff)
	Ran: 263
	Failures: 263
	Failed 1 of 1 tests

It turns out that, when the test tries to write 2KB data with dio, the new dio
path allocates 4KB data block without filling zero data inside the remained 2KB
area. Finally, the output file contains a garbage data for that region.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

944fcfc1

f2fs: should put the dnode when NEW_ADDR is detected · 1ec79083

由 Jaegeuk Kim 提交于 12月 26, 2013

When get_dnode_of_data() in get_data_block() returns a successful dnode, we
should put the dnode.
But, previously, if its data block address is equal to NEW_ADDR, we didn't do
that, resulting in a deadlock condition.
So, this patch splits original error conditions with this case, and then calls
f2fs_put_dnode before finishing the function.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

1ec79083

f2fs: introduce F2FS_INODE macro to get f2fs_inode · 58bfaf44

由 Jaegeuk Kim 提交于 12月 26, 2013

This patch introduces F2FS_INODE that returns struct f2fs_inode * from the inode
page.
By using this macro, we can remove unnecessary casting codes like below.

   struct f2fs_inode *ri = &F2FS_NODE(inode_page)->i;
-> struct f2fs_inode *ri = F2FS_INODE(inode_page);
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

58bfaf44

f2fs: check filename length in recover_dentry · d96b1431

由 Chao Yu 提交于 12月 23, 2013

In current flow, we will get Null return value of f2fs_find_entry in
recover_dentry when name.len is bigger than F2FS_NAME_LEN, and then we
still add this inode into its dir entry.
To avoid this situation, we must check filename length before we use it.

Another point is that we could remove the code of checking filename length
In f2fs_find_entry, because f2fs_lookup will be called previously to ensure of
validity of filename length.

V2:
 o add WARN_ON() as Jaegeuk Kim suggested.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

d96b1431

23 12月, 2013 13 次提交

f2fs: avoid to set wrong pino of inode when rename dir · deead090

由 Chao Yu 提交于 12月 21, 2013

When we rename a dir to new name which is not exist previous,
we will set pino of parent inode with ino of child inode in f2fs_set_link.
It destroy consistency of pino, it should be fixed.

Thanks for previous work of Shu Tan.
Signed-off-by: NShu Tan <shu.tan@samsung.com>
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

deead090

f2fs: update several comments · 4f4124d0

由 Chao Yu 提交于 12月 21, 2013

Update several comments:
1. use f2fs_{un}lock_op install of mutex_{un}lock_op.
2. update comment of get_data_block().
3. update description of node offset.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

4f4124d0

f2fs: remove the rw_flag domain from f2fs_io_info · 7e8f2308

由 Gu Zheng 提交于 12月 20, 2013

When using the f2fs_io_info in the low level, we still need to merge the
rw and rw_flag, so use the rw to hold all the io flags directly,
and remove the rw_flag field.

ps.It is based on the previous patch:
f2fs: move all the bio initialization into __bio_alloc
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

7e8f2308

f2fs: move all the bio initialization into __bio_alloc · 940a6d34

由 Gu Zheng 提交于 12月 20, 2013

Move all the bio initialization into __bio_alloc, and some minor cleanups are
also added.

v3:
  Use 'bool' rather than 'int' as Kim suggested.

v2:
  Use 'is_read' rather than 'rw' as Yu Chao suggested.
  Remove the needless initialization of bio->bi_private.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

940a6d34

f2fs: add description about small_discards in document · ba0697ec

由 Jaegeuk Kim 提交于 12月 19, 2013

This patch adds a description about small_disacrds in the f2fs document.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

ba0697ec

f2fs: write dirty meta pages collectively · 5459aa97

由 Jaegeuk Kim 提交于 12月 17, 2013

This patch enhances writing dirty meta pages collectively in background.
During the file data writes, it'd better avoid to write small dirty meta pages
frequently.
So let's give a chance to collect a number of dirty meta pages for a while.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5459aa97

f2fs: introduce a new direct_IO write path · bfad7c2d

由 Jaegeuk Kim 提交于 12月 16, 2013

Previously, f2fs doesn't support direct IOs with high performance, which throws
every write requests via the buffered write path, resulting in highly
performance degradation due to memory opeations like copy_from_user.

This patch introduces a new direct IO path in which every write requests are
processed by generic blockdev_direct_IO() with enhanced get_block function.

The get_data_block() in f2fs handles:
1. if original data blocks are allocates, then give them to blockdev.
2. otherwise,
  a. preallocate requested block addresses
  b. do not use extent cache for better performance
  c. give the block addresses to blockdev

This policy induces that:
- new allocated data are sequentially written to the disk
- updated data are randomly written to the disk.
- f2fs gives consistency on its file meta, not file data.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

bfad7c2d

f2fs: introduce sysfs entry to control in-place-update policy · 216fbd64

由 Jaegeuk Kim 提交于 11月 07, 2013

This patch introduces new sysfs entries for users to control the policy of
in-place-updates, namely IPU, in f2fs.

Sometimes f2fs suffers from performance degradation due to its out-of-place
update policy that produces many additional node block writes.
If the storage performance is very dependant on the amount of data writes
instead of IO patterns, we'd better drop this out-of-place update policy.

This patch suggests 5 polcies and their triggering conditions as follows.

[sysfs entry name = ipu_policy]

0: F2FS_IPU_FORCE       all the time,
1: F2FS_IPU_SSR         if SSR mode is activated,
2: F2FS_IPU_UTIL        if FS utilization is over threashold,
3: F2FS_IPU_SSR_UTIL    if SSR mode is activated and FS utilization is over
                        threashold,
4: F2FS_IPU_DISABLE    disable IPU. (=default option)

[sysfs entry name = min_ipu_util]

This parameter controls the threshold to trigger in-place-updates.
The number indicates percentage of the filesystem utilization, and used by
F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.

For more details, see need_inplace_update() in segment.h.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

216fbd64

f2fs: missing kmem_cache_destroy for discard_entry · 5dcd8a71

由 Changman Lee 提交于 12月 11, 2013

insmod f2fs.ko is failed after insmod and rmmod firstly.

$ sudo insmod fs/f2fs/f2fs.ko
insmod: error inserting 'fs/f2fs/f2fs.ko': -1 Cannot allocate memory

-- dmesg --
kmem_cache_sanity_check (free_nid): Cache name already exists.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

5dcd8a71

f2fs: fix the location of tracepoint · 76130cca

由 Jaegeuk Kim 提交于 12月 11, 2013

We need to get a trace before submit_bio, since its bi_sector is remapped during
the submit_bio.
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

76130cca

f2fs: refactor bio->rw handling · 458e6197

由 Jaegeuk Kim 提交于 12月 11, 2013

This patch introduces f2fs_io_info to mitigate the complex parameter list.

struct f2fs_io_info {
	enum page_type type;		/* contains DATA/NODE/META/META_FLUSH */
	int rw;				/* contains R/RS/W/WS */
	int rw_flag;			/* contains REQ_META/REQ_PRIO */
}

1. f2fs_write_data_pages
 - DATA
 - WRITE_SYNC is set when wbc->WB_SYNC_ALL.

2. sync_node_pages
 - NODE
 - WRITE_SYNC all the time

3. sync_meta_pages
 - META
 - WRITE_SYNC all the time
 - REQ_META | REQ_PRIO all the time

 ** f2fs_submit_merged_bio() handles META_FLUSH.

4. ra_nat_pages, ra_sit_pages, ra_sum_pages
 - META
 - READ_SYNC

Cc: Fan Li <fanofcode.li@samsung.com>
Cc: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

458e6197

f2fs: merge pages with the same sync_mode flag · 63a0b7cb

由 Fan Li 提交于 12月 09, 2013

Previously f2fs submits most of write requests using WRITE_SYNC, but f2fs_write_data_pages
submits last write requests by sync_mode flags callers pass.

This causes a performance problem since continuous pages with different sync flags
can't be merged in cfq IO scheduler(thanks yu chao for pointing it out), and synchronous
requests often take more time.

This patch makes the following modifies to DATA writebacks:

1. every page will be written back using the sync mode caller pass.
2. only pages with the same sync mode can be merged in one bio request.

These changes are restricted to DATA pages.Other types of writebacks are modified
To remain synchronous.

In my test with tiotest, f2fs sequence write performance is improved by about 7%-10% ,
and this patch has no obvious impact on other performance tests.
Signed-off-by: NFan Li <fanofcode.li@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

63a0b7cb

f2fs: add unlikely() macro for compiler more aggressively · 6bacf52f

由 Jaegeuk Kim 提交于 12月 06, 2013

This patch adds unlikely() macro into the most of codes.
The basic rule is to add that when:
- checking unusual errors,
- checking page mappings,
- and the other unlikely conditions.

Change log from v1:
 - Don't add unlikely for the NULL test and error test: advised by Andi Kleen.

Cc: Chao Yu <chao2.yu@samsung.com>
Cc: Andi Kleen <andi@firstfloor.org>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>

6bacf52f

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功