提交 · 54d71856428961124be26301b7997f2ad23be520 · openanolis / cloud-kernel

29 8月, 2015 1 次提交

f2fs: avoid accessing NULL pointer in f2fs_drop_largest_extent · 54d71856

由 Chao Yu 提交于 8月 28, 2015

If extent cache is disable, we will encounter oops when triggering direct
IO as below:

BUG: unable to handle kernel NULL pointer dereference at 0000000c
IP: [<f0b9c61e>] f2fs_drop_largest_extent+0xe/0x30 [f2fs]
*pdpt = 000000002bb9a001 *pde = 0000000000000000
Oops: 0000 [#1] SMP
Modules linked in: f2fs(O) fuse bnep rfcomm bluetooth nfsd dm_crypt nfs_acl auth_rpcgss oid_registry nfs binfmt_misc fscache lockd
sunrpc grace snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer
snd_seq_device snd soundcore joydev psmouse hid_generic i2c_piix4 serio_raw ppdev mac_hid parport_pc lp parport ext4 jbd2 mbcache
usbhid hid e1000
CPU: 3 PID: 3608 Comm: dd Tainted: G           O    4.2.0-rc4 #12
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
task: ef161600 ti: ebd5e000 task.ti: ebd5e000
EIP: 0060:[<f0b9c61e>] EFLAGS: 00010202 CPU: 3
EIP is at f2fs_drop_largest_extent+0xe/0x30 [f2fs]
EAX: 00000000 EBX: ddebc000 ECX: 00000000 EDX: 00000000
ESI: ebd5fdf8 EDI: 00000000 EBP: ebd5fd58 ESP: ebd5fd58
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: 0000000c CR3: 2c24ee40 CR4: 000006f0
Stack:
 ebd5fda4 f0b8c005 00000000 00000001 00000000 f0b8c430 c816cd68 ddebc000
 ddebc088 00001000 00000555 00000555 ffffffff c160bb00 00055501 00000000
 00000000 00000100 00000000 ebd5fe20 f0b8c430 00000046 ef161600 00001000
Call Trace:
 [<f0b8c005>] __allocate_data_block+0x1a5/0x260 [f2fs]
 [<f0b8c430>] ? f2fs_direct_IO+0x370/0x440 [f2fs]
 [<c160bb00>] ? down_read+0x30/0x50
 [<f0b8c430>] f2fs_direct_IO+0x370/0x440 [f2fs]
 [<c113e115>] generic_file_direct_write+0xa5/0x260
 [<c10b53f8>] ? current_fs_time+0x18/0x50
 [<c113e38b>] __generic_file_write_iter+0xbb/0x210
 [<c113e50f>] ? generic_file_write_iter+0x2f/0x320
 [<c113e63c>] generic_file_write_iter+0x15c/0x320
 [<f0b77f29>] f2fs_file_write_iter+0x39/0x80 [f2fs]
 [<c11984d9>] __vfs_write+0xa9/0xe0
 [<c1199227>] vfs_write+0x97/0x180
 [<c119955b>] SyS_write+0x5b/0xd0
 [<c160dcd0>] sysenter_do_call+0x12/0x12
Code: 10 8b 50 1c 89 53 14 eb ca 8d 74 26 00 85 f6 74 86 eb a6 0f 0b 90 8d b4 26 00 00 00 00 55 89 e5 3e 8d 74 26 00 8b 80 d4 02 00
00 <8b> 48 0c 39 d1 77 0e 03 48 14 39 ca 73 07 c7 40 14 00 00 00 00
EIP: [<f0b9c61e>] f2fs_drop_largest_extent+0xe/0x30 [f2fs] SS:ESP 0068:ebd5fd58
CR2: 000000000000000c
---[ end trace a38c07026a1afffd ]---

This is because when extent cache is disable, extent_tree pointer in struct
f2fs_inode_info should be NULL, but in f2fs_drop_largest_extent we access
this NULL pointer directly without checking state of extent cache, then,
the oops occurs. Let's fix it by checking state of extent cache before
accessing.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

54d71856

27 8月, 2015 1 次提交

f2fs: update extent tree in batches · 19b2c30d

由 Chao Yu 提交于 8月 26, 2015

This patch introduce a new helper f2fs_update_extent_tree_range which can
do extent mapping update at a specified range.

The main idea is:
1) punch all mapping info in extent node(s) which are at a specified range;
2) try to merge new extent mapping with adjacent node, or failing that,
   insert the mapping into extent tree as a new node.

In order to see the benefit, I add a function for stating time stamping
count as below:

uint64_t rdtsc(void)
{
	uint32_t lo, hi;
	__asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
	return (uint64_t)hi << 32 | lo;
}

My test environment is: ubuntu, intel i7-3770, 16G memory, 256g micron ssd.

truncation path:	update extent cache from truncate_data_blocks_range
non-truncataion path:	update extent cache from other paths
total:			all update paths

a) Removing 128MB file which has one extent node mapping whole range of
file:
1. dd if=/dev/zero of=/mnt/f2fs/128M bs=1M count=128
2. sync
3. rm /mnt/f2fs/128M

Before:
		total		count		average
truncation:	7651022		32768		233.49

Patched:
		total		count		average
truncation:	3321		33		100.64

b) fsstress:
fsstress -d /mnt/f2fs -l 5 -n 100 -p 20
Test times:		5 times.

Before:
		total		count		average
truncation:	5812480.6	20911.6		277.95
non-truncation:	7783845.6	13440.8		579.12
total:		13596326.2	34352.4		395.79

Patched:
		total		count		average
truncation:	1281283.0	3041.6		421.25
non-truncation:	7355844.4	13662.8		538.38
total:		8637127.4	16704.4		517.06

1) For the updates in truncation path:
 - we can see updating in batches leads total tsc and update count reducing
   explicitly;
 - besides, for a single batched updating, punching multiple extent nodes
   in a loop, result in executing more operations, so our average tsc
   increase intensively.
2) For the updates in non-truncation path:
 - there is a little improvement, that is because for the scenario that we
   just need to update in the head or tail of extent node, new interface
   optimize to update info in extent node directly, rather than removing
   original extent node for updating and then inserting that updated one
   into cache as new node.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

19b2c30d

25 8月, 2015 6 次提交

f2fs: fix to release inode correctly · 13ec7297

由 Chao Yu 提交于 8月 24, 2015

In following call stack, if unfortunately we lose all chances to truncate
inode page in remove_inode_page, eventually we will add the nid allocated
previously into free nid cache, this nid is with NID_NEW status and with
NEW_ADDR in its blkaddr pointer:

 - f2fs_create
  - f2fs_add_link
   - __f2fs_add_link
    - init_inode_metadata
     - new_inode_page
      - new_node_page
       - set_node_addr(, NEW_ADDR)
     - f2fs_init_acl   failed
     - remove_inode_page  failed
  - handle_failed_inode
   - remove_inode_page  failed
   - iput
    - f2fs_evict_inode
     - remove_inode_page  failed
     - alloc_nid_failed   cache a nid with valid blkaddr: NEW_ADDR

This may not only cause resource leak of previous inode, but also may cause
incorrect use of the previous blkaddr which is located in NO.nid node entry
when this nid is reused by others.

This patch tries to add this inode to orphan list if we fail to truncate
inode, so that we can obtain a second chance to release it in orphan
recovery flow.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

13ec7297

f2fs: handle f2fs_truncate error correctly · b0154891

由 Chao Yu 提交于 8月 24, 2015

This patch fixes to return error number of f2fs_truncate, so that we
can handle the error correctly in callers.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b0154891

f2fs: avoid unneeded initializing when converting inline dentry · 4ec17d68

由 Chao Yu 提交于 8月 24, 2015

When converting inline dentry, we will zero out target dentry page before
duplicating data of inline dentry into target page, it become overhead
since inline dentry size is not small.

So this patch tries to remove unneeded initializing in the space of target
dentry page.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4ec17d68

f2fs: atomically set inode->i_flags · 6a678857

由 Zhang Zhen 提交于 8月 24, 2015

According to commit 5f16f322 ("ext4: atomically set inode->i_flags in
ext4_set_inode_flags()").
Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6a678857

f2fs: fix wrong pointer access during try_to_free_nids · f7409d0f

由 Jaegeuk Kim 提交于 8月 21, 2015

If we release the lock in list_for_each_entry_safe, we can lose the tmp
pointer by alloc_nid.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f7409d0f

f2fs: use __GFP_NOFAIL to avoid infinite loop · 80c54505

由 Jaegeuk Kim 提交于 8月 20, 2015

__GFP_NOFAIL can avoid retrying the whole path of kmem_cache_alloc and
bio_alloc.
And, it also fixes the use cases of GFP_ATOMIC correctly.
Suggested-by: NChao Yu <chao2.yu@samsung.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

80c54505

22 8月, 2015 8 次提交

f2fs: lookup neighbor extent nodes for merging later · dac2ddef

由 Chao Yu 提交于 8月 19, 2015

In __lookup_extent_tree_ret we will not try to find neighbor nodes if
we find the target node, in this condition, we will lost the chance to
merge the new mapping with exist extent node later.

So our extent cache of inode will be fragmented after overwrite exist
file, we can see the number of extent node increases intensively in
following test case:

dd if=/dev/zero of=/mnt/f2fs/4m bs=4K count=1024

Extent Cache:
  - Hit Count: L1-1:0 L1-2:0 L2:0
  - Hit Ratio: 0% (0 / 3072)
  - Inner Struct Count: tree: 1, node: 1

dd if=/dev/zero of=/mnt/f2fs/4m bs=4K count=1024 conv=notrunc

Extent Cache:
  - Hit Count: L1-1:2048 L1-2:0 L2:0
  - Hit Ratio: 33% (2048 / 6144)
  - Inner Struct Count: tree: 1, node: 961

This patch fixes to lookup neighbors of target node for further
merging.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

dac2ddef

f2fs: split __insert_extent_tree_ret for readability · ef05e221

由 Chao Yu 提交于 8月 19, 2015

This patch splits __insert_extent_tree_ret into __try_merge_extent_node &
__insert_extent_tree for code readability.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ef05e221

f2fs: kill dead code in __insert_extent_tree · a6f78345

由 Chao Yu 提交于 8月 19, 2015

After commit 0f825ee6 ("f2fs: add new interfaces for extent tree"),
f2fs_init_extent_tree becomes the only caller of __insert_extent_tree, and
in f2fs_init_extent_tree, we will only insert extent node in an empty tree,
so __try_{back,front}_merge in __insert_extent_tree will never be called.

This patch removes these dead codes, besides, rename __insert_extent_tree
to __init_extent_tree for readability.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a6f78345

f2fs: adjust showing of extent cache stat · 029e13cc

由 Chao Yu 提交于 8月 19, 2015

This patch alters to replace total hit stat with rbtree hit stat,
and then adjust showing of extent cache stat:

Hit Count:
L1-1: for largest node hit count;
L1-2: for last cached node hit count;
L2: for extent node hit after lookuping in rbtree.

Hit Ratio:
ratio (hit count / total lookup count)

Inner Struct Count:
tree count, node count.

Before:
Extent Hit Ratio: 0 / 2

Extent Tree Count: 3

Extent Node Count: 2

Patched:
Exten Cacache:
  - Hit Count: L1-1:4871 L1-2:2074 L2:208
  - Hit Ratio: 1% (7153 / 550751)
  - Inner Struct Count: tree: 26560, node: 11824
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

029e13cc

f2fs: add largest/cached stat in extent cache · 91c481ff

由 Chao Yu 提交于 8月 19, 2015

This patch adds to stat the hit count of largest/cached node for showing
in debugfs.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

91c481ff

f2fs: fix incorrect mapping for bmap · e2b4e2bc

由 Chao Yu 提交于 8月 19, 2015

The test step is like below:
1. touch file
2. truncate -s $((1024*1024)) file
3. fallocate -o 0 -l $((1024*1024)) file
4. fibmap.f2fs file

Our result of fibmap.f2fs showed below is not correct:

file_pos   start_blk     end_blk        blks
       0    -937166132    -937166132           1
    4096    -937166132    -937166132           1
    8192    -937166132    -937166132           1
   12288    -937166132    -937166132           1
   16384    -937166132    -937166132           1
   20480    -937166132    -937166132           1
...
 1040384    -937166132    -937166132           1
 1044480    -937166132    -937166132           1

This is because f2fs_map_blocks will return with no error when meeting
a hole or preallocated block, the caller __get_data_block will map the
uninitialized variable value to bh->b_blocknr.

Unfortunately generic_block_bmap will neither check the return value of
get_data() nor check mapping info of buffer_head, result in returning
the random block address.

After fixing the issue, our result shows correctly:

file_pos   start_blk     end_blk        blks
       0           0           0         256
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e2b4e2bc

f2fs: fix to update cached_en of extent tree properly · f8b703da

由 Fan Li 提交于 8月 18, 2015

In f2fs_lookup_extent_tree, et->cached_en was read and updated with only
read lock held,
it could cause __lookup_extent_tree within return entirely wrong
extent_node, if other
thread update et->cached_en just before __lookup_extent_tree return.

However, there are two things about this patch that need to be noticed:
1. It does no good to arrange the order of concurrent read/write, the result
would still
be random in such case.
2. It's built on this assumption: the mix up of reads and writes on a single
pointer would
not make the pointer partially wrong at any time. Please let me know if I'm
wrong, thx.
Signed-off-by: NFan li <fanofcode.li@samsung.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f8b703da

f2fs: fix typo · 217940d4

由 Junesung Lee 提交于 8月 18, 2015

Fix typo.
Signed-off-by: NJunesung Lee <junesoung412@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

217940d4

21 8月, 2015 11 次提交

f2fs: check the node block address of newly allocated nid · 24928634

由 Jaegeuk Kim 提交于 8月 16, 2015

This patch adds a routine which checks the block address of newly allocated nid.
If an nid has already allocated by other thread due to subtle data races, it
will result in filesystem corruption.
So, it needs to check whether its block address was already allocated or not
in prior to nid allocation as the last chance.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

24928634

f2fs: go out for insert_inode_locked failure · a21c20f0

由 Jaegeuk Kim 提交于 8月 16, 2015

We should not call unlock_new_inode when insert_inode_locked failed.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a21c20f0

f2fs: retry gc if one section is not successfully reclaimed · 5ee5293c

由 Jaegeuk Kim 提交于 8月 15, 2015

If FG_GC failed to reclaim one section, let's retry with another section
from the start, since we can get anoterh good candidate.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5ee5293c

f2fs: fix to cover lock_op for update_inode_page · 2286c020

由 Jaegeuk Kim 提交于 8月 15, 2015

Previously, update_inode_page is not called under f2fs_lock_op.
Instead we should call with f2fs_write_inode.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2286c020

f2fs: reuse nids more aggressively · 26834466

由 Jaegeuk Kim 提交于 8月 14, 2015

If we can reuse nids as many as possible, we can mitigate producing obsolete
node pages in the page cache.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

26834466

f2fs: avoid garbage collecting already moved node blocks · 26d58599

由 Jaegeuk Kim 提交于 8月 14, 2015

If node blocks were already moved, we don't need to move them again.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

26d58599

f2fs: handle failed bio allocation · 740432f8

由 Jaegeuk Kim 提交于 8月 14, 2015

As the below comment of bio_alloc_bioset, f2fs can allocate multiple bios at the
same time. So, we can't guarantee that bio is allocated all the time.

"
 *   When @bs is not NULL, if %__GFP_WAIT is set then bio_alloc will always be
 *   able to allocate a bio. This is due to the mempool guarantees. To make this
 *   work, callers must never allocate more than 1 bio at a time from this pool.
 *   Callers that need to allocate more than 1 bio must always submit the
 *   previously allocated bio for IO before attempting to allocate a new one.
 *   Failure to do so can cause deadlocks under memory pressure.
"
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

740432f8

f2fs: increase the number of max hard links · a6db67f0

由 Jaegeuk Kim 提交于 8月 10, 2015

This patch increases the number of maximum hard links for one file.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a6db67f0

f2fs: skip checkpoint if there is no dirty and prefree segments · 798c1b16

由 Jaegeuk Kim 提交于 8月 11, 2015

We should avoid needless checkpoints when there is no dirty and prefree segment.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

798c1b16

f2fs: shrink free_nids entries · 31696580

由 Chao Yu 提交于 7月 28, 2015

This patch introduces __count_free_nids/try_to_free_nids and registers
them in slab shrinker for shrinking under memory pressure.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

31696580

f2fs: avoid clear valid page · 206e61be

由 Chao Yu 提交于 8月 12, 2015

In f2fs_delete_entry, if last dirent is remove from the dentry page,
we will try to punch that page since it has no valid date in it.

But truncate_hole which is used for punching could fail because of
no memory or IO error, if that happened, we'd better skip clearing
this valid dentry page.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

206e61be

20 8月, 2015 1 次提交

f2fs: do not write any node pages related to orphan inodes · 315df839

由 Jaegeuk Kim 提交于 8月 11, 2015

We should not write node pages when deleting orphan inodes.
In order to do that, we can eaisly set POR_DOING flag earlier before entering
orphan inode routine.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

315df839

15 8月, 2015 3 次提交

f2fs: avoid a build warning · 4c278394

由 Jaegeuk Kim 提交于 8月 11, 2015

If F2FS_CHECK_FS is turned off, we can get a build warning for unused variable.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4c278394

f2fs: handle error of f2fs_iget correctly · 8c14bfad

由 Chao Yu 提交于 8月 07, 2015

In recover_orphan_inode, whenever f2fs_iget fail, we will make kernel panic,
but it's not reasonable, because f2fs_iget can fail due to a lot of reasons
including out of memory.

So we change error handling method as below:
a) when finding no entry for the orphan inode, bug_on for catching bugs;
b) for other reasons, report it to caller.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8c14bfad

f2fs: do not assign a new segment for dio under space shortage · 47e70ca4

由 Jaegeuk Kim 提交于 8月 11, 2015

If there is not enough free segment, we should not assign a new segment
explicitly. Otherwise, we can run out of free segment.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

47e70ca4

12 8月, 2015 2 次提交

f2fs: remove inmem radix tree · decd36b6

由 Chao Yu 提交于 8月 07, 2015

Previously, we use radix tree to index all registered page entries for
atomic file, but now we only use radix tree to see whether current page
is indexed or not, since the other user of radix tree is gone in commit
042b7816 ("f2fs: remove unnecessary call to invalidate inmemory pages").

So in this patch, we try to use one more efficient way:
Introducing a macro ATOMIC_WRITTEN_PAGE, and setting it as page private
value to indicate page indexing status. By using this way, we can save
memory and lookup time.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

decd36b6

f2fs: report EINVAL for unalignment direct IO · c15e8599

由 Chao Yu 提交于 8月 07, 2015

We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in
detail is as fallow:

dio04

<<<test_start>>>
tag=dio04 stime=1432278894
cmdline="diotest4"
contacts=""
analysis=exit
<<<test_output>>>
diotest4    1  TPASS  :  Negative Offset
diotest4    2  TPASS  :  removed
diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
diotest4    5  TPASS  :  Read beyond the file size
......

the result of ext4 with same environment:

dio04

<<<test_start>>>
tag=dio04 stime=1432259643
cmdline="diotest4"
contacts=""
analysis=exit
<<<test_output>>>
diotest4    1  TPASS  :  Negative Offset
diotest4    2  TPASS  :  removed
diotest4    3  TPASS  :  Odd count of read and write
diotest4    4  TPASS  :  Read beyond the file size
......

The reason is that when triggering DIO in f2fs, we will return zero value
in ->direct_IO if writer's buffer offset, file offset and transfer size is
not alignment to block size of filesystem, resulting in falling back into
buffered write instead of returning -EINVAL.

This patch fixes that problem by returning correct error number for above
case, and removing the judgement condition in check_direct_IO to make sure
the verification will be enabled for direct reader too.

Besides, Jaegeuk Kim pointed out that there is expectional cases we should
always make direct-io falling back into buffered write, such as dio in
encrypted file.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
[Chao Yu make small change and add detail description in commit message]
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c15e8599

11 8月, 2015 1 次提交

f2fs: report error of fill_zero · 6394328a

由 Chao Yu 提交于 8月 07, 2015

fill_zero can fail due to a lot of reason, but previously we do not handle
its return value, so its callers such as punch_hole/f2fs_zero_range may
report success, but actually can fail because of error occurs inside
fill_zero.

This patch fixes to report correct return value of fill_zero.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6394328a

06 8月, 2015 2 次提交

f2fs: recover invalid/reserved block address for fsynced file · 12a8343e

由 Chao Yu 提交于 8月 05, 2015

When testing with generic/101 in xfstests, error message outputed as below:

    --- tests/generic/101.out
    +++ results//generic/101.out.bad
    @@ -10,10 +10,14 @@
     File foo content after log replay:
     0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
     *
    -0200000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    +0200000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
     *
     0372000
    ...
    (Run 'diff -u tests/generic/101.out results/generic/101.out.bad'  to see the entire diff)

The test flow is like below:
1. pwrite foo -S 0xaa 0 64K
2. pwrite foo -S 0xbb 64K 61K
3. sync
4. truncate foo 64K
5. truncate foo 125K
6. fsync foo
7. flakey drop writes
8. umount

After this test, we expect the data of recovered file will have the first
64k of data filling with value 0xaa and the next 61k of data filling with
value 0x00 because we have fsynced it before dropping writes in dm.

In f2fs, during recovering, we will only recover the valid block address
in direct node page if it is marked as a fsynced dnode, but block address
which means invalid/reserved (with value NULL_ADDR/NEW_ADDR) will not be
recovered. So, the file recovered shows its incorrect data 0xbb in range of
[61k, 125k].

In this patch, we fix to recover invalid/reserved block during recover flow.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

12a8343e

f2fs: use extent cache to optimize f2fs_reserve_block · 759af1c9

由 Fan Li 提交于 8月 05, 2015

In some cases, we only need the block address when we call
f2fs_reserve_block,
other fields of struct dnode_of_data aren't necessary.
We can try extent cache first for such cases in order to speed up the
process.
Signed-off-by: NFan li <fanofcode.li@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

759af1c9

05 8月, 2015 4 次提交

f2fs: invalidate temporary meta page · e90c2d28

由 Chao Yu 提交于 7月 28, 2015

To avoid meeting garbage data in next free node block at the end of warm
node chain when doing recovery, we will try to zero out that invalid block.

If the device is not support discard, our way for zeroing out block is:
grabbing a temporary zeroed page in meta inode, then, issue write request
with this page.

But, we forget to release that temporary page, so our memory usage will
increase without gaining any hit ratio benefit, so it's better to free it
for saving memory.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e90c2d28

f2fs: fix to release inode page correctly · 470f00e9

由 Chao Yu 提交于 7月 14, 2015

In following call path, we will pass a locked and referenced ipage
pointer to get_new_data_page:
 - init_inode_metadata
  - make_empty_dir
   - get_new_data_page

There are two exit paths in get_new_data_page when error occurs:
1) grab_cache_page fails, ipage will not be released;
2) f2fs_reserve_block fails, ipage will be released in callee.

So, it's not consistent for error handling in get_new_data_page.

For f2fs_reserve_block, it's not very easy to change the rule
of error handling, since it's already complicated.

Here we deside to choose an easy way to fix this issue:
If any error occur in get_new_data_page, we will ensure releasing
ipage in this function.

The same issue is in f2fs_convert_inline_dir, fix that too.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

470f00e9

f2fs: unify f2fs_bug_on when check blocks and segment · 7a04f64d

由 Liu Xue 提交于 7月 27, 2015

Replace BUG_ON with f2fs_bug_on to deal with
block and segment validity check failed.
Signed-off-by: NXue Liu <liuxueliu.liu@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7a04f64d

f2fs: freeze filesystem when fail to update meta page due to IO error · f3f338ca

由 Chao Yu 提交于 7月 29, 2015

In get_meta_page, we guarantee no failure for the returned page,
but sometimes, IO error from device will incur returning an
non-updated page.

Then, we still use this page as updated one, exception could happen
when using this kind of page.

So in this condition, we'd better freeze fs by making fs readonly and
and stop doing checkpoint.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f3f338ca

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功