提交 · 1f258ec13b82d3d947b515a007a748ffcbe29f9a · openanolis / cloud-kernel

04 7月, 2017 2 次提交

f2fs: fix to avoid panic when encountering corrupt node · 1f258ec1

由 Chao Yu 提交于 6月 07, 2017

With fault_injection option, generic/361 of fstests will complain us
with below message:

Call Trace:
 get_node_page+0x12/0x20 [f2fs]
 f2fs_iget+0x92/0x7d0 [f2fs]
 f2fs_fill_super+0x10fb/0x15e0 [f2fs]
 mount_bdev+0x184/0x1c0
 f2fs_mount+0x15/0x20 [f2fs]
 mount_fs+0x39/0x150
 vfs_kern_mount+0x67/0x110
 do_mount+0x1bb/0xc70
 SyS_mount+0x83/0xd0
 do_syscall_64+0x6e/0x160
 entry_SYSCALL64_slow_path+0x25/0x25

Since mkfs loop device in f2fs partition can be failed silently due to
checkpoint error injection, so root inode page can be corrupted, in order
to avoid needless panic, in get_node_page, it's better to leave message
and return error to caller, and let fsck repaire it later.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1f258ec1

f2fs: don't track newly allocated nat entry in list · febeca6d

由 Chao Yu 提交于 6月 05, 2017

We will never persist newly allocated nat entries during checkpoint(), so
we don't need to track such nat entries in nat dirty list in order to
avoid:
- more latency during traversing dirty list;
- sorting nat sets incorrectly due to recording wrong entry_cnt in nat
entry set.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

febeca6d

24 5月, 2017 2 次提交

f2fs: declare load_free_nid_bitmap static · bd80a4b9

由 Hou Pengyang 提交于 5月 17, 2017

Signed-off-by: NHou Pengyang <houpengyang@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bd80a4b9

f2fs: remove unnecessary read cases in merged IO flow · b9109b0e

由 Jaegeuk Kim 提交于 5月 10, 2017

Merged IO flow doesn't need to care about read IOs.

f2fs_submit_merged_bio -> f2fs_submit_merged_write
f2fs_submit_merged_bios -> f2fs_submit_merged_writes
f2fs_submit_merged_bio_cond -> f2fs_submit_merged_write_cond
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b9109b0e

04 5月, 2017 1 次提交

f2fs: fix a mount fail for wrong next_scan_nid · e9cdd307

由 Yunlei He 提交于 4月 26, 2017

-write_checkpoint
   -do_checkpoint
      -next_free_nid    <--- something wrong with next free nid

-f2fs_fill_super
   -build_node_manager
      -build_free_nids
          -get_current_nat_page
             -__get_meta_page   <--- attempt to access beyond end of device
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e9cdd307

26 4月, 2017 1 次提交

f2fs: seperate read nat page from nat_tree_lock · 66a82d1f

由 Yunlei He 提交于 4月 22, 2017

This patch seperate nat page read io from nat_tree_lock.

-lock_page
	-get_node_info()
		-current_nat_addr

			......           	->       write_checkpoint

			-get_meta_page

Because we lock node page, we can make sure no other threads
modify this nid concurrently. So we just obtain current_nat_addr
under nat_tree_lock, node info is always same in both nat pack.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

66a82d1f

13 4月, 2017 1 次提交

f2fs: fix not to set fsync/dentry mark · d29fd172

由 Jaegeuk Kim 提交于 4月 12, 2017

Otherwise, we can see stale fsync/dentry mark given by previous calls, resulting
in giving up roll-forward recovery due to wrong dentry mark.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d29fd172

11 4月, 2017 1 次提交

f2fs: guard macro variables with braces · 68afcf2d

由 Tomohiro Kusumi 提交于 4月 09, 2017

Add braces around variables used within macros for those make sense
to do it. Many of the macros in f2fs already do this. What this commit
doesn't do is anything that changes line# as a result of adding braces,
which usually affects the binary via __LINE__.

Confirmed no diff in fs/f2fs/f2fs.ko before/after this commit on x86_64,
to make sure this has no functional change as well as there's been no
unexpected side effect due to callers' arithmetics within the existing
code.
Signed-off-by: NTomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

68afcf2d

25 3月, 2017 2 次提交

f2fs: allow write page cache when writting cp · 59c9081b

由 Yunlei He 提交于 3月 13, 2017

This patch allow write data to normal file when writting
new checkpoint.

We relax three limitations for write_begin path:
1. data allocation
2. node allocation
3. variables in checkpoint
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

59c9081b

f2fs: fix race condition in between free nid allocator/initializer · 30a61ddf

由 Chao Yu 提交于 3月 22, 2017

In below concurrent case, allocated nid can be loaded into free nid cache
and be allocated again.

Thread A				Thread B
- f2fs_create
 - f2fs_new_inode
  - alloc_nid
   - __insert_nid_to_list(ALLOC_NID_LIST)
					- f2fs_balance_fs_bg
					 - build_free_nids
					  - __build_free_nids
					   - scan_nat_page
					    - add_free_nid
					     - __lookup_nat_cache
 - f2fs_add_link
  - init_inode_metadata
   - new_inode_page
    - new_node_page
     - set_node_addr
 - alloc_nid_done
  - __remove_nid_from_list(ALLOC_NID_LIST)
					     - __insert_nid_to_list(FREE_NID_LIST)

This patch makes nat cache lookup and free nid list operation being atomical
to avoid this race condition.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

30a61ddf

22 3月, 2017 3 次提交

K
f2fs: more reasonable mem_size calculating of ino_entry · 8f73cbb7
由 Kinglong Mee 提交于 3月 18, 2017
```
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
8f73cbb7

f2fs: cover update_free_nid_bitmap with nid_list_lock · 346fe752

由 Chao Yu 提交于 3月 13, 2017

free_nid_bitmap and free_nid_count in update_free_nid_bitmap should be
updated atomically, use nid_list_lock cover them to avoid race in
concurrent scenario.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Reviewed-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

346fe752

f2fs: drop duplicate radix tree lookup of nat_entry_set · 0b28b71e

由 Kinglong Mee 提交于 2月 28, 2017

The nat entry is listed from the set list for freeing,
it's duplicate to do radix tree lookup again.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
[Jaegeuk Kim: remove unnecessary f2fs_bug_on]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0b28b71e

20 3月, 2017 4 次提交

f2fs: combine nat_bits and free_nid_bitmap cache · 7041d5d2

由 Chao Yu 提交于 3月 08, 2017

Both nat_bits cache and free_nid_bitmap cache provide same functionality
as a intermediate cache between free nid cache and disk, but with
different granularity of indicating free nid range, and different
persistence policy. nat_bits cache provides better persistence ability,
and free_nid_bitmap provides better granularity.

In this patch we combine advantage of both caches, so finally policy of
the intermediate cache would be:
- init: load free nid status from nat_bits into free_nid_bitmap
- lookup: scan free_nid_bitmap before load NAT blocks
- update: update free_nid_bitmap in real-time
- persistence: udpate and persist nat_bits in checkpoint

This patch also resolves performance regression reported by lkp-robot.

commit:
  4ac91242 ("f2fs: introduce free nid bitmap")
  d00030cf9cd0bb96fdccc41e33d3c91dcbb672ba ("f2fs: use __set{__clear}_bit_le")
  1382c0f3f9d3f936c8bc42ed1591cf7a593ef9f7 ("f2fs: combine nat_bits and free_nid_bitmap cache")

4ac91242 d00030cf9cd0bb96fdccc41e33 1382c0f3f9d3f936c8bc42ed15
---------------- -------------------------- --------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
     77863 ±  0%      +2.1%      79485 ±  1%     +50.8%     117404 ±  0%  aim7.jobs-per-min
    231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time
    231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time.max
    896604 ±  0%      -0.8%     889221 ±  3%     -20.2%     715260 ±  1%  aim7.time.involuntary_context_switches
      2394 ±  1%      +4.6%       2503 ±  1%      +3.7%       2481 ±  2%  aim7.time.maximum_resident_set_size
      6240 ±  0%      -1.5%       6145 ±  1%     -14.1%       5360 ±  1%  aim7.time.system_time
   1111357 ±  3%      +1.9%    1132509 ±  2%      -6.2%    1041932 ±  2%  aim7.time.voluntary_context_switches
...
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Tested-by: NXiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7041d5d2

f2fs: skip scanning free nid bitmap of full NAT blocks · 586d1492

由 Chao Yu 提交于 3月 01, 2017

This patch adds to account free nids for each NAT blocks, and while
scanning all free nid bitmap, do check count and skip lookuping in
full NAT block.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

586d1492

f2fs: use __set{__clear}_bit_le · 23380b85

由 Jaegeuk Kim 提交于 3月 07, 2017

This patch uses __set{__clear}_bit_le for highter speed.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

23380b85

f2fs: declare static functions · 9f7e4a2c

由 Jaegeuk Kim 提交于 3月 10, 2017

This is to avoid build warning reported by kbuild test robot.
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9f7e4a2c

01 3月, 2017 1 次提交

f2fs: avoid to flush nat journal entries · 900f7362

由 Jaegeuk Kim 提交于 2月 27, 2017

This patch adds a missing condition which flushes nat journal entries
unnecessarily introduced by:

    f2fs: add bitmaps for empty or full NAT blocks
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

900f7362

28 2月, 2017 5 次提交

f2fs: use MAX_FREE_NIDS for the free nids target · f0cdbfe6

由 Kinglong Mee 提交于 2月 26, 2017

F2FS has define MAX_FREE_NIDS for maximum of cached free nids target.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f0cdbfe6

f2fs: introduce free nid bitmap · 4ac91242

由 Chao Yu 提交于 2月 23, 2017

In scenario of intensively node allocation, free nids will be ran out
soon, then it needs to stop to load free nids by traversing NAT blocks,
in worse case, if NAT blocks does not be cached in memory, it generates
IOs which slows down our foreground operations.

In order to speed up node allocation, in this patch we introduce a new
free_nid_bitmap array, so there is an bitmap table for each NAT block,
Once the NAT block is loaded, related bitmap cache will be switched on,
and bitmap will be set during traversing nat entries in NAT block, later
we can query and update nid usage status in memory completely.

With such implementation, I expect performance of node allocation can be
improved in the long-term after filesystem image is mounted.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4ac91242

f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint · ced2c7ea

由 Kinglong Mee 提交于 2月 25, 2017

There are four places that getting the crc value in f2fs_checkpoint,
just add a new helper cur_cp_crc for them.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ced2c7ea

f2fs: show simple call stack in fault injection message · 55523519

由 Chao Yu 提交于 2月 25, 2017

Previously kernel message can show that in which function we do the
injection, but unfortunately, most of the caller are the same, for
tracking more information of injection path, it needs to show upper
caller's name. This patch supports that ability.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

55523519

f2fs: add bitmaps for empty or full NAT blocks · 22ad0b6a

由 Jaegeuk Kim 提交于 2月 09, 2017

This patches adds bitmaps to represent empty or full NAT blocks containing
free nid entries.

If we can find valid crc|cp_ver in the last block of checkpoint pack, we'll
use these bitmaps when building free nids. In order to avoid checkpointing
burden, up-to-date bitmaps will be flushed only during umount time. So,
normally we can get this gain, but when power-cut happens, we rely on fsck.f2fs
which recovers this bitmap again.

After this patch, we build free nids from nid #0 at mount time to make more
full NAT blocks, but in runtime, we check empty NAT blocks to load free nids
without loading any NAT pages from disk.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

22ad0b6a

24 2月, 2017 6 次提交

f2fs: avoid reading NAT page by get_node_info · 25cc5d3b

由 Jaegeuk Kim 提交于 2月 13, 2017

We've not seen this buggy case for a long time, so it's time to avoid this
unnecessary get_node_info() call which reading NAT page to cache nat entry.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

25cc5d3b

f2fs: change recovery policy of xattr node block · d260081c

由 Chao Yu 提交于 2月 08, 2017

Currently, if we call fsync after updating the xattr date belongs to the
file, f2fs needs to trigger checkpoint to keep xattr data consistent. But,
this policy cause low performance as checkpoint will block most foreground
operations and cause unneeded and unrelated IOs around checkpoint.

This patch will reuse regular file recovery policy for xattr node block,
so, we change to write xattr node block tagged with fsync flag to warm
area instead of cold area, and during recovery, we search warm node chain
for fsynced xattr block, and do the recovery.

So, for below application IO pattern, performance can be improved
obviously:
- touch file
- create/update/delete xattr entry in file
- fsync file
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d260081c

f2fs: check last page index in cached bio to decide submission · 942fd319

由 Jaegeuk Kim 提交于 2月 01, 2017

If the cached bio has the last page's index, then we need to submit it.
Otherwise, we don't need to submit it and can wait for further IO merges.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

942fd319

f2fs: check io submission more precisely · d68f735b

由 Jaegeuk Kim 提交于 2月 03, 2017

This patch check IO submission more precisely than previous rough check.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d68f735b

f2fs: avoid out-of-order execution of atomic writes · e7c75ab0

由 Jaegeuk Kim 提交于 2月 02, 2017

We need to flush data writes before flushing last node block writes by using
FUA with PREFLUSH. We don't need to guarantee precedent node writes since if
those are not written, we can't reach to the last node block when scanning
node block chain during roll-forward recovery.
Afterwards f2fs_wait_on_page_writeback guarantees all the IO submission to
disk, which builds a valid node block chain.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e7c75ab0

f2fs: move write_node_page above fsync_node_pages · faa24895

由 Jaegeuk Kim 提交于 2月 02, 2017

This patch just moves write_node_page and introduces an inner function.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

faa24895

23 2月, 2017 1 次提交

f2fs: check in-memory nat version bitmap · 599a09b2

由 Chao Yu 提交于 1月 07, 2017

This patch adds a mirror for nat version bitmap, and use it to detect
in-memory bitmap corruption which may be caused by bit-transition of
cache or memory overflow.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

599a09b2

29 1月, 2017 1 次提交

f2fs: don't cache nat entry if out of memory · 5c9e4184

由 Chao Yu 提交于 12月 13, 2016

If we run out of memory, in cache_nat_entry, it's better to avoid loop
for allocating memory to cache nat entry, so in low memory scenario, for
read path of node block, I expect this can avoid unneeded latency.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5c9e4184

26 11月, 2016 2 次提交

f2fs: fix to account total free nid correctly · 04d47e67

由 Chao Yu 提交于 11月 17, 2016

Thread A		Thread B		Thread C
- f2fs_create
 - f2fs_new_inode
  - f2fs_lock_op
   - alloc_nid
    alloc last nid
  - f2fs_unlock_op
			- f2fs_create
			 - f2fs_new_inode
			  - f2fs_lock_op
			   - alloc_nid
			    as node count still not
			    be increased, we will
			    loop in alloc_nid
						- f2fs_write_node_pages
						 - f2fs_balance_fs_bg
						  - f2fs_sync_fs
						   - write_checkpoint
						    - block_operations
						     - f2fs_lock_all
 - f2fs_lock_op

While creating new inode, we do not allocate and account nid atomically,
so that when there is almost no free nids left, we may encounter deadloop
like above stack.

In order to avoid that, reuse nm_i::available_nids for accounting free nids
and make nid allocation and counting being atomical during node creation.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

04d47e67

f2fs: fix an infinite loop when flush nodes in cp · d40a43af

由 Yunlei He 提交于 11月 16, 2016

Thread A			Thread B

- write_checkpoint
 - block_operations
   -blk_start_plug
    -sync_node_pages		- f2fs_do_sync_file
				 - fsync_node_pages
				  - f2fs_wait_on_page_writeback

Thread A wait for global F2FS_DIRTY_NODES decreased to zero,
it start a plug list, some requests have been added to this list.
Thread B lock one dirty node page, and wait this page write back.
But this page has been in plug list of thread A with PG_writeback flag.
Thread A keep on running and its plug list has no chance to finish,
so it seems a deadlock between cp and fsync path.

This patch add a wait on page write back before set node page dirty
to avoid this problem.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NPengyang Hou <houpengyang@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d40a43af

24 11月, 2016 7 次提交

f2fs: use BIO_MAX_PAGES for bio allocation · 664ba972

由 Jaegeuk Kim 提交于 10月 18, 2016

We don't need to allocate bio partially in order to maximize sequential writes.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

664ba972

J
f2fs: declare static function for __build_free_nids · 3e7b5bbb
由 Jaegeuk Kim 提交于 10月 17, 2016
```
This patch avoids build warning.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
3e7b5bbb

f2fs: don't interrupt free nids building during nid allocation · 3a2ad567

由 Chao Yu 提交于 10月 11, 2016

Let build_free_nids support sync/async methods, in allocation flow of nids,
we use synchronuous method, so that we can avoid looping in alloc_nid when
free memory is low; in unblock_operations and f2fs_balance_fs_bg we use
asynchronuous method in where low memory condition can interrupt us.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3a2ad567

f2fs: clean up free nid list operations · eb0aa4b8

由 Jaegeuk Kim 提交于 10月 12, 2016

This patch cleans up to use consistent free nid list ops.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

eb0aa4b8

f2fs: split free nid list · b8559dc2

由 Chao Yu 提交于 10月 12, 2016

During free nid allocation, in order to do preallocation, we will tag free
nid entry as allocated one and still leave it in free nid list, for other
allocators who want to grab free nids, it needs to traverse the free nid
list for lookup. It becomes overhead in scenario of allocating free nid
intensively by multithreads.

This patch splits free nid list to two list: {free,alloc}_nid_list, to
keep free nids and preallocated free nids separately, after that, traverse
latency will be gone, besides split nid_cnt for separate statistic.

Additionally, introduce __insert_nid_to_list and __remove_nid_from_list for
cleanup.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: modify f2fs_bug_on to avoid needless branches]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b8559dc2

f2fs: fix sparse warnings · 0c0b471e

由 Eric Biggers 提交于 10月 11, 2016

f2fs contained a number of endianness conversion bugs.

Also, one function should have been 'static'.

Found with sparse by running 'make C=2 CF=-D__CHECK_ENDIAN__ fs/f2fs/'
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0c0b471e

f2fs: fix error handling in fsync_node_pages · 9de69279

由 Chao Yu 提交于 10月 11, 2016

In fsync_node_pages, if f2fs was taged with CP_ERROR_FLAG, make sure bio
cache was flushed before return.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9de69279

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功