提交 · 8790568255f3f0b29ec417089c47008a3d493490 · openanolis / cloud-kernel

27 7月, 2017 2 次提交

f2fs: alloc new nids for xattr block in recovery · 87905682

由 Yunlei He 提交于 7月 18, 2017

recovery file A:			recovery file B:
	-get_dnode_of_data
		-alloc_nid
					-recover_xattr_data
						-set_node_addr(sbi, &ni, NEW_ADDR, false);
							--->bug_on for nid has been used by file A

In recovery process, new allocated node blocks may "reuse" xattr block
nids, this patch alloc new nids for xattr blocks in recovery process to
avoid this problem.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

87905682

f2fs: remove unused input parameter · 5f4ce6ab

由 Yunlei He 提交于 7月 17, 2017

This patch remove unused input parameter in function
new_node_page.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NYong Sheng <shengyong1@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5f4ce6ab

09 7月, 2017 1 次提交

f2fs: support plain user/group quota · 0abd675e

由 Chao Yu 提交于 7月 09, 2017

This patch adds to support plain user/group quota.

Change Note by Jaegeuk Kim.

- Use f2fs page cache for quota files in order to consider garbage collection.
  so, quota files are not tolerable for sudden power-cuts, so user needs to do
  quotacheck.

- setattr() calls dquot_transfer which will transfer inode->i_blocks.
  We can't reclaim that during f2fs_evict_inode(). So, we need to count
  node blocks as well in order to match i_blocks with dquot's space.

  Note that, Chao wrote a patch to count inode->i_blocks without inode block.
  (f2fs: don't count inode block in in-memory inode.i_blocks)

- in f2fs_remount, we need to make RW in prior to dquot_resume.

- handle fault_injection case during f2fs_quota_off_umount

- TODO: Project quota
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0abd675e

08 7月, 2017 2 次提交

f2fs: don't count inode block in in-memory inode.i_blocks · 000519f2

由 Chao Yu 提交于 7月 06, 2017

Previously, we count all inode consumed blocks including inode block,
xattr block, index block, data block into i_blocks, for other generic
filesystems, they won't count inode block into i_blocks, so for
userspace applications or quota system, they may detect incorrect block
count according to i_blocks value in inode.

This patch changes to count all blocks into inode.i_blocks excluding
inode block, for on-disk i_blocks, we keep counting inode block for
backward compatibility.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

000519f2

f2fs: skip ->writepages for {mete,node}_inode during recovery · 0771fcc7

由 Chao Yu 提交于 6月 29, 2017

Skip ->writepages in prior to ->writepage for {meta,node}_inode during
recovery, hence unneeded loop in ->writepages can be avoided.

Moreover, check SBI_POR_DOING earlier while writebacking pages.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0771fcc7

04 7月, 2017 3 次提交

f2fs: measure inode.i_blocks as generic filesystem · 0eb0adad

由 Chao Yu 提交于 6月 14, 2017

Both in memory or on disk, generic filesystems record i_blocks with
512bytes sized sector count, also VFS sub module such as disk quota
follows this rule, but f2fs records it with 4096bytes sized block
count, this difference leads to that once we use dquota's function
which inc/dec iblocks, it will make i_blocks of f2fs being inconsistent
between in memory and on disk.

In order to resolve this issue, this patch changes to make in-memory
i_blocks of f2fs recording sector count instead of block count,
meanwhile leaving on-disk i_blocks recording block count.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0eb0adad

f2fs: fix to avoid panic when encountering corrupt node · 1f258ec1

由 Chao Yu 提交于 6月 07, 2017

With fault_injection option, generic/361 of fstests will complain us
with below message:

Call Trace:
 get_node_page+0x12/0x20 [f2fs]
 f2fs_iget+0x92/0x7d0 [f2fs]
 f2fs_fill_super+0x10fb/0x15e0 [f2fs]
 mount_bdev+0x184/0x1c0
 f2fs_mount+0x15/0x20 [f2fs]
 mount_fs+0x39/0x150
 vfs_kern_mount+0x67/0x110
 do_mount+0x1bb/0xc70
 SyS_mount+0x83/0xd0
 do_syscall_64+0x6e/0x160
 entry_SYSCALL64_slow_path+0x25/0x25

Since mkfs loop device in f2fs partition can be failed silently due to
checkpoint error injection, so root inode page can be corrupted, in order
to avoid needless panic, in get_node_page, it's better to leave message
and return error to caller, and let fsck repaire it later.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1f258ec1

f2fs: don't track newly allocated nat entry in list · febeca6d

由 Chao Yu 提交于 6月 05, 2017

We will never persist newly allocated nat entries during checkpoint(), so
we don't need to track such nat entries in nat dirty list in order to
avoid:
- more latency during traversing dirty list;
- sorting nat sets incorrectly due to recording wrong entry_cnt in nat
entry set.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

febeca6d

24 5月, 2017 2 次提交

f2fs: declare load_free_nid_bitmap static · bd80a4b9

由 Hou Pengyang 提交于 5月 17, 2017

Signed-off-by: NHou Pengyang <houpengyang@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bd80a4b9

f2fs: remove unnecessary read cases in merged IO flow · b9109b0e

由 Jaegeuk Kim 提交于 5月 10, 2017

Merged IO flow doesn't need to care about read IOs.

f2fs_submit_merged_bio -> f2fs_submit_merged_write
f2fs_submit_merged_bios -> f2fs_submit_merged_writes
f2fs_submit_merged_bio_cond -> f2fs_submit_merged_write_cond
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b9109b0e

09 5月, 2017 1 次提交

mm: introduce kv[mz]alloc helpers · a7c3e901

由 Michal Hocko 提交于 5月 08, 2017

Patch series "kvmalloc", v5.

There are many open coded kmalloc with vmalloc fallback instances in the
tree.  Most of them are not careful enough or simply do not care about
the underlying semantic of the kmalloc/page allocator which means that
a) some vmalloc fallbacks are basically unreachable because the kmalloc
part will keep retrying until it succeeds b) the page allocator can
invoke a really disruptive steps like the OOM killer to move forward
which doesn't sound appropriate when we consider that the vmalloc
fallback is available.

As it can be seen implementing kvmalloc requires quite an intimate
knowledge if the page allocator and the memory reclaim internals which
strongly suggests that a helper should be implemented in the memory
subsystem proper.

Most callers, I could find, have been converted to use the helper
instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
in the networking stack which I have converted as well and Eric Dumazet
was not opposed [2] to convert them as well.

[1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com

This patch (of 9):

Using kmalloc with the vmalloc fallback for larger allocations is a
common pattern in the kernel code.  Yet we do not have any common helper
for that and so users have invented their own helpers.  Some of them are
really creative when doing so.  Let's just add kv[mz]alloc and make sure
it is implemented properly.  This implementation makes sure to not make
a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
to not warn about allocation failures.  This also rules out the OOM
killer as the vmalloc is a more approapriate fallback than a disruptive
user visible action.

This patch also changes some existing users and removes helpers which
are specific for them.  In some cases this is not possible (e.g.
ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
require GFP_NO{FS,IO} context which is not vmalloc compatible in general
(note that the page table allocation is GFP_KERNEL).  Those need to be
fixed separately.

While we are at it, document that __vmalloc{_node} about unsupported gfp
mask because there seems to be a lot of confusion out there.
kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
superset) flags to catch new abusers.  Existing ones would have to die
slowly.

[sfr@canb.auug.org.au: f2fs fixup]
  Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7c3e901

04 5月, 2017 1 次提交

f2fs: fix a mount fail for wrong next_scan_nid · e9cdd307

由 Yunlei He 提交于 4月 26, 2017

-write_checkpoint
   -do_checkpoint
      -next_free_nid    <--- something wrong with next free nid

-f2fs_fill_super
   -build_node_manager
      -build_free_nids
          -get_current_nat_page
             -__get_meta_page   <--- attempt to access beyond end of device
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e9cdd307

26 4月, 2017 1 次提交

f2fs: seperate read nat page from nat_tree_lock · 66a82d1f

由 Yunlei He 提交于 4月 22, 2017

This patch seperate nat page read io from nat_tree_lock.

-lock_page
	-get_node_info()
		-current_nat_addr

			......           	->       write_checkpoint

			-get_meta_page

Because we lock node page, we can make sure no other threads
modify this nid concurrently. So we just obtain current_nat_addr
under nat_tree_lock, node info is always same in both nat pack.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

66a82d1f

13 4月, 2017 1 次提交

f2fs: fix not to set fsync/dentry mark · d29fd172

由 Jaegeuk Kim 提交于 4月 12, 2017

Otherwise, we can see stale fsync/dentry mark given by previous calls, resulting
in giving up roll-forward recovery due to wrong dentry mark.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d29fd172

11 4月, 2017 1 次提交

f2fs: guard macro variables with braces · 68afcf2d

由 Tomohiro Kusumi 提交于 4月 09, 2017

Add braces around variables used within macros for those make sense
to do it. Many of the macros in f2fs already do this. What this commit
doesn't do is anything that changes line# as a result of adding braces,
which usually affects the binary via __LINE__.

Confirmed no diff in fs/f2fs/f2fs.ko before/after this commit on x86_64,
to make sure this has no functional change as well as there's been no
unexpected side effect due to callers' arithmetics within the existing
code.
Signed-off-by: NTomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

68afcf2d

25 3月, 2017 2 次提交

f2fs: allow write page cache when writting cp · 59c9081b

由 Yunlei He 提交于 3月 13, 2017

This patch allow write data to normal file when writting
new checkpoint.

We relax three limitations for write_begin path:
1. data allocation
2. node allocation
3. variables in checkpoint
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

59c9081b

f2fs: fix race condition in between free nid allocator/initializer · 30a61ddf

由 Chao Yu 提交于 3月 22, 2017

In below concurrent case, allocated nid can be loaded into free nid cache
and be allocated again.

Thread A				Thread B
- f2fs_create
 - f2fs_new_inode
  - alloc_nid
   - __insert_nid_to_list(ALLOC_NID_LIST)
					- f2fs_balance_fs_bg
					 - build_free_nids
					  - __build_free_nids
					   - scan_nat_page
					    - add_free_nid
					     - __lookup_nat_cache
 - f2fs_add_link
  - init_inode_metadata
   - new_inode_page
    - new_node_page
     - set_node_addr
 - alloc_nid_done
  - __remove_nid_from_list(ALLOC_NID_LIST)
					     - __insert_nid_to_list(FREE_NID_LIST)

This patch makes nat cache lookup and free nid list operation being atomical
to avoid this race condition.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

30a61ddf

22 3月, 2017 3 次提交

K
f2fs: more reasonable mem_size calculating of ino_entry · 8f73cbb7
由 Kinglong Mee 提交于 3月 18, 2017
```
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
8f73cbb7

f2fs: cover update_free_nid_bitmap with nid_list_lock · 346fe752

由 Chao Yu 提交于 3月 13, 2017

free_nid_bitmap and free_nid_count in update_free_nid_bitmap should be
updated atomically, use nid_list_lock cover them to avoid race in
concurrent scenario.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Reviewed-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

346fe752

f2fs: drop duplicate radix tree lookup of nat_entry_set · 0b28b71e

由 Kinglong Mee 提交于 2月 28, 2017

The nat entry is listed from the set list for freeing,
it's duplicate to do radix tree lookup again.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
[Jaegeuk Kim: remove unnecessary f2fs_bug_on]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0b28b71e

20 3月, 2017 4 次提交

f2fs: combine nat_bits and free_nid_bitmap cache · 7041d5d2

由 Chao Yu 提交于 3月 08, 2017

Both nat_bits cache and free_nid_bitmap cache provide same functionality
as a intermediate cache between free nid cache and disk, but with
different granularity of indicating free nid range, and different
persistence policy. nat_bits cache provides better persistence ability,
and free_nid_bitmap provides better granularity.

In this patch we combine advantage of both caches, so finally policy of
the intermediate cache would be:
- init: load free nid status from nat_bits into free_nid_bitmap
- lookup: scan free_nid_bitmap before load NAT blocks
- update: update free_nid_bitmap in real-time
- persistence: udpate and persist nat_bits in checkpoint

This patch also resolves performance regression reported by lkp-robot.

commit:
  4ac91242 ("f2fs: introduce free nid bitmap")
  d00030cf9cd0bb96fdccc41e33d3c91dcbb672ba ("f2fs: use __set{__clear}_bit_le")
  1382c0f3f9d3f936c8bc42ed1591cf7a593ef9f7 ("f2fs: combine nat_bits and free_nid_bitmap cache")

4ac91242 d00030cf9cd0bb96fdccc41e33 1382c0f3f9d3f936c8bc42ed15
---------------- -------------------------- --------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
     77863 ±  0%      +2.1%      79485 ±  1%     +50.8%     117404 ±  0%  aim7.jobs-per-min
    231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time
    231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time.max
    896604 ±  0%      -0.8%     889221 ±  3%     -20.2%     715260 ±  1%  aim7.time.involuntary_context_switches
      2394 ±  1%      +4.6%       2503 ±  1%      +3.7%       2481 ±  2%  aim7.time.maximum_resident_set_size
      6240 ±  0%      -1.5%       6145 ±  1%     -14.1%       5360 ±  1%  aim7.time.system_time
   1111357 ±  3%      +1.9%    1132509 ±  2%      -6.2%    1041932 ±  2%  aim7.time.voluntary_context_switches
...
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Tested-by: NXiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7041d5d2

f2fs: skip scanning free nid bitmap of full NAT blocks · 586d1492

由 Chao Yu 提交于 3月 01, 2017

This patch adds to account free nids for each NAT blocks, and while
scanning all free nid bitmap, do check count and skip lookuping in
full NAT block.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

586d1492

f2fs: use __set{__clear}_bit_le · 23380b85

由 Jaegeuk Kim 提交于 3月 07, 2017

This patch uses __set{__clear}_bit_le for highter speed.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

23380b85

f2fs: declare static functions · 9f7e4a2c

由 Jaegeuk Kim 提交于 3月 10, 2017

This is to avoid build warning reported by kbuild test robot.
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9f7e4a2c

01 3月, 2017 1 次提交

f2fs: avoid to flush nat journal entries · 900f7362

由 Jaegeuk Kim 提交于 2月 27, 2017

This patch adds a missing condition which flushes nat journal entries
unnecessarily introduced by:

    f2fs: add bitmaps for empty or full NAT blocks
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

900f7362

28 2月, 2017 5 次提交

f2fs: use MAX_FREE_NIDS for the free nids target · f0cdbfe6

由 Kinglong Mee 提交于 2月 26, 2017

F2FS has define MAX_FREE_NIDS for maximum of cached free nids target.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f0cdbfe6

f2fs: introduce free nid bitmap · 4ac91242

由 Chao Yu 提交于 2月 23, 2017

In scenario of intensively node allocation, free nids will be ran out
soon, then it needs to stop to load free nids by traversing NAT blocks,
in worse case, if NAT blocks does not be cached in memory, it generates
IOs which slows down our foreground operations.

In order to speed up node allocation, in this patch we introduce a new
free_nid_bitmap array, so there is an bitmap table for each NAT block,
Once the NAT block is loaded, related bitmap cache will be switched on,
and bitmap will be set during traversing nat entries in NAT block, later
we can query and update nid usage status in memory completely.

With such implementation, I expect performance of node allocation can be
improved in the long-term after filesystem image is mounted.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4ac91242

f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint · ced2c7ea

由 Kinglong Mee 提交于 2月 25, 2017

There are four places that getting the crc value in f2fs_checkpoint,
just add a new helper cur_cp_crc for them.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ced2c7ea

f2fs: show simple call stack in fault injection message · 55523519

由 Chao Yu 提交于 2月 25, 2017

Previously kernel message can show that in which function we do the
injection, but unfortunately, most of the caller are the same, for
tracking more information of injection path, it needs to show upper
caller's name. This patch supports that ability.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

55523519

f2fs: add bitmaps for empty or full NAT blocks · 22ad0b6a

由 Jaegeuk Kim 提交于 2月 09, 2017

This patches adds bitmaps to represent empty or full NAT blocks containing
free nid entries.

If we can find valid crc|cp_ver in the last block of checkpoint pack, we'll
use these bitmaps when building free nids. In order to avoid checkpointing
burden, up-to-date bitmaps will be flushed only during umount time. So,
normally we can get this gain, but when power-cut happens, we rely on fsck.f2fs
which recovers this bitmap again.

After this patch, we build free nids from nid #0 at mount time to make more
full NAT blocks, but in runtime, we check empty NAT blocks to load free nids
without loading any NAT pages from disk.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

22ad0b6a

24 2月, 2017 6 次提交

f2fs: avoid reading NAT page by get_node_info · 25cc5d3b

由 Jaegeuk Kim 提交于 2月 13, 2017

We've not seen this buggy case for a long time, so it's time to avoid this
unnecessary get_node_info() call which reading NAT page to cache nat entry.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

25cc5d3b

f2fs: change recovery policy of xattr node block · d260081c

由 Chao Yu 提交于 2月 08, 2017

Currently, if we call fsync after updating the xattr date belongs to the
file, f2fs needs to trigger checkpoint to keep xattr data consistent. But,
this policy cause low performance as checkpoint will block most foreground
operations and cause unneeded and unrelated IOs around checkpoint.

This patch will reuse regular file recovery policy for xattr node block,
so, we change to write xattr node block tagged with fsync flag to warm
area instead of cold area, and during recovery, we search warm node chain
for fsynced xattr block, and do the recovery.

So, for below application IO pattern, performance can be improved
obviously:
- touch file
- create/update/delete xattr entry in file
- fsync file
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d260081c

f2fs: check last page index in cached bio to decide submission · 942fd319

由 Jaegeuk Kim 提交于 2月 01, 2017

If the cached bio has the last page's index, then we need to submit it.
Otherwise, we don't need to submit it and can wait for further IO merges.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

942fd319

f2fs: check io submission more precisely · d68f735b

由 Jaegeuk Kim 提交于 2月 03, 2017

This patch check IO submission more precisely than previous rough check.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d68f735b

f2fs: avoid out-of-order execution of atomic writes · e7c75ab0

由 Jaegeuk Kim 提交于 2月 02, 2017

We need to flush data writes before flushing last node block writes by using
FUA with PREFLUSH. We don't need to guarantee precedent node writes since if
those are not written, we can't reach to the last node block when scanning
node block chain during roll-forward recovery.
Afterwards f2fs_wait_on_page_writeback guarantees all the IO submission to
disk, which builds a valid node block chain.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e7c75ab0

f2fs: move write_node_page above fsync_node_pages · faa24895

由 Jaegeuk Kim 提交于 2月 02, 2017

This patch just moves write_node_page and introduces an inner function.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

faa24895

23 2月, 2017 1 次提交

f2fs: check in-memory nat version bitmap · 599a09b2

由 Chao Yu 提交于 1月 07, 2017

This patch adds a mirror for nat version bitmap, and use it to detect
in-memory bitmap corruption which may be caused by bit-transition of
cache or memory overflow.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

599a09b2

29 1月, 2017 1 次提交

f2fs: don't cache nat entry if out of memory · 5c9e4184

由 Chao Yu 提交于 12月 13, 2016

If we run out of memory, in cache_nat_entry, it's better to avoid loop
for allocating memory to cache nat entry, so in low memory scenario, for
read path of node block, I expect this can avoid unneeded latency.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5c9e4184

26 11月, 2016 2 次提交

f2fs: fix to account total free nid correctly · 04d47e67

由 Chao Yu 提交于 11月 17, 2016

Thread A		Thread B		Thread C
- f2fs_create
 - f2fs_new_inode
  - f2fs_lock_op
   - alloc_nid
    alloc last nid
  - f2fs_unlock_op
			- f2fs_create
			 - f2fs_new_inode
			  - f2fs_lock_op
			   - alloc_nid
			    as node count still not
			    be increased, we will
			    loop in alloc_nid
						- f2fs_write_node_pages
						 - f2fs_balance_fs_bg
						  - f2fs_sync_fs
						   - write_checkpoint
						    - block_operations
						     - f2fs_lock_all
 - f2fs_lock_op

While creating new inode, we do not allocate and account nid atomically,
so that when there is almost no free nids left, we may encounter deadloop
like above stack.

In order to avoid that, reuse nm_i::available_nids for accounting free nids
and make nid allocation and counting being atomical during node creation.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

04d47e67

f2fs: fix an infinite loop when flush nodes in cp · d40a43af

由 Yunlei He 提交于 11月 16, 2016

Thread A			Thread B

- write_checkpoint
 - block_operations
   -blk_start_plug
    -sync_node_pages		- f2fs_do_sync_file
				 - fsync_node_pages
				  - f2fs_wait_on_page_writeback

Thread A wait for global F2FS_DIRTY_NODES decreased to zero,
it start a plug list, some requests have been added to this list.
Thread B lock one dirty node page, and wait this page write back.
But this page has been in plug list of thread A with PG_writeback flag.
Thread A keep on running and its plug list has no chance to finish,
so it seems a deadlock between cp and fsync path.

This patch add a wait on page write back before set node page dirty
to avoid this problem.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NPengyang Hou <houpengyang@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d40a43af

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功