提交 · 46008c6d42328710f9beaf5c2b47dc92b1cc1a75 · openeuler / Kernel

12 5月, 2016 3 次提交

f2fs: support in batch multi blocks preallocation · 46008c6d

由 Chao Yu 提交于 5月 09, 2016

This patch introduces reserve_new_blocks to make preallocation of multi
blocks as in batch operation, so it can avoid lots of redundant
operation, result in better performance.

In virtual machine, with rotational device:

time fallocate -l 32G /mnt/f2fs/file

Before:
real	0m4.584s
user	0m0.000s
sys	0m4.580s

After:
real	0m0.292s
user	0m0.000s
sys	0m0.272s

In x86, with SSD:

time fallocate -l 500G $MNT/testfile

Before : 24.758 s
After  :  1.604 s
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix bugs and add performance numbers measured in x86.]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

46008c6d

f2fs: make atomic/volatile operation exclusive · 0fac558b

由 Chao Yu 提交于 5月 09, 2016

atomic/volatile ioctl interfaces are exposed to user like other file
operation interface, it needs to make them getting exclusion against
to each other to avoid potential conflict among these operations
in concurrent scenario.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0fac558b

f2fs: use mnt_{want,drop}_write_file in ioctl · 7fb17fe4

由 Chao Yu 提交于 5月 09, 2016

In interfaces of ioctl, mnt_{want,drop}_write_file should be used for:
- get exclusion against file system freezing which may used by lvm
  snapshot.
- do telling filesystem that a write is about to be performed on it, and
  make sure that the writes are permitted.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7fb17fe4

08 5月, 2016 24 次提交

f2fs: do not preallocate block unaligned to 4KB · 0080c507

由 Jaegeuk Kim 提交于 5月 07, 2016

Previously f2fs_preallocate_blocks() tries to allocate unaligned blocks.
In f2fs_write_begin(), however, prepare_write_begin() does not skip its
allocation due to (len != 4KB).
So, it needs locking node page twice unexpectedly.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0080c507

f2fs: read node blocks ahead when truncating blocks · 79344efb

由 Jaegeuk Kim 提交于 5月 06, 2016

This patch enables reading node blocks in advance when truncating large
data blocks.

 > time rm $MNT/testfile (500GB) after drop_cachees
Before : 9.422 s
After  : 4.821 s
Reported-by: NStephen Bates <stephen.bates@microsemi.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

79344efb

f2fs: fallocate data blocks in single locked node page · e12dd7bd

由 Jaegeuk Kim 提交于 5月 06, 2016

This patch is to improve the expand_inode speed in fallocate by allocating
data blocks as many as possible in single locked node page.

In SSD,
 # time fallocate -l 500G $MNT/testfile

Before : 1m 33.410 s
After  : 24.758 s
Reported-by: NStephen Bates <stephen.bates@microsemi.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e12dd7bd

f2fs: fix inode cache leak · f61cce5b

由 Chao Yu 提交于 5月 07, 2016

When testing f2fs with inline_dentry option, generic/342 reports:
VFS: Busy inodes after unmount of dm-0. Self-destruct in 5 seconds.  Have a nice day...

After rmmod f2fs module, kenrel shows following dmesg:
 =============================================================================
 BUG f2fs_inode_cache (Tainted: G           O   ): Objects remaining in f2fs_inode_cache on __kmem_cache_shutdown()
 -----------------------------------------------------------------------------

 Disabling lock debugging due to kernel taint
 INFO: Slab 0xf51ca0e0 objects=22 used=1 fp=0xd1e6fc60 flags=0x40004080
 CPU: 3 PID: 7455 Comm: rmmod Tainted: G    B      O    4.6.0-rc4+ #16
 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
  00000086 00000086 d062fe18 c13a83a0 f51ca0e0 d062fe38 d062fea4 c11c7276
  c1981040 f51ca0e0 00000016 00000001 d1e6fc60 40004080 656a624f 20737463
  616d6572 6e696e69 6e692067 66326620 6e695f73 5f65646f 68636163 6e6f2065
 Call Trace:
  [<c13a83a0>] dump_stack+0x5f/0x8f
  [<c11c7276>] slab_err+0x76/0x80
  [<c11cbfc0>] ? __kmem_cache_shutdown+0x100/0x2f0
  [<c11cbfc0>] ? __kmem_cache_shutdown+0x100/0x2f0
  [<c11cbfe5>] __kmem_cache_shutdown+0x125/0x2f0
  [<c1198a38>] kmem_cache_destroy+0x158/0x1f0
  [<c176b43d>] ? mutex_unlock+0xd/0x10
  [<f8f15aa3>] exit_f2fs_fs+0x4b/0x5a8 [f2fs]
  [<c10f596c>] SyS_delete_module+0x16c/0x1d0
  [<c1001b10>] ? do_fast_syscall_32+0x30/0x1c0
  [<c13c59bf>] ? __this_cpu_preempt_check+0xf/0x20
  [<c10afa7d>] ? trace_hardirqs_on_caller+0xdd/0x210
  [<c10ad50b>] ? trace_hardirqs_off+0xb/0x10
  [<c1001b81>] do_fast_syscall_32+0xa1/0x1c0
  [<c176d888>] sysenter_past_esp+0x45/0x74
 INFO: Object 0xd1e6d9e0 @offset=6624
 kmem_cache_destroy f2fs_inode_cache: Slab cache still has objects
 CPU: 3 PID: 7455 Comm: rmmod Tainted: G    B      O    4.6.0-rc4+ #16
 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
  00000286 00000286 d062fef4 c13a83a0 f174b000 d062ff14 d062ff28 c1198ac7
  c197fe18 f3c5b980 d062ff20 000d04f2 d062ff0c d062ff0c d062ff14 d062ff14
  f8f20dc0 fffffff5 d062e000 d062ff30 f8f15aa3 d062ff7c c10f596c 73663266
 Call Trace:
  [<c13a83a0>] dump_stack+0x5f/0x8f
  [<c1198ac7>] kmem_cache_destroy+0x1e7/0x1f0
  [<f8f15aa3>] exit_f2fs_fs+0x4b/0x5a8 [f2fs]
  [<c10f596c>] SyS_delete_module+0x16c/0x1d0
  [<c1001b10>] ? do_fast_syscall_32+0x30/0x1c0
  [<c13c59bf>] ? __this_cpu_preempt_check+0xf/0x20
  [<c10afa7d>] ? trace_hardirqs_on_caller+0xdd/0x210
  [<c10ad50b>] ? trace_hardirqs_off+0xb/0x10
  [<c1001b81>] do_fast_syscall_32+0xa1/0x1c0
  [<c176d888>] sysenter_past_esp+0x45/0x74

The reason is: in recovery flow, we use delayed iput mechanism for directory
which has recovered dentry block. It means the reference of inode will be
held until last dirty dentry page being writebacked.

But when we mount f2fs with inline_dentry option, during recovery, dirent
may only be recovered into dir inode page rather than dentry page, so there
are no chance for us to release inode reference in ->writepage when
writebacking last dentry page.

We can call paired iget/iput explicityly for inline_dentry case, but for
non-inline_dentry case, iput will call writeback_single_inode to write all
data pages synchronously, but during recovery, ->writepages of f2fs skips
writing all pages, result in losing dirent.

This patch fixes this issue by obsoleting old mechanism, and introduce a
new dir_list to hold all directory inodes which has recovered datas until
finishing recovery.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f61cce5b

fscrypto/f2fs: allow fs-specific key prefix for fs encryption · b5a7aef1

由 Jaegeuk Kim 提交于 5月 04, 2016

This patch allows fscrypto to handle a second key prefix given by filesystem.
The main reason is to provide backward compatibility, since previously f2fs
used "f2fs:" as a crypto prefix instead of "fscrypt:".
Later, ext4 should also provide key_prefix() to give "ext4:".

One concern decribed by Ted would be kinda double check overhead of prefixes.
In x86, for example, validate_user_key consumes 8 ms after boot-up, which turns
out derive_key_aes() consumed most of the time to load specific crypto module.
After such the cold miss, it shows almost zero latencies, which treats as a
negligible overhead.
Note that request_key() detects wrong prefix in prior to derive_key_aes() even.

Cc: Ted Tso <tytso@mit.edu>
Cc: stable@vger.kernel.org # v4.6
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b5a7aef1

f2fs: avoid panic when truncating to max filesize · 09210c97

由 Chao Yu 提交于 5月 05, 2016

The following panic occurs when truncating inode which has inline
xattr to max filesize.

[<ffffffffa013d3be>] get_dnode_of_data+0x4e/0x580 [f2fs]
[<ffffffffa013aca1>] ? read_node_page+0x51/0x90 [f2fs]
[<ffffffffa013ad99>] ? get_node_page.part.34+0xb9/0x170 [f2fs]
[<ffffffffa01235b1>] truncate_blocks+0x131/0x3f0 [f2fs]
[<ffffffffa01238e3>] f2fs_truncate+0x73/0x100 [f2fs]
[<ffffffffa01239d2>] f2fs_setattr+0x62/0x2a0 [f2fs]
[<ffffffff811a72c8>] notify_change+0x158/0x300
[<ffffffff8118a42b>] do_truncate+0x6b/0xa0
[<ffffffff8118e539>] ? __sb_start_write+0x49/0x100
[<ffffffff8118a798>] do_sys_ftruncate.constprop.12+0x118/0x170
[<ffffffff8118a82e>] SyS_ftruncate+0xe/0x10
[<ffffffff8169efcf>] tracesys+0xe1/0xe6
[<ffffffffa0139ae0>] get_node_path+0x210/0x220 [f2fs]
 <ffff880206a89ce8>
--[ end trace 5fea664dfbcc6625 ]---

The reason is truncate_blocks tries to truncate all node and data blocks
start from specified block offset with value of (max filesize / block
size), but actually, our valid max block offset is (max filesize / block
size) - 1, so f2fs detects such invalid block offset with BUG_ON in
truncation path.

This patch lets f2fs skip truncating data which is exceeding max
filesize.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

09210c97

f2fs: fix incorrect mapping in ->bmap · 43473f96

由 Chao Yu 提交于 5月 05, 2016

Currently, generic_block_bmap is used in f2fs_bmap, its semantics is when
the mapping is been found, return position of target physical block,
otherwise return zero.

But, previously, when there is no mapping info for specified logical block,
f2fs_bmap will map target physical block to a uninitialized variable, which
should be wrong. Fix it.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

43473f96

f2fs: remove an obsolete variable · fb58ae22

由 Jaegeuk Kim 提交于 5月 04, 2016

This patch removes an obsolete variable used in add_free_nid.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fb58ae22

f2fs: don't worry about inode leak in evict_inode · 29234b1d

由 Jaegeuk Kim 提交于 5月 04, 2016

Even if an inode failed to release its blocks, it should be kept in an orphan
inode list, so it will be released later.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

29234b1d

f2fs: shrink size of struct seg_entry · f51b4ce6

由 Chao Yu 提交于 5月 04, 2016

Restructure struct seg_entry to eliminate holes in it, after that,
in 32-bits machine, it reduces size from 32 bytes to 24 bytes; in
64-bits machine, it reduces size from 56 bytes to 40 bytes.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f51b4ce6

f2fs: reuse get_extent_info · bd933d4f

由 Chao Yu 提交于 5月 04, 2016

Reuse get_extent_info for readability.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bd933d4f

f2fs: remove unneeded memset when updating xattr · e3bc808c

由 Chao Yu 提交于 5月 04, 2016

Each of fields in struct f2fs_xattr_entry will be assigned later,
so previously we don't need to memset the struct.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e3bc808c

f2fs: remove unneeded readahead in find_fsync_dnodes · ae8d1db3

由 Chao Yu 提交于 5月 04, 2016

In find_fsync_dnodes, get_tmp_page will read dnode page synchronously,
previously, ra_meta_page did the same work, which is redundant, remove
it.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ae8d1db3

f2fs: retry to truncate blocks in -ENOMEM case · 4c0c2949

由 Jaegeuk Kim 提交于 5月 03, 2016

This patch modifies to retry truncating node blocks in -ENOMEM case.
Signed-off-by: NHou Pengyang <houpengyang@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4c0c2949

f2fs: fix leak of orphan inode objects · 74ef9241

由 Jaegeuk Kim 提交于 5月 02, 2016

When unmounting filesystem, we should release all the ino entries.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

74ef9241

f2fs: revisit error handling flows · 221149c0

由 Jaegeuk Kim 提交于 5月 02, 2016

This patch fixes a couple of bugs regarding to orphan inodes when handling
errors.

This tries to
 - call alloc_nid_done with add_orphan_inode in handle_failed_inode
 - let truncate blocks in f2fs_evict_inode
 - not make a bad inode due to i_mode change
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

221149c0

f2fs: inject ENOSPC failures · cb78942b

由 Jaegeuk Kim 提交于 4月 29, 2016

This patch injects ENOSPC failures.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

cb78942b

J
f2fs: inject page allocation failures · c41f3cc3
由 Jaegeuk Kim 提交于 4月 29, 2016
```
This patch adds page allocation failures.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
c41f3cc3

f2fs: inject kmalloc failure · 2c63fead

由 Jaegeuk Kim 提交于 4月 29, 2016

This patch injects kmalloc failure given a fault injection rate.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2c63fead

J
f2fs: add mount option to select fault injection ratio · 73faec4d
由 Jaegeuk Kim 提交于 4月 29, 2016
```
This patch adds a mount option to select fault ratio.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
73faec4d
J
f2fs: use f2fs_grab_cache_page instead of grab_cache_page · 300e129c
由 Jaegeuk Kim 提交于 4月 29, 2016
```
This patch converts grab_cache_page to f2fs_grab_cache_page.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
300e129c
J
f2fs: introduce f2fs_kmalloc to wrap kmalloc · 0414b004
由 Jaegeuk Kim 提交于 4月 29, 2016
```
This patch adds f2fs_kmalloc.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
0414b004

f2fs: add proc entry to show valid block bitmap · f00d6fa7

由 Jaegeuk Kim 提交于 4月 27, 2016

This patch adds a new proc entry to show segment information in more detail.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f00d6fa7

f2fs: introduce macros for proc entries · b7a15f3d

由 Jaegeuk Kim 提交于 4月 27, 2016

This adds macros to be used multiple proc entries.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b7a15f3d

04 5月, 2016 3 次提交

f2fs: factor out fsync inode entry operations · 3f8ab270

由 Chao Yu 提交于 4月 29, 2016

Factor out fsync inode entry operations into {add,del}_fsync_inode.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3f8ab270

f2fs: fix to clear page private flag · c81ced05

由 Chao Yu 提交于 4月 29, 2016

Commit 28bc106b ("f2fs: support revoking atomic written pages")
forgot to clear page private flag correctly, fix it.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c81ced05

f2fs: fix to clear private data in page · 23dc974e

由 Chao Yu 提交于 4月 29, 2016

Private data in page should be removed during ->releasepage or
->invalidatepage, otherwise garbage data would be remained in that page.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

23dc974e

28 4月, 2016 4 次提交

f2fs: fix to return 0 if err == -ENOENT in f2fs_readdir · fe216c7a

由 Yunlong Song 提交于 4月 27, 2016

Commit 57b62d29 ("f2fs: fix to report
error in f2fs_readdir") causes f2fs_readdir to return -ENOENT when
get_lock_data_page returns -ENOENT. However, the original logic is to
continue when get_lock_data_page returns -ENOENT, but it forgets to
reset err to 0.

This will cause getdents64 incorretly return -ENOENT when lastdirent is
NULL in getdents64. This will lead to a wrong return value for syscall
caller.
Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fe216c7a

f2fs: move node pages only in victim section during GC · da011cc0

由 Chao Yu 提交于 4月 27, 2016

For foreground GC, we cache node blocks in victim section and set them
dirty, then we call sync_node_pages to flush these node pages, but
meanwhile, those node pages which does not locate in victim section
will be flushed together, so more bandwidth and continuous free space
would be occupied.

So for this condition, it's better to leave those unrelated node page
in cache for further write hit, and let CP or VM to flush them afterward.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

da011cc0

f2fs: be aware of invalid filename length · a4a13f58

由 Chao Yu 提交于 4月 27, 2016

The filename length in dirent of may become zero-sized after random junk
data injection, once encounter such dirent, find_target_dentry or
f2fs_add_inline_entries will run into an infinite loop. So let f2fs being
aware of that to avoid deadloop.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a4a13f58

MAINTAINERS: update my email address · ae9b9a9d

由 Chao Yu 提交于 4月 24, 2016

I've changed employer, update my email address to the new one.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ae9b9a9d

27 4月, 2016 6 次提交

f2fs: issue cache flush on direct IO · 6bfc4919

由 Jaegeuk Kim 提交于 4月 18, 2016

Under direct IO path with O_(D)SYNC, it needs to set proper APPEND or UPDATE
flags, so taht f2fs_sync_file can make its data safe.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6bfc4919

f2fs: set fsync mark only for the last dnode · 608514de

由 Jaegeuk Kim 提交于 4月 15, 2016

In order to give atomic writes, we should consider power failure during
sync_node_pages in fsync.
So, this patch marks fsync flag only in the last dnode block.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

608514de

f2fs: report unwritten status in fsync_node_pages · c267ec15

由 Jaegeuk Kim 提交于 4月 15, 2016

The fsync_node_pages should return pass or failure so that user could know
fsync is completed or not.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c267ec15

f2fs: split sync_node_pages with fsync_node_pages · 52681375

由 Jaegeuk Kim 提交于 4月 13, 2016

This patch splits the existing sync_node_pages into (f)sync_node_pages.
The fsync_node_pages is used for f2fs_sync_file only.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

52681375

f2fs: avoid writing 0'th page in volatile writes · e6e5f561

由 Jaegeuk Kim 提交于 4月 14, 2016

The first page of volatile writes usually contains a sort of header information
which will be used for recovery.
(e.g., journal header of sqlite)

If this is written without other journal data, user needs to handle the stale
journal information.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e6e5f561

f2fs: avoid needless lock for node pages when fsyncing a file · eca76e78

由 Jaegeuk Kim 提交于 4月 13, 2016

When fsync is called, sync_node_pages finds a proper direct node pages to flush.
But, it locks unrelated direct node pages together unnecessarily.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

eca76e78

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功