提交 · 67883ade7a98a7589ca50e97b1c7b7893886d30e · openeuler / Kernel

28 1月, 2021 1 次提交

由 Christoph Hellwig 提交于 1月 26, 2021

Sleeping bio allocations do not fail, which means that injecting an error
into sleeping bio allocations is a little silly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

67883ade

11 12月, 2020 1 次提交

f2fs: fix shift-out-of-bounds in sanity_check_raw_super() · e584bbe8

由 Chao Yu 提交于 12月 09, 2020

syzbot reported a bug which could cause shift-out-of-bounds issue,
fix it.

Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 sanity_check_raw_super fs/f2fs/super.c:2812 [inline]
 read_raw_super_block fs/f2fs/super.c:3267 [inline]
 f2fs_fill_super.cold+0x16c9/0x16f6 fs/f2fs/super.c:3519
 mount_bdev+0x34d/0x410 fs/super.c:1366
 legacy_get_tree+0x105/0x220 fs/fs_context.c:592
 vfs_get_tree+0x89/0x2f0 fs/super.c:1496
 do_new_mount fs/namespace.c:2896 [inline]
 path_mount+0x12ae/0x1e70 fs/namespace.c:3227
 do_mount fs/namespace.c:3240 [inline]
 __do_sys_mount fs/namespace.c:3448 [inline]
 __se_sys_mount fs/namespace.c:3425 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+ca9a785f8ac472085994@syzkaller.appspotmail.com
Signed-off-by: NAnant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e584bbe8

09 12月, 2020 2 次提交

f2fs: don't check PAGE_SIZE again in sanity_check_raw_super() · d540e35d

由 Yangtao Li 提交于 12月 07, 2020

Many flash devices read and write a single IO based on a multiple
of 4KB, and we support only 4KB page cache size now.

Since we already check page size in init_f2fs_fs(), so remove page
size check in sanity_check_raw_super().
Signed-off-by: NYangtao Li <tiny.windzz@gmail.com>
Signed-off-by: NShaohua Liu <liush@allwinnertech.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d540e35d

f2fs: convert to F2FS_*_INO macro · b9ec1094

由 Yangtao Li 提交于 12月 07, 2020

Use F2FS_ROOT_INO, F2FS_NODE_INO and F2FS_META_INO macro
for better code readability.
Signed-off-by: NYangtao Li <tiny.windzz@gmail.com>
Signed-off-by: NShaohua Liu <liush@allwinnertech.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b9ec1094

03 12月, 2020 8 次提交

f2fs: add compress_mode mount option · 602a16d5

由 Daeho Jeong 提交于 12月 01, 2020

We will add a new "compress_mode" mount option to control file
compression mode. This supports "fs" and "user". In "fs" mode (default),
f2fs does automatic compression on the compression enabled files.
In "user" mode, f2fs disables the automaic compression and gives the
user discretion of choosing the target file and the timing. It means
the user can do manual compression/decompression on the compression
enabled files using ioctls.
Signed-off-by: NDaeho Jeong <daehojeong@google.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

602a16d5

f2fs: fix kbytes written stat for multi-device case · 3a0a9cbc

由 Chao Yu 提交于 11月 27, 2020

For multi-device case, one f2fs image includes multi devices, so it
needs to account bytes written of all block devices belong to the image
rather than one main block device, fix it.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3a0a9cbc

f2fs: compress: support chksum · b28f047b

由 Chao Yu 提交于 11月 26, 2020

This patch supports to store chksum value with compressed
data, and verify the integrality of compressed data while
reading the data.

The feature can be enabled through specifying mount option
'compress_chksum'.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b28f047b

f2fs: change to use rwsem for cp_mutex · 8769918b

由 Sahitya Tummala 提交于 11月 23, 2020

Use rwsem to ensure serialization of the callers and to avoid
starvation of high priority tasks, when the system is under
heavy IO workload.
Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8769918b

f2fs: Handle casefolding with Encryption · 7ad08a58

由 Daniel Rosenberg 提交于 11月 19, 2020

Expand f2fs's casefolding support to include encrypted directories.  To
index casefolded+encrypted directories, we use the SipHash of the
casefolded name, keyed by a key derived from the directory's fscrypt
master key.  This ensures that the dirhash doesn't leak information
about the plaintext filenames.

Encryption keys are unavailable during roll-forward recovery, so we
can't compute the dirhash when recovering a new dentry in an encrypted +
casefolded directory.  To avoid having to force a checkpoint when a new
file is fsync'ed, store the dirhash on-disk appended to i_name.

This patch incorporates work by Eric Biggers <ebiggers@google.com>
and Jaegeuk Kim <jaegeuk@kernel.org>.
Co-developed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NDaniel Rosenberg <drosen@google.com>
Reviewed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7ad08a58

fscrypt: Have filesystems handle their d_ops · bb9cd910

由 Daniel Rosenberg 提交于 11月 19, 2020

This shifts the responsibility of setting up dentry operations from
fscrypt to the individual filesystems, allowing them to have their own
operations while still setting fscrypt's d_revalidate as appropriate.

Most filesystems can just use generic_set_encrypted_ci_d_ops, unless
they have their own specific dentry operations as well. That operation
will set the minimal d_ops required under the circumstances.

Since the fscrypt d_ops are set later on, we must set all d_ops there,
since we cannot adjust those later on. This should not result in any
change in behavior.
Signed-off-by: NDaniel Rosenberg <drosen@google.com>
Acked-by: NTheodore Ts'o <tytso@mit.edu>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bb9cd910

f2fs: remove writeback_inodes_sb in f2fs_remount · 9f7e334a

由 Liu Song 提交于 11月 18, 2020

Since sync_inodes_sb has been used, there is no need to
use writeback_inodes_sb, so remove it.
Signed-off-by: NLiu Song <liu.song11@zte.com.cn>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9f7e334a

f2fs: fix double free of unicode map · 89ff6005

由 Hyeongseok Kim 提交于 11月 12, 2020

In case of retrying fill_super with skip_recovery,
s_encoding for casefold would not be loaded again even though it's
already been freed because it's not NULL.
Set NULL after free to prevent double freeing when unmount.

Fixes: eca4873e ("f2fs: Use generic casefolding support")
Signed-off-by: NHyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

89ff6005

02 12月, 2020 2 次提交

block: switch partition lookup to use struct block_device · 8446fe92

由 Christoph Hellwig 提交于 11月 24, 2020

Use struct block_device to lookup partitions on a disk.  This removes
all usage of struct hd_struct from the I/O path.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Acked-by: Coly Li <colyli@suse.de>			[bcache]
Acked-by: Chao Yu <yuchao0@huawei.com>			[f2fs]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8446fe92

block: remove the nr_sects field in struct hd_struct · a782483c

由 Christoph Hellwig 提交于 11月 26, 2020

Now that the hd_struct always has a block device attached to it, there is
no need for having two size field that just get out of sync.

Additionally the field in hd_struct did not use proper serialization,
possibly allowing for torn writes.  By only using the block_device field
this problem also gets fixed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Acked-by: Coly Li <colyli@suse.de>			[bcache]
Acked-by: Chao Yu <yuchao0@huawei.com>			[f2fs]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a782483c

30 9月, 2020 2 次提交

f2fs: compress: introduce cic/dic slab cache · c68d6c88

由 Chao Yu 提交于 9月 14, 2020

Add two slab caches: "f2fs_cic_entry" and "f2fs_dic_entry" for memory
allocation of compress_io_ctx and decompress_io_ctx structure.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c68d6c88

f2fs: compress: introduce page array slab cache · 31083031

由 Chao Yu 提交于 9月 14, 2020

Add a per-sbi slab cache "f2fs_page_array_entry-%u:%u" for memory
allocation of page pointer array in compress context.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: Fix wrong memory allocation]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

31083031

29 9月, 2020 5 次提交

f2fs: fix to do sanity check on segment/section count · 3a22e9ac

由 Chao Yu 提交于 9月 29, 2020

As syzbot reported:

BUG: KASAN: slab-out-of-bounds in init_min_max_mtime fs/f2fs/segment.c:4710 [inline]
BUG: KASAN: slab-out-of-bounds in f2fs_build_segment_manager+0x9302/0xa6d0 fs/f2fs/segment.c:4792
Read of size 8 at addr ffff8880a1b934a8 by task syz-executor682/6878

CPU: 1 PID: 6878 Comm: syz-executor682 Not tainted 5.9.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x198/0x1fd lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 init_min_max_mtime fs/f2fs/segment.c:4710 [inline]
 f2fs_build_segment_manager+0x9302/0xa6d0 fs/f2fs/segment.c:4792
 f2fs_fill_super+0x381a/0x6e80 fs/f2fs/super.c:3633
 mount_bdev+0x32e/0x3f0 fs/super.c:1417
 legacy_get_tree+0x105/0x220 fs/fs_context.c:592
 vfs_get_tree+0x89/0x2f0 fs/super.c:1547
 do_new_mount fs/namespace.c:2875 [inline]
 path_mount+0x1387/0x20a0 fs/namespace.c:3192
 do_mount fs/namespace.c:3205 [inline]
 __do_sys_mount fs/namespace.c:3413 [inline]
 __se_sys_mount fs/namespace.c:3390 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3390
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The root cause is: if segs_per_sec is larger than one, and segment count
in last section is less than segs_per_sec, we will suffer out-of-boundary
memory access on sit_i->sentries[] in init_min_max_mtime().

Fix this by adding sanity check among segment count, section count and
segs_per_sec value in sanity_check_raw_super().

Reported-by: syzbot+481a3ffab50fed41dcc0@syzkaller.appspotmail.com
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3a22e9ac

f2fs: fix wrong total_sections check and fsmeta check · f99ba9ad

由 Wang Xiaojun 提交于 9月 17, 2020

Meta area is not included in section_count computation.
So the minimum number of total_sections is 1 meanwhile it cannot be
greater than segment_count_main.

The minimum number of meta segments is 8 (SB + 2 (CP + SIT + NAT) + SSA).
Signed-off-by: NWang Xiaojun <wangxiaojun11@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f99ba9ad

f2fs: remove duplicated code in sanity_check_area_boundary · d89f5891

由 Wang Xiaojun 提交于 9月 18, 2020

Use seg_end_blkaddr instead of "segment0_blkaddr + (segment_count <<
log_blocks_per_seg)".
Signed-off-by: NWang Xiaojun <wangxiaojun11@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d89f5891

f2fs: relocate blkzoned feature check · d0660122

由 Chao Yu 提交于 9月 21, 2020

Relocate blkzoned feature check into parse_options() like
other feature check.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d0660122

f2fs: do sanity check on zoned block device path · 07eb1d69

由 Chao Yu 提交于 9月 21, 2020

sbi->devs would be initialized only if image enables multiple device
feature or blkzoned feature, if blkzoned feature flag was set by fuzz
in non-blkzoned device, we will suffer below panic:

get_zone_idx fs/f2fs/segment.c:4892 [inline]
f2fs_usable_zone_blks_in_seg fs/f2fs/segment.c:4943 [inline]
f2fs_usable_blks_in_seg+0x39b/0xa00 fs/f2fs/segment.c:4999
Call Trace:
 check_block_count+0x69/0x4e0 fs/f2fs/segment.h:704
 build_sit_entries fs/f2fs/segment.c:4403 [inline]
 f2fs_build_segment_manager+0x51da/0xa370 fs/f2fs/segment.c:5100
 f2fs_fill_super+0x3880/0x6ff0 fs/f2fs/super.c:3684
 mount_bdev+0x32e/0x3f0 fs/super.c:1417
 legacy_get_tree+0x105/0x220 fs/fs_context.c:592
 vfs_get_tree+0x89/0x2f0 fs/super.c:1547
 do_new_mount fs/namespace.c:2896 [inline]
 path_mount+0x12ae/0x1e70 fs/namespace.c:3216
 do_mount fs/namespace.c:3229 [inline]
 __do_sys_mount fs/namespace.c:3437 [inline]
 __se_sys_mount fs/namespace.c:3414 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3414
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46

Add sanity check to inconsistency on factors: blkzoned flag, device
path and device character to avoid above panic.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

07eb1d69

22 9月, 2020 2 次提交

fscrypt: make fscrypt_set_test_dummy_encryption() take a 'const char *' · c8c868ab

由 Eric Biggers 提交于 9月 16, 2020

fscrypt_set_test_dummy_encryption() requires that the optional argument
to the test_dummy_encryption mount option be specified as a substring_t.
That doesn't work well with filesystems that use the new mount API,
since the new way of parsing mount options doesn't use substring_t.

Make it take the argument as a 'const char *' instead.

Instead of moving the match_strdup() into the callers in ext4 and f2fs,
make them just use arg->from directly. Since the pattern is
"test_dummy_encryption=%s", the argument will be null-terminated.
Acked-by: NJeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20200917041136.178600-14-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>

c8c868ab

fscrypt: handle test_dummy_encryption in more logical way · ac4acb1f

由 Eric Biggers 提交于 9月 16, 2020

The behavior of the test_dummy_encryption mount option is that when a
new file (or directory or symlink) is created in an unencrypted
directory, it's automatically encrypted using a dummy encryption policy.
That's it; in particular, the encryption (or lack thereof) of existing
files (or directories or symlinks) doesn't change.

Unfortunately the implementation of test_dummy_encryption is a bit weird
and confusing.  When test_dummy_encryption is enabled and a file is
being created in an unencrypted directory, we set up an encryption key
(->i_crypt_info) for the directory.  This isn't actually used to do any
encryption, however, since the directory is still unencrypted!  Instead,
->i_crypt_info is only used for inheriting the encryption policy.

One consequence of this is that the filesystem ends up providing a
"dummy context" (policy + nonce) instead of a "dummy policy".  In
commit ed318a6c ("fscrypt: support test_dummy_encryption=v2"), I
mistakenly thought this was required.  However, actually the nonce only
ends up being used to derive a key that is never used.

Another consequence of this implementation is that it allows for
'inode->i_crypt_info != NULL && !IS_ENCRYPTED(inode)', which is an edge
case that can be forgotten about.  For example, currently
FS_IOC_GET_ENCRYPTION_POLICY on an unencrypted directory may return the
dummy encryption policy when the filesystem is mounted with
test_dummy_encryption.  That seems like the wrong thing to do, since
again, the directory itself is not actually encrypted.

Therefore, switch to a more logical and maintainable implementation
where the dummy encryption policy inheritance is done without setting up
keys for unencrypted directories.  This involves:

- Adding a function fscrypt_policy_to_inherit() which returns the
  encryption policy to inherit from a directory.  This can be a real
  policy, a dummy policy, or no policy.

- Replacing struct fscrypt_dummy_context, ->get_dummy_context(), etc.
  with struct fscrypt_dummy_policy, ->get_dummy_policy(), etc.

- Making fscrypt_fname_encrypted_size() take an fscrypt_policy instead
  of an inode.
Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
Acked-by: NJeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20200917041136.178600-13-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>

ac4acb1f

19 9月, 2020 1 次提交

[PATCH] reduce boilerplate in fsid handling · 6d1349c7

由 Al Viro 提交于 9月 18, 2020

Get rid of boilerplate in most of ->statfs()
instances...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6d1349c7

12 9月, 2020 3 次提交

f2fs: change i_compr_blocks of inode to atomic value · c2759eba

由 Daeho Jeong 提交于 9月 08, 2020

writepages() can be concurrently invoked for the same file by different
threads such as a thread fsyncing the file and a kworker kernel thread.
So, changing i_compr_blocks without protection is racy and we need to
protect it by changing it with atomic type value. Plus, we don't need
a 64bit value for i_compr_blocks, so just we will use a atomic value,
not atomic64.
Signed-off-by: NDaeho Jeong <daehojeong@google.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c2759eba

f2fs: ignore compress mount option on image w/o compression feature · 69c0dd29

由 Chao Yu 提交于 9月 03, 2020

to keep consistent with behavior when passing compress mount option
to kernel w/o compression feature, so that mount may not fail on
such condition.
Reported-by: NKyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

69c0dd29

f2fs: support age threshold based garbage collection · 093749e2

由 Chao Yu 提交于 8月 04, 2020

There are several issues in current background GC algorithm:
- valid blocks is one of key factors during cost overhead calculation,
so if segment has less valid block, however even its age is young or
it locates hot segment, CB algorithm will still choose the segment as
victim, it's not appropriate.
- GCed data/node will go to existing logs, no matter in-there datas'
update frequency is the same or not, it may mix hot and cold data
again.
- GC alloctor mainly use LFS type segment, it will cost free segment
more quickly.

This patch introduces a new algorithm named age threshold based
garbage collection to solve above issues, there are three steps
mainly:

1. select a source victim:
- set an age threshold, and select candidates beased threshold:
e.g.
 0 means youngest, 100 means oldest, if we set age threshold to 80
 then select dirty segments which has age in range of [80, 100] as
 candiddates;
- set candidate_ratio threshold, and select candidates based the
ratio, so that we can shrink candidates to those oldest segments;
- select target segment with fewest valid blocks in order to
migrate blocks with minimum cost;

2. select a target victim:
- select candidates beased age threshold;
- set candidate_radius threshold, search candidates whose age is
around source victims, searching radius should less than the
radius threshold.
- select target segment with most valid blocks in order to avoid
migrating current target segment.

3. merge valid blocks from source victim into target victim with
SSR alloctor.

Test steps:
- create 160 dirty segments:
 * half of them have 128 valid blocks per segment
 * left of them have 384 valid blocks per segment
- run background GC

Benefit: GC count and block movement count both decrease obviously:

- Before:
  - Valid: 86
  - Dirty: 1
  - Prefree: 11
  - Free: 6001 (6001)

GC calls: 162 (BG: 220)
  - data segments : 160 (160)
  - node segments : 2 (2)
Try to move 41454 blocks (BG: 41454)
  - data blocks : 40960 (40960)
  - node blocks : 494 (494)

IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 41364 blocks in 81 segments

- After:

  - Valid: 87
  - Dirty: 0
  - Prefree: 4
  - Free: 6008 (6008)

GC calls: 75 (BG: 76)
  - data segments : 74 (74)
  - node segments : 1 (1)
Try to move 12813 blocks (BG: 12813)
  - data blocks : 12544 (12544)
  - node blocks : 269 (269)

IPU: 0 blocks
SSR: 12032 blocks in 77 segments
LFS: 855 blocks in 2 segments
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

093749e2

11 9月, 2020 3 次提交

f2fs: Use generic casefolding support · eca4873e

由 Daniel Rosenberg 提交于 7月 08, 2020

This switches f2fs over to the generic support provided in
the previous patch.

Since casefolded dentries behave the same in ext4 and f2fs, we decrease
the maintenance burden by unifying them, and any optimizations will
immediately apply to both.
Signed-off-by: NDaniel Rosenberg <drosen@google.com>
Reviewed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

eca4873e

f2fs: introduce inmem curseg · d0b9e42a

由 Chao Yu 提交于 8月 04, 2020

Previous implementation of aligned pinfile allocation will:
- allocate new segment on cold data log no matter whether last used
segment is partially used or not, it makes IOs more random;
- force concurrent cold data/GCed IO going into warm data area, it
can make a bad effect on hot/cold data separation;

In this patch, we introduce a new type of log named 'inmem curseg',
the differents from normal curseg is:
- it reuses existed segment type (CURSEG_XXX_NODE/DATA);
- it only exists in memory, its segno, blkofs, summary will not b
 persisted into checkpoint area;

With this new feature, we can enhance scalability of log, special
allocators can be created for purposes:
- pure lfs allocator for aligned pinfile allocation or file
defragmentation
- pure ssr allocator for later feature

So that, let's update aligned pinfile allocation to use this new
inmem curseg fwk.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d0b9e42a

f2fs: support zone capacity less than zone size · de881df9

由 Aravind Ramesh 提交于 7月 16, 2020

NVMe Zoned Namespace devices can have zone-capacity less than zone-size.
Zone-capacity indicates the maximum number of sectors that are usable in
a zone beginning from the first sector of the zone. This makes the sectors
sectors after the zone-capacity till zone-size to be unusable.
This patch set tracks zone-size and zone-capacity in zoned devices and
calculate the usable blocks per segment and usable segments per section.

If zone-capacity is less than zone-size mark only those segments which
start before zone-capacity as free segments. All segments at and beyond
zone-capacity are treated as permanently used segments. In cases where
zone-capacity does not align with segment size the last segment will start
before zone-capacity and end beyond the zone-capacity of the zone. For
such spanning segments only sectors within the zone-capacity are used.

During writes and GC manage the usable segments in a section and usable
blocks per segment. Segments which are beyond zone-capacity are never
allocated, and do not need to be garbage collected, only the segments
which are before zone-capacity needs to garbage collected.
For spanning segments based on the number of usable blocks in that
segment, write to blocks only up to zone-capacity.

Zone-capacity is device specific and cannot be configured by the user.
Since NVMe ZNS device zones are sequentially write only, a block device
with conventional zones or any normal block device is needed along with
the ZNS device for the metadata operations of F2fs.

A typical nvme-cli output of a zoned device shows zone start and capacity
and write pointer as below:

SLBA: 0x0 WP: 0x0 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x20000 WP: 0x20000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x40000 WP: 0x40000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ

Here zone size is 64MB, capacity is 49MB, WP is at zone start as the zones
are in EMPTY state. For each zone, only zone start + 49MB is usable area,
any lba/sector after 49MB cannot be read or written to, the drive will fail
any attempts to read/write. So, the second zone starts at 64MB and is
usable till 113MB (64 + 49) and the range between 113 and 128MB is
again unusable. The next zone starts at 128MB, and so on.
Signed-off-by: NAravind Ramesh <aravind.ramesh@wdc.com>
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

de881df9

04 8月, 2020 1 次提交

f2fs: compress: disable compression mount option if compression is off · 1f0b067b

由 Chao Yu 提交于 7月 29, 2020

If CONFIG_F2FS_FS_COMPRESSION is off, don't allow to configure or
show compression related mount option.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1f0b067b

24 7月, 2020 1 次提交

f2fs: fix use-after-free issue · 99c787cf

由 Li Guifu 提交于 7月 24, 2020

During umount, f2fs_put_super() unregisters procfs entries after
f2fs_destroy_segment_manager(), it may cause use-after-free
issue when umount races with procfs accessing, fix it by relocating
f2fs_unregister_sysfs().

[Chao Yu: change commit title/message a bit]
Signed-off-by: NLi Guifu <bluce.liguifu@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

99c787cf

09 7月, 2020 1 次提交

f2fs: add inline encryption support · 27aacd28

由 Satya Tangirala 提交于 7月 02, 2020

Wire up f2fs to support inline encryption via the helper functions which
fs/crypto/ now provides.  This includes:

- Adding a mount option 'inlinecrypt' which enables inline encryption
  on encrypted files where it can be used.

- Setting the bio_crypt_ctx on bios that will be submitted to an
  inline-encrypted file.

- Not adding logically discontiguous data to bios that will be submitted
  to an inline-encrypted file.

- Not doing filesystem-layer crypto on inline-encrypted files.

This patch includes a fix for a race during IPU by
Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: NSatya Tangirala <satyat@google.com>
Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20200702015607.1215430-4-satyat@google.comCo-developed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>

27aacd28

08 7月, 2020 1 次提交

f2fs: avoid readahead race condition · 6b12367d

由 Jaegeuk Kim 提交于 6月 22, 2020

If two readahead threads having same offset enter in readpages, every read
IOs are split and issued to the disk which giving lower bandwidth.

This patch tries to avoid redundant readahead calls.

Fixes one build error reported by Randy.
Fix build error when F2FS_FS_COMPRESSION is not set/enabled.
This label is needed in either case.

../fs/f2fs/data.c: In function ‘f2fs_mpage_readpages’:
../fs/f2fs/data.c:2327:5: error: label ‘next_page’ used but not defined
     goto next_page;
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6b12367d

19 6月, 2020 2 次提交

f2fs: use kfree() to free variables allocated by match_strdup() · ba87a45c

由 Wang Xiaojun 提交于 6月 17, 2020

Use kfree() instead of kvfree() to free variables allocated
by match_strdup(). Because the memory is allocated with kmalloc
inside match_strdup().
Signed-off-by: NWang Xiaojun <wangxiaojun11@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ba87a45c

f2fs: use kfree() instead of kvfree() to free superblock data · 742532d1

由 Denis Efremov 提交于 6月 10, 2020

Use kfree() instead of kvfree() to free super in read_raw_super_block()
because the memory is allocated with kzalloc() in the function.
Use kfree() instead of kvfree() to free sbi, raw_super in
f2fs_fill_super() and f2fs_put_super() because the memory is allocated
with kzalloc().
Signed-off-by: NDenis Efremov <efremov@linux.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

742532d1

09 6月, 2020 1 次提交

f2fs: don't return vmalloc() memory from f2fs_kmalloc() · 0b6d4ca0

由 Eric Biggers 提交于 6月 04, 2020

kmalloc() returns kmalloc'ed memory, and kvmalloc() returns either
kmalloc'ed or vmalloc'ed memory. But the f2fs wrappers, f2fs_kmalloc()
and f2fs_kvmalloc(), both return both kinds of memory.

It's redundant to have two functions that do the same thing, and also
breaking the standard naming convention is causing bugs since people
assume it's safe to kfree() memory allocated by f2fs_kmalloc(). See
e.g. the various allocations in fs/f2fs/compress.c.

Fix this by making f2fs_kmalloc() just use kmalloc(). And to avoid
re-introducing the allocation failures that the vmalloc fallback was
intended to fix, convert the largest allocations to use f2fs_kvmalloc().
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0b6d4ca0

19 5月, 2020 2 次提交

fscrypt: support test_dummy_encryption=v2 · ed318a6c

由 Eric Biggers 提交于 5月 12, 2020

v1 encryption policies are deprecated in favor of v2, and some new
features (e.g. encryption+casefolding) are only being added for v2.

Therefore, the "test_dummy_encryption" mount option (which is used for
encryption I/O testing with xfstests) needs to support v2 policies.

To do this, extend its syntax to be "test_dummy_encryption=v1" or
"test_dummy_encryption=v2". The existing "test_dummy_encryption" (no
argument) also continues to be accepted, to specify the default setting
-- currently v1, but the next patch changes it to v2.

To cleanly support both v1 and v2 while also making it easy to support
specifying other encryption settings in the future (say, accepting
"$contents_mode:$filenames_mode:v2"), make ext4 and f2fs maintain a
pointer to the dummy fscrypt_context rather than using mount flags.

To avoid concurrency issues, don't allow test_dummy_encryption to be set
or changed during a remount. (The former restriction is new, but
xfstests doesn't run into it, so no one should notice.)

Tested with 'gce-xfstests -c {ext4,f2fs}/encrypt -g auto'. On ext4,
there are two regressions, both of which are test bugs: ext4/023 and
ext4/028 fail because they set an xattr and expect it to be stored
inline, but the increase in size of the fscrypt_context from
24 to 40 bytes causes this xattr to be spilled into an external block.

Link: https://lore.kernel.org/r/20200512233251.118314-4-ebiggers@kernel.orgAcked-by: NJaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NEric Biggers <ebiggers@google.com>

ed318a6c

f2fs: fix checkpoint=disable:%u%% · 1ae18f71

由 Jaegeuk Kim 提交于 5月 15, 2020

When parsing the mount option, we don't have sbi->user_block_count.
Should do it after getting it.

Cc: <stable@vger.kernel.org>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1ae18f71

12 5月, 2020 1 次提交

f2fs: refactor resize_fs to avoid meta updates in progress · b4b10061

由 Jaegeuk Kim 提交于 3月 31, 2020

Sahitya raised an issue:
- prevent meta updates while checkpoint is in progress

allocate_segment_for_resize() can cause metapage updates if
it requires to change the current node/data segments for resizing.
Stop these meta updates when there is a checkpoint already
in progress to prevent inconsistent CP data.
Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b4b10061

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功