提交 · cf30f6a5f0c60ec98a637b836bef6915f602c6ab · openeuler / Kernel

09 11月, 2021 1 次提交

lib: zstd: Add kernel-specific API · cf30f6a5

由 Nick Terrell 提交于 9月 11, 2020

This patch:
- Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h`
- Updates modified zstd headers to yearless copyright
- Adds a new API in `include/linux/zstd.h` that is functionally
  equivalent to the in-use subset of the current API. Functions are
  renamed to avoid symbol collisions with zstd, to make it clear it is
  not the upstream zstd API, and to follow the kernel style guide.
- Updates all callers to use the new API.

There are no functional changes in this patch. Since there are no
functional change, I felt it was okay to update all the callers in a
single patch. Once the API is approved, the callers are mechanically
changed.

This patch is preparing for the 3rd patch in this series, which updates
zstd to version 1.4.10. Since the upstream zstd API is no longer exposed
to callers, the update can happen transparently.
Signed-off-by: NNick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: NJean-Denis Girard <jd.girard@sysnux.pf>

cf30f6a5

21 9月, 2021 1 次提交

fscrypt: remove fscrypt_operations::max_namelen · 4373b3dc

由 Eric Biggers 提交于 9月 09, 2021

The max_namelen field is unnecessary, as it is set to 255 (NAME_MAX) on
all filesystems that support fscrypt (or plan to support fscrypt). For
simplicity, just use NAME_MAX directly instead.

Link: https://lore.kernel.org/r/20210909184513.139281-1-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>

4373b3dc

31 8月, 2021 3 次提交

f2fs: enable realtime discard iff device supports discard · f7db8dd6

由 Chao Yu 提交于 8月 30, 2021

Let's only enable realtime discard if and only if device supports
discard functionality.
Signed-off-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f7db8dd6

f2fs: guarantee to write dirty data when enabling checkpoint back · dddd3d65

由 Jaegeuk Kim 提交于 8月 19, 2021

We must flush all the dirty data when enabling checkpoint back. Let's guarantee
that first by adding a retry logic on sync_inodes_sb(). In addition to that,
this patch adds to flush data in fsync when checkpoint is disabled, which can
mitigate the sync_inodes_sb() failures in advance.
Reviewed-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

dddd3d65

f2fs: Don't create discard thread when device doesn't support realtime discard · 4d674904

由 Fengnan Chang 提交于 8月 19, 2021

Don't create discard thread when device doesn't support realtime discard
or user specifies nodiscard mount option.
Signed-off-by: NFengnan Chang <changfengnan@vivo.com>
Signed-off-by: NYangtao Li <frank.li@vivo.com>
Reviewed-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4d674904

24 8月, 2021 2 次提交

f2fs: introduce periodic iostat io latency traces · a4b68176

由 Daeho Jeong 提交于 8月 20, 2021

Whenever we notice some sluggish issues on our machines, we are always
curious about how well all types of I/O in the f2fs filesystem are
handled. But, it's hard to get this kind of real data. First of all,
we need to reproduce the issue while turning on the profiling tool like
blktrace, but the issue doesn't happen again easily. Second, with the
intervention of any tools, the overall timing of the issue will be
slightly changed and it sometimes makes us hard to figure it out.

So, I added the feature printing out IO latency statistics tracepoint
events, which are minimal things to understand filesystem's I/O related
behaviors, into F2FS_IOSTAT kernel config. With "iostat_enable" sysfs
node on, we can get this statistics info in a periodic way and it
would cause the least overhead.

[samples]
 f2fs_ckpt-254:1-507     [003] ....  2842.439683: f2fs_iostat_latency:
dev = (254,11), iotype [peak lat.(ms)/avg lat.(ms)/count],
rd_data [136/1/801], rd_node [136/1/1704], rd_meta [4/2/4],
wr_sync_data [164/16/3331], wr_sync_node [152/3/648],
wr_sync_meta [160/2/4243], wr_async_data [24/13/15],
wr_async_node [0/0/0], wr_async_meta [0/0/0]

 f2fs_ckpt-254:1-507     [002] ....  2845.450514: f2fs_iostat_latency:
dev = (254,11), iotype [peak lat.(ms)/avg lat.(ms)/count],
rd_data [60/3/456], rd_node [60/3/1258], rd_meta [0/0/1],
wr_sync_data [120/12/2285], wr_sync_node [88/5/428],
wr_sync_meta [52/6/2990], wr_async_data [4/1/3],
wr_async_node [0/0/0], wr_async_meta [0/0/0]
Signed-off-by: NDaeho Jeong <daehojeong@google.com>
Reviewed-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a4b68176

f2fs: separate out iostat feature · 52118743

由 Daeho Jeong 提交于 8月 19, 2021

Added F2FS_IOSTAT config option to support getting IO statistics through
sysfs and printing out periodic IO statistics tracepoint events and
moved I/O statistics related codes into separate files for better
maintenance.
Signed-off-by: NDaeho Jeong <daehojeong@google.com>
Reviewed-by: NChao Yu <chao@kernel.org>
[Jaegeuk Kim: set default=y]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

52118743

18 8月, 2021 1 次提交

f2fs: support fault injection for f2fs_kmem_cache_alloc() · 32410577

由 Chao Yu 提交于 8月 09, 2021

This patch supports to inject fault into f2fs_kmem_cache_alloc().

Usage:
a) echo 32768 > /sys/fs/f2fs/<dev>/inject_type or
b) mount -o fault_type=32768 <dev> <mountpoint>
Signed-off-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

32410577

06 8月, 2021 1 次提交

f2fs: fix to do sanity check for sb/cp fields correctly · 65ddf656

由 Chao Yu 提交于 8月 06, 2021

This patch fixes below problems of sb/cp sanity check:
- in sanity_check_raw_superi(), it missed to consider log header
blocks while cp_payload check.
- in f2fs_sanity_check_ckpt(), it missed to check nat_bits_blocks.

Cc: <stable@kernel.org>
Signed-off-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

65ddf656

04 8月, 2021 2 次提交

f2fs: add sysfs node to control ra_pages for fadvise seq file · 0f6b56ec

由 Daeho Jeong 提交于 8月 02, 2021

fadvise() allows the user to expand the readahead window to double with
POSIX_FADV_SEQUENTIAL, now. But, in some use cases, it is not that
sufficient and we need to meet the need in a restricted way. We can
control the multiplier value of bdi device readahead between 2 (default)
and 256 for POSIX_FADV_SEQUENTIAL advise option.
Signed-off-by: NDaeho Jeong <daehojeong@google.com>
Reviewed-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0f6b56ec

f2fs: introduce discard_unit mount option · 4f993264

由 Chao Yu 提交于 8月 03, 2021

As James Z reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=213877

[1.] One-line summary of the problem:
Mount multiple SMR block devices exceed certain number cause system non-response

[2.] Full description of the problem/report:
Created some F2FS on SMR devices (mkfs.f2fs -m), then mounted in sequence. Each device is the same Model: HGST HSH721414AL (Size 14TB).
Empirically, found that when the amount of SMR device * 1.5Gb > System RAM, the system ran out of memory and hung. No dmesg output. For example, 24 SMR Disk need 24*1.5GB = 36GB. A system with 32G RAM can only mount 21 devices, the 22nd device will be a reproducible cause of system hang.
The number of SMR devices with other FS mounted on this system does not interfere with the result above.

[3.] Keywords (i.e., modules, networking, kernel):
F2FS, SMR, Memory

[4.] Kernel information
[4.1.] Kernel version (uname -a):
Linux 5.13.4-200.fc34.x86_64 #1 SMP Tue Jul 20 20:27:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

[4.2.] Kernel .config file:
Default Fedora 34 with f2fs-tools-1.14.0-2.fc34.x86_64

[5.] Most recent kernel version which did not have the bug:
None

[6.] Output of Oops.. message (if applicable) with symbolic information
     resolved (see Documentation/admin-guide/oops-tracing.rst)
None

[7.] A small shell script or example program which triggers the
     problem (if possible)
mount /dev/sdX /mnt/0X

[8.] Memory consumption

With 24 * 14T SMR Block device with F2FS
free -g
              total        used        free      shared  buff/cache   available
Mem:             46          36           0           0          10          10
Swap:             0           0           0

With 3 * 14T SMR Block device with F2FS
free -g
               total        used        free      shared  buff/cache   available
Mem:               7           5           0           0           1           1
Swap:              7           0           7

The root cause is, there are three bitmaps:
- cur_valid_map
- ckpt_valid_map
- discard_map
and each of them will cost ~500MB memory, {cur, ckpt}_valid_map are
necessary, but discard_map is optional, since this bitmap will only be
useful in mountpoint that small discard is enabled.

For a blkzoned device such as SMR or ZNS devices, f2fs will only issue
discard for a section(zone) when all blocks of that section are invalid,
so, for such device, we don't need small discard functionality at all.

This patch introduces a new mountoption "discard_unit=block|segment|
section" to support issuing discard with different basic unit which is
aligned to block, segment or section, so that user can specify
"discard_unit=segment" or "discard_unit=section" to disable small
discard functionality.

Note that this mount option can not be changed by remount() due to
related metadata need to be initialized during mount().

In order to save memory, let's use "discard_unit=section" for blkzoned
device by default.
Signed-off-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4f993264

03 8月, 2021 1 次提交

f2fs: fix wrong checkpoint_changed value in f2fs_remount() · 277afbde

由 Chao Yu 提交于 7月 29, 2021

In f2fs_remount(), return value of test_opt() is an unsigned int type
variable, however when we compare it to a bool type variable, it cause
wrong result, fix it.

Fixes: 4354994f ("f2fs: checkpoint disabling")
Signed-off-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

277afbde

20 7月, 2021 1 次提交

f2fs: quota: fix potential deadlock · 9de71ede

由 Chao Yu 提交于 7月 19, 2021

xfstest generic/587 reports a deadlock issue as below:

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc1 #69 Not tainted
------------------------------------------------------
repquota/8606 is trying to acquire lock:
ffff888022ac9320 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}, at: f2fs_quota_sync+0x207/0x300 [f2fs]

but task is already holding lock:
ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&sbi->quota_sem){.+.+}-{3:3}:
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_read+0x3b/0x2a0
       f2fs_quota_sync+0x59/0x300 [f2fs]
       f2fs_quota_on+0x48/0x100 [f2fs]
       do_quotactl+0x5e3/0xb30
       __x64_sys_quotactl+0x23a/0x4e0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #1 (&sbi->cp_rwsem){++++}-{3:3}:
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_read+0x3b/0x2a0
       f2fs_unlink+0x353/0x670 [f2fs]
       vfs_unlink+0x1c7/0x380
       do_unlinkat+0x413/0x4b0
       __x64_sys_unlinkat+0x50/0xb0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}:
       check_prev_add+0xdc/0xb30
       validate_chain+0xa67/0xb20
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_write+0x39/0xc0
       f2fs_quota_sync+0x207/0x300 [f2fs]
       do_quotactl+0xaff/0xb30
       __x64_sys_quotactl+0x23a/0x4e0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

Chain exists of:
  &sb->s_type->i_mutex_key#18 --> &sbi->cp_rwsem --> &sbi->quota_sem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sbi->quota_sem);
                               lock(&sbi->cp_rwsem);
                               lock(&sbi->quota_sem);
  lock(&sb->s_type->i_mutex_key#18);

 *** DEADLOCK ***

3 locks held by repquota/8606:
 #0: ffff88801efac0e0 (&type->s_umount_key#53){++++}-{3:3}, at: user_get_super+0xd9/0x190
 #1: ffff8880084bc380 (&sbi->cp_rwsem){++++}-{3:3}, at: f2fs_quota_sync+0x3e/0x300 [f2fs]
 #2: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs]

stack backtrace:
CPU: 6 PID: 8606 Comm: repquota Not tainted 5.14.0-rc1 #69
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
 dump_stack_lvl+0xce/0x134
 dump_stack+0x17/0x20
 print_circular_bug.isra.0.cold+0x239/0x253
 check_noncircular+0x1be/0x1f0
 check_prev_add+0xdc/0xb30
 validate_chain+0xa67/0xb20
 __lock_acquire+0x648/0x10b0
 lock_acquire+0x128/0x470
 down_write+0x39/0xc0
 f2fs_quota_sync+0x207/0x300 [f2fs]
 do_quotactl+0xaff/0xb30
 __x64_sys_quotactl+0x23a/0x4e0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f883b0b4efe

The root cause is ABBA deadlock of inode lock and cp_rwsem,
reorder locks in f2fs_quota_sync() as below to fix this issue:
- lock inode
- lock cp_rwsem
- lock quota_sem

Fixes: db6ec53b ("f2fs: add a rw_sem to cover quota flag changes")
Signed-off-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9de71ede

13 7月, 2021 1 次提交

f2fs: Convert to using invalidate_lock · edc6d01b

由 Jan Kara 提交于 4月 13, 2021

Use invalidate_lock instead of f2fs' private i_mmap_sem. The intended
purpose is exactly the same. By this conversion we fix a long standing
race between hole punching and read(2) / readahead(2) paths that can
lead to stale page cache contents.

CC: Jaegeuk Kim <jaegeuk@kernel.org>
CC: Chao Yu <yuchao0@huawei.com>
CC: linux-f2fs-devel@lists.sourceforge.net
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJan Kara <jack@suse.cz>

edc6d01b

02 7月, 2021 1 次提交

f2fs: compress: add nocompress extensions support · 151b1982

由 Fengnan Chang 提交于 6月 08, 2021

When we create a directory with enable compression, all file write into
directory will try to compress.But sometimes we may know, new file
cannot meet compression ratio requirements.
We need a nocompress extension to skip those files to avoid unnecessary
compress page test.

After add nocompress_extension, the priority should be:
dir_flag < comp_extention,nocompress_extension < comp_file_flag,
no_comp_file_flag.

Priority in between FS_COMPR_FL, FS_NOCOMP_FS, extensions:
   * compress_extension=so; nocompress_extension=zip; chattr +c dir;
     touch dir/foo.so; touch dir/bar.zip; touch dir/baz.txt; then foo.so
     and baz.txt should be compresse, bar.zip should be non-compressed.
     chattr +c dir/bar.zip can enable compress on bar.zip.
   * compress_extension=so; nocompress_extension=zip; chattr -c dir;
     touch dir/foo.so; touch dir/bar.zip; touch dir/baz.txt; then foo.so
     should be compresse, bar.zip and baz.txt should be non-compressed.
     chattr+c dir/bar.zip; chattr+c dir/baz.txt; can enable compress on
     bar.zip and baz.txt.
Signed-off-by: NFengnan Chang <changfengnan@vivo.com>
Reviewed-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

151b1982

23 6月, 2021 4 次提交

f2fs: introduce f2fs_casefolded_name slab cache · 4d9a2bb1

由 Chao Yu 提交于 6月 11, 2021

Add a slab cache: "f2fs_casefolded_name" for memory allocation
of casefold name.
Reviewed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4d9a2bb1

f2fs: compress: add compress_inode to cache compressed blocks · 6ce19aff

由 Chao Yu 提交于 5月 20, 2021

Support to use address space of inner inode to cache compressed block,
in order to improve cache hit ratio of random read.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6ce19aff

f2fs: support RO feature · a7d9fe3c

由 Jaegeuk Kim 提交于 5月 21, 2021

Given RO feature in superblock, we don't need to check provisioning/reserve
spaces and SSA area.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a7d9fe3c

f2fs: logging neatening · 833dcd35

由 Joe Perches 提交于 5月 26, 2021

Update the logging uses that have unnecessary newlines as the f2fs_printk
function and so its f2fs_<level> macro callers already adds one.

This allows searching single line logging entries with an easier grep and
also avoids unnecessary blank lines in the logging.

Miscellanea:

o Coalesce formats
o Align to open parenthesis
Signed-off-by: NJoe Perches <joe@perches.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

833dcd35

26 5月, 2021 1 次提交

f2fs: add MODULE_SOFTDEP to ensure crc32 is included in the initramfs · 0dd57178

由 Chao Yu 提交于 5月 18, 2021

As marcosfrm reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=213089

Initramfs generators rely on "pre" softdeps (and "depends") to include
additional required modules.

F2FS does not declare "pre: crc32" softdep. Then every generator (dracut,
mkinitcpio...) has to maintain a hardcoded list for this purpose.

Hence let's use MODULE_SOFTDEP("pre: crc32") in f2fs code.

Fixes: 43b6573b ("f2fs: use cryptoapi crc32 functions")
Reported-by: Nmarcosfrm <marcosfrm@gmail.com>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0dd57178

15 5月, 2021 1 次提交

f2fs: fix to avoid racing on fsync_entry_slab by multi filesystem instances · cad83c96

由 Chao Yu 提交于 5月 07, 2021

As syzbot reported, there is an use-after-free issue during f2fs recovery:

Use-after-free write at 0xffff88823bc16040 (in kfence-#10):
 kmem_cache_destroy+0x1f/0x120 mm/slab_common.c:486
 f2fs_recover_fsync_data+0x75b0/0x8380 fs/f2fs/recovery.c:869
 f2fs_fill_super+0x9393/0xa420 fs/f2fs/super.c:3945
 mount_bdev+0x26c/0x3a0 fs/super.c:1367
 legacy_get_tree+0xea/0x180 fs/fs_context.c:592
 vfs_get_tree+0x86/0x270 fs/super.c:1497
 do_new_mount fs/namespace.c:2905 [inline]
 path_mount+0x196f/0x2be0 fs/namespace.c:3235
 do_mount fs/namespace.c:3248 [inline]
 __do_sys_mount fs/namespace.c:3456 [inline]
 __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3433
 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The root cause is multi f2fs filesystem instances can race on accessing
global fsync_entry_slab pointer, result in use-after-free issue of slab
cache, fixes to init/destroy this slab cache only once during module
init/destroy procedure to avoid this issue.

Reported-by: syzbot+9d90dad32dd9727ed084@syzkaller.appspotmail.com
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

cad83c96

11 4月, 2021 1 次提交

f2fs: clean up build warnings · 5f029c04

由 Yi Zhuang 提交于 4月 06, 2021

This patch combined the below three clean-up patches.

- modify open brace '{' following function definitions
- ERROR: spaces required around that ':'
- ERROR: spaces required before the open parenthesis '('
- ERROR: spaces prohibited before that ','
- Made suggested modifications from checkpatch in reference to WARNING:
 Missing a blank line after declarations
Signed-off-by: NYi Zhuang <zhuangyi1@huawei.com>
Signed-off-by: NJia Yang <jiayang5@huawei.com>
Signed-off-by: NJack Qiu <jack.qiu@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5f029c04

02 4月, 2021 1 次提交

f2fs: set checkpoint_merge by default · b5d15199

由 Jaegeuk Kim 提交于 4月 01, 2021

Once we introduced checkpoint_merge, we've seen some contention w/o the option.
In order to avoid it, let's set it by default.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b5d15199

31 3月, 2021 2 次提交

f2fs: fix to restrict mount condition on readonly block device · 23738e74

由 Chao Yu 提交于 3月 31, 2021

When we mount an unclean f2fs image in a readonly block device, let's
make mount() succeed only when there is no recoverable data in that
image, otherwise after mount(), file fsyned won't be recovered as user
expected.

Fixes: 938a1842 ("f2fs: give a warning only for readonly partition")
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

23738e74

f2fs: introduce gc_merge mount option · 5911d2d1

由 Chao Yu 提交于 3月 27, 2021

In this patch, we will add two new mount options: "gc_merge" and
"nogc_merge", when background_gc is on, "gc_merge" option can be
set to let background GC thread to handle foreground GC requests,
it can eliminate the sluggish issue caused by slow foreground GC
operation when GC is triggered from a process with limited I/O
and CPU resources.

Original idea is from Xiang.
Signed-off-by: NGao Xiang <xiang@kernel.org>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5911d2d1

26 3月, 2021 2 次提交

f2fs: fix error path of f2fs_remount() · 3fd97359

由 Chao Yu 提交于 3月 17, 2021

In error path of f2fs_remount(), it missed to restart/stop kernel thread
or enable/disable checkpoint, then mount option status may not be
consistent with real condition of filesystem, so let's reorder remount
flow a bit as below and do recovery correctly in error path:

1) handle gc thread
2) handle ckpt thread
3) handle flush thread
4) handle checkpoint disabling
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3fd97359

f2fs: don't start checkpoint thread in readonly mountpoint · 3f7070b0

由 Chao Yu 提交于 3月 17, 2021

In readonly mountpoint, there should be no write IOs include checkpoint
IO, so that it's not needed to create kernel checkpoint thread.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3f7070b0

13 3月, 2021 2 次提交

f2fs: avoid unused f2fs_show_compress_options() · cd6ee739

由 Chao Yu 提交于 2月 20, 2021

LKP reports:

   fs/f2fs/super.c:1516:20: warning: unused function 'f2fs_show_compress_options' [-Wunused-function]
   static inline void f2fs_show_compress_options(struct seq_file *seq,

Fix this issue by covering f2fs_show_compress_options() with
CONFIG_F2FS_FS_COMPRESSION macro.

Fixes: 4c8ff709 ("f2fs: support data compression")
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

cd6ee739

f2fs: fix to allow migrating fully valid segment · 7dede886

由 Chao Yu 提交于 2月 20, 2021

F2FS_IOC_FLUSH_DEVICE/F2FS_IOC_RESIZE_FS needs to migrate all blocks of
target segment to other place, no matter the segment has partially or fully
valid blocks.

However, after commit 803e74be ("f2fs: stop GC when the victim becomes
fully valid"), we may skip migration due to target segment is fully valid,
result in failing the ioctl interface, fix this.

Fixes: 803e74be ("f2fs: stop GC when the victim becomes fully valid")
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7dede886

11 3月, 2021 1 次提交

block: rename BIO_MAX_PAGES to BIO_MAX_VECS · a8affc03

由 Christoph Hellwig 提交于 3月 11, 2021

Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been
horribly confusingly misnamed. Rename it to BIO_MAX_VECS to stop
confusing users of the bio API.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

a8affc03

13 2月, 2021 1 次提交

f2fs: give a warning only for readonly partition · 938a1842

由 Jaegeuk Kim 提交于 2月 12, 2021

Let's allow mounting readonly partition. We're able to recovery later once we
have it as read-write back.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

938a1842

09 2月, 2021 1 次提交

f2fs: don't grab superblock freeze for flush/ckpt thread · d50dfc0c

由 Jaegeuk Kim 提交于 2月 08, 2021

There are controlled by f2fs_freeze().

This fixes xfstests/generic/068 which is stuck at

 task:f2fs_ckpt-252:3 state:D stack:    0 pid: 5761 ppid:     2 flags:0x00004000
 Call Trace:
  __schedule+0x44c/0x8a0
  schedule+0x4f/0xc0
  percpu_rwsem_wait+0xd8/0x140
  ? percpu_down_write+0xf0/0xf0
  __percpu_down_read+0x56/0x70
  issue_checkpoint_thread+0x12c/0x160 [f2fs]
  ? wait_woken+0x80/0x80
  kthread+0x114/0x150
  ? __checkpoint_and_complete_reqs+0x110/0x110 [f2fs]
  ? kthread_park+0x90/0x90
  ret_from_fork+0x22/0x30
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d50dfc0c

04 2月, 2021 1 次提交

f2fs: introduce checkpoint_merge mount option · 261eeb9c

由 Daeho Jeong 提交于 1月 19, 2021

We've added a new mount options, "checkpoint_merge" and "nocheckpoint_merge",
which creates a kernel daemon and makes it to merge concurrent checkpoint
requests as much as possible to eliminate redundant checkpoint issues. Plus,
we can eliminate the sluggish issue caused by slow checkpoint operation
when the checkpoint is done in a process context in a cgroup having
low i/o budget and cpu shares. To make this do better, we set the
default i/o priority of the kernel daemon to "3", to give one higher
priority than other kernel threads. The below verification result
explains this.
The basic idea has come from https://opensource.samsung.com.

[Verification]
Android Pixel Device(ARM64, 7GB RAM, 256GB UFS)
Create two I/O cgroups (fg w/ weight 100, bg w/ wight 20)
Set "strict_guarantees" to "1" in BFQ tunables

In "fg" cgroup,
- thread A => trigger 1000 checkpoint operations
  "for i in `seq 1 1000`; do touch test_dir1/file; fsync test_dir1;
   done"
- thread B => gererating async. I/O
  "fio --rw=write --numjobs=1 --bs=128k --runtime=3600 --time_based=1
       --filename=test_img --name=test"

In "bg" cgroup,
- thread C => trigger repeated checkpoint operations
  "echo $$ > /dev/blkio/bg/tasks; while true; do touch test_dir2/file;
   fsync test_dir2; done"

We've measured thread A's execution time.

[ w/o patch ]
Elapsed Time: Avg. 68 seconds
[ w/  patch ]
Elapsed Time: Avg. 48 seconds
Reported-by: Nkernel test robot <lkp@intel.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
[Jaegeuk Kim: fix the return value in f2fs_start_ckpt_thread, reported by Dan]
Signed-off-by: NDaeho Jeong <daehojeong@google.com>
Signed-off-by: NSungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

261eeb9c

02 2月, 2021 1 次提交

f2fs: flush data when enabling checkpoint back · b0ff4fe7

由 Jaegeuk Kim 提交于 1月 26, 2021

During checkpoint=disable period, f2fs bypasses all the synchronous IOs such as
sync and fsync. So, when enabling it back, we must flush all of them in order
to keep the data persistent. Otherwise, suddern power-cut right after enabling
checkpoint will cause data loss.

Fixes: 4354994f ("f2fs: checkpoint disabling")
Cc: stable@vger.kernel.org
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b0ff4fe7

28 1月, 2021 6 次提交

f2fs: deprecate f2fs_trace_io · d5f7bc00

由 Jaegeuk Kim 提交于 1月 14, 2021

This patch deprecates f2fs_trace_io, since f2fs uses page->private more broadly,
resulting in more buggy cases.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d5f7bc00

f2fs: Remove readahead collision detection · 12699fb7

由 Matthew Wilcox (Oracle) 提交于 1月 14, 2021

With the new ->readahead operation, locked pages are added to the page
cache, preventing two threads from racing with each other to read the
same chunk of file, so this is dead code.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

12699fb7

f2fs: fix to use per-inode maxbytes · 6d1451bf

由 Chengguang Xu 提交于 1月 13, 2021

F2FS inode may have different max size, e.g. compressed file have
less blkaddr entries in all its direct-node blocks, result in being
with less max filesize. So change to use per-inode maxbytes.
Suggested-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NChengguang Xu <cgxu519@mykernel.net>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6d1451bf

f2fs: compress: support compress level · 3fde13f8

由 Chao Yu 提交于 1月 22, 2021

Expand 'compress_algorithm' mount option to accept parameter as format of
<algorithm>:<level>, by this way, it gives a way to allow user to do more
specified config on lz4 and zstd compression level, then f2fs compression
can provide higher compress ratio.

In order to set compress level for lz4 algorithm, it needs to set
CONFIG_LZ4HC_COMPRESS and CONFIG_F2FS_FS_LZ4HC config to enable lz4hc
compress algorithm.

CR and performance number on lz4/lz4hc algorithm:

dd if=enwik9 of=compressed_file conv=fsync

Original blocks:	244382

			lz4			lz4hc-9
compressed blocks	170647			163270
compress ratio		69.8%			66.8%
speed			16.4207 s, 60.9 MB/s	26.7299 s, 37.4 MB/s

compress ratio = after / before
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3fde13f8

f2fs: compress: deny setting unsupported compress algorithm · 32be0e97

由 Chao Yu 提交于 1月 22, 2021

If kernel doesn't support certain kinds of compress algorithm, deny to set
them as compress algorithm of f2fs via 'compress_algorithm=%s' mount option.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

32be0e97

f2fs: remove FAULT_ALLOC_BIO · 67883ade

由 Christoph Hellwig 提交于 1月 26, 2021

Sleeping bio allocations do not fail, which means that injecting an error
into sleeping bio allocations is a little silly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

67883ade

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功