提交 · 26787236b36660baf4d136281d40b5bb33a570ec · openeuler / Kernel

30 11月, 2016 1 次提交

f2fs: do not activate auto_recovery for fallocated i_size · 26787236

由 Jaegeuk Kim 提交于 11月 28, 2016

If a file needs to keep its i_size by fallocate, we need to turn off auto
recovery during roll-forward recovery.

This will resolve the below scenario.

1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4096" -c "fsync"
2. xfs_io -f /mnt/f2fs/file -c "falloc -k 4096 4096" -c "fsync"
3. md5sum /mnt/f2fs/file;
4. godown /mnt/f2fs/
5. umount /mnt/f2fs/
6. mount -t f2fs /dev/sdx /mnt/f2fs
7. md5sum /mnt/f2fs/file
Reported-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

26787236

29 11月, 2016 1 次提交

f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack · 8508e44a

由 Jaegeuk Kim 提交于 11月 24, 2016

We don't guarantee cp_addr is fixed by cp_version.
This is to sync with f2fs-tools.

Cc: stable@vger.kernel.org
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8508e44a

26 11月, 2016 8 次提交

f2fs: fix wrong AUTO_RECOVER condition · 97dd26ad

由 Jaegeuk Kim 提交于 11月 16, 2016

If i_size is not aligned to the f2fs's block size, we should not skip inode
update during fsync.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

97dd26ad

f2fs: fix fdatasync · 281518c6

由 Chao Yu 提交于 11月 17, 2016

For below two cases, we can't guarantee data consistence:

a)
1. xfs_io "pwrite 0 4195328" "fsync"
2. xfs_io "pwrite 4195328 1024" "fdatasync"
3. godown
4. umount & mount
--> isize we updated before fdatasync won't be recovered

b)
1. xfs_io "pwrite -S 0xcc 0 4202496" "fsync"
2. xfs_io "fpunch 4194304 4096" "fdatasync"
3. godown
4. umount & mount
--> dnode we punched before fdatasync won't be recovered

The reason is that normally fdatasync won't be aware of modification
of metadata in file, e.g. isize changing, dnode updating, so in ->fsync
we will skip flushing node pages for above cases, result in making
fdatasynced file being lost during recovery.

Currently we have introduced DIRTY_META global list in sbi for tracking
dirty inode selectively, so in fdatasync we can choose to flush nodes
depend on dirty state of current inode in the list.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

281518c6

f2fs: fix to account total free nid correctly · 04d47e67

由 Chao Yu 提交于 11月 17, 2016

Thread A		Thread B		Thread C
- f2fs_create
 - f2fs_new_inode
  - f2fs_lock_op
   - alloc_nid
    alloc last nid
  - f2fs_unlock_op
			- f2fs_create
			 - f2fs_new_inode
			  - f2fs_lock_op
			   - alloc_nid
			    as node count still not
			    be increased, we will
			    loop in alloc_nid
						- f2fs_write_node_pages
						 - f2fs_balance_fs_bg
						  - f2fs_sync_fs
						   - write_checkpoint
						    - block_operations
						     - f2fs_lock_all
 - f2fs_lock_op

While creating new inode, we do not allocate and account nid atomically,
so that when there is almost no free nids left, we may encounter deadloop
like above stack.

In order to avoid that, reuse nm_i::available_nids for accounting free nids
and make nid allocation and counting being atomical during node creation.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

04d47e67

f2fs: don't wait writeback for datas during checkpoint · 36951b38

由 Chao Yu 提交于 11月 16, 2016

Normally, while committing checkpoint, we will wait on all pages to be
writebacked no matter the page is data or metadata, so in scenario where
there are lots of data IO being submitted with metadata, we may suffer
long latency for waiting writeback during checkpoint.

Indeed, we only care about persistence for pages with metadata, but not
pages with data, as file system consistent are only related to metadate,
so in order to avoid encountering long latency in above scenario, let's
recognize and reference metadata in submitted IOs, wait writeback only
for metadatas.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

36951b38

f2fs: avoid BG_GC in f2fs_balance_fs · 7702bdbe

由 Jaegeuk Kim 提交于 11月 14, 2016

If many threads hit has_not_enough_free_secs() in f2fs_balance_fs() at the same
time, all the threads would do FG_GC or BG_GC.
In this critical path, we totally don't need to do BG_GC at all.
Let's avoid that.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7702bdbe

J
f2fs: use err for f2fs_preallocate_blocks · a7de6086
由 Jaegeuk Kim 提交于 11月 11, 2016
```
This patch has no functional change.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
a7de6086

f2fs: support multiple devices · 3c62be17

由 Jaegeuk Kim 提交于 10月 06, 2016

This patch implements multiple devices support for f2fs.
Given multiple devices by mkfs.f2fs, f2fs shows them entirely as one big
volume under one f2fs instance.

Internal block management is very simple, but we will modify block allocation
and background GC policy to boost IO speed by exploiting them accoording to
each device speed.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3c62be17

f2fs: revert segment allocation for direct IO · 6ae1be13

由 Jaegeuk Kim 提交于 11月 11, 2016

Now we don't need to be too much careful about storage alignment for dio, since
its speed becomes quite fast and we'd better avoid any misalignment first.

Revert: 38aa0889 (f2fs: align direct_io'ed data to section)
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6ae1be13

24 11月, 2016 8 次提交

f2fs: Cache zoned block devices zone type · 178053e2

由 Damien Le Moal 提交于 10月 28, 2016

With the zoned block device feature enabled, section discard
need to do a zone reset for sections contained in sequential
zones, and a regular discard (if supported) for sections
stored in conventional zones. Avoid the need for a costly
report zones to obtain a section zone type when discarding it
by caching the types of the device zones in the super block
information. This cache is initialized at mount time for mounts
with the zoned block device feature enabled.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

178053e2

f2fs: Always enable discard for zoned blocks devices · 96ba2dec

由 Damien Le Moal 提交于 10月 28, 2016

Zone write pointer reset acts as discard for zoned block
devices. So if the zoned block device feature is enabled,
always declare that discard is enabled, even if the device
does not actually support the command.
For the same reason, prevent the use the "nodicard" mount
option.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

96ba2dec

f2fs: Use generic zoned block device terminology · 0bfd7a09

由 Damien Le Moal 提交于 10月 28, 2016

SMR stands for "Shingled Magnetic Recording" which makes sense
only for hard disk drives (spinning rust). The ZBC/ZAC standards
enable management of SMR disks, but solid state drives may also
support those standards. So rename the HMSMR feature to BLKZONED
to avoid a HDD centric terminology. For the same reason, rename
f2fs_sb_mounted_hmsmr to f2fs_sb_mounted_blkzoned.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0bfd7a09

f2fs: report error of f2fs_fill_dentries · ed6bd4b1

由 Chao Yu 提交于 10月 29, 2016

Report error of f2fs_fill_dentries to ->iterate_shared, otherwise when
error ocurrs, user may just list part of dirents in target directory
without any hints.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ed6bd4b1

f2fs: remove percpu_count due to performance regression · 35782b23

由 Jaegeuk Kim 提交于 10月 20, 2016

This patch removes percpu_count usage due to performance regression in iozone.

Fixes: 523be8a6 ("f2fs: use percpu_counter for page counters")
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

35782b23

f2fs: keep dirty inodes selectively for checkpoint · 7c45729a

由 Jaegeuk Kim 提交于 10月 14, 2016

This is to avoid no free segment bug during checkpoint caused by a number of
dirty inodes.

The case was reported by Chao like this.
1. mount with lazytime option
2. fill 4k file until disk is full
3. sync filesystem
4. read all files in the image
5. umount

In this case, we actually don't need to flush dirty inode to inode page during
checkpoint.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7c45729a

f2fs: don't interrupt free nids building during nid allocation · 3a2ad567

由 Chao Yu 提交于 10月 11, 2016

Let build_free_nids support sync/async methods, in allocation flow of nids,
we use synchronuous method, so that we can avoid looping in alloc_nid when
free memory is low; in unblock_operations and f2fs_balance_fs_bg we use
asynchronuous method in where low memory condition can interrupt us.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3a2ad567

f2fs: split free nid list · b8559dc2

由 Chao Yu 提交于 10月 12, 2016

During free nid allocation, in order to do preallocation, we will tag free
nid entry as allocated one and still leave it in free nid list, for other
allocators who want to grab free nids, it needs to traverse the free nid
list for lookup. It becomes overhead in scenario of allocating free nid
intensively by multithreads.

This patch splits free nid list to two list: {free,alloc}_nid_list, to
keep free nids and preallocated free nids separately, after that, traverse
latency will be gone, besides split nid_cnt for separate statistic.

Additionally, introduce __insert_nid_to_list and __remove_nid_from_list for
cleanup.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: modify f2fs_bug_on to avoid needless branches]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b8559dc2

01 10月, 2016 6 次提交

f2fs: support checkpoint error injection · 0f348028

由 Chao Yu 提交于 9月 26, 2016

This patch adds to support checkpoint error injection in f2fs for testing
fatal error tolerance, it will be useful that it can simulate abnormal
power off by f2fs itself instead of calling godown ioctl by running apps.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0f348028

f2fs: support configuring fault injection per superblock · 1ecc0c5c

由 Chao Yu 提交于 9月 23, 2016

Previously, we only support global fault injection configuration, so that
when we configure type/rate of fault injection through sysfs, mount
option, it will influence all f2fs partition which is being used.

It is not make sence, since it will be not convenient if developer want
to test separated partitions with different fault injection rate/type
simultaneously, also it's not possible to enable fault injection in one
partition and disable fault injection in other one.

>From now on, we move global configuration of fault injection in module
into per-superblock, hence injection testing can be more flexible.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1ecc0c5c

f2fs: add customized migrate_page callback · 5b7a487c

由 Weichao Guo 提交于 9月 20, 2016

This patch improves the migration of dirty pages and allows migrating atomic
written pages that F2FS uses in Page Cache. Instead of the fallback releasing
page path, it provides better performance for memory compaction, CMA and other
users of memory page migrating. For dirty pages, there is no need to write back
first when migrating. For an atomic written page before committing, we can
migrate the page and update the related 'inmem_pages' list at the same time.
Signed-off-by: NWeichao Guo <guoweichao@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix some coding style]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5b7a487c

f2fs: introduce cp_lock to protect updating of ckpt_flags · aaec2b1d

由 Chao Yu 提交于 9月 20, 2016

This patch introduces spinlock to protect updating process of ckpt_flags
field in struct f2fs_checkpoint, it avoids incorrectly updating in race
condition.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: add __is_set_ckpt_flags likewise __set_ckpt_flags]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

aaec2b1d

f2fs: fix to avoid race condition when updating sbi flag · fadb2fb8

由 Chao Yu 提交于 9月 20, 2016

Making updating of sbi flag atomic by using {test,set,clear}_bit,
otherwise in concurrency scenario, the flag could be updated incorrectly.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fadb2fb8

f2fs: use crc and cp version to determine roll-forward recovery · a468f0ef

由 Jaegeuk Kim 提交于 9月 19, 2016

Previously, we used cp_version only to detect recoverable dnodes.
In order to avoid same garbage cp_version, we needed to truncate the next
dnode during checkpoint, resulting in additional discard or data write.
If we can distinguish this by using crc in addition to cp_version, we can
remove this overhead.

There is backward compatibility concern where it changes node_footer layout.
So, this patch introduces a new checkpoint flag, CP_CRC_RECOVERY_FLAG, to
detect new layout. New layout will be activated only when this flag is set.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a468f0ef

23 9月, 2016 3 次提交

f2fs: show dirty inode number · 5bc994a0

由 Chao Yu 提交于 9月 18, 2016

This patch enables showing dirty inode number in procfs.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5bc994a0

f2fs: support IO error injection · 8b038c70

由 Chao Yu 提交于 9月 18, 2016

This patch adds to support IO error injection for testing IO error
tolerance of f2fs.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8b038c70

f2fs: make f2fs_filetype_table static · ebfa7322

由 Chao Yu 提交于 9月 18, 2016

There is no more user of f2fs_filetype_table outside of dir.c, make it
static.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ebfa7322

14 9月, 2016 1 次提交

f2fs: avoid ENOMEM during roll-forward recovery · e8ea9b3d

由 Jaegeuk Kim 提交于 9月 09, 2016

This patch gives another chances during roll-forward recovery regarding to
-ENOMEM.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e8ea9b3d

08 9月, 2016 4 次提交

f2fs: add roll-forward recovery process for encrypted dentry · e7ba108a

由 Shuoran Liu 提交于 8月 29, 2016

Add roll-forward recovery process for encrypted dentry, so the first fsync
issued to an encrypted file does not need writing checkpoint.

This improves the performance of the following test at thousands of small
files: open -> write -> fsync -> close
Signed-off-by: NShuoran Liu <liushuoran@huawei.com>
Acked-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: modify kernel message to show encrypted names]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e7ba108a

f2fs: fix lost xattrs of directories · bbf156f7

由 Jaegeuk Kim 提交于 8月 29, 2016

This patch enhances the xattr consistency of dirs from suddern power-cuts.

Possible scenario would be:
1. dir->setxattr used by per-file encryption
2. file->setxattr goes into inline_xattr
3. file->fsync

In that case, we should do checkpoint for #1.
Otherwise we'd lose dir's key information for the file given #2.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bbf156f7

f2fs: support async discard · 275b66b0

由 Chao Yu 提交于 8月 29, 2016

Like most filesystems, f2fs will issue discard command synchronously, so
when user trigger fstrim through ioctl, multiple discard commands will be
issued serially with sync mode, which makes poor performance.

In this patch we try to support async discard, so that all discard
commands can be issued and be waited for endio in batch to improve
performance.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

275b66b0

f2fs: fix to do security initialization of encrypted inode with original filename · 9421d570

由 Chao Yu 提交于 8月 28, 2016

When creating new inode, security_inode_init_security will be called for
initializing security info related to the inode, and filename is passed to
security module, it helps security module such as SElinux to know which
rule or label could be applied for the inode with specified name.

Previously, if new inode is created as an encrypted one, f2fs will transfer
encrypted filename to security module which may fail the check of security
policy belong to the inode. So in order to this issue, alter to transfer
original unencrypted filename instead.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9421d570

30 8月, 2016 3 次提交

f2fs: set dirty state for filesystem only when updating meta data · 7c4abcbe

由 Chao Yu 提交于 8月 18, 2016

We don't guarantee integrity of user data after checkpoint, since we only
guarantee meta data integrity for data consistency of filesystem.

Due to above reason, we only need to set fs as dirty when meta data is
updated, so that we can skip writing checkpoint in some case of non-meta
data is updated.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7c4abcbe

f2fs: add discard info to sys entry of f2fs status · f83a2584

由 Yunlei He 提交于 8月 18, 2016

This patch add discard block count to sys entry of f2fs status
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f83a2584

f2fs: reduce batch size of fstrim · 2d9e9c32

由 Jaegeuk Kim 提交于 8月 11, 2016

This is to reduce the batch size of fstrim to avoid long latency.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2d9e9c32

25 8月, 2016 1 次提交

f2fs: do not use discard_map for hard disks · 3e025740

由 Jaegeuk Kim 提交于 8月 02, 2016

We don't need to keep discard_map, if disk does not support discard command.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3e025740

19 8月, 2016 1 次提交

Revert "f2fs: use percpu_rw_semaphore" · b873b798

由 Jaegeuk Kim 提交于 8月 04, 2016

LKP reported -36.3% regression of fsmark.files_per_sec due to this patch.
I've confirmed that fxmark [1] has also slight regression for DWAL.

[1] https://github.com/sslab-gatech/fxmark

This reverts commit ec795418.

b873b798

31 7月, 2016 1 次提交
- A
  qstr: constify instances in f2fs · 185de68f
  由 Al Viro 提交于 7月 20, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  185de68f
21 7月, 2016 2 次提交

f2fs: avoid data race when deciding checkpoin in f2fs_sync_file · dd11a5df

由 Jaegeuk Kim 提交于 7月 19, 2016

When fs utilization is almost full, f2fs_sync_file should do checkpoint if
there is not enough space for roll-forward later. (i.e. space_for_roll_forward)
So, currently we have no lock for sbi->alloc_valid_block_count, resulting in
race condition.

In rare case, we can get -ENOSPC when doing roll-forward which triggers

	if (is_valid_blkaddr(sbi, dest, META_POR)) {
		if (src == NULL_ADDR) {
			err = reserve_new_block(&dn);
			f2fs_bug_on(sbi, err);
			...
		}
		...
	}
in do_recover_data.

So, this patch avoids that situation in advance.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

dd11a5df

f2fs: support an ioctl to move a range of data blocks · 4dd6f977

由 Jaegeuk Kim 提交于 7月 08, 2016

This patch implements moving a range of data blocks from source file to
destination file.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4dd6f977

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功