提交 · 939afa943c5290a3b92f01612a792af17bc98115 · openanolis / cloud-kernel

23 2月, 2017 1 次提交

f2fs: return fs_trim if there is no candidate · 25290fa5

由 Jaegeuk Kim 提交于 12月 29, 2016

If there is no candidate to submit discard command during f2fs_trim_fs, let's
return without checkpoint.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

25290fa5

29 1月, 2017 3 次提交

f2fs: relax async discard commands more · 4e6a8d9b

由 Jaegeuk Kim 提交于 12月 29, 2016

This patch relaxes async discard commands to avoid waiting its end_io during
checkpoint.
Instead of waiting them during checkpoint, it will be done when actually reusing
them.

Test on initial partition of nvme drive.

 # time fstrim /mnt/test

Before : 6.158s
After : 4.822s
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4e6a8d9b

f2fs: show the max number of atomic operations · 26a28a0c

由 Jaegeuk Kim 提交于 12月 28, 2016

This patch adds to show the max number of atomic operations which are
conducting concurrently.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

26a28a0c

f2fs: support IO alignment for DATA and NODE writes · 0a595eba

由 Jaegeuk Kim 提交于 12月 14, 2016

This patch implements IO alignment by filling dummy blocks in DATA and NODE
write bios. If we can guarantee, for example, 32KB or 64KB for such the IOs,
we can eliminate underlying dummy page problem which FTL conducts in order to
close MLC or TLC partial written pages.

Note that,
 - it requires "-o mode=lfs".
 - IO size should be power of 2, not exceed BIO_MAX_PAGES, 256.
 - read IO is still 4KB.
 - do checkpoint at fsync, if dummy NODE page was written.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0a595eba

12 12月, 2016 1 次提交

fscrypto: move ioctl processing more fully into common code · db717d8e

由 Eric Biggers 提交于 11月 26, 2016

Multiple bugs were recently fixed in the "set encryption policy" ioctl.
To make it clear that fscrypt_process_policy() and fscrypt_get_policy()
implement ioctls and therefore their implementations must take standard
security and correctness precautions, rename them to
fscrypt_ioctl_set_policy() and fscrypt_ioctl_get_policy(). Make the
latter take in a struct file * to make it consistent with the former.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

db717d8e

08 12月, 2016 1 次提交

f2fs: fix to access nullified flush_cmd_control pointer · 5eba8c5d

由 Jaegeuk Kim 提交于 12月 07, 2016

f2fs_sync_file()             remount_ro
 - f2fs_readonly
                               - destroy_flush_cmd_control
 - f2fs_issue_flush
   - no fcc pointer!

So, this patch doesn't free fcc in this case, but just stop its kernel thread
which sends flush commands.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5eba8c5d

06 12月, 2016 1 次提交

Revert "f2fs: use percpu_counter for # of dirty pages in inode" · 204706c7

由 Jaegeuk Kim 提交于 12月 02, 2016

This reverts commit 1beba1b3.

The perpcu_counter doesn't provide atomicity in single core and consume more
DRAM. That incurs fs_mark test failure due to ENOMEM.

Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

204706c7

30 11月, 2016 1 次提交

f2fs: do not activate auto_recovery for fallocated i_size · 26787236

由 Jaegeuk Kim 提交于 11月 28, 2016

If a file needs to keep its i_size by fallocate, we need to turn off auto
recovery during roll-forward recovery.

This will resolve the below scenario.

1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4096" -c "fsync"
2. xfs_io -f /mnt/f2fs/file -c "falloc -k 4096 4096" -c "fsync"
3. md5sum /mnt/f2fs/file;
4. godown /mnt/f2fs/
5. umount /mnt/f2fs/
6. mount -t f2fs /dev/sdx /mnt/f2fs
7. md5sum /mnt/f2fs/file
Reported-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

26787236

29 11月, 2016 1 次提交

f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack · 8508e44a

由 Jaegeuk Kim 提交于 11月 24, 2016

We don't guarantee cp_addr is fixed by cp_version.
This is to sync with f2fs-tools.

Cc: stable@vger.kernel.org
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8508e44a

26 11月, 2016 8 次提交

f2fs: fix wrong AUTO_RECOVER condition · 97dd26ad

由 Jaegeuk Kim 提交于 11月 16, 2016

If i_size is not aligned to the f2fs's block size, we should not skip inode
update during fsync.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

97dd26ad

f2fs: fix fdatasync · 281518c6

由 Chao Yu 提交于 11月 17, 2016

For below two cases, we can't guarantee data consistence:

a)
1. xfs_io "pwrite 0 4195328" "fsync"
2. xfs_io "pwrite 4195328 1024" "fdatasync"
3. godown
4. umount & mount
--> isize we updated before fdatasync won't be recovered

b)
1. xfs_io "pwrite -S 0xcc 0 4202496" "fsync"
2. xfs_io "fpunch 4194304 4096" "fdatasync"
3. godown
4. umount & mount
--> dnode we punched before fdatasync won't be recovered

The reason is that normally fdatasync won't be aware of modification
of metadata in file, e.g. isize changing, dnode updating, so in ->fsync
we will skip flushing node pages for above cases, result in making
fdatasynced file being lost during recovery.

Currently we have introduced DIRTY_META global list in sbi for tracking
dirty inode selectively, so in fdatasync we can choose to flush nodes
depend on dirty state of current inode in the list.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

281518c6

f2fs: fix to account total free nid correctly · 04d47e67

由 Chao Yu 提交于 11月 17, 2016

Thread A		Thread B		Thread C
- f2fs_create
 - f2fs_new_inode
  - f2fs_lock_op
   - alloc_nid
    alloc last nid
  - f2fs_unlock_op
			- f2fs_create
			 - f2fs_new_inode
			  - f2fs_lock_op
			   - alloc_nid
			    as node count still not
			    be increased, we will
			    loop in alloc_nid
						- f2fs_write_node_pages
						 - f2fs_balance_fs_bg
						  - f2fs_sync_fs
						   - write_checkpoint
						    - block_operations
						     - f2fs_lock_all
 - f2fs_lock_op

While creating new inode, we do not allocate and account nid atomically,
so that when there is almost no free nids left, we may encounter deadloop
like above stack.

In order to avoid that, reuse nm_i::available_nids for accounting free nids
and make nid allocation and counting being atomical during node creation.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

04d47e67

f2fs: don't wait writeback for datas during checkpoint · 36951b38

由 Chao Yu 提交于 11月 16, 2016

Normally, while committing checkpoint, we will wait on all pages to be
writebacked no matter the page is data or metadata, so in scenario where
there are lots of data IO being submitted with metadata, we may suffer
long latency for waiting writeback during checkpoint.

Indeed, we only care about persistence for pages with metadata, but not
pages with data, as file system consistent are only related to metadate,
so in order to avoid encountering long latency in above scenario, let's
recognize and reference metadata in submitted IOs, wait writeback only
for metadatas.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

36951b38

f2fs: avoid BG_GC in f2fs_balance_fs · 7702bdbe

由 Jaegeuk Kim 提交于 11月 14, 2016

If many threads hit has_not_enough_free_secs() in f2fs_balance_fs() at the same
time, all the threads would do FG_GC or BG_GC.
In this critical path, we totally don't need to do BG_GC at all.
Let's avoid that.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7702bdbe

J
f2fs: use err for f2fs_preallocate_blocks · a7de6086
由 Jaegeuk Kim 提交于 11月 11, 2016
```
This patch has no functional change.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
a7de6086

f2fs: support multiple devices · 3c62be17

由 Jaegeuk Kim 提交于 10月 06, 2016

This patch implements multiple devices support for f2fs.
Given multiple devices by mkfs.f2fs, f2fs shows them entirely as one big
volume under one f2fs instance.

Internal block management is very simple, but we will modify block allocation
and background GC policy to boost IO speed by exploiting them accoording to
each device speed.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3c62be17

f2fs: revert segment allocation for direct IO · 6ae1be13

由 Jaegeuk Kim 提交于 11月 11, 2016

Now we don't need to be too much careful about storage alignment for dio, since
its speed becomes quite fast and we'd better avoid any misalignment first.

Revert: 38aa0889 (f2fs: align direct_io'ed data to section)
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6ae1be13

24 11月, 2016 8 次提交

f2fs: Cache zoned block devices zone type · 178053e2

由 Damien Le Moal 提交于 10月 28, 2016

With the zoned block device feature enabled, section discard
need to do a zone reset for sections contained in sequential
zones, and a regular discard (if supported) for sections
stored in conventional zones. Avoid the need for a costly
report zones to obtain a section zone type when discarding it
by caching the types of the device zones in the super block
information. This cache is initialized at mount time for mounts
with the zoned block device feature enabled.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

178053e2

f2fs: Always enable discard for zoned blocks devices · 96ba2dec

由 Damien Le Moal 提交于 10月 28, 2016

Zone write pointer reset acts as discard for zoned block
devices. So if the zoned block device feature is enabled,
always declare that discard is enabled, even if the device
does not actually support the command.
For the same reason, prevent the use the "nodicard" mount
option.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

96ba2dec

f2fs: Use generic zoned block device terminology · 0bfd7a09

由 Damien Le Moal 提交于 10月 28, 2016

SMR stands for "Shingled Magnetic Recording" which makes sense
only for hard disk drives (spinning rust). The ZBC/ZAC standards
enable management of SMR disks, but solid state drives may also
support those standards. So rename the HMSMR feature to BLKZONED
to avoid a HDD centric terminology. For the same reason, rename
f2fs_sb_mounted_hmsmr to f2fs_sb_mounted_blkzoned.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0bfd7a09

f2fs: report error of f2fs_fill_dentries · ed6bd4b1

由 Chao Yu 提交于 10月 29, 2016

Report error of f2fs_fill_dentries to ->iterate_shared, otherwise when
error ocurrs, user may just list part of dirents in target directory
without any hints.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ed6bd4b1

f2fs: remove percpu_count due to performance regression · 35782b23

由 Jaegeuk Kim 提交于 10月 20, 2016

This patch removes percpu_count usage due to performance regression in iozone.

Fixes: 523be8a6 ("f2fs: use percpu_counter for page counters")
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

35782b23

f2fs: keep dirty inodes selectively for checkpoint · 7c45729a

由 Jaegeuk Kim 提交于 10月 14, 2016

This is to avoid no free segment bug during checkpoint caused by a number of
dirty inodes.

The case was reported by Chao like this.
1. mount with lazytime option
2. fill 4k file until disk is full
3. sync filesystem
4. read all files in the image
5. umount

In this case, we actually don't need to flush dirty inode to inode page during
checkpoint.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7c45729a

f2fs: don't interrupt free nids building during nid allocation · 3a2ad567

由 Chao Yu 提交于 10月 11, 2016

Let build_free_nids support sync/async methods, in allocation flow of nids,
we use synchronuous method, so that we can avoid looping in alloc_nid when
free memory is low; in unblock_operations and f2fs_balance_fs_bg we use
asynchronuous method in where low memory condition can interrupt us.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3a2ad567

f2fs: split free nid list · b8559dc2

由 Chao Yu 提交于 10月 12, 2016

During free nid allocation, in order to do preallocation, we will tag free
nid entry as allocated one and still leave it in free nid list, for other
allocators who want to grab free nids, it needs to traverse the free nid
list for lookup. It becomes overhead in scenario of allocating free nid
intensively by multithreads.

This patch splits free nid list to two list: {free,alloc}_nid_list, to
keep free nids and preallocated free nids separately, after that, traverse
latency will be gone, besides split nid_cnt for separate statistic.

Additionally, introduce __insert_nid_to_list and __remove_nid_from_list for
cleanup.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: modify f2fs_bug_on to avoid needless branches]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b8559dc2

28 10月, 2016 1 次提交

block: better op and flags encoding · ef295ecf

由 Christoph Hellwig 提交于 10月 28, 2016

Now that we don't need the common flags to overflow outside the range
of a 32-bit type we can encode them the same way for both the bio and
request fields.  This in addition allows us to place the operation
first (and make some room for more ops while we're at it) and to
stop having to shift around the operation values.

In addition this allows passing around only one value in the block layer
instead of two (and eventuall also in the file systems, but we can do
that later) and thus clean up a lot of code.

Last but not least this allows decreasing the size of the cmd_flags
field in struct request to 32-bits.  Various functions passing this
value could also be updated, but I'd like to avoid the churn for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

ef295ecf

01 10月, 2016 6 次提交

f2fs: support checkpoint error injection · 0f348028

由 Chao Yu 提交于 9月 26, 2016

This patch adds to support checkpoint error injection in f2fs for testing
fatal error tolerance, it will be useful that it can simulate abnormal
power off by f2fs itself instead of calling godown ioctl by running apps.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0f348028

f2fs: support configuring fault injection per superblock · 1ecc0c5c

由 Chao Yu 提交于 9月 23, 2016

Previously, we only support global fault injection configuration, so that
when we configure type/rate of fault injection through sysfs, mount
option, it will influence all f2fs partition which is being used.

It is not make sence, since it will be not convenient if developer want
to test separated partitions with different fault injection rate/type
simultaneously, also it's not possible to enable fault injection in one
partition and disable fault injection in other one.

>From now on, we move global configuration of fault injection in module
into per-superblock, hence injection testing can be more flexible.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1ecc0c5c

f2fs: add customized migrate_page callback · 5b7a487c

由 Weichao Guo 提交于 9月 20, 2016

This patch improves the migration of dirty pages and allows migrating atomic
written pages that F2FS uses in Page Cache. Instead of the fallback releasing
page path, it provides better performance for memory compaction, CMA and other
users of memory page migrating. For dirty pages, there is no need to write back
first when migrating. For an atomic written page before committing, we can
migrate the page and update the related 'inmem_pages' list at the same time.
Signed-off-by: NWeichao Guo <guoweichao@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix some coding style]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5b7a487c

f2fs: introduce cp_lock to protect updating of ckpt_flags · aaec2b1d

由 Chao Yu 提交于 9月 20, 2016

This patch introduces spinlock to protect updating process of ckpt_flags
field in struct f2fs_checkpoint, it avoids incorrectly updating in race
condition.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: add __is_set_ckpt_flags likewise __set_ckpt_flags]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

aaec2b1d

f2fs: fix to avoid race condition when updating sbi flag · fadb2fb8

由 Chao Yu 提交于 9月 20, 2016

Making updating of sbi flag atomic by using {test,set,clear}_bit,
otherwise in concurrency scenario, the flag could be updated incorrectly.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fadb2fb8

f2fs: use crc and cp version to determine roll-forward recovery · a468f0ef

由 Jaegeuk Kim 提交于 9月 19, 2016

Previously, we used cp_version only to detect recoverable dnodes.
In order to avoid same garbage cp_version, we needed to truncate the next
dnode during checkpoint, resulting in additional discard or data write.
If we can distinguish this by using crc in addition to cp_version, we can
remove this overhead.

There is backward compatibility concern where it changes node_footer layout.
So, this patch introduces a new checkpoint flag, CP_CRC_RECOVERY_FLAG, to
detect new layout. New layout will be activated only when this flag is set.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a468f0ef

23 9月, 2016 3 次提交

f2fs: show dirty inode number · 5bc994a0

由 Chao Yu 提交于 9月 18, 2016

This patch enables showing dirty inode number in procfs.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5bc994a0

f2fs: support IO error injection · 8b038c70

由 Chao Yu 提交于 9月 18, 2016

This patch adds to support IO error injection for testing IO error
tolerance of f2fs.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8b038c70

f2fs: make f2fs_filetype_table static · ebfa7322

由 Chao Yu 提交于 9月 18, 2016

There is no more user of f2fs_filetype_table outside of dir.c, make it
static.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ebfa7322

14 9月, 2016 1 次提交

f2fs: avoid ENOMEM during roll-forward recovery · e8ea9b3d

由 Jaegeuk Kim 提交于 9月 09, 2016

This patch gives another chances during roll-forward recovery regarding to
-ENOMEM.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e8ea9b3d

08 9月, 2016 4 次提交

f2fs: add roll-forward recovery process for encrypted dentry · e7ba108a

由 Shuoran Liu 提交于 8月 29, 2016

Add roll-forward recovery process for encrypted dentry, so the first fsync
issued to an encrypted file does not need writing checkpoint.

This improves the performance of the following test at thousands of small
files: open -> write -> fsync -> close
Signed-off-by: NShuoran Liu <liushuoran@huawei.com>
Acked-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: modify kernel message to show encrypted names]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e7ba108a

f2fs: fix lost xattrs of directories · bbf156f7

由 Jaegeuk Kim 提交于 8月 29, 2016

This patch enhances the xattr consistency of dirs from suddern power-cuts.

Possible scenario would be:
1. dir->setxattr used by per-file encryption
2. file->setxattr goes into inline_xattr
3. file->fsync

In that case, we should do checkpoint for #1.
Otherwise we'd lose dir's key information for the file given #2.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bbf156f7

f2fs: support async discard · 275b66b0

由 Chao Yu 提交于 8月 29, 2016

Like most filesystems, f2fs will issue discard command synchronously, so
when user trigger fstrim through ioctl, multiple discard commands will be
issued serially with sync mode, which makes poor performance.

In this patch we try to support async discard, so that all discard
commands can be issued and be waited for endio in batch to improve
performance.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

275b66b0

f2fs: fix to do security initialization of encrypted inode with original filename · 9421d570

由 Chao Yu 提交于 8月 28, 2016

When creating new inode, security_inode_init_security will be called for
initializing security info related to the inode, and filename is passed to
security module, it helps security module such as SElinux to know which
rule or label could be applied for the inode with specified name.

Previously, if new inode is created as an encrypted one, f2fs will transfer
encrypted filename to security module which may fail the check of security
policy belong to the inode. So in order to this issue, alter to transfer
original unencrypted filename instead.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9421d570

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功