提交 · fe8494bfc8c2914fca821d4ae994aef039be5cf1 · openeuler / raspberrypi-kernel

19 8月, 2016 3 次提交

f2fs: allow copying file range only in between regular files · fe8494bf

由 Chao Yu 提交于 8月 04, 2016

Only if two input files are regular files, we allow copying data in
range of them, otherwise, deny it.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fe8494bf

Revert "f2fs: move i_size_write in f2fs_write_end" · 3024c9a1

由 Chao Yu 提交于 8月 06, 2016

This reverts commit a2ee0a30.

When testing with generic/032 of xfstest suit, failure message will be
reported as below:

generic/032 8s ... [failed, exit status 1] - output mismatch (see results/generic/032.out.bad)
    --- tests/generic/032.out	2015-01-11 16:52:27.643681072 +0800
    +++ results/generic/032.out.bad	2016-08-06 13:44:43.861330500 +0800
    @@ -1,5 +1,5 @@
     QA output created by 032
    -100 iterations
    -0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
    -*
    -0100000
    +1: [768..775]: unwritten
    +Unwritten extents found!
    ...
    (Run 'diff -u tests/generic/032.out results/generic/032.out.bad'  to see the entire diff)
Ran: generic/032
Failures: generic/032
Failed 1 of 1 tests

In write_end(), we should update i_size of inode before unlock page,
otherwise, we will lose newly updated data in following race condition.

Thread A			Thread B
- write_end
 - unlock page
				- writepages
				 - lock_page
				  - writepage
				  if page is out-of-range of file size,
				  we will skip writting the page.
 - update i_size
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3024c9a1

Revert "f2fs: use percpu_rw_semaphore" · b873b798

由 Jaegeuk Kim 提交于 8月 04, 2016

LKP reported -36.3% regression of fsmark.files_per_sec due to this patch.
I've confirmed that fxmark [1] has also slight regression for DWAL.

[1] https://github.com/sslab-gatech/fxmark

This reverts commit ec795418.

b873b798

05 8月, 2016 1 次提交

f2fs: drop bio->bi_rw manual assignment · 1aee6b9a

由 Jens Axboe 提交于 7月 27, 2016

Merge 4fc29c1a included this extra line, but it's not needed (or
useful) since we'll bio_set_op_attrs() right after to properly set
the op and flags for the bio.
Signed-off-by: NJens Axboe <axboe@fb.com>

1aee6b9a

31 7月, 2016 1 次提交
- A
  qstr: constify instances in f2fs · 185de68f
  由 Al Viro 提交于 7月 20, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  185de68f
27 7月, 2016 1 次提交

mm, memcg: use consistent gfp flags during readahead · 8a5c743e

由 Michal Hocko 提交于 7月 26, 2016

Vladimir has noticed that we might declare memcg oom even during
readahead because read_pages only uses GFP_KERNEL (with mapping_gfp
restriction) while __do_page_cache_readahead uses
page_cache_alloc_readahead which adds __GFP_NORETRY to prevent from
OOMs.  This gfp mask discrepancy is really unfortunate and easily
fixable.  Drop page_cache_alloc_readahead() which only has one user and
outsource the gfp_mask logic into readahead_gfp_mask and propagate this
mask from __do_page_cache_readahead down to read_pages.

This alone would have only very limited impact as most filesystems are
implementing ->readpages and the common implementation mpage_readpages
does GFP_KERNEL (with mapping_gfp restriction) again.  We can tell it to
use readahead_gfp_mask instead as this function is called only during
readahead as well.  The same applies to read_cache_pages.

ext4 has its own ext4_mpage_readpages but the path which has pages !=
NULL can use the same gfp mask.  Btrfs, cifs, f2fs and orangefs are
doing a very similar pattern to mpage_readpages so the same can be
applied to them as well.

[akpm@linux-foundation.org: coding-style fixes]
[mhocko@suse.com: restrict gfp mask in mpage_alloc]
  Link: http://lkml.kernel.org/r/20160610074223.GC32285@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/1465301556-26431-1-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Chris Mason <clm@fb.com>
Cc: Steve French <sfrench@samba.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: Mike Marshall <hubcap@omnibond.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Changman Lee <cm224.lee@samsung.com>
Cc: Chao Yu <yuchao0@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8a5c743e

26 7月, 2016 1 次提交
- J
  f2fs: clean up coding style and redundancy · 5302fb00
  由 Jaegeuk Kim 提交于 7月 22, 2016
```
This patch includes minor clean-ups.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
  5302fb00
23 7月, 2016 1 次提交

f2fs: get victim segment again after new cp · fe94793e

由 Yunlei He 提交于 7月 22, 2016

Previous selected segment may become free after write_checkpoint,
if we do garbage collect on this segment, and then new_curseg happen
to reuse it, it may cause f2fs_bug_on as below.

	panic+0x154/0x29c
	do_garbage_collect+0x15c/0xaf4
	f2fs_gc+0x2dc/0x444
	f2fs_balance_fs.part.22+0xcc/0x14c
	f2fs_balance_fs+0x28/0x34
	f2fs_map_blocks+0x5ec/0x790
	f2fs_preallocate_blocks+0xe0/0x100
	f2fs_file_write_iter+0x64/0x11c
	new_sync_write+0xac/0x11c
	vfs_write+0x144/0x1e4
	SyS_write+0x60/0xc0

Here, maybe we check sit and ssa type during reset_curseg. So, we check
segment is stale or not, and select a new victim to avoid this.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fe94793e

21 7月, 2016 5 次提交

block: get rid of bio_rw and READA · 70246286

由 Christoph Hellwig 提交于 7月 19, 2016

These two are confusing leftover of the old world order, combining
values of the REQ_OP_ and REQ_ namespaces.  For callers that don't
special case we mostly just replace bi_rw with bio_data_dir or
op_is_write, except for the few cases where a switch over the REQ_OP_
values makes more sense.  Any check for READA is replaced with an
explicit check for REQ_RAHEAD.  Also remove the READA alias for
REQ_RAHEAD.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

70246286

f2fs: handle error case with f2fs_bug_on · 6f3ec995

由 Jaegeuk Kim 提交于 7月 19, 2016

It's enough to show BUG or WARN by f2fs_bug_on for error case.
Then, we don't need to remain corrupted filesystem.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6f3ec995

f2fs: avoid data race when deciding checkpoin in f2fs_sync_file · dd11a5df

由 Jaegeuk Kim 提交于 7月 19, 2016

When fs utilization is almost full, f2fs_sync_file should do checkpoint if
there is not enough space for roll-forward later. (i.e. space_for_roll_forward)
So, currently we have no lock for sbi->alloc_valid_block_count, resulting in
race condition.

In rare case, we can get -ENOSPC when doing roll-forward which triggers

	if (is_valid_blkaddr(sbi, dest, META_POR)) {
		if (src == NULL_ADDR) {
			err = reserve_new_block(&dn);
			f2fs_bug_on(sbi, err);
			...
		}
		...
	}
in do_recover_data.

So, this patch avoids that situation in advance.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

dd11a5df

f2fs: support an ioctl to move a range of data blocks · 4dd6f977

由 Jaegeuk Kim 提交于 7月 08, 2016

This patch implements moving a range of data blocks from source file to
destination file.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4dd6f977

f2fs: fix to report error number of f2fs_find_entry · 91246c21

由 Chao Yu 提交于 7月 19, 2016

This patch fixes to report the right error number of f2fs_find_entry to
its caller.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

91246c21

19 7月, 2016 1 次提交
- J
  f2fs: avoid memory allocation failure due to a long length · 363cad7f
  由 Jaegeuk Kim 提交于 7月 16, 2016
```
We need to avoid ENOMEM due to unexpected long length.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
  363cad7f
16 7月, 2016 7 次提交

f2fs: reset default idle interval value · dcf25fe8

由 Chao Yu 提交于 7月 15, 2016

The default value of idle interval is 2 mins, but for most time when
screen shutdown, there are still operations during the 2 mins interval,
and gc's sleep time is about 30 secs to 60 secs, so there is almost no
chance for GC thread to do garbage collecting.

Set default value of idle interval value from 2 mins to 5 secs for
fixing.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

dcf25fe8

f2fs: use blk_plug in all the possible paths · 9dfa1baf

由 Jaegeuk Kim 提交于 7月 13, 2016

This patch reverts 19a5f5e2 (f2fs: drop any block plugging),
and adds blk_plug in write paths additionally.

The main reason is that blk_start_plug can be used to wake up from low-power
mode before submitting further bios.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9dfa1baf

f2fs: fix to avoid data update racing between GC and DIO · 82e0a5aa

由 Chao Yu 提交于 7月 13, 2016

Datas in file can be operated by GC and DIO simultaneously, so we will
face race case as below:

For write case:
Thread A				Thread B
- generic_file_direct_write
 - invalidate_inode_pages2_range
 - f2fs_direct_IO
  - do_blockdev_direct_IO
   - do_direct_IO
    - get_more_blocks
					- f2fs_gc
					 - do_garbage_collect
					  - gc_data_segment
					   - move_data_page
					    - do_write_data_page
					    migrate data block to new block address
   - dio_bio_submit
   update user data to old block address

For read case:
Thread A                                Thread B
- generic_file_direct_write
 - invalidate_inode_pages2_range
 - f2fs_direct_IO
  - do_blockdev_direct_IO
   - do_direct_IO
    - get_more_blocks
					- f2fs_balance_fs
					 - f2fs_gc
					  - do_garbage_collect
					   - gc_data_segment
					    - move_data_page
					     - do_write_data_page
					     migrate data block to new block address
					  - write_checkpoint
					   - do_checkpoint
					    - clear_prefree_segments
					     - f2fs_issue_discard
                                             discard old block adress
   - dio_bio_submit
   update user buffer from obsolete block address

In order to fix this, for one file, we should let DIO and GC getting exclusion
against with each other.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

82e0a5aa

f2fs: add maximum prefree segments · 44a83499

由 Jaegeuk Kim 提交于 7月 13, 2016

In 1TB storage, we need to admit 22841 prefree segments, which can consume
too much segments.
This patch sets 8GB in max. prefree segments in that case.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

44a83499

f2fs: disable extent_cache for fcollapse/finsert inodes · 5f281fab

由 Jaegeuk Kim 提交于 7月 12, 2016

This reduces the elapsed time to do xfstests/generic/017.

Before: 458 s
After:  390 s
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

5f281fab

f2fs: refactor __exchange_data_block for speed up · 0a2aa8fb

由 Jaegeuk Kim 提交于 7月 08, 2016

This reduces the elapsed time to do xfstests/generic/017.

Before: 715 s
After:  458 s
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0a2aa8fb

f2fs: fix ERR_PTR returned by bio · 1d353eb7

由 Jaegeuk Kim 提交于 7月 12, 2016

This is to fix wrong error pointer handling flow reported by Dan.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NChao Yu <chao@kernel.org>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1d353eb7

09 7月, 2016 15 次提交

f2fs: avoid mark_inode_dirty · b56ab837

由 Jaegeuk Kim 提交于 6月 30, 2016

Let's check inode's dirtiness before calling mark_inode_dirty.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b56ab837

f2fs: move i_size_write in f2fs_write_end · a2ee0a30

由 Jaegeuk Kim 提交于 7月 07, 2016

We don't need to do i_size_write under page lock.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a2ee0a30

f2fs: fix to avoid redundant discard during fstrim · c24a0fd6

由 Chao Yu 提交于 7月 07, 2016

With below test steps, f2fs will issue redundant discard when doing fstrim,
the reason is that we issue discards for both prefree segments and
consecutive freed region user wants to trim, part regions they covered are
overlapped, here, we change to do not to issue any discards for prefree
segments in trimmed range.

1. mount -t f2fs -o discard /dev/zram0 /mnt/f2fs
2. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/
3. dd if=/dev/zero  of=/mnt/f2fs/a bs=2M count=1
4. dd if=/dev/zero  of=/mnt/f2fs/b bs=1M count=1
5. sync
6. rm /mnt/f2fs/a /mnt/f2fs/b
7. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/

Before:
<...>-5428  [001] ...1  9511.052125: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x200
<...>-5428  [001] ...1  9511.052787: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300

After:
<...>-6764  [000] ...1  9720.382504: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c24a0fd6

f2fs: avoid mismatching block range for discard · c7b41e16

由 Yunlei He 提交于 7月 07, 2016

This patch skip discard block range smaller than trim_minlen,
and can not be merged by neighbour
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c7b41e16

f2fs: fix incorrect f_bfree calculation in ->statfs · 3e6d0b4d

由 Chao Yu 提交于 7月 06, 2016

As manual described, f_bfree indicates total free blocks in fs, in f2fs, it
includes two parts: visible free blocks and over-provision blocks. This
patch corrrects the calculation.

fsblkcnt_t   f_bfree;   /* free blocks in fs */
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3e6d0b4d

f2fs: use percpu_rw_semaphore · ec795418

由 Jaegeuk Kim 提交于 6月 30, 2016

This patch replaces rw_semaphore with percpu_rw_semaphore for:
sbi->cp_rwsem
nm_i->nat_tree_lock
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ec795418

J
f2fs: skip to check the block address of node page · 3bdad3c7
由 Jaegeuk Kim 提交于 6月 30, 2016
```
If the node page is up-to-date, it should be alive.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
3bdad3c7

f2fs: shrink critical region in spin_lock · 2555a2d5

由 Jaegeuk Kim 提交于 6月 30, 2016

This patch shrinks the critical region in spin_lock.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2555a2d5

f2fs: call SetPageUptodate if needed · 237c0790

由 Jaegeuk Kim 提交于 6月 30, 2016

SetPageUptodate() issues memory barrier, resulting in performance degrdation.
Let's avoid that.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

237c0790

f2fs: introduce f2fs_set_page_dirty_nobuffer · fe76b796

由 Jaegeuk Kim 提交于 6月 30, 2016

This patch adds f2fs_set_page_dirty_nobuffer() copied from __set_page_dirty_buffer.
When appending 4KB blocks in f2fs on pmem with multiple cores, this improves the
overall performance.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fe76b796

f2fs: remove unnecessary goto statement · a0995af6

由 Tiezhu Yang 提交于 6月 28, 2016

When base_addr is NULL, there is no need to call kzfree,
it should return -ENOMEM directly. Additionally, it is
better to initialize variable 'error' with 0.
Signed-off-by: NTiezhu Yang <kernelpatch@126.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a0995af6

f2fs: add nodiscard mount option · 64058be9

由 Chao Yu 提交于 7月 03, 2016

This patch adds 'nodiscard' mount option.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

64058be9

f2fs: fix to redirty page if fail to gc data page · 72e1c797

由 Chao Yu 提交于 7月 03, 2016

If we fail to move data page during foreground GC, we should give another
chance to writeback that page which was set dirty previously by writer.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

72e1c797

f2fs: fix to detect truncation prior rather than EIO during read · 1563ac75

由 Chao Yu 提交于 7月 03, 2016

In procedure of synchonized read, after sending out the read request, reader
will try to lock the page for waiting device to finish the read jobs and
unlock the page, but meanwhile, truncater will race with reader, so after
reader get lock of the page, it should check page's mapping to detect
whether someone has truncated the page in advance, then reader has the
chance to do the retry if truncation was done, otherwise read can be failed
due to previous condition check.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1563ac75

f2fs: fix to avoid reading out encrypted data in page cache · 78682f79

由 Chao Yu 提交于 7月 03, 2016

For encrypted inode, if user overwrites data of the inode, f2fs will read
encrypted data into page cache, and then do the decryption.

However reader can race with overwriter, and it will see encrypted data
which has not been decrypted by overwriter yet. Fix it by moving decrypting
work to background and keep page non-uptodated until data is decrypted.

Thread A				Thread B
- f2fs_file_write_iter
 - __generic_file_write_iter
  - generic_perform_write
   - f2fs_write_begin
    - f2fs_submit_page_bio
					- generic_file_read_iter
					 - do_generic_file_read
					  - lock_page_killable
					  - unlock_page
					  - copy_page_to_iter
					  hit the encrypted data in updated page
    - lock_page
    - fscrypt_decrypt_page
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

78682f79

07 7月, 2016 4 次提交

f2fs: avoid latency-critical readahead of node pages · ac6f1999

由 Jaegeuk Kim 提交于 6月 16, 2016

The f2fs_map_blocks is very related to the performance, so let's avoid any
latency to read ahead node pages.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ac6f1999

J
f2fs: avoid writing node/metapages during writes · 2c237eba
由 Jaegeuk Kim 提交于 6月 16, 2016
```
Let's keep more node/meta pages in run time.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
2c237eba

f2fs: produce more nids and reduce readahead nats · ad4edb83

由 Jaegeuk Kim 提交于 6月 16, 2016

The readahead nat pages are more likely to be reclaimed quickly, so it'd better
to gather more free nids in advance.

And, let's keep some free nids as much as possible.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ad4edb83

f2fs: detect host-managed SMR by feature flag · 52763a4b

由 Jaegeuk Kim 提交于 6月 13, 2016

If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs
by default.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

52763a4b