提交 · a0995af69554cecd55c8d2b8c4e4418b84737fd0 · openeuler / Kernel

09 7月, 2016 2 次提交

f2fs: fix to detect truncation prior rather than EIO during read · 1563ac75

由 Chao Yu 提交于 7月 03, 2016

In procedure of synchonized read, after sending out the read request, reader
will try to lock the page for waiting device to finish the read jobs and
unlock the page, but meanwhile, truncater will race with reader, so after
reader get lock of the page, it should check page's mapping to detect
whether someone has truncated the page in advance, then reader has the
chance to do the retry if truncation was done, otherwise read can be failed
due to previous condition check.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1563ac75

f2fs: fix to avoid reading out encrypted data in page cache · 78682f79

由 Chao Yu 提交于 7月 03, 2016

For encrypted inode, if user overwrites data of the inode, f2fs will read
encrypted data into page cache, and then do the decryption.

However reader can race with overwriter, and it will see encrypted data
which has not been decrypted by overwriter yet. Fix it by moving decrypting
work to background and keep page non-uptodated until data is decrypted.

Thread A				Thread B
- f2fs_file_write_iter
 - __generic_file_write_iter
  - generic_perform_write
   - f2fs_write_begin
    - f2fs_submit_page_bio
					- generic_file_read_iter
					 - do_generic_file_read
					  - lock_page_killable
					  - unlock_page
					  - copy_page_to_iter
					  hit the encrypted data in updated page
    - lock_page
    - fscrypt_decrypt_page
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

78682f79

07 7月, 2016 2 次提交

f2fs: avoid latency-critical readahead of node pages · ac6f1999

由 Jaegeuk Kim 提交于 6月 16, 2016

The f2fs_map_blocks is very related to the performance, so let's avoid any
latency to read ahead node pages.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ac6f1999

f2fs: detect host-managed SMR by feature flag · 52763a4b

由 Jaegeuk Kim 提交于 6月 13, 2016

If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs
by default.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

52763a4b

14 6月, 2016 1 次提交

f2fs: introduce mode=lfs mount option · 36abef4e

由 Jaegeuk Kim 提交于 6月 03, 2016

This mount option is to enable original log-structured filesystem forcefully.
So, there should be no random writes for main area.

Especially, this supports host-managed SMR device.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

36abef4e

09 6月, 2016 2 次提交

f2fs: drop any block plugging · 19a5f5e2

由 Jaegeuk Kim 提交于 6月 04, 2016

In f2fs, we don't need to keep block plugging for NODE and DATA writes, since
we already merged bios as much as possible.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

19a5f5e2

f2fs: set mapping error for EIO · 7f319975

由 Jaegeuk Kim 提交于 6月 03, 2016

If EIO occurred, we need to set all the mapping to avoid any further IOs.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7f319975

03 6月, 2016 9 次提交

f2fs: handle writepage correctly · b230e6ca

由 Jaegeuk Kim 提交于 5月 29, 2016

Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls
f2fs_write_data_page().
If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage()
calls mapping_set_error(). But, this should not happen at every time, since
sometimes f2fs_write_data_page() tries to skip writing pages without error.
For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed
out.
Reported-by: NShuoran Liu <liushuoran@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b230e6ca

f2fs: remove two steps to flush dirty data pages · 46ae957f

由 Jaegeuk Kim 提交于 5月 25, 2016

If there is no cold page, we don't need to do a loop to flush dirty
data pages.

On /dev/pmem0,

1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
 Before : 1.1 GB/s
 After  : 1.2 GB/s

2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
 Before : 2.2 GB/s
 After  : 2.3 GB/s
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

46ae957f

f2fs: do not skip writing data pages · 28ea6162

由 Jaegeuk Kim 提交于 5月 25, 2016

For data pages, let's try to flush as much as possible in background.

On /dev/pmem0,

1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
 Before : 800 MB/s
 After  : 1.1 GB/s

2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
 Before : 1.3 GB/s
 After  : 2.2 GB/s
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

28ea6162

f2fs: remove writepages lock · b93f7712

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch removes writepages lock.
We can improve multi-threading performance.

tiobench, 32 threads, 4KB write per fsync on SSD
Before: 25.88 MB/s
After: 28.03 MB/s
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b93f7712

f2fs: avoid unnecessary updating inode during fsync · 26de9b11

由 Jaegeuk Kim 提交于 5月 20, 2016

If roll-forward recovery can recover i_size, we don't need to update inode's
metadata during fsync.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

26de9b11

f2fs: remove syncing inode page in all the cases · ee6d182f

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()

Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ee6d182f

f2fs: introduce f2fs_i_blocks_write with mark_inode_dirty_sync · 8edd03c8

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch introduces f2fs_i_blocks_write() to call mark_inode_dirty_sync() when
changing inode->i_blocks.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8edd03c8

f2fs: introduce f2fs_i_size_write with mark_inode_dirty_sync · fc9581c8

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch introduces f2fs_i_size_write() to call mark_inode_dirty_sync() with
i_size_write().
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fc9581c8

f2fs: use inode pointer for {set, clear}_inode_flag · 91942321

由 Jaegeuk Kim 提交于 5月 20, 2016

This patch refactors to use inode pointer for set_inode_flag and
clear_inode_flag.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

91942321

21 5月, 2016 1 次提交

f2fs: flush pending bios right away when error occurs · 38f91ca8

由 Jaegeuk Kim 提交于 5月 18, 2016

Given errors, this patch flushes pending bios as soon as possible.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

38f91ca8

19 5月, 2016 1 次提交
- J
  f2fs: use bio count instead of F2FS_WRITEBACK page count · f5730184
  由 Jaegeuk Kim 提交于 5月 17, 2016
```
This can reduce page counting overhead.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
```
  f5730184
12 5月, 2016 2 次提交

f2fs: fix deadlock when flush inline data · ab47036d

由 Chao Yu 提交于 5月 11, 2016

Below backtrace info was reported by Yunlei He:

Call Trace:
 [<ffffffff817a9395>] schedule+0x35/0x80
 [<ffffffff817abb7d>] rwsem_down_read_failed+0xed/0x130
 [<ffffffff813c12a8>] call_rwsem_down_read_failed+0x18/0x
 [<ffffffff817ab1d0>] down_read+0x20/0x30
 [<ffffffffa02a1a12>] f2fs_evict_inode+0x242/0x3a0 [f2fs]
 [<ffffffff81217057>] evict+0xc7/0x1a0
 [<ffffffff81217cd6>] iput+0x196/0x200
 [<ffffffff812134f9>] __dentry_kill+0x179/0x1e0
 [<ffffffff812136f9>] dput+0x199/0x1f0
 [<ffffffff811fe77b>] __fput+0x18b/0x220
 [<ffffffff811fe84e>] ____fput+0xe/0x10
 [<ffffffff81097427>] task_work_run+0x77/0x90
 [<ffffffff81074d62>] exit_to_usermode_loop+0x73/0xa2
 [<ffffffff81003b7a>] do_syscall_64+0xfa/0x110
 [<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25

Call Trace:
 [<ffffffff817a9395>] schedule+0x35/0x80
 [<ffffffff81216dc3>] __wait_on_freeing_inode+0xa3/0xd0
 [<ffffffff810bc300>] ? autoremove_wake_function+0x40/0x4
 [<ffffffff8121771d>] find_inode_fast+0x7d/0xb0
 [<ffffffff8121794a>] ilookup+0x6a/0xd0
 [<ffffffffa02bc740>] sync_node_pages+0x210/0x650 [f2fs]
 [<ffffffff8122e690>] ? do_fsync+0x70/0x70
 [<ffffffffa02b085e>] block_operations+0x9e/0xf0 [f2fs]
 [<ffffffff8137b795>] ? bio_endio+0x55/0x60
 [<ffffffffa02b0942>] write_checkpoint+0x92/0xba0 [f2fs]
 [<ffffffff8117da57>] ? mempool_free_slab+0x17/0x20
 [<ffffffff8117de8b>] ? mempool_free+0x2b/0x80
 [<ffffffff8122e690>] ? do_fsync+0x70/0x70
 [<ffffffffa02a53e3>] f2fs_sync_fs+0x63/0xd0 [f2fs]
 [<ffffffff8129630f>] ? ext4_sync_fs+0xbf/0x190
 [<ffffffff8122e6b0>] sync_fs_one_sb+0x20/0x30
 [<ffffffff812002e9>] iterate_supers+0xb9/0x110
 [<ffffffff8122e7b5>] sys_sync+0x55/0x90
 [<ffffffff81003ae9>] do_syscall_64+0x69/0x110
 [<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25

With following excuting serials, we will set inline_node in inode page
after inode was unlinked, result in a deadloop described as below:
1. open file
2. write file
3. unlink file
4. write file
5. close file

Thread A				Thread B
 - dput
  - iput_final
   - inode->i_state |= I_FREEING
   - evict
    - f2fs_evict_inode
					 - f2fs_sync_fs
					  - write_checkpoint
					   - block_operations
					    - f2fs_lock_all (down_write(cp_rwsem))
     - f2fs_lock_op (down_read(cp_rwsem))
					    - sync_node_pages
					     - ilookup
					      - find_inode_fast
					       - __wait_on_freeing_inode
					         (wait on I_FREEING clear)

Here, we change to set inline_node flag only for linked inode for fixing.
Reported-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Tested-by: NJaegeuk Kim <jaegeuk@kernel.org>
Cc: stable@vger.kernel.org # v4.6
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ab47036d

f2fs: support in batch multi blocks preallocation · 46008c6d

由 Chao Yu 提交于 5月 09, 2016

This patch introduces reserve_new_blocks to make preallocation of multi
blocks as in batch operation, so it can avoid lots of redundant
operation, result in better performance.

In virtual machine, with rotational device:

time fallocate -l 32G /mnt/f2fs/file

Before:
real	0m4.584s
user	0m0.000s
sys	0m4.580s

After:
real	0m0.292s
user	0m0.000s
sys	0m0.272s

In x86, with SSD:

time fallocate -l 500G $MNT/testfile

Before : 24.758 s
After  :  1.604 s
Signed-off-by: NChao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix bugs and add performance numbers measured in x86.]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

46008c6d

08 5月, 2016 2 次提交

f2fs: do not preallocate block unaligned to 4KB · 0080c507

由 Jaegeuk Kim 提交于 5月 07, 2016

Previously f2fs_preallocate_blocks() tries to allocate unaligned blocks.
In f2fs_write_begin(), however, prepare_write_begin() does not skip its
allocation due to (len != 4KB).
So, it needs locking node page twice unexpectedly.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0080c507

f2fs: fix incorrect mapping in ->bmap · 43473f96

由 Chao Yu 提交于 5月 05, 2016

Currently, generic_block_bmap is used in f2fs_bmap, its semantics is when
the mapping is been found, return position of target physical block,
otherwise return zero.

But, previously, when there is no mapping info for specified logical block,
f2fs_bmap will map target physical block to a uninitialized variable, which
should be wrong. Fix it.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

43473f96

04 5月, 2016 1 次提交

f2fs: fix to clear private data in page · 23dc974e

由 Chao Yu 提交于 4月 29, 2016

Private data in page should be removed during ->releasepage or
->invalidatepage, otherwise garbage data would be remained in that page.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

23dc974e

02 5月, 2016 1 次提交

direct-io: eliminate the offset argument to ->direct_IO · c8b8e32d

由 Christoph Hellwig 提交于 4月 07, 2016

Including blkdev_direct_IO and dax_do_io.  It has to be ki_pos to actually
work, so eliminate the superflous argument.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c8b8e32d

27 4月, 2016 2 次提交

f2fs: issue cache flush on direct IO · 6bfc4919

由 Jaegeuk Kim 提交于 4月 18, 2016

Under direct IO path with O_(D)SYNC, it needs to set proper APPEND or UPDATE
flags, so taht f2fs_sync_file can make its data safe.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6bfc4919

f2fs: avoid writing 0'th page in volatile writes · e6e5f561

由 Jaegeuk Kim 提交于 4月 14, 2016

The first page of volatile writes usually contains a sort of header information
which will be used for recovery.
(e.g., journal header of sqlite)

If this is written without other journal data, user needs to handle the stale
journal information.
Acked-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e6e5f561

15 4月, 2016 1 次提交

f2fs: remove redundant condition check · 4da7bf5a

由 Jaegeuk Kim 提交于 4月 06, 2016

This patch resolves the redundant condition check reported by David.
Reported-by: NDavid Binderman <dcb314@hotmail.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4da7bf5a

13 4月, 2016 1 次提交

fscrypto: don't let data integrity writebacks fail with ENOMEM · b32e4482

由 Jaegeuk Kim 提交于 4月 11, 2016

This patch fixes the issue introduced by the ext4 crypto fix in a same manner.
For F2FS, however, we flush the pending IOs and wait for a while to acquire free
memory.

Fixes: c9af28fd ("ext4 crypto: don't let data integrity writebacks fail with ENOMEM")
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b32e4482

05 4月, 2016 1 次提交

mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf

由 Kirill A. Shutemov 提交于 4月 01, 2016

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09cbfeaf

18 3月, 2016 1 次提交

fs crypto: move per-file encryption from f2fs tree to fs/crypto · 0b81d077

由 Jaegeuk Kim 提交于 5月 15, 2015

This patch adds the renamed functions moved from the f2fs crypto files.

1. definitions for per-file encryption used by ext4 and f2fs.

2. crypto.c for encrypt/decrypt functions
 a. IO preparation:
  - fscrypt_get_ctx / fscrypt_release_ctx
 b. before IOs:
  - fscrypt_encrypt_page
  - fscrypt_decrypt_page
  - fscrypt_zeroout_range
 c. after IOs:
  - fscrypt_decrypt_bio_pages
  - fscrypt_pullback_bio_page
  - fscrypt_restore_control_page

3. policy.c supporting context management.
 a. For ioctls:
  - fscrypt_process_policy
  - fscrypt_get_policy
 b. For context permission
  - fscrypt_has_permitted_context
  - fscrypt_inherit_context

4. keyinfo.c to handle permissions
  - fscrypt_get_encryption_info
  - fscrypt_free_encryption_info

5. fname.c to support filename encryption
 a. general wrapper functions
  - fscrypt_fname_disk_to_usr
  - fscrypt_fname_usr_to_disk
  - fscrypt_setup_filename
  - fscrypt_free_filename

 b. specific filename handling functions
  - fscrypt_fname_alloc_buffer
  - fscrypt_fname_free_buffer

6. Makefile and Kconfig

Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: NMichael Halcrow <mhalcrow@google.com>
Signed-off-by: NIldar Muslukhov <ildarm@google.com>
Signed-off-by: NUday Savagaonkar <savagaon@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0b81d077

27 2月, 2016 2 次提交

f2fs: introduce f2fs_flush_merged_bios for cleanup · 406657dd

由 Chao Yu 提交于 2月 24, 2016

Add a new helper f2fs_flush_merged_bios to clean up redundant codes.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

406657dd

f2fs: introduce f2fs_update_data_blkaddr for cleanup · f28b3434

由 Chao Yu 提交于 2月 24, 2016

Add a new help f2fs_update_data_blkaddr to clean up redundant codes.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f28b3434

23 2月, 2016 8 次提交

f2fs: trace old block address for CoWed page · 7a9d7548

由 Chao Yu 提交于 2月 22, 2016

This patch enables to trace old block address of CoWed page for better
debugging.

f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE

f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA

f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7a9d7548

f2fs: support revoking atomic written pages · 28bc106b

由 Chao Yu 提交于 2月 06, 2016

f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file

With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.

But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.

So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.

If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

28bc106b

f2fs crypto: f2fs_page_crypto() doesn't need a encryption context · ce855a3b

由 Jaegeuk Kim 提交于 2月 05, 2016

This patch adopts:
	ext4 crypto: ext4_page_crypto() doesn't need a encryption context

Since ext4_page_crypto() doesn't need an encryption context (at least
not any more), this allows us to simplify a number function signature
and also allows us to avoid needing to allocate a context in
ext4_block_write_begin().  It also means we no longer need a separate
ext4_decrypt_one() function.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ce855a3b

f2fs: preallocate blocks for buffered aio writes · 24b84912

由 Jaegeuk Kim 提交于 2月 03, 2016

This patch preallocates data blocks for buffered aio writes.
With this patch, we can avoid redundant locking and unlocking of node pages
given consecutive aio request.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

24b84912

f2fs: move dio preallocation into f2fs_file_write_iter · b439b103

由 Jaegeuk Kim 提交于 2月 03, 2016

This patch moves preallocation code for direct IOs into f2fs_file_write_iter.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b439b103

f2fs: fix missing skip pages info · d31c7c3f

由 Yunlei He 提交于 2月 04, 2016

fix missing skip pages info in f2fs_writepages trace event.
Signed-off-by: NYunlei He <heyunlei@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d31c7c3f

f2fs: introduce f2fs_submit_merged_bio_cond · 0c3a5797

由 Chao Yu 提交于 1月 18, 2016

f2fs use single bio buffer per type data (META/NODE/DATA) for caching
writes locating in continuous block address as many as possible, after
submitting, these writes may be still cached in bio buffer, so we have
to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.

Unfortunately, in the scenario of high concurrency, bio buffer could be
flushed by someone else before we submit it as below reasons:
a) there is no space in bio buffer.
b) add a request of different type (SYNC, ASYNC).
c) add a discontinuous block address.

For this condition, f2fs_submit_merged_bio will be devastating, because
it could break the following merging of writes in bio buffer, split one
big bio into two smaller one.

This patch introduces f2fs_submit_merged_bio_cond which can do a
conditional submitting with bio buffer, before submitting it will judge
whether:
 - page in DATA type bio buffer is matching with specified page;
 - page in DATA type bio buffer is belong to specified inode;
 - page in NODE type bio buffer is belong to specified inode;
If there is no eligible page in bio buffer, we will skip submitting step,
result in gaining more chance to merge consecutive block IOs in bio cache.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0c3a5797

f2fs: speed up handling holes in fiemap · da85985c

由 Chao Yu 提交于 1月 26, 2016

This patch makes f2fs_map_blocks supporting returning next potential
page offset which skips hole region in indirect tree of inode, and
use it to speed up fiemap in handling big hole case.

Test method:
xfs_io -f /mnt/f2fs/file  -c "pwrite 1099511627776 4096"
time xfs_io -f /mnt/f2fs/file -c "fiemap -v"

Before:
time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
/mnt/f2fs/file:
 EXT: FILE-OFFSET              BLOCK-RANGE      TOTAL FLAGS
   0: [0..2147483647]:         hole             2147483648
   1: [2147483648..2147483655]: 81920..81927         8   0x1

real    3m3.518s
user    0m0.000s
sys     3m3.456s

After:
time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
/mnt/f2fs/file:
 EXT: FILE-OFFSET              BLOCK-RANGE      TOTAL FLAGS
   0: [0..2147483647]:         hole             2147483648
   1: [2147483648..2147483655]: 81920..81927         8   0x1

real    0m0.008s
user    0m0.000s
sys     0m0.008s
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

da85985c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功