1. 03 6月, 2016 4 次提交
  2. 21 5月, 2016 1 次提交
  3. 19 5月, 2016 1 次提交
  4. 12 5月, 2016 2 次提交
    • C
      f2fs: fix deadlock when flush inline data · ab47036d
      Chao Yu 提交于
      Below backtrace info was reported by Yunlei He:
      
      Call Trace:
       [<ffffffff817a9395>] schedule+0x35/0x80
       [<ffffffff817abb7d>] rwsem_down_read_failed+0xed/0x130
       [<ffffffff813c12a8>] call_rwsem_down_read_failed+0x18/0x
       [<ffffffff817ab1d0>] down_read+0x20/0x30
       [<ffffffffa02a1a12>] f2fs_evict_inode+0x242/0x3a0 [f2fs]
       [<ffffffff81217057>] evict+0xc7/0x1a0
       [<ffffffff81217cd6>] iput+0x196/0x200
       [<ffffffff812134f9>] __dentry_kill+0x179/0x1e0
       [<ffffffff812136f9>] dput+0x199/0x1f0
       [<ffffffff811fe77b>] __fput+0x18b/0x220
       [<ffffffff811fe84e>] ____fput+0xe/0x10
       [<ffffffff81097427>] task_work_run+0x77/0x90
       [<ffffffff81074d62>] exit_to_usermode_loop+0x73/0xa2
       [<ffffffff81003b7a>] do_syscall_64+0xfa/0x110
       [<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25
      
      Call Trace:
       [<ffffffff817a9395>] schedule+0x35/0x80
       [<ffffffff81216dc3>] __wait_on_freeing_inode+0xa3/0xd0
       [<ffffffff810bc300>] ? autoremove_wake_function+0x40/0x4
       [<ffffffff8121771d>] find_inode_fast+0x7d/0xb0
       [<ffffffff8121794a>] ilookup+0x6a/0xd0
       [<ffffffffa02bc740>] sync_node_pages+0x210/0x650 [f2fs]
       [<ffffffff8122e690>] ? do_fsync+0x70/0x70
       [<ffffffffa02b085e>] block_operations+0x9e/0xf0 [f2fs]
       [<ffffffff8137b795>] ? bio_endio+0x55/0x60
       [<ffffffffa02b0942>] write_checkpoint+0x92/0xba0 [f2fs]
       [<ffffffff8117da57>] ? mempool_free_slab+0x17/0x20
       [<ffffffff8117de8b>] ? mempool_free+0x2b/0x80
       [<ffffffff8122e690>] ? do_fsync+0x70/0x70
       [<ffffffffa02a53e3>] f2fs_sync_fs+0x63/0xd0 [f2fs]
       [<ffffffff8129630f>] ? ext4_sync_fs+0xbf/0x190
       [<ffffffff8122e6b0>] sync_fs_one_sb+0x20/0x30
       [<ffffffff812002e9>] iterate_supers+0xb9/0x110
       [<ffffffff8122e7b5>] sys_sync+0x55/0x90
       [<ffffffff81003ae9>] do_syscall_64+0x69/0x110
       [<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25
      
      With following excuting serials, we will set inline_node in inode page
      after inode was unlinked, result in a deadloop described as below:
      1. open file
      2. write file
      3. unlink file
      4. write file
      5. close file
      
      Thread A				Thread B
       - dput
        - iput_final
         - inode->i_state |= I_FREEING
         - evict
          - f2fs_evict_inode
      					 - f2fs_sync_fs
      					  - write_checkpoint
      					   - block_operations
      					    - f2fs_lock_all (down_write(cp_rwsem))
           - f2fs_lock_op (down_read(cp_rwsem))
      					    - sync_node_pages
      					     - ilookup
      					      - find_inode_fast
      					       - __wait_on_freeing_inode
      					         (wait on I_FREEING clear)
      
      Here, we change to set inline_node flag only for linked inode for fixing.
      Reported-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Tested-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Cc: stable@vger.kernel.org # v4.6
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ab47036d
    • C
      f2fs: support in batch multi blocks preallocation · 46008c6d
      Chao Yu 提交于
      This patch introduces reserve_new_blocks to make preallocation of multi
      blocks as in batch operation, so it can avoid lots of redundant
      operation, result in better performance.
      
      In virtual machine, with rotational device:
      
      time fallocate -l 32G /mnt/f2fs/file
      
      Before:
      real	0m4.584s
      user	0m0.000s
      sys	0m4.580s
      
      After:
      real	0m0.292s
      user	0m0.000s
      sys	0m0.272s
      
      In x86, with SSD:
      
      time fallocate -l 500G $MNT/testfile
      
      Before : 24.758 s
      After  :  1.604 s
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix bugs and add performance numbers measured in x86.]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      46008c6d
  5. 08 5月, 2016 2 次提交
  6. 04 5月, 2016 1 次提交
  7. 02 5月, 2016 1 次提交
  8. 27 4月, 2016 2 次提交
  9. 15 4月, 2016 1 次提交
  10. 13 4月, 2016 1 次提交
  11. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  12. 18 3月, 2016 1 次提交
    • J
      fs crypto: move per-file encryption from f2fs tree to fs/crypto · 0b81d077
      Jaegeuk Kim 提交于
      This patch adds the renamed functions moved from the f2fs crypto files.
      
      1. definitions for per-file encryption used by ext4 and f2fs.
      
      2. crypto.c for encrypt/decrypt functions
       a. IO preparation:
        - fscrypt_get_ctx / fscrypt_release_ctx
       b. before IOs:
        - fscrypt_encrypt_page
        - fscrypt_decrypt_page
        - fscrypt_zeroout_range
       c. after IOs:
        - fscrypt_decrypt_bio_pages
        - fscrypt_pullback_bio_page
        - fscrypt_restore_control_page
      
      3. policy.c supporting context management.
       a. For ioctls:
        - fscrypt_process_policy
        - fscrypt_get_policy
       b. For context permission
        - fscrypt_has_permitted_context
        - fscrypt_inherit_context
      
      4. keyinfo.c to handle permissions
        - fscrypt_get_encryption_info
        - fscrypt_free_encryption_info
      
      5. fname.c to support filename encryption
       a. general wrapper functions
        - fscrypt_fname_disk_to_usr
        - fscrypt_fname_usr_to_disk
        - fscrypt_setup_filename
        - fscrypt_free_filename
      
       b. specific filename handling functions
        - fscrypt_fname_alloc_buffer
        - fscrypt_fname_free_buffer
      
      6. Makefile and Kconfig
      
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Signed-off-by: NMichael Halcrow <mhalcrow@google.com>
      Signed-off-by: NIldar Muslukhov <ildarm@google.com>
      Signed-off-by: NUday Savagaonkar <savagaon@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0b81d077
  13. 27 2月, 2016 2 次提交
  14. 23 2月, 2016 20 次提交
    • C
      f2fs: trace old block address for CoWed page · 7a9d7548
      Chao Yu 提交于
      This patch enables to trace old block address of CoWed page for better
      debugging.
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7a9d7548
    • C
      f2fs: support revoking atomic written pages · 28bc106b
      Chao Yu 提交于
      f2fs support atomic write with following semantics:
      1. open db file
      2. ioctl start atomic write
      3. (write db file) * n
      4. ioctl commit atomic write
      5. close db file
      
      With this flow we can avoid file becoming corrupted when abnormal power
      cut, because we hold data of transaction in referenced pages linked in
      inmem_pages list of inode, but without setting them dirty, so these data
      won't be persisted unless we commit them in step 4.
      
      But we should still hold journal db file in memory by using volatile
      write, because our semantics of 'atomic write support' is incomplete, in
      step 4, we could fail to submit all dirty data of transaction, once
      partial dirty data was committed in storage, then after a checkpoint &
      abnormal power-cut, db file will be corrupted forever.
      
      So this patch tries to improve atomic write flow by adding a revoking flow,
      once inner error occurs in committing, this gives another chance to try to
      revoke these partial submitted data of current transaction, it makes
      committing operation more like aotmical one.
      
      If we're not lucky, once revoking operation was failed, EAGAIN will be
      reported to user for suggesting doing the recovery with held journal file,
      or retrying current transaction again.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      28bc106b
    • J
      f2fs crypto: f2fs_page_crypto() doesn't need a encryption context · ce855a3b
      Jaegeuk Kim 提交于
      This patch adopts:
      	ext4 crypto: ext4_page_crypto() doesn't need a encryption context
      
      Since ext4_page_crypto() doesn't need an encryption context (at least
      not any more), this allows us to simplify a number function signature
      and also allows us to avoid needing to allocate a context in
      ext4_block_write_begin().  It also means we no longer need a separate
      ext4_decrypt_one() function.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ce855a3b
    • J
      f2fs: preallocate blocks for buffered aio writes · 24b84912
      Jaegeuk Kim 提交于
      This patch preallocates data blocks for buffered aio writes.
      With this patch, we can avoid redundant locking and unlocking of node pages
      given consecutive aio request.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      24b84912
    • J
      f2fs: move dio preallocation into f2fs_file_write_iter · b439b103
      Jaegeuk Kim 提交于
      This patch moves preallocation code for direct IOs into f2fs_file_write_iter.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b439b103
    • Y
      f2fs: fix missing skip pages info · d31c7c3f
      Yunlei He 提交于
      fix missing skip pages info in f2fs_writepages trace event.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d31c7c3f
    • C
      f2fs: introduce f2fs_submit_merged_bio_cond · 0c3a5797
      Chao Yu 提交于
      f2fs use single bio buffer per type data (META/NODE/DATA) for caching
      writes locating in continuous block address as many as possible, after
      submitting, these writes may be still cached in bio buffer, so we have
      to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.
      
      Unfortunately, in the scenario of high concurrency, bio buffer could be
      flushed by someone else before we submit it as below reasons:
      a) there is no space in bio buffer.
      b) add a request of different type (SYNC, ASYNC).
      c) add a discontinuous block address.
      
      For this condition, f2fs_submit_merged_bio will be devastating, because
      it could break the following merging of writes in bio buffer, split one
      big bio into two smaller one.
      
      This patch introduces f2fs_submit_merged_bio_cond which can do a
      conditional submitting with bio buffer, before submitting it will judge
      whether:
       - page in DATA type bio buffer is matching with specified page;
       - page in DATA type bio buffer is belong to specified inode;
       - page in NODE type bio buffer is belong to specified inode;
      If there is no eligible page in bio buffer, we will skip submitting step,
      result in gaining more chance to merge consecutive block IOs in bio cache.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0c3a5797
    • C
      f2fs: speed up handling holes in fiemap · da85985c
      Chao Yu 提交于
      This patch makes f2fs_map_blocks supporting returning next potential
      page offset which skips hole region in indirect tree of inode, and
      use it to speed up fiemap in handling big hole case.
      
      Test method:
      xfs_io -f /mnt/f2fs/file  -c "pwrite 1099511627776 4096"
      time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
      
      Before:
      time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
      /mnt/f2fs/file:
       EXT: FILE-OFFSET              BLOCK-RANGE      TOTAL FLAGS
         0: [0..2147483647]:         hole             2147483648
         1: [2147483648..2147483655]: 81920..81927         8   0x1
      
      real    3m3.518s
      user    0m0.000s
      sys     3m3.456s
      
      After:
      time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
      /mnt/f2fs/file:
       EXT: FILE-OFFSET              BLOCK-RANGE      TOTAL FLAGS
         0: [0..2147483647]:         hole             2147483648
         1: [2147483648..2147483655]: 81920..81927         8   0x1
      
      real    0m0.008s
      user    0m0.000s
      sys     0m0.008s
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      da85985c
    • C
      f2fs: remove unneeded pointer conversion · 81ca7350
      Chao Yu 提交于
      There are redundant pointer conversion in following call stack:
       - at position a, inode was been converted to f2fs_file_info.
       - at position b, f2fs_file_info was been converted to inode again.
      
       - truncate_blocks(inode,..)
        - fi = F2FS_I(inode)		---a
        - ADDRS_PER_PAGE(node_page, fi)
         - addrs_per_inode(fi)
          - inode = &fi->vfs_inode	---b
          - f2fs_has_inline_xattr(inode)
           - fi = F2FS_I(inode)
           - is_inode_flag_set(fi,..)
      
      In order to avoid unneeded conversion, alter ADDRS_PER_PAGE and
      addrs_per_inode to acept parameter with type of inode pointer.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      81ca7350
    • C
      f2fs: simplify __allocate_data_blocks · 5b8db7fa
      Chao Yu 提交于
      This patch uses existing function f2fs_map_block to simplify implementation
      of __allocate_data_blocks.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5b8db7fa
    • C
      f2fs: simplify f2fs_map_blocks · 4fe71e88
      Chao Yu 提交于
      In f2fs_map_blocks, we use duplicated codes to handle first block mapping
      and the following blocks mapping, it's unnecessary. This patch simplifies
      f2fs_map_blocks to avoid using copied codes.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4fe71e88
    • J
      f2fs: use wq_has_sleeper for cp_wait wait_queue · 7c506896
      Jaegeuk Kim 提交于
      We need to use wq_has_sleeper including smp_mb to consider cp_wait concurrency.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7c506896
    • J
      f2fs: use wait_for_stable_page to avoid contention · fec1d657
      Jaegeuk Kim 提交于
      In write_begin, if storage supports stable_page, we don't need to wait for
      writeback to update its contents.
      This patch introduces to use wait_for_stable_page instead of
      wait_on_page_writeback.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fec1d657
    • J
      f2fs: don't need to call set_page_dirty for io error · e3ef1876
      Jaegeuk Kim 提交于
      If end_io gets an error, we don't need to set the page as dirty, since we
      already set f2fs_stop_checkpoint which will not flush any data.
      
      This will resolve the following warning.
      
      ======================================================
      [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
      4.4.0+ #9 Tainted: G           O
      ------------------------------------------------------
      xfs_io/26773 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
       (&(&sbi->inode_lock[i])->rlock){+.+...}, at: [<ffffffffc025483f>] update_dirty_page+0x6f/0xd0 [f2fs]
      
      and this task is already holding:
       (&(&q->__queue_lock)->rlock){-.-.-.}, at: [<ffffffff81396ea2>] blk_queue_bio+0x422/0x490
      which would create a new lock dependency:
       (&(&q->__queue_lock)->rlock){-.-.-.} -> (&(&sbi->inode_lock[i])->rlock){+.+...}
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e3ef1876
    • J
      f2fs: avoid needless sync_inode_page when reading inline_data · ae96e7bd
      Jaegeuk Kim 提交于
      In write_begin, if there is an inline_data, f2fs loads it into 0'th data page.
      Since it's the read path, we don't need to sync its inode page.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ae96e7bd
    • J
      f2fs: don't need to sync node page at every time · 52f80337
      Jaegeuk Kim 提交于
      In write_end, we don't need to sync inode page at every time.
      Instead, we can expect f2fs_write_inode will update later.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      52f80337
    • J
      f2fs: avoid multiple node page writes due to inline_data · 2049d4fc
      Jaegeuk Kim 提交于
      The sceanrio is:
      1. create fully node blocks
      2. flush node blocks
      3. write inline_data for all the node blocks again
      4. flush node blocks redundantly
      
      So, this patch tries to flush inline_data when flushing node blocks.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2049d4fc
    • J
      f2fs: do f2fs_balance_fs when block is allocated · 3c082b7b
      Jaegeuk Kim 提交于
      We should consider data block allocation to trigger f2fs_balance_fs.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3c082b7b
    • J
      f2fs: use writepages->lock for WB_SYNC_ALL · 25c13551
      Jaegeuk Kim 提交于
      If there are many writepages calls by multiple threads in background, we don't
      need to serialize to merge all the bios, since it's background.
      In such the case, it'd better to run writepages concurrently.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      25c13551
    • J
      f2fs: remove needless condition check · b483fadf
      Jaegeuk Kim 提交于
      This patch removes needless condition variable.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b483fadf