1. 08 9月, 2016 1 次提交
  2. 30 8月, 2016 3 次提交
  3. 19 8月, 2016 1 次提交
    • C
      Revert "f2fs: move i_size_write in f2fs_write_end" · 3024c9a1
      Chao Yu 提交于
      This reverts commit a2ee0a30.
      
      When testing with generic/032 of xfstest suit, failure message will be
      reported as below:
      
      generic/032 8s ... [failed, exit status 1] - output mismatch (see results/generic/032.out.bad)
          --- tests/generic/032.out	2015-01-11 16:52:27.643681072 +0800
          +++ results/generic/032.out.bad	2016-08-06 13:44:43.861330500 +0800
          @@ -1,5 +1,5 @@
           QA output created by 032
          -100 iterations
          -0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
          -*
          -0100000
          +1: [768..775]: unwritten
          +Unwritten extents found!
          ...
          (Run 'diff -u tests/generic/032.out results/generic/032.out.bad'  to see the entire diff)
      Ran: generic/032
      Failures: generic/032
      Failed 1 of 1 tests
      
      In write_end(), we should update i_size of inode before unlock page,
      otherwise, we will lose newly updated data in following race condition.
      
      Thread A			Thread B
      - write_end
       - unlock page
      				- writepages
      				 - lock_page
      				  - writepage
      				  if page is out-of-range of file size,
      				  we will skip writting the page.
       - update i_size
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3024c9a1
  4. 05 8月, 2016 1 次提交
  5. 27 7月, 2016 1 次提交
    • M
      mm, memcg: use consistent gfp flags during readahead · 8a5c743e
      Michal Hocko 提交于
      Vladimir has noticed that we might declare memcg oom even during
      readahead because read_pages only uses GFP_KERNEL (with mapping_gfp
      restriction) while __do_page_cache_readahead uses
      page_cache_alloc_readahead which adds __GFP_NORETRY to prevent from
      OOMs.  This gfp mask discrepancy is really unfortunate and easily
      fixable.  Drop page_cache_alloc_readahead() which only has one user and
      outsource the gfp_mask logic into readahead_gfp_mask and propagate this
      mask from __do_page_cache_readahead down to read_pages.
      
      This alone would have only very limited impact as most filesystems are
      implementing ->readpages and the common implementation mpage_readpages
      does GFP_KERNEL (with mapping_gfp restriction) again.  We can tell it to
      use readahead_gfp_mask instead as this function is called only during
      readahead as well.  The same applies to read_cache_pages.
      
      ext4 has its own ext4_mpage_readpages but the path which has pages !=
      NULL can use the same gfp mask.  Btrfs, cifs, f2fs and orangefs are
      doing a very similar pattern to mpage_readpages so the same can be
      applied to them as well.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [mhocko@suse.com: restrict gfp mask in mpage_alloc]
        Link: http://lkml.kernel.org/r/20160610074223.GC32285@dhcp22.suse.cz
      Link: http://lkml.kernel.org/r/1465301556-26431-1-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Chris Mason <clm@fb.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mike Marshall <hubcap@omnibond.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Changman Lee <cm224.lee@samsung.com>
      Cc: Chao Yu <yuchao0@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a5c743e
  6. 26 7月, 2016 1 次提交
  7. 16 7月, 2016 3 次提交
    • J
      f2fs: use blk_plug in all the possible paths · 9dfa1baf
      Jaegeuk Kim 提交于
      This patch reverts 19a5f5e2 (f2fs: drop any block plugging),
      and adds blk_plug in write paths additionally.
      
      The main reason is that blk_start_plug can be used to wake up from low-power
      mode before submitting further bios.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9dfa1baf
    • C
      f2fs: fix to avoid data update racing between GC and DIO · 82e0a5aa
      Chao Yu 提交于
      Datas in file can be operated by GC and DIO simultaneously, so we will
      face race case as below:
      
      For write case:
      Thread A				Thread B
      - generic_file_direct_write
       - invalidate_inode_pages2_range
       - f2fs_direct_IO
        - do_blockdev_direct_IO
         - do_direct_IO
          - get_more_blocks
      					- f2fs_gc
      					 - do_garbage_collect
      					  - gc_data_segment
      					   - move_data_page
      					    - do_write_data_page
      					    migrate data block to new block address
         - dio_bio_submit
         update user data to old block address
      
      For read case:
      Thread A                                Thread B
      - generic_file_direct_write
       - invalidate_inode_pages2_range
       - f2fs_direct_IO
        - do_blockdev_direct_IO
         - do_direct_IO
          - get_more_blocks
      					- f2fs_balance_fs
      					 - f2fs_gc
      					  - do_garbage_collect
      					   - gc_data_segment
      					    - move_data_page
      					     - do_write_data_page
      					     migrate data block to new block address
      					  - write_checkpoint
      					   - do_checkpoint
      					    - clear_prefree_segments
      					     - f2fs_issue_discard
                                                   discard old block adress
         - dio_bio_submit
         update user buffer from obsolete block address
      
      In order to fix this, for one file, we should let DIO and GC getting exclusion
      against with each other.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      82e0a5aa
    • J
      f2fs: fix ERR_PTR returned by bio · 1d353eb7
      Jaegeuk Kim 提交于
      This is to fix wrong error pointer handling flow reported by Dan.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NChao Yu <chao@kernel.org>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1d353eb7
  8. 09 7月, 2016 5 次提交
  9. 07 7月, 2016 2 次提交
  10. 14 6月, 2016 1 次提交
  11. 09 6月, 2016 2 次提交
  12. 08 6月, 2016 2 次提交
  13. 03 6月, 2016 9 次提交
  14. 21 5月, 2016 1 次提交
  15. 19 5月, 2016 1 次提交
  16. 12 5月, 2016 2 次提交
    • C
      f2fs: fix deadlock when flush inline data · ab47036d
      Chao Yu 提交于
      Below backtrace info was reported by Yunlei He:
      
      Call Trace:
       [<ffffffff817a9395>] schedule+0x35/0x80
       [<ffffffff817abb7d>] rwsem_down_read_failed+0xed/0x130
       [<ffffffff813c12a8>] call_rwsem_down_read_failed+0x18/0x
       [<ffffffff817ab1d0>] down_read+0x20/0x30
       [<ffffffffa02a1a12>] f2fs_evict_inode+0x242/0x3a0 [f2fs]
       [<ffffffff81217057>] evict+0xc7/0x1a0
       [<ffffffff81217cd6>] iput+0x196/0x200
       [<ffffffff812134f9>] __dentry_kill+0x179/0x1e0
       [<ffffffff812136f9>] dput+0x199/0x1f0
       [<ffffffff811fe77b>] __fput+0x18b/0x220
       [<ffffffff811fe84e>] ____fput+0xe/0x10
       [<ffffffff81097427>] task_work_run+0x77/0x90
       [<ffffffff81074d62>] exit_to_usermode_loop+0x73/0xa2
       [<ffffffff81003b7a>] do_syscall_64+0xfa/0x110
       [<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25
      
      Call Trace:
       [<ffffffff817a9395>] schedule+0x35/0x80
       [<ffffffff81216dc3>] __wait_on_freeing_inode+0xa3/0xd0
       [<ffffffff810bc300>] ? autoremove_wake_function+0x40/0x4
       [<ffffffff8121771d>] find_inode_fast+0x7d/0xb0
       [<ffffffff8121794a>] ilookup+0x6a/0xd0
       [<ffffffffa02bc740>] sync_node_pages+0x210/0x650 [f2fs]
       [<ffffffff8122e690>] ? do_fsync+0x70/0x70
       [<ffffffffa02b085e>] block_operations+0x9e/0xf0 [f2fs]
       [<ffffffff8137b795>] ? bio_endio+0x55/0x60
       [<ffffffffa02b0942>] write_checkpoint+0x92/0xba0 [f2fs]
       [<ffffffff8117da57>] ? mempool_free_slab+0x17/0x20
       [<ffffffff8117de8b>] ? mempool_free+0x2b/0x80
       [<ffffffff8122e690>] ? do_fsync+0x70/0x70
       [<ffffffffa02a53e3>] f2fs_sync_fs+0x63/0xd0 [f2fs]
       [<ffffffff8129630f>] ? ext4_sync_fs+0xbf/0x190
       [<ffffffff8122e6b0>] sync_fs_one_sb+0x20/0x30
       [<ffffffff812002e9>] iterate_supers+0xb9/0x110
       [<ffffffff8122e7b5>] sys_sync+0x55/0x90
       [<ffffffff81003ae9>] do_syscall_64+0x69/0x110
       [<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25
      
      With following excuting serials, we will set inline_node in inode page
      after inode was unlinked, result in a deadloop described as below:
      1. open file
      2. write file
      3. unlink file
      4. write file
      5. close file
      
      Thread A				Thread B
       - dput
        - iput_final
         - inode->i_state |= I_FREEING
         - evict
          - f2fs_evict_inode
      					 - f2fs_sync_fs
      					  - write_checkpoint
      					   - block_operations
      					    - f2fs_lock_all (down_write(cp_rwsem))
           - f2fs_lock_op (down_read(cp_rwsem))
      					    - sync_node_pages
      					     - ilookup
      					      - find_inode_fast
      					       - __wait_on_freeing_inode
      					         (wait on I_FREEING clear)
      
      Here, we change to set inline_node flag only for linked inode for fixing.
      Reported-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Tested-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Cc: stable@vger.kernel.org # v4.6
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ab47036d
    • C
      f2fs: support in batch multi blocks preallocation · 46008c6d
      Chao Yu 提交于
      This patch introduces reserve_new_blocks to make preallocation of multi
      blocks as in batch operation, so it can avoid lots of redundant
      operation, result in better performance.
      
      In virtual machine, with rotational device:
      
      time fallocate -l 32G /mnt/f2fs/file
      
      Before:
      real	0m4.584s
      user	0m0.000s
      sys	0m4.580s
      
      After:
      real	0m0.292s
      user	0m0.000s
      sys	0m0.272s
      
      In x86, with SSD:
      
      time fallocate -l 500G $MNT/testfile
      
      Before : 24.758 s
      After  :  1.604 s
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix bugs and add performance numbers measured in x86.]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      46008c6d
  17. 08 5月, 2016 2 次提交
  18. 04 5月, 2016 1 次提交
  19. 02 5月, 2016 1 次提交