1. 13 9月, 2016 2 次提交
  2. 08 9月, 2016 13 次提交
    • J
      f2fs: no need to make zeros beyond i_size · 68f31393
      Jaegeuk Kim 提交于
      We don't need to make zeros beyond i_size, since we already wrote that through
      NEW_ADDR case.
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      68f31393
    • C
      f2fs: fix to detect temporary name of multimedia file · 7732c26a
      Chao Yu 提交于
      Some applications may create multimeida file with temporary name like
      '*.jpg.tmp' or '*.mp4.tmp', then rename to '*.jpg' or '*.mp4'.
      
      Now, f2fs can only detect multimedia filename with specified format:
      "filename + '.' + extension", so it will make f2fs missing to detect
      multimedia file with special temporary name, result in failing to set
      cold flag on file.
      
      This patch enhances detection flow for enabling lookup extension in the
      middle of temporary filename.
      Reported-by: NXue Liu <liuxueliu.liu@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7732c26a
    • C
      f2fs: fix minor typo · 6ab2a308
      Chao Yu 提交于
      Correct typo from 'destory' to 'destroy'.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6ab2a308
    • J
      f2fs: set dentry bits on random location in memory · 6bf6b267
      Jaegeuk Kim 提交于
      This fixes pointer panic when using inline_dentry, which was triggered when
      backporting to 3.10.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6bf6b267
    • C
      f2fs: fix to set superblock dirty correctly · c2a080ae
      Chao Yu 提交于
      tests/generic/251 of fstest suit complains us with below message:
      
      ------------[ cut here ]------------
      invalid opcode: 0000 [#1] PREEMPT SMP
      CPU: 2 PID: 7698 Comm: fstrim Tainted: G           O    4.7.0+ #21
      task: e9f4e000 task.stack: e7262000
      EIP: 0060:[<f89fcefe>] EFLAGS: 00010202 CPU: 2
      EIP is at write_checkpoint+0xfde/0x1020 [f2fs]
      EAX: f33eb300 EBX: eecac310 ECX: 00000001 EDX: ffff0001
      ESI: eecac000 EDI: eecac5f0 EBP: e7263dec ESP: e7263d18
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      CR0: 80050033 CR2: b76ab01c CR3: 2eb89de0 CR4: 000406f0
      Stack:
       00000001 a220fb7b e9f4e000 00000002 419ff2d3 b3a05151 00000002 e9f4e5d8
       e9f4e000 419ff2d3 b3a05151 eecac310 c10b8154 b3a05151 419ff2d3 c10b78bd
       e9f4e000 e9f4e000 e9f4e5d8 00000001 e9f4e000 ec409000 eecac2cc eecac288
      Call Trace:
       [<c10b8154>] ? __lock_acquire+0x3c4/0x760
       [<c10b78bd>] ? mark_held_locks+0x5d/0x80
       [<f8a10632>] f2fs_trim_fs+0x1c2/0x2e0 [f2fs]
       [<f89e9f56>] f2fs_ioctl+0x6b6/0x10b0 [f2fs]
       [<c13d51df>] ? __this_cpu_preempt_check+0xf/0x20
       [<c10b4281>] ? trace_hardirqs_off_caller+0x91/0x120
       [<f89e98a0>] ? __exchange_data_block+0xd30/0xd30 [f2fs]
       [<c120b2e1>] do_vfs_ioctl+0x81/0x7f0
       [<c11d57c5>] ? kmem_cache_free+0x245/0x2e0
       [<c1217840>] ? get_unused_fd_flags+0x40/0x40
       [<c1206eec>] ? putname+0x4c/0x50
       [<c11f631e>] ? do_sys_open+0x16e/0x1d0
       [<c1001990>] ? do_fast_syscall_32+0x30/0x1c0
       [<c13d51df>] ? __this_cpu_preempt_check+0xf/0x20
       [<c120baa8>] SyS_ioctl+0x58/0x80
       [<c1001a01>] do_fast_syscall_32+0xa1/0x1c0
       [<c178cc54>] sysenter_past_esp+0x45/0x74
      EIP: [<f89fcefe>] write_checkpoint+0xfde/0x1020 [f2fs] SS:ESP 0068:e7263d18
      ---[ end trace 4de95d7e6b3aa7c6 ]---
      
      The reason is: with below call stack, we will encounter BUG_ON during
      doing fstrim.
      
      Thread A				Thread B
      - write_checkpoint
       - do_checkpoint
      					- f2fs_write_inode
      					 - update_inode_page
      					  - update_inode
      					   - set_page_dirty
      					    - f2fs_set_node_page_dirty
      					     - inc_page_count
      					      - percpu_counter_inc
      					      - set_sbi_flag(SBI_IS_DIRTY)
        - clear_sbi_flag(SBI_IS_DIRTY)
      
      Thread C				Thread D
      - f2fs_write_node_page
       - set_node_addr
        - __set_nat_cache_dirty
         - nm_i->dirty_nat_cnt++
      					- do_vfs_ioctl
      					 - f2fs_ioctl
      					  - f2fs_trim_fs
      					   - write_checkpoint
      					    - f2fs_bug_on(nm_i->dirty_nat_cnt)
      
      Fix it by setting superblock dirty correctly in do_checkpoint and
      f2fs_write_node_page.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c2a080ae
    • S
      f2fs: add roll-forward recovery process for encrypted dentry · e7ba108a
      Shuoran Liu 提交于
      Add roll-forward recovery process for encrypted dentry, so the first fsync
      issued to an encrypted file does not need writing checkpoint.
      
      This improves the performance of the following test at thousands of small
      files: open -> write -> fsync -> close
      Signed-off-by: NShuoran Liu <liushuoran@huawei.com>
      Acked-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: modify kernel message to show encrypted names]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e7ba108a
    • J
      f2fs: fix lost xattrs of directories · bbf156f7
      Jaegeuk Kim 提交于
      This patch enhances the xattr consistency of dirs from suddern power-cuts.
      
      Possible scenario would be:
      1. dir->setxattr used by per-file encryption
      2. file->setxattr goes into inline_xattr
      3. file->fsync
      
      In that case, we should do checkpoint for #1.
      Otherwise we'd lose dir's key information for the file given #2.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bbf156f7
    • C
      f2fs: support async discard · 275b66b0
      Chao Yu 提交于
      Like most filesystems, f2fs will issue discard command synchronously, so
      when user trigger fstrim through ioctl, multiple discard commands will be
      issued serially with sync mode, which makes poor performance.
      
      In this patch we try to support async discard, so that all discard
      commands can be issued and be waited for endio in batch to improve
      performance.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      275b66b0
    • S
      f2fs: set encryption name flag in add inline entry path · 167451ef
      Shuoran Liu 提交于
      This patch sets encryption name flag in the add inline entry path
      if filename is encrypted.
      Signed-off-by: NShuoran Liu <liushuoran@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      167451ef
    • C
      f2fs crypto: avoid unneeded memory allocation in ->readdir · e06f86e6
      Chao Yu 提交于
      When decrypting dirents in ->readdir, fscrypt_fname_disk_to_usr won't
      change content of original encrypted dirent, we don't need to allocate
      additional buffer for storing mirror of it, so get rid of it.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e06f86e6
    • C
      f2fs: fix to do security initialization of encrypted inode with original filename · 9421d570
      Chao Yu 提交于
      When creating new inode, security_inode_init_security will be called for
      initializing security info related to the inode, and filename is passed to
      security module, it helps security module such as SElinux to know which
      rule or label could be applied for the inode with specified name.
      
      Previously, if new inode is created as an encrypted one, f2fs will transfer
      encrypted filename to security module which may fail the check of security
      policy belong to the inode. So in order to this issue, alter to transfer
      original unencrypted filename instead.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9421d570
    • C
      f2fs: do in batch synchronously readahead during GC · 7ea984b0
      Chao Yu 提交于
      In order to enhance performance, we try to readahead node page during
      GC, but before loading node page we should get block address of node page
      which is stored in NAT table, so synchronously read of single NAT page
      block our readahead flow.
      
      f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0xa1e, oldaddr = 0xa1e, newaddr = 0xa1e, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x35e9, oldaddr = 0x72d7a, newaddr = 0x72d7a, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0xc1f, oldaddr = 0xc1f, newaddr = 0xc1f, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x389d, oldaddr = 0x72d7d, newaddr = 0x72d7d, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x3a82, oldaddr = 0x72d7f, newaddr = 0x72d7f, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x3bfa, oldaddr = 0x72d86, newaddr = 0x72d86, rw = READAHEAD ^H, type = NODE
      
      This patch adds one phase that do readahead NAT pages in batch before
      readahead node page for more effeciently.
      
      f2fs_submit_page_bio: dev = (251,0), ino = 2, page_index = 0x1952, oldaddr = 0x1952, newaddr = 0x1952, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc34, oldaddr = 0xc34, newaddr = 0xc34, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa33, oldaddr = 0xa33, newaddr = 0xa33, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc30, oldaddr = 0xc30, newaddr = 0xc30, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc32, oldaddr = 0xc32, newaddr = 0xc32, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc26, oldaddr = 0xc26, newaddr = 0xc26, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa2b, oldaddr = 0xa2b, newaddr = 0xa2b, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc23, oldaddr = 0xc23, newaddr = 0xc23, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc24, oldaddr = 0xc24, newaddr = 0xc24, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xa10, oldaddr = 0xa10, newaddr = 0xa10, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_mbio: dev = (251,0), ino = 2, page_index = 0xc2c, oldaddr = 0xc2c, newaddr = 0xc2c, rw = READ_SYNC(MP), type = META
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5db7, oldaddr = 0x6be00, newaddr = 0x6be00, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5db9, oldaddr = 0x6be17, newaddr = 0x6be17, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dbc, oldaddr = 0x6be1a, newaddr = 0x6be1a, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc3, oldaddr = 0x6be20, newaddr = 0x6be20, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc7, oldaddr = 0x6be24, newaddr = 0x6be24, rw = READAHEAD ^H, type = NODE
      f2fs_submit_page_bio: dev = (251,0), ino = 1, page_index = 0x5dc9, oldaddr = 0x6be25, newaddr = 0x6be25, rw = READAHEAD ^H, type = NODE
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7ea984b0
    • C
      f2fs: schedule in between two continous batch discards · 74fa5f3d
      Chao Yu 提交于
      In batch discard approach of fstrim will grab/release gc_mutex lock
      repeatly, it makes contention of the lock becoming more intensive.
      
      So after one batch discards were issued in checkpoint and the lock
      was released, it's better to do schedule() to increase opportunity
      of grabbing gc_mutex lock for other competitors.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      74fa5f3d
  3. 30 8月, 2016 14 次提交
  4. 25 8月, 2016 2 次提交
    • J
      f2fs: do not use discard_map for hard disks · 3e025740
      Jaegeuk Kim 提交于
      We don't need to keep discard_map, if disk does not support discard command.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3e025740
    • Y
      f2fs: not allow to write illegal blkaddr · bb413d6a
      Yunlei He 提交于
      we came across an error as below:
      
      [build_nat_area_bitmap:1710] nid[0x    1718] addr[0x         1c18ddc] ino[0x    1718]
      [build_nat_area_bitmap:1710] nid[0x    1719] addr[0x         1c193d5] ino[0x    1719]
      [build_nat_area_bitmap:1710] nid[0x    171a] addr[0x         1c1736e] ino[0x    171a]
      [build_nat_area_bitmap:1710] nid[0x    171b] addr[0x        58b3ee8f] ino[0x815f92ed]
      [build_nat_area_bitmap:1710] nid[0x    171c] addr[0x         fcdc94b] ino[0x49366377]
      [build_nat_area_bitmap:1710] nid[0x    171d] addr[0x        7cd2facf] ino[0xb3c55300]
      [build_nat_area_bitmap:1710] nid[0x    171e] addr[0x        bd4e25d0] ino[0x77c34c09]
      
      ... ...
      
      [build_nat_area_bitmap:1710] nid[0x    1718] addr[0x         1c18ddc] ino[0x    1718]
      [build_nat_area_bitmap:1710] nid[0x    1719] addr[0x         1c193d5] ino[0x    1719]
      [build_nat_area_bitmap:1710] nid[0x    171a] addr[0x         1c1736e] ino[0x    171a]
      [build_nat_area_bitmap:1710] nid[0x    171b] addr[0x        58b3ee8f] ino[0x815f92ed]
      [build_nat_area_bitmap:1710] nid[0x    171c] addr[0x         fcdc94b] ino[0x49366377]
      [build_nat_area_bitmap:1710] nid[0x    171d] addr[0x        7cd2facf] ino[0xb3c55300]
      [build_nat_area_bitmap:1710] nid[0x    171e] addr[0x        bd4e25d0] ino[0x77c34c09]
      
      One nat block may be stepped by a data block, so this patch forbid to
      write if the blkaddr is illegal
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bb413d6a
  5. 19 8月, 2016 4 次提交
  6. 05 8月, 2016 1 次提交
  7. 31 7月, 2016 1 次提交
  8. 27 7月, 2016 1 次提交
    • M
      mm, memcg: use consistent gfp flags during readahead · 8a5c743e
      Michal Hocko 提交于
      Vladimir has noticed that we might declare memcg oom even during
      readahead because read_pages only uses GFP_KERNEL (with mapping_gfp
      restriction) while __do_page_cache_readahead uses
      page_cache_alloc_readahead which adds __GFP_NORETRY to prevent from
      OOMs.  This gfp mask discrepancy is really unfortunate and easily
      fixable.  Drop page_cache_alloc_readahead() which only has one user and
      outsource the gfp_mask logic into readahead_gfp_mask and propagate this
      mask from __do_page_cache_readahead down to read_pages.
      
      This alone would have only very limited impact as most filesystems are
      implementing ->readpages and the common implementation mpage_readpages
      does GFP_KERNEL (with mapping_gfp restriction) again.  We can tell it to
      use readahead_gfp_mask instead as this function is called only during
      readahead as well.  The same applies to read_cache_pages.
      
      ext4 has its own ext4_mpage_readpages but the path which has pages !=
      NULL can use the same gfp mask.  Btrfs, cifs, f2fs and orangefs are
      doing a very similar pattern to mpage_readpages so the same can be
      applied to them as well.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [mhocko@suse.com: restrict gfp mask in mpage_alloc]
        Link: http://lkml.kernel.org/r/20160610074223.GC32285@dhcp22.suse.cz
      Link: http://lkml.kernel.org/r/1465301556-26431-1-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Chris Mason <clm@fb.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mike Marshall <hubcap@omnibond.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Changman Lee <cm224.lee@samsung.com>
      Cc: Chao Yu <yuchao0@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a5c743e
  9. 26 7月, 2016 1 次提交
  10. 23 7月, 2016 1 次提交
    • Y
      f2fs: get victim segment again after new cp · fe94793e
      Yunlei He 提交于
      Previous selected segment may become free after write_checkpoint,
      if we do garbage collect on this segment, and then new_curseg happen
      to reuse it, it may cause f2fs_bug_on as below.
      
      	panic+0x154/0x29c
      	do_garbage_collect+0x15c/0xaf4
      	f2fs_gc+0x2dc/0x444
      	f2fs_balance_fs.part.22+0xcc/0x14c
      	f2fs_balance_fs+0x28/0x34
      	f2fs_map_blocks+0x5ec/0x790
      	f2fs_preallocate_blocks+0xe0/0x100
      	f2fs_file_write_iter+0x64/0x11c
      	new_sync_write+0xac/0x11c
      	vfs_write+0x144/0x1e4
      	SyS_write+0x60/0xc0
      
      Here, maybe we check sit and ssa type during reset_curseg. So, we check
      segment is stale or not, and select a new victim to avoid this.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fe94793e