1. 13 2月, 2019 4 次提交
  2. 10 1月, 2019 3 次提交
  3. 14 11月, 2018 11 次提交
    • C
      f2fs: fix to account IO correctly · 5cfb4a68
      Chao Yu 提交于
      commit 4c58ed076875f36dae0f240da1e25e99e5d4afb8 upstream.
      
      Below race can cause reversed reference on dirty count, fix it by
      relocating __submit_bio() and inc_page_count().
      
      Thread A				Thread B
      - f2fs_inplace_write_data
       - f2fs_submit_page_bio
        - __submit_bio
      					- f2fs_write_end_io
      					 - dec_page_count
        - inc_page_count
      
      Cc: <stable@vger.kernel.org>
      Fixes: d1b3e72d ("f2fs: submit bio of in-place-update pages")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5cfb4a68
    • C
      f2fs: fix to recover cold bit of inode block during POR · 56db43fd
      Chao Yu 提交于
      commit ef2a007134b4eaa39264c885999f296577bc87d2 upstream.
      
      Testcase to reproduce this bug:
      1. mkfs.f2fs /dev/sdd
      2. mount -t f2fs /dev/sdd /mnt/f2fs
      3. touch /mnt/f2fs/file
      4. sync
      5. chattr +A /mnt/f2fs/file
      6. xfs_io -f /mnt/f2fs/file -c "fsync"
      7. godown /mnt/f2fs
      8. umount /mnt/f2fs
      9. mount -t f2fs /dev/sdd /mnt/f2fs
      10. chattr -A /mnt/f2fs/file
      11. xfs_io -f /mnt/f2fs/file -c "fsync"
      12. umount /mnt/f2fs
      13. mount -t f2fs /dev/sdd /mnt/f2fs
      14. lsattr /mnt/f2fs/file
      
      -----------------N- /mnt/f2fs/file
      
      But actually, we expect the corrct result is:
      
      -------A---------N- /mnt/f2fs/file
      
      The reason is in step 9) we missed to recover cold bit flag in inode
      block, so later, in fsync, we will skip write inode block due to below
      condition check, result in lossing data in another SPOR.
      
      f2fs_fsync_node_pages()
      	if (!IS_DNODE(page) || !is_cold_node(page))
      		continue;
      
      Note that, I guess that some non-dir inode has already lost cold bit
      during POR, so in order to reenable recovery for those inode, let's
      try to recover cold bit in f2fs_iget() to save more fsynced data.
      
      Fixes: c5667575 ("f2fs: remove unneeded set_cold_node()")
      Cc: <stable@vger.kernel.org> 4.17+
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56db43fd
    • J
      f2fs: fix missing up_read · d224115a
      Jaegeuk Kim 提交于
      commit 89d13c38501df730cbb2e02c4499da1b5187119d upstream.
      
      This patch fixes missing up_read call.
      
      Fixes: c9b60788 ("f2fs: fix to do sanity check with block address in main area")
      Cc: <stable@vger.kernel.org> # 4.19+
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d224115a
    • J
      Revert "f2fs: fix to clear PG_checked flag in set_page_dirty()" · 4561194f
      Jaegeuk Kim 提交于
      commit 164a63fa6b384e30ceb96ed80bc7dc3379bc0960 upstream.
      
      This reverts commit 66110abc.
      
      If we clear the cold data flag out of the writeback flow, we can miscount
      -1 by end_io, which incurs a deadlock caused by all I/Os being blocked during
      heavy GC.
      
      Balancing F2FS Async:
       - IO (CP:    1, Data:   -1, Flush: (   0    0    1), Discard: (   ...
      
      GC thread:                              IRQ
      - move_data_page()
       - set_page_dirty()
        - clear_cold_data()
                                              - f2fs_write_end_io()
                                               - type = WB_DATA_TYPE(page);
                                                 here, we get wrong type
                                               - dec_page_count(sbi, type);
       - f2fs_wait_on_page_writeback()
      
      Cc: <stable@vger.kernel.org>
      Reported-and-Tested-by: NPark Ju Hyung <qkrwngud825@gmail.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4561194f
    • C
      f2fs: fix to flush all dirty inodes recovered in readonly fs · cd295fdd
      Chao Yu 提交于
      [ Upstream commit 1378752b9921e60749eaf18ec6c47b33f9001abb ]
      
      generic/417 reported as blow:
      
      ------------[ cut here ]------------
      kernel BUG at /home/yuchao/git/devf2fs/inode.c:695!
      invalid opcode: 0000 [#1] PREEMPT SMP
      CPU: 1 PID: 21697 Comm: umount Tainted: G        W  O      4.18.0-rc2+ #39
      Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      EIP: f2fs_evict_inode+0x556/0x580 [f2fs]
      Call Trace:
       ? _raw_spin_unlock+0x2c/0x50
       evict+0xa8/0x170
       dispose_list+0x34/0x40
       evict_inodes+0x118/0x120
       generic_shutdown_super+0x41/0x100
       ? rcu_read_lock_sched_held+0x97/0xa0
       kill_block_super+0x22/0x50
       kill_f2fs_super+0x6f/0x80 [f2fs]
       deactivate_locked_super+0x3d/0x70
       deactivate_super+0x40/0x60
       cleanup_mnt+0x39/0x70
       __cleanup_mnt+0x10/0x20
       task_work_run+0x81/0xa0
       exit_to_usermode_loop+0x59/0xa7
       do_fast_syscall_32+0x1f5/0x22c
       entry_SYSENTER_32+0x53/0x86
      EIP: f2fs_evict_inode+0x556/0x580 [f2fs]
      
      It can simply reproduced with scripts:
      
      Enable quota feature during mkfs.
      
      Testcase1:
      1. mkfs.f2fs /dev/zram0
      2. mount -t f2fs /dev/zram0 /mnt/f2fs
      3. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4k" -c "fsync"
      4. godown /mnt/f2fs
      5. umount /mnt/f2fs
      6. mount -t f2fs -o ro /dev/zram0 /mnt/f2fs
      7. umount /mnt/f2fs
      
      Testcase2:
      1. mkfs.f2fs /dev/zram0
      2. mount -t f2fs /dev/zram0 /mnt/f2fs
      3. touch /mnt/f2fs/file
      4. create process[pid = x] do:
      	a) open /mnt/f2fs/file;
      	b) unlink /mnt/f2fs/file
      5. godown -f /mnt/f2fs
      6. kill process[pid = x]
      7. umount /mnt/f2fs
      8. mount -t f2fs -o ro /dev/zram0 /mnt/f2fs
      9. umount /mnt/f2fs
      
      The reason is: during recovery, i_{c,m}time of inode will be updated, then
      the inode can be set dirty w/o being tracked in sbi->inode_list[DIRTY_META]
      global list, so later write_checkpoint will not flush such dirty inode into
      node page.
      
      Once umount is called, sync_filesystem() in generic_shutdown_super() will
      skip syncng dirty inodes due to sb_rdonly check, leaving dirty inodes
      there.
      
      To solve this issue, during umount, add remove SB_RDONLY flag in
      sb->s_flags, to make sure sync_filesystem() will not be skipped.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd295fdd
    • Y
      f2fs: report error if quota off error during umount · cfc8a57a
      Yunlei He 提交于
      [ Upstream commit cda9cc595f0bb6ffa51a4efc4b6533dfa4039b4c ]
      
      Now, we depend on fsck to ensure quota file data is ok,
      so we scan whole partition if checkpoint without umount
      flag. It's same for quota off error case, which may make
      quota file data inconsistent.
      
      generic/019 reports below error:
      
       __quota_error: 1160 callbacks suppressed
       Quota error (device zram1): write_blk: dquota write failed
       Quota error (device zram1): qtree_write_dquot: Error -28 occurred while creating quota
       Quota error (device zram1): write_blk: dquota write failed
       Quota error (device zram1): qtree_write_dquot: Error -28 occurred while creating quota
       Quota error (device zram1): write_blk: dquota write failed
       Quota error (device zram1): qtree_write_dquot: Error -28 occurred while creating quota
       Quota error (device zram1): write_blk: dquota write failed
       Quota error (device zram1): qtree_write_dquot: Error -28 occurred while creating quota
       Quota error (device zram1): write_blk: dquota write failed
       Quota error (device zram1): qtree_write_dquot: Error -28 occurred while creating quota
       VFS: Busy inodes after unmount of zram1. Self-destruct in 5 seconds.  Have a nice day...
      
      If we failed in below path due to fail to write dquot block, we will miss
      to release quota inode, fix it.
      
      - f2fs_put_super
       - f2fs_quota_off_umount
        - f2fs_quota_off
         - f2fs_quota_sync   <-- failed
         - dquot_quota_off   <-- missed to call
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cfc8a57a
    • Z
      f2fs: avoid sleeping under spin_lock · 207093ca
      Zhikang Zhang 提交于
      [ Upstream commit b430f7263673eab1dc40e662ae3441a9619d16b8 ]
      
      In the call trace below, we might sleep in function dput().
      
      So in order to avoid sleeping under spin_lock, we remove f2fs_mark_inode_dirty_sync
      from __try_update_largest_extent && __drop_largest_extent.
      
      BUG: sleeping function called from invalid context at fs/dcache.c:796
      Call trace:
      	dump_backtrace+0x0/0x3f4
      	show_stack+0x24/0x30
      	dump_stack+0xe0/0x138
      	___might_sleep+0x2a8/0x2c8
      	__might_sleep+0x78/0x10c
      	dput+0x7c/0x750
      	block_dump___mark_inode_dirty+0x120/0x17c
      	__mark_inode_dirty+0x344/0x11f0
      	f2fs_mark_inode_dirty_sync+0x40/0x50
      	__insert_extent_tree+0x2e0/0x2f4
      	f2fs_update_extent_tree_range+0xcf4/0xde8
      	f2fs_update_extent_cache+0x114/0x12c
      	f2fs_update_data_blkaddr+0x40/0x50
      	write_data_page+0x150/0x314
      	do_write_data_page+0x648/0x2318
      	__write_data_page+0xdb4/0x1640
      	f2fs_write_cache_pages+0x768/0xafc
      	__f2fs_write_data_pages+0x590/0x1218
      	f2fs_write_data_pages+0x64/0x74
      	do_writepages+0x74/0xe4
      	__writeback_single_inode+0xdc/0x15f0
      	writeback_sb_inodes+0x574/0xc98
      	__writeback_inodes_wb+0x190/0x204
      	wb_writeback+0x730/0xf14
      	wb_check_old_data_flush+0x1bc/0x1c8
      	wb_workfn+0x554/0xf74
      	process_one_work+0x440/0x118c
      	worker_thread+0xac/0x974
      	kthread+0x1a0/0x1c8
      	ret_from_fork+0x10/0x1c
      Signed-off-by: NZhikang Zhang <zhangzhikang1@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      207093ca
    • C
      f2fs: fix to recover inode's i_flags during POR · b1097eb9
      Chao Yu 提交于
      [ Upstream commit 19c73a691ccf6fb2f12d4e9cf9830023966cec88 ]
      
      Testcase to reproduce this bug:
      1. mkfs.f2fs /dev/sdd
      2. mount -t f2fs /dev/sdd /mnt/f2fs
      3. touch /mnt/f2fs/file
      4. sync
      5. chattr +A /mnt/f2fs/file
      6. xfs_io -f /mnt/f2fs/file -c "fsync"
      7. godown /mnt/f2fs
      8. umount /mnt/f2fs
      9. mount -t f2fs /dev/sdd /mnt/f2fs
      10. lsattr /mnt/f2fs/file
      
      -----------------N- /mnt/f2fs/file
      
      But actually, we expect the corrct result is:
      
      -------A---------N- /mnt/f2fs/file
      
      The reason is we didn't recover inode.i_flags field during mount,
      fix it.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b1097eb9
    • C
      f2fs: fix to recover inode's crtime during POR · 5ce5de03
      Chao Yu 提交于
      [ Upstream commit 5cd1f387a13b5188b4edb4c834310302a85a6ea2 ]
      
      Testcase to reproduce this bug:
      1. mkfs.f2fs -O extra_attr -O inode_crtime /dev/sdd
      2. mount -t f2fs /dev/sdd /mnt/f2fs
      3. touch /mnt/f2fs/file
      4. xfs_io -f /mnt/f2fs/file -c "fsync"
      5. godown /mnt/f2fs
      6. umount /mnt/f2fs
      7. mount -t f2fs /dev/sdd /mnt/f2fs
      8. xfs_io -f /mnt/f2fs/file -c "statx -r"
      
      stat.btime.tv_sec = 0
      stat.btime.tv_nsec = 0
      
      This patch fixes to recover inode creation time fields during
      mount.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5ce5de03
    • J
      f2fs: clear PageError on the read path · a28549b8
      Jaegeuk Kim 提交于
      [ Upstream commit fb7d70db305a1446864227abf711b756568f8242 ]
      
      When running fault injection test, I hit somewhat wrong behavior in f2fs_gc ->
      gc_data_segment():
      
      0. fault injection generated some PageError'ed pages
      
      1. gc_data_segment
       -> f2fs_get_read_data_page(REQ_RAHEAD)
      
      2. move_data_page
       -> f2fs_get_lock_data_page()
        -> f2f_get_read_data_page()
         -> f2fs_submit_page_read()
          -> submit_bio(READ)
        -> return EIO due to PageError
        -> fail to move data
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a28549b8
    • C
      f2fs: fix to account IO correctly for cgroup writeback · 64c90d9c
      Chao Yu 提交于
      [ Upstream commit 78efac537de33faab9a4302cc05a70bb4a8b3b63 ]
      
      Now, we have supported cgroup writeback, it depends on correctly IO
      account of specified filesystem.
      
      But in commit d1b3e72d ("f2fs: submit bio of in-place-update pages"),
      we split write paths from f2fs_submit_page_mbio() to two:
      - f2fs_submit_page_bio() for IPU path
      - f2fs_submit_page_bio() for OPU path
      
      But still we account write IO only in f2fs_submit_page_mbio(), result in
      incorrect IO account, fix it by adding missing IO account in IPU path.
      
      Fixes: d1b3e72d ("f2fs: submit bio of in-place-update pages")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64c90d9c
  4. 21 8月, 2018 3 次提交
    • C
      f2fs: readahead encrypted block during GC · 6aa58d8a
      Chao Yu 提交于
      During GC, for each encrypted block, we will read block synchronously
      into meta page, and then submit it into current cold data log area.
      
      So this block read model with 4k granularity can make poor performance,
      like migrating non-encrypted block, let's readahead encrypted block
      as well to improve migration performance.
      
      To implement this, we choose meta page that its index is old block
      address of the encrypted block, and readahead ciphertext into this
      page, later, if readaheaded page is still updated, we will load its
      data into target meta page, and submit the write IO.
      
      Note that for OPU, truncation, deletion, we need to invalid meta
      page after we invalid old block address, to make sure we won't load
      invalid data from target meta page during encrypted block migration.
      
      for ((i = 0; i < 1000; i++))
      do {
              xfs_io -f /mnt/f2fs/dir/$i -c "pwrite 0 128k" -c "fsync";
      } done
      
      for ((i = 0; i < 1000; i+=2))
      do {
              rm /mnt/f2fs/dir/$i;
      } done
      
      ret = ioctl(fd, F2FS_IOC_GARBAGE_COLLECT, 0);
      
      Before:
                    gc-6549  [001] d..1 214682.212797: block_rq_insert: 8,32 RA 32768 () 786400 + 64 [gc]
                    gc-6549  [001] d..1 214682.212802: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.213892: block_bio_queue: 8,32 R 67494144 + 8 [gc]
                    gc-6549  [001] .... 214682.213899: block_getrq: 8,32 R 67494144 + 8 [gc]
                    gc-6549  [001] .... 214682.213902: block_plug: [gc]
                    gc-6549  [001] d..1 214682.213905: block_rq_insert: 8,32 R 4096 () 67494144 + 8 [gc]
                    gc-6549  [001] d..1 214682.213908: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.226405: block_bio_queue: 8,32 R 67494152 + 8 [gc]
                    gc-6549  [001] .... 214682.226412: block_getrq: 8,32 R 67494152 + 8 [gc]
                    gc-6549  [001] .... 214682.226414: block_plug: [gc]
                    gc-6549  [001] d..1 214682.226417: block_rq_insert: 8,32 R 4096 () 67494152 + 8 [gc]
                    gc-6549  [001] d..1 214682.226420: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.226904: block_bio_queue: 8,32 R 67494160 + 8 [gc]
                    gc-6549  [001] .... 214682.226910: block_getrq: 8,32 R 67494160 + 8 [gc]
                    gc-6549  [001] .... 214682.226911: block_plug: [gc]
                    gc-6549  [001] d..1 214682.226914: block_rq_insert: 8,32 R 4096 () 67494160 + 8 [gc]
                    gc-6549  [001] d..1 214682.226916: block_unplug: [gc] 1
      
      After:
                    gc-5678  [003] .... 214327.025906: block_bio_queue: 8,32 R 67493824 + 8 [gc]
                    gc-5678  [003] .... 214327.025908: block_bio_backmerge: 8,32 R 67493824 + 8 [gc]
                    gc-5678  [003] .... 214327.025915: block_bio_queue: 8,32 R 67493832 + 8 [gc]
                    gc-5678  [003] .... 214327.025917: block_bio_backmerge: 8,32 R 67493832 + 8 [gc]
                    gc-5678  [003] .... 214327.025923: block_bio_queue: 8,32 R 67493840 + 8 [gc]
                    gc-5678  [003] .... 214327.025925: block_bio_backmerge: 8,32 R 67493840 + 8 [gc]
                    gc-5678  [003] .... 214327.025932: block_bio_queue: 8,32 R 67493848 + 8 [gc]
                    gc-5678  [003] .... 214327.025934: block_bio_backmerge: 8,32 R 67493848 + 8 [gc]
                    gc-5678  [003] .... 214327.025941: block_bio_queue: 8,32 R 67493856 + 8 [gc]
                    gc-5678  [003] .... 214327.025943: block_bio_backmerge: 8,32 R 67493856 + 8 [gc]
                    gc-5678  [003] .... 214327.025953: block_bio_queue: 8,32 R 67493864 + 8 [gc]
                    gc-5678  [003] .... 214327.025955: block_bio_backmerge: 8,32 R 67493864 + 8 [gc]
                    gc-5678  [003] .... 214327.025962: block_bio_queue: 8,32 R 67493872 + 8 [gc]
                    gc-5678  [003] .... 214327.025964: block_bio_backmerge: 8,32 R 67493872 + 8 [gc]
                    gc-5678  [003] .... 214327.025970: block_bio_queue: 8,32 R 67493880 + 8 [gc]
                    gc-5678  [003] .... 214327.025972: block_bio_backmerge: 8,32 R 67493880 + 8 [gc]
                    gc-5678  [003] .... 214327.026000: block_bio_queue: 8,32 WS 34123776 + 2048 [gc]
                    gc-5678  [003] .... 214327.026019: block_getrq: 8,32 WS 34123776 + 2048 [gc]
                    gc-5678  [003] d..1 214327.026021: block_rq_insert: 8,32 R 131072 () 67493632 + 256 [gc]
                    gc-5678  [003] d..1 214327.026023: block_unplug: [gc] 1
                    gc-5678  [003] d..1 214327.026026: block_rq_issue: 8,32 R 131072 () 67493632 + 256 [gc]
                    gc-5678  [003] .... 214327.026046: block_plug: [gc]
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6aa58d8a
    • J
      f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc · 6f8d4455
      Jaegeuk Kim 提交于
      The f2fs_gc() called by f2fs_balance_fs() requires to be called outside of
      fi->i_gc_rwsem[WRITE], since f2fs_gc() can try to grab it in a loop.
      
      If it hits the miximum retrials in GC, let's give a chance to release
      gc_mutex for a short time in order not to go into live lock in the worst
      case.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6f8d4455
    • J
      f2fs: fix performance issue observed with multi-thread sequential read · 853137ce
      Jaegeuk Kim 提交于
      This reverts the commit - "b93f7712 - f2fs: remove writepages lock"
      to fix the drop in sequential read throughput.
      
      Test: ./tiotest -t 32 -d /data/tio_tmp -f 32 -b 524288 -k 1 -k 3 -L
      device: UFS
      
      Before -
      read throughput: 185 MB/s
      total read requests: 85177 (of these ~80000 are 4KB size requests).
      total write requests: 2546 (of these ~2208 requests are written in 512KB).
      
      After -
      read throughput: 758 MB/s
      total read requests: 2417 (of these ~2042 are 512KB reads).
      total write requests: 2701 (of these ~2034 requests are written in 512KB).
      Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      853137ce
  5. 18 8月, 2018 1 次提交
  6. 15 8月, 2018 2 次提交
    • C
      f2fs: fix to skip verifying block address for non-regular inode · dda9f4b9
      Chao Yu 提交于
      generic/184 1s ... [failed, exit status 1]- output mismatch
          --- tests/generic/184.out	2015-01-11 16:52:27.643681072 +0800
           QA output created by 184 - silence is golden
          +rm: cannot remove '/mnt/f2fs/null': Bad address
          +mknod: '/mnt/f2fs/null': Bad address
          +chmod: cannot access '/mnt/f2fs/null': Bad address
          +./tests/generic/184: line 36: /mnt/f2fs/null: Bad address
          ...
      
      F2FS-fs (zram0): access invalid blkaddr:259
      EIP: f2fs_is_valid_blkaddr+0x14b/0x1b0 [f2fs]
       f2fs_iget+0x927/0x1010 [f2fs]
       f2fs_lookup+0x26e/0x630 [f2fs]
       __lookup_slow+0xb3/0x140
       lookup_slow+0x31/0x50
       walk_component+0x185/0x1f0
       path_lookupat+0x51/0x190
       filename_lookup+0x7f/0x140
       user_path_at_empty+0x36/0x40
       vfs_statx+0x61/0xc0
       __do_sys_stat64+0x29/0x40
       sys_stat64+0x13/0x20
       do_fast_syscall_32+0xaa/0x22c
       entry_SYSENTER_32+0x53/0x86
      
      In f2fs_iget(), we will check inode's first block address, if it is valid,
      we will set FI_FIRST_BLOCK_WRITTEN flag in inode.
      
      But we should only do this for regular inode, otherwise, like special
      inode, i_addr[0] is used for storing device info instead of block address,
      it will fail checking flow obviously.
      
      So for non-regular inode, let's skip verifying address and setting flag.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      dda9f4b9
    • A
      f2fs: rework fault injection handling to avoid a warning · 7fa750a1
      Arnd Bergmann 提交于
      When CONFIG_F2FS_FAULT_INJECTION is disabled, we get a warning about an
      unused label:
      
      fs/f2fs/segment.c: In function '__submit_discard_cmd':
      fs/f2fs/segment.c:1059:1: error: label 'submit' defined but not used [-Werror=unused-label]
      
      This could be fixed by adding another #ifdef around it, but the more
      reliable way of doing this seems to be to remove the other #ifdefs
      where that is easily possible.
      
      By defining time_to_inject() as a trivial stub, most of the checks for
      CONFIG_F2FS_FAULT_INJECTION can go away. This also leads to nicer
      formatting of the code.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7fa750a1
  7. 14 8月, 2018 12 次提交
    • C
      f2fs: support fault_type mount option · d494500a
      Chao Yu 提交于
      Previously, once fault injection is on, by default, all kind of faults
      will be injected to f2fs, if we want to trigger single or specified
      combined type during the test, we need to configure sysfs entry, it will
      be a little inconvenient to integrate sysfs configuring into testsuit,
      such as xfstest.
      
      So this patch introduces a new mount option 'fault_type' to assist old
      option 'fault_injection', with these two mount options, we can specify
      any fault rate/type at mount-time.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d494500a
    • C
      f2fs: fix to return success when trimming meta area · 3f16ecd9
      Chao Yu 提交于
      generic/251
          --- tests/generic/251.out	2016-05-03 20:20:11.381899000 +0800
           QA output created by 251
           Running the test: done.
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          ...
      Ran: generic/251
      Failures: generic/251
      
      The reason is coverage of fstrim locates in meta area, previously we
      just return -EINVAL for such case, making generic/251 failed, to fix
      this problem, let's relieve restriction to return success with no
      block discarded.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3f16ecd9
    • C
      f2fs: fix use-after-free of dicard command entry · 6b9cb124
      Chao Yu 提交于
      As Dan Carpenter reported:
      
      The patch 20ee4382: "f2fs: issue small discard by LBA order" from
      Jul 8, 2018, leads to the following Smatch warning:
      
      	fs/f2fs/segment.c:1277 __issue_discard_cmd_orderly()
      	warn: 'dc' was already freed.
      
      See also:
      fs/f2fs/segment.c:2550 __issue_discard_cmd_range() warn: 'dc' was already freed.
      
      In order to fix this issue, let's get error from __submit_discard_cmd(),
      and release current discard command after we referenced next one.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6b9cb124
    • C
      f2fs: support discard submission error injection · b83dcfe6
      Chao Yu 提交于
      This patch adds to support discard submission error injection for testing
      error handling of __submit_discard_cmd().
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b83dcfe6
    • C
      f2fs: split discard command in prior to block layer · 35ec7d57
      Chao Yu 提交于
      Some devices has small max_{hw,}discard_sectors, so that in
      __blkdev_issue_discard(), one big size discard bio can be split
      into multiple small size discard bios, result in heavy load in IO
      scheduler and device, which can hang other sync IO for long time.
      
      Now, f2fs is trying to control discard commands more elaboratively,
      in order to make less conflict in between discard IO and user IO
      to enhance application's performance, so in this patch, we will
      split discard bio in f2fs in prior to in block layer to reduce
      issuing multiple discard bios in a short time.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      35ec7d57
    • S
      f2fs: wake up gc thread immediately when gc_urgent is set · a690efff
      Sheng Yong 提交于
      Fixes: 5b0e9539 ("f2fs: introduce sbi->gc_mode to determine the policy")
      Signed-off-by: NSheng Yong <shengyong1@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a690efff
    • C
      f2fs: fix incorrect range->len in f2fs_trim_fs() · 6eae2694
      Chao Yu 提交于
      generic/260 reported below error:
      
           [+] Default length with start set (should succeed)
           [+] Length beyond the end of fs (should succeed)
           [+] Length beyond the end of fs with start set (should succeed)
          +./tests/generic/260: line 94: [: 18446744073709551615: integer expression expected
          +./tests/generic/260: line 104: [: 18446744073709551615: integer expression expected
           Test done
          ...
      
      In f2fs_trim_fs(), if there is no discard being trimmed, we need to correct
      range->len before return.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6eae2694
    • C
      f2fs: refresh recent accessed nat entry in lru list · 22969158
      Chao Yu 提交于
      Introduce nat_list_lock to protect nm_i->nat_entries list, and manage
      it as a LRU list, refresh location for therein recent accessed entries
      in the list.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      22969158
    • C
      f2fs: fix avoid race between truncate and background GC · a33c1502
      Chao Yu 提交于
      Thread A				Background GC
      - f2fs_setattr isize to 0
       - truncate_setsize
      					- gc_data_segment
      					 - f2fs_get_read_data_page page #0
      					  - set_page_dirty
      					  - set_cold_data
       - f2fs_truncate
      
      - f2fs_setattr isize to 4k
      - read 4k <--- hit data in cached page #0
      
      Above race condition can cause read out invalid data in a truncated
      page, fix it by i_gc_rwsem[WRITE] lock.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a33c1502
    • C
      f2fs: avoid race between zero_range and background GC · c7079853
      Chao Yu 提交于
      Thread A				Background GC
      - f2fs_zero_range
       - truncate_pagecache_range
      					- gc_data_segment
      					 - get_read_data_page
      					  - move_data_page
      					   - set_page_dirty
      					   - set_cold_data
       - f2fs_do_zero_range
        - dn->data_blkaddr = NEW_ADDR;
        - f2fs_set_data_blkaddr
      
      Actually, we don't need to set dirty & checked flag on the page, since
      all valid data in the page should be zeroed by zero_range().
      
      Use i_gc_rwsem[WRITE] to avoid such race condition.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c7079853
    • C
      f2fs: fix to do sanity check with block address in main area v2 · 91291e99
      Chao Yu 提交于
      This patch adds f2fs_is_valid_blkaddr() in below functions to do sanity
      check with block address to avoid pentential panic:
      - f2fs_grab_read_bio()
      - __written_first_block()
      
      https://bugzilla.kernel.org/show_bug.cgi?id=200465
      
      - Reproduce
      
      - POC (poc.c)
          #define _GNU_SOURCE
          #include <sys/types.h>
          #include <sys/mount.h>
          #include <sys/mman.h>
          #include <sys/stat.h>
          #include <sys/xattr.h>
      
          #include <dirent.h>
          #include <errno.h>
          #include <error.h>
          #include <fcntl.h>
          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>
          #include <unistd.h>
      
          #include <linux/falloc.h>
          #include <linux/loop.h>
      
          static void activity(char *mpoint) {
      
            char *xattr;
            int err;
      
            err = asprintf(&xattr, "%s/foo/bar/xattr", mpoint);
      
            char buf2[113];
            memset(buf2, 0, sizeof(buf2));
            listxattr(xattr, buf2, sizeof(buf2));
      
          }
      
          int main(int argc, char *argv[]) {
            activity(argv[1]);
            return 0;
          }
      
      - kernel message
      [  844.718738] F2FS-fs (loop0): Mounted with checkpoint version = 2
      [  846.430929] F2FS-fs (loop0): access invalid blkaddr:1024
      [  846.431058] WARNING: CPU: 1 PID: 1249 at fs/f2fs/checkpoint.c:154 f2fs_is_valid_blkaddr+0x10f/0x160
      [  846.431059] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
      [  846.431310] CPU: 1 PID: 1249 Comm: a.out Not tainted 4.18.0-rc3+ #1
      [  846.431312] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  846.431315] RIP: 0010:f2fs_is_valid_blkaddr+0x10f/0x160
      [  846.431316] Code: 00 eb ed 31 c0 83 fa 05 75 ae 48 83 ec 08 48 8b 3f 89 f1 48 c7 c2 fc 0b 0f 8b 48 c7 c6 8b d7 09 8b 88 44 24 07 e8 61 8b ff ff <0f> 0b 0f b6 44 24 07 48 83 c4 08 eb 81 4c 8b 47 10 8b 8f 38 04 00
      [  846.431347] RSP: 0018:ffff961c414a7bc0 EFLAGS: 00010282
      [  846.431349] RAX: 0000000000000000 RBX: ffffc5f787b8ea80 RCX: 0000000000000000
      [  846.431350] RDX: 0000000000000000 RSI: ffff89dfffd165d8 RDI: ffff89dfffd165d8
      [  846.431351] RBP: ffff961c414a7c20 R08: 0000000000000001 R09: 0000000000000248
      [  846.431353] R10: 0000000000000000 R11: 0000000000000248 R12: 0000000000000007
      [  846.431369] R13: ffff89dff5492800 R14: ffff89dfae3aa000 R15: ffff89dff4ff88d0
      [  846.431372] FS:  00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000
      [  846.431373] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  846.431374] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0
      [  846.431384] Call Trace:
      [  846.431426]  f2fs_iget+0x6f4/0xe70
      [  846.431430]  ? f2fs_find_entry+0x71/0x90
      [  846.431432]  f2fs_lookup+0x1aa/0x390
      [  846.431452]  __lookup_slow+0x97/0x150
      [  846.431459]  lookup_slow+0x35/0x50
      [  846.431462]  walk_component+0x1c6/0x470
      [  846.431479]  ? memcg_kmem_charge_memcg+0x70/0x90
      [  846.431488]  ? page_add_file_rmap+0x13/0x200
      [  846.431491]  path_lookupat+0x76/0x230
      [  846.431501]  ? __alloc_pages_nodemask+0xfc/0x280
      [  846.431504]  filename_lookup+0xb8/0x1a0
      [  846.431534]  ? _cond_resched+0x16/0x40
      [  846.431541]  ? kmem_cache_alloc+0x160/0x1d0
      [  846.431549]  ? path_listxattr+0x41/0xa0
      [  846.431551]  path_listxattr+0x41/0xa0
      [  846.431570]  do_syscall_64+0x55/0x100
      [  846.431583]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  846.431607] RIP: 0033:0x7f882de1c0d7
      [  846.431607] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
      [  846.431639] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2
      [  846.431641] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7
      [  846.431642] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0
      [  846.431643] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000
      [  846.431645] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550
      [  846.431646] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000
      [  846.431648] ---[ end trace abca54df39d14f5c ]---
      [  846.431651] F2FS-fs (loop0): invalid blkaddr: 1024, type: 5, run fsck to fix.
      [  846.431762] WARNING: CPU: 1 PID: 1249 at fs/f2fs/f2fs.h:2697 f2fs_iget+0xd17/0xe70
      [  846.431763] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
      [  846.431797] CPU: 1 PID: 1249 Comm: a.out Tainted: G        W         4.18.0-rc3+ #1
      [  846.431798] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  846.431800] RIP: 0010:f2fs_iget+0xd17/0xe70
      [  846.431801] Code: ff ff 48 63 d8 e9 e1 f6 ff ff 48 8b 45 c8 41 b8 05 00 00 00 48 c7 c2 d8 e8 0e 8b 48 c7 c6 1d b0 0a 8b 48 8b 38 e8 f9 b4 00 00 <0f> 0b 48 8b 45 c8 f0 80 48 48 04 e9 d8 f9 ff ff 0f 0b 48 8b 43 18
      [  846.431832] RSP: 0018:ffff961c414a7bd0 EFLAGS: 00010282
      [  846.431834] RAX: 0000000000000000 RBX: ffffc5f787b8ea80 RCX: 0000000000000006
      [  846.431835] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff89dfffd165d0
      [  846.431836] RBP: ffff961c414a7c20 R08: 0000000000000000 R09: 0000000000000273
      [  846.431837] R10: 0000000000000000 R11: ffff89dfad50ca60 R12: 0000000000000007
      [  846.431838] R13: ffff89dff5492800 R14: ffff89dfae3aa000 R15: ffff89dff4ff88d0
      [  846.431840] FS:  00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000
      [  846.431841] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  846.431842] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0
      [  846.431846] Call Trace:
      [  846.431850]  ? f2fs_find_entry+0x71/0x90
      [  846.431853]  f2fs_lookup+0x1aa/0x390
      [  846.431856]  __lookup_slow+0x97/0x150
      [  846.431858]  lookup_slow+0x35/0x50
      [  846.431874]  walk_component+0x1c6/0x470
      [  846.431878]  ? memcg_kmem_charge_memcg+0x70/0x90
      [  846.431880]  ? page_add_file_rmap+0x13/0x200
      [  846.431882]  path_lookupat+0x76/0x230
      [  846.431884]  ? __alloc_pages_nodemask+0xfc/0x280
      [  846.431886]  filename_lookup+0xb8/0x1a0
      [  846.431890]  ? _cond_resched+0x16/0x40
      [  846.431891]  ? kmem_cache_alloc+0x160/0x1d0
      [  846.431894]  ? path_listxattr+0x41/0xa0
      [  846.431896]  path_listxattr+0x41/0xa0
      [  846.431898]  do_syscall_64+0x55/0x100
      [  846.431901]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  846.431902] RIP: 0033:0x7f882de1c0d7
      [  846.431903] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
      [  846.431934] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2
      [  846.431936] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7
      [  846.431937] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0
      [  846.431939] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000
      [  846.431940] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550
      [  846.431941] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000
      [  846.431943] ---[ end trace abca54df39d14f5d ]---
      [  846.432033] F2FS-fs (loop0): access invalid blkaddr:1024
      [  846.432051] WARNING: CPU: 1 PID: 1249 at fs/f2fs/checkpoint.c:154 f2fs_is_valid_blkaddr+0x10f/0x160
      [  846.432051] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
      [  846.432085] CPU: 1 PID: 1249 Comm: a.out Tainted: G        W         4.18.0-rc3+ #1
      [  846.432086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  846.432089] RIP: 0010:f2fs_is_valid_blkaddr+0x10f/0x160
      [  846.432089] Code: 00 eb ed 31 c0 83 fa 05 75 ae 48 83 ec 08 48 8b 3f 89 f1 48 c7 c2 fc 0b 0f 8b 48 c7 c6 8b d7 09 8b 88 44 24 07 e8 61 8b ff ff <0f> 0b 0f b6 44 24 07 48 83 c4 08 eb 81 4c 8b 47 10 8b 8f 38 04 00
      [  846.432120] RSP: 0018:ffff961c414a7900 EFLAGS: 00010286
      [  846.432122] RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000006
      [  846.432123] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff89dfffd165d0
      [  846.432124] RBP: ffff89dff5492800 R08: 0000000000000001 R09: 000000000000029d
      [  846.432125] R10: ffff961c414a7820 R11: 000000000000029d R12: 0000000000000400
      [  846.432126] R13: 0000000000000000 R14: ffff89dff4ff88d0 R15: 0000000000000000
      [  846.432128] FS:  00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000
      [  846.432130] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  846.432131] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0
      [  846.432135] Call Trace:
      [  846.432151]  f2fs_wait_on_block_writeback+0x20/0x110
      [  846.432158]  f2fs_grab_read_bio+0xbc/0xe0
      [  846.432161]  f2fs_submit_page_read+0x21/0x280
      [  846.432163]  f2fs_get_read_data_page+0xb7/0x3c0
      [  846.432165]  f2fs_get_lock_data_page+0x29/0x1e0
      [  846.432167]  f2fs_get_new_data_page+0x148/0x550
      [  846.432170]  f2fs_add_regular_entry+0x1d2/0x550
      [  846.432178]  ? __switch_to+0x12f/0x460
      [  846.432181]  f2fs_add_dentry+0x6a/0xd0
      [  846.432184]  f2fs_do_add_link+0xe9/0x140
      [  846.432186]  __recover_dot_dentries+0x260/0x280
      [  846.432189]  f2fs_lookup+0x343/0x390
      [  846.432193]  __lookup_slow+0x97/0x150
      [  846.432195]  lookup_slow+0x35/0x50
      [  846.432208]  walk_component+0x1c6/0x470
      [  846.432212]  ? memcg_kmem_charge_memcg+0x70/0x90
      [  846.432215]  ? page_add_file_rmap+0x13/0x200
      [  846.432217]  path_lookupat+0x76/0x230
      [  846.432219]  ? __alloc_pages_nodemask+0xfc/0x280
      [  846.432221]  filename_lookup+0xb8/0x1a0
      [  846.432224]  ? _cond_resched+0x16/0x40
      [  846.432226]  ? kmem_cache_alloc+0x160/0x1d0
      [  846.432228]  ? path_listxattr+0x41/0xa0
      [  846.432230]  path_listxattr+0x41/0xa0
      [  846.432233]  do_syscall_64+0x55/0x100
      [  846.432235]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  846.432237] RIP: 0033:0x7f882de1c0d7
      [  846.432237] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
      [  846.432269] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2
      [  846.432271] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7
      [  846.432272] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0
      [  846.432273] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000
      [  846.432274] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550
      [  846.432275] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000
      [  846.432277] ---[ end trace abca54df39d14f5e ]---
      [  846.432279] F2FS-fs (loop0): invalid blkaddr: 1024, type: 5, run fsck to fix.
      [  846.432376] WARNING: CPU: 1 PID: 1249 at fs/f2fs/f2fs.h:2697 f2fs_wait_on_block_writeback+0xb1/0x110
      [  846.432376] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
      [  846.432410] CPU: 1 PID: 1249 Comm: a.out Tainted: G        W         4.18.0-rc3+ #1
      [  846.432411] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  846.432413] RIP: 0010:f2fs_wait_on_block_writeback+0xb1/0x110
      [  846.432414] Code: 66 90 f0 ff 4b 34 74 59 5b 5d c3 48 8b 7d 00 41 b8 05 00 00 00 89 d9 48 c7 c2 d8 e8 0e 8b 48 c7 c6 1d b0 0a 8b e8 df bc fd ff <0f> 0b f0 80 4d 48 04 e9 67 ff ff ff 48 8b 03 48 c1 e8 37 83 e0 07
      [  846.432445] RSP: 0018:ffff961c414a7910 EFLAGS: 00010286
      [  846.432447] RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000006
      [  846.432448] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff89dfffd165d0
      [  846.432449] RBP: ffff89dff5492800 R08: 0000000000000000 R09: 00000000000002d1
      [  846.432450] R10: ffff961c414a7820 R11: ffff89dfad50cf80 R12: 0000000000000400
      [  846.432451] R13: 0000000000000000 R14: ffff89dff4ff88d0 R15: 0000000000000000
      [  846.432453] FS:  00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000
      [  846.432454] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  846.432455] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0
      [  846.432459] Call Trace:
      [  846.432463]  f2fs_grab_read_bio+0xbc/0xe0
      [  846.432464]  f2fs_submit_page_read+0x21/0x280
      [  846.432466]  f2fs_get_read_data_page+0xb7/0x3c0
      [  846.432468]  f2fs_get_lock_data_page+0x29/0x1e0
      [  846.432470]  f2fs_get_new_data_page+0x148/0x550
      [  846.432473]  f2fs_add_regular_entry+0x1d2/0x550
      [  846.432475]  ? __switch_to+0x12f/0x460
      [  846.432477]  f2fs_add_dentry+0x6a/0xd0
      [  846.432480]  f2fs_do_add_link+0xe9/0x140
      [  846.432483]  __recover_dot_dentries+0x260/0x280
      [  846.432485]  f2fs_lookup+0x343/0x390
      [  846.432488]  __lookup_slow+0x97/0x150
      [  846.432490]  lookup_slow+0x35/0x50
      [  846.432505]  walk_component+0x1c6/0x470
      [  846.432509]  ? memcg_kmem_charge_memcg+0x70/0x90
      [  846.432511]  ? page_add_file_rmap+0x13/0x200
      [  846.432513]  path_lookupat+0x76/0x230
      [  846.432515]  ? __alloc_pages_nodemask+0xfc/0x280
      [  846.432517]  filename_lookup+0xb8/0x1a0
      [  846.432520]  ? _cond_resched+0x16/0x40
      [  846.432522]  ? kmem_cache_alloc+0x160/0x1d0
      [  846.432525]  ? path_listxattr+0x41/0xa0
      [  846.432526]  path_listxattr+0x41/0xa0
      [  846.432529]  do_syscall_64+0x55/0x100
      [  846.432531]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  846.432533] RIP: 0033:0x7f882de1c0d7
      [  846.432533] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
      [  846.432565] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2
      [  846.432567] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7
      [  846.432568] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0
      [  846.432569] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000
      [  846.432570] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550
      [  846.432571] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000
      [  846.432573] ---[ end trace abca54df39d14f5f ]---
      [  846.434280] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      [  846.434424] PGD 80000001ebd3a067 P4D 80000001ebd3a067 PUD 1eb1ae067 PMD 0
      [  846.434551] Oops: 0000 [#1] SMP PTI
      [  846.434697] CPU: 0 PID: 44 Comm: kworker/u5:0 Tainted: G        W         4.18.0-rc3+ #1
      [  846.434805] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  846.435000] Workqueue: fscrypt_read_queue decrypt_work
      [  846.435174] RIP: 0010:fscrypt_do_page_crypto+0x6e/0x2d0
      [  846.435351] Code: 00 65 48 8b 04 25 28 00 00 00 48 89 84 24 88 00 00 00 31 c0 e8 43 c2 e0 ff 49 8b 86 48 02 00 00 85 ed c7 44 24 70 00 00 00 00 <48> 8b 58 08 0f 84 14 02 00 00 48 8b 78 10 48 8b 0c 24 48 c7 84 24
      [  846.435696] RSP: 0018:ffff961c40f9bd60 EFLAGS: 00010206
      [  846.435870] RAX: 0000000000000000 RBX: ffffc5f787719b80 RCX: ffffc5f787719b80
      [  846.436051] RDX: ffffffff8b9f4b88 RSI: ffffffff8b0ae622 RDI: ffff961c40f9bdb8
      [  846.436261] RBP: 0000000000001000 R08: ffffc5f787719b80 R09: 0000000000001000
      [  846.436433] R10: 0000000000000018 R11: fefefefefefefeff R12: ffffc5f787719b80
      [  846.436562] R13: ffffc5f787719b80 R14: ffff89dff4ff88d0 R15: 0ffff89dfaddee60
      [  846.436658] FS:  0000000000000000(0000) GS:ffff89dfffc00000(0000) knlGS:0000000000000000
      [  846.436758] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  846.436898] CR2: 0000000000000008 CR3: 00000001eddd0000 CR4: 00000000000006f0
      [  846.437001] Call Trace:
      [  846.437181]  ? check_preempt_wakeup+0xf2/0x230
      [  846.437276]  ? check_preempt_curr+0x7c/0x90
      [  846.437370]  fscrypt_decrypt_page+0x48/0x4d
      [  846.437466]  __fscrypt_decrypt_bio+0x5b/0x90
      [  846.437542]  decrypt_work+0x12/0x20
      [  846.437651]  process_one_work+0x15e/0x3d0
      [  846.437740]  worker_thread+0x4c/0x440
      [  846.437848]  kthread+0xf8/0x130
      [  846.437938]  ? rescuer_thread+0x350/0x350
      [  846.438022]  ? kthread_associate_blkcg+0x90/0x90
      [  846.438117]  ret_from_fork+0x35/0x40
      [  846.438201] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
      [  846.438653] CR2: 0000000000000008
      [  846.438713] ---[ end trace abca54df39d14f60 ]---
      [  846.438796] RIP: 0010:fscrypt_do_page_crypto+0x6e/0x2d0
      [  846.438844] Code: 00 65 48 8b 04 25 28 00 00 00 48 89 84 24 88 00 00 00 31 c0 e8 43 c2 e0 ff 49 8b 86 48 02 00 00 85 ed c7 44 24 70 00 00 00 00 <48> 8b 58 08 0f 84 14 02 00 00 48 8b 78 10 48 8b 0c 24 48 c7 84 24
      [  846.439084] RSP: 0018:ffff961c40f9bd60 EFLAGS: 00010206
      [  846.439176] RAX: 0000000000000000 RBX: ffffc5f787719b80 RCX: ffffc5f787719b80
      [  846.440927] RDX: ffffffff8b9f4b88 RSI: ffffffff8b0ae622 RDI: ffff961c40f9bdb8
      [  846.442083] RBP: 0000000000001000 R08: ffffc5f787719b80 R09: 0000000000001000
      [  846.443284] R10: 0000000000000018 R11: fefefefefefefeff R12: ffffc5f787719b80
      [  846.444448] R13: ffffc5f787719b80 R14: ffff89dff4ff88d0 R15: 0ffff89dfaddee60
      [  846.445558] FS:  0000000000000000(0000) GS:ffff89dfffc00000(0000) knlGS:0000000000000000
      [  846.446687] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  846.447796] CR2: 0000000000000008 CR3: 00000001eddd0000 CR4: 00000000000006f0
      
      - Location
      https://elixir.bootlin.com/linux/v4.18-rc4/source/fs/crypto/crypto.c#L149
      	struct crypto_skcipher *tfm = ci->ci_ctfm;
      Here ci can be NULL
      
      Note that this issue maybe require CONFIG_F2FS_FS_ENCRYPTION=y to reproduce.
      
      Reported-by Wen Xu <wen.xu@gatech.edu>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      91291e99
    • C
      f2fs: fix to do sanity check with inline flags · bcbfbd60
      Chao Yu 提交于
      https://bugzilla.kernel.org/show_bug.cgi?id=200221
      
      - Overview
      BUG() in clear_inode() when mounting and un-mounting a corrupted f2fs image
      
      - Reproduce
      
      - Kernel message
      [  538.601448] F2FS-fs (loop0): Invalid segment/section count (31, 24 x 1376257)
      [  538.601458] F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock
      [  538.724091] F2FS-fs (loop0): Try to recover 2th superblock, ret: 0
      [  538.724102] F2FS-fs (loop0): Mounted with checkpoint version = 2
      [  540.970834] ------------[ cut here ]------------
      [  540.970838] kernel BUG at fs/inode.c:512!
      [  540.971750] invalid opcode: 0000 [#1] SMP KASAN PTI
      [  540.972755] CPU: 1 PID: 1305 Comm: umount Not tainted 4.18.0-rc1+ #4
      [  540.974034] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  540.982913] RIP: 0010:clear_inode+0xc0/0xd0
      [  540.983774] Code: 8d a3 30 01 00 00 4c 89 e7 e8 1c ec f8 ff 48 8b 83 30 01 00 00 49 39 c4 75 1a 48 c7 83 a0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 1f 40 00 66 66 66 66 90 55
      [  540.987570] RSP: 0018:ffff8801e34a7b70 EFLAGS: 00010002
      [  540.988636] RAX: 0000000000000000 RBX: ffff8801e9b744e8 RCX: ffffffffb840eb3a
      [  540.990063] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8801e9b746b8
      [  540.991499] RBP: ffff8801e34a7b80 R08: ffffed003d36e8ce R09: ffffed003d36e8ce
      [  540.992923] R10: 0000000000000001 R11: ffffed003d36e8cd R12: ffff8801e9b74668
      [  540.994360] R13: ffff8801e9b74760 R14: ffff8801e9b74528 R15: ffff8801e9b74530
      [  540.995786] FS:  00007f4662bdf840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
      [  540.997403] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  540.998571] CR2: 000000000175c568 CR3: 00000001dcfe6000 CR4: 00000000000006e0
      [  541.000015] Call Trace:
      [  541.000554]  f2fs_evict_inode+0x253/0x630
      [  541.001381]  evict+0x16f/0x290
      [  541.002015]  iput+0x280/0x300
      [  541.002654]  dentry_unlink_inode+0x165/0x1e0
      [  541.003528]  __dentry_kill+0x16a/0x260
      [  541.004300]  dentry_kill+0x70/0x250
      [  541.005018]  dput+0x154/0x1d0
      [  541.005635]  do_one_tree+0x34/0x40
      [  541.006354]  shrink_dcache_for_umount+0x3f/0xa0
      [  541.007285]  generic_shutdown_super+0x43/0x1c0
      [  541.008192]  kill_block_super+0x52/0x80
      [  541.008978]  kill_f2fs_super+0x62/0x70
      [  541.009750]  deactivate_locked_super+0x6f/0xa0
      [  541.010664]  deactivate_super+0x5e/0x80
      [  541.011450]  cleanup_mnt+0x61/0xa0
      [  541.012151]  __cleanup_mnt+0x12/0x20
      [  541.012893]  task_work_run+0xc8/0xf0
      [  541.013635]  exit_to_usermode_loop+0x125/0x130
      [  541.014555]  do_syscall_64+0x138/0x170
      [  541.015340]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  541.016375] RIP: 0033:0x7f46624bf487
      [  541.017104] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48
      [  541.020923] RSP: 002b:00007fff5e12e9a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      [  541.022452] RAX: 0000000000000000 RBX: 0000000001753030 RCX: 00007f46624bf487
      [  541.023885] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000175a1e0
      [  541.025318] RBP: 000000000175a1e0 R08: 0000000000000000 R09: 0000000000000014
      [  541.026755] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f46629c883c
      [  541.028186] R13: 0000000000000000 R14: 0000000001753210 R15: 00007fff5e12ec30
      [  541.029626] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
      [  541.039445] ---[ end trace 4ce02f25ff7d3df5 ]---
      [  541.040392] RIP: 0010:clear_inode+0xc0/0xd0
      [  541.041240] Code: 8d a3 30 01 00 00 4c 89 e7 e8 1c ec f8 ff 48 8b 83 30 01 00 00 49 39 c4 75 1a 48 c7 83 a0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 1f 40 00 66 66 66 66 90 55
      [  541.045042] RSP: 0018:ffff8801e34a7b70 EFLAGS: 00010002
      [  541.046099] RAX: 0000000000000000 RBX: ffff8801e9b744e8 RCX: ffffffffb840eb3a
      [  541.047537] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8801e9b746b8
      [  541.048965] RBP: ffff8801e34a7b80 R08: ffffed003d36e8ce R09: ffffed003d36e8ce
      [  541.050402] R10: 0000000000000001 R11: ffffed003d36e8cd R12: ffff8801e9b74668
      [  541.051832] R13: ffff8801e9b74760 R14: ffff8801e9b74528 R15: ffff8801e9b74530
      [  541.053263] FS:  00007f4662bdf840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
      [  541.054891] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  541.056039] CR2: 000000000175c568 CR3: 00000001dcfe6000 CR4: 00000000000006e0
      [  541.058506] ==================================================================
      [  541.059991] BUG: KASAN: stack-out-of-bounds in update_stack_state+0x38c/0x3e0
      [  541.061513] Read of size 8 at addr ffff8801e34a7970 by task umount/1305
      
      [  541.063302] CPU: 1 PID: 1305 Comm: umount Tainted: G      D           4.18.0-rc1+ #4
      [  541.064838] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  541.066778] Call Trace:
      [  541.067294]  dump_stack+0x7b/0xb5
      [  541.067986]  print_address_description+0x70/0x290
      [  541.068941]  kasan_report+0x291/0x390
      [  541.069692]  ? update_stack_state+0x38c/0x3e0
      [  541.070598]  __asan_load8+0x54/0x90
      [  541.071315]  update_stack_state+0x38c/0x3e0
      [  541.072172]  ? __read_once_size_nocheck.constprop.7+0x20/0x20
      [  541.073340]  ? vprintk_func+0x27/0x60
      [  541.074096]  ? printk+0xa3/0xd3
      [  541.074762]  ? __save_stack_trace+0x5e/0x100
      [  541.075634]  unwind_next_frame.part.5+0x18e/0x490
      [  541.076594]  ? unwind_dump+0x290/0x290
      [  541.077368]  ? __show_regs+0x2c4/0x330
      [  541.078142]  __unwind_start+0x106/0x190
      [  541.085422]  __save_stack_trace+0x5e/0x100
      [  541.086268]  ? __save_stack_trace+0x5e/0x100
      [  541.087161]  ? unlink_anon_vmas+0xba/0x2c0
      [  541.087997]  save_stack_trace+0x1f/0x30
      [  541.088782]  save_stack+0x46/0xd0
      [  541.089475]  ? __alloc_pages_slowpath+0x1420/0x1420
      [  541.090477]  ? flush_tlb_mm_range+0x15e/0x220
      [  541.091364]  ? __dec_node_state+0x24/0xb0
      [  541.092180]  ? lock_page_memcg+0x85/0xf0
      [  541.092979]  ? unlock_page_memcg+0x16/0x80
      [  541.093812]  ? page_remove_rmap+0x198/0x520
      [  541.094674]  ? mark_page_accessed+0x133/0x200
      [  541.095559]  ? _cond_resched+0x1a/0x50
      [  541.096326]  ? unmap_page_range+0xcd4/0xe50
      [  541.097179]  ? rb_next+0x58/0x80
      [  541.097845]  ? rb_next+0x58/0x80
      [  541.098518]  __kasan_slab_free+0x13c/0x1a0
      [  541.099352]  ? unlink_anon_vmas+0xba/0x2c0
      [  541.100184]  kasan_slab_free+0xe/0x10
      [  541.100934]  kmem_cache_free+0x89/0x1e0
      [  541.101724]  unlink_anon_vmas+0xba/0x2c0
      [  541.102534]  free_pgtables+0x101/0x1b0
      [  541.103299]  exit_mmap+0x146/0x2a0
      [  541.103996]  ? __ia32_sys_munmap+0x50/0x50
      [  541.104829]  ? kasan_check_read+0x11/0x20
      [  541.105649]  ? mm_update_next_owner+0x322/0x380
      [  541.106578]  mmput+0x8b/0x1d0
      [  541.107191]  do_exit+0x43a/0x1390
      [  541.107876]  ? mm_update_next_owner+0x380/0x380
      [  541.108791]  ? deactivate_super+0x5e/0x80
      [  541.109610]  ? cleanup_mnt+0x61/0xa0
      [  541.110351]  ? __cleanup_mnt+0x12/0x20
      [  541.111115]  ? task_work_run+0xc8/0xf0
      [  541.111879]  ? exit_to_usermode_loop+0x125/0x130
      [  541.112817]  rewind_stack_do_exit+0x17/0x20
      [  541.113666] RIP: 0033:0x7f46624bf487
      [  541.114404] Code: Bad RIP value.
      [  541.115094] RSP: 002b:00007fff5e12e9a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      [  541.116605] RAX: 0000000000000000 RBX: 0000000001753030 RCX: 00007f46624bf487
      [  541.118034] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000175a1e0
      [  541.119472] RBP: 000000000175a1e0 R08: 0000000000000000 R09: 0000000000000014
      [  541.120890] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f46629c883c
      [  541.122321] R13: 0000000000000000 R14: 0000000001753210 R15: 00007fff5e12ec30
      
      [  541.124061] The buggy address belongs to the page:
      [  541.125042] page:ffffea00078d29c0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
      [  541.126651] flags: 0x2ffff0000000000()
      [  541.127418] raw: 02ffff0000000000 dead000000000100 dead000000000200 0000000000000000
      [  541.128963] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      [  541.130516] page dumped because: kasan: bad access detected
      
      [  541.131954] Memory state around the buggy address:
      [  541.132924]  ffff8801e34a7800: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
      [  541.134378]  ffff8801e34a7880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  541.135814] >ffff8801e34a7900: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1
      [  541.137253]                                                              ^
      [  541.138637]  ffff8801e34a7980: f1 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  541.140075]  ffff8801e34a7a00: 00 00 00 00 00 00 00 00 f3 00 00 00 00 00 00 00
      [  541.141509] ==================================================================
      
      - Location
      https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/inode.c#L512
      	BUG_ON(inode->i_data.nrpages);
      
      The root cause is root directory inode is corrupted, it has both
      inline_data and inline_dentry flag, and its nlink is zero, so in
      ->evict(), after dropping all page cache, it grabs page #0 for inline
      data truncation, result in panic in later clear_inode() where we will
      check inode->i_data.nrpages value.
      
      This patch adds inline flags check in sanity_check_inode, in addition,
      do sanity check with root inode's nlink.
      
      Reported-by Wen Xu <wen.xu@gatech.edu>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bcbfbd60
  8. 11 8月, 2018 4 次提交
    • C
      f2fs: fix to reset i_gc_failures correctly · 30933364
      Chao Yu 提交于
      Let's reset i_gc_failures to zero when we unset pinned state for file.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      30933364
    • C
      f2fs: fix invalid memory access · d3f07c04
      Chao Yu 提交于
      syzbot found the following crash on:
      
      HEAD commit:    d9bd94c0bcaa Add linux-next specific files for 20180801
      git tree:       linux-next
      console output: https://syzkaller.appspot.com/x/log.txt?x=1001189c400000
      kernel config:  https://syzkaller.appspot.com/x/.config?x=cc8964ea4d04518c
      dashboard link: https://syzkaller.appspot.com/bug?extid=c966a82db0b14aa37e81
      compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
      
      Unfortunately, I don't have any reproducer for this crash yet.
      
      IMPORTANT: if you fix the bug, please add the following tag to the commit:
      Reported-by: syzbot+c966a82db0b14aa37e81@syzkaller.appspotmail.com
      
      loop7: rw=12288, want=8200, limit=20
      netlink: 65342 bytes leftover after parsing attributes in process `syz-executor4'.
      openvswitch: netlink: Message has 8 unknown bytes.
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      CPU: 1 PID: 7615 Comm: syz-executor7 Not tainted 4.18.0-rc7-next-20180801+ #29
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
      RIP: 0010:compound_head include/linux/page-flags.h:142 [inline]
      RIP: 0010:PageLocked include/linux/page-flags.h:272 [inline]
      RIP: 0010:f2fs_put_page fs/f2fs/f2fs.h:2011 [inline]
      RIP: 0010:validate_checkpoint+0x66d/0xec0 fs/f2fs/checkpoint.c:835
      Code: e8 58 05 7f fe 4c 8d 6b 80 4d 8d 74 24 08 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 c6 04 02 00 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 f4 06 00 00 4c 89 ea 4d 8b 7c 24 08 48 b8 00 00
      RSP: 0018:ffff8801937cebe8 EFLAGS: 00010246
      RAX: dffffc0000000000 RBX: ffff8801937cef30 RCX: ffffc90006035000
      RDX: 0000000000000000 RSI: ffffffff82fd9658 RDI: 0000000000000005
      RBP: ffff8801937cef58 R08: ffff8801ab254700 R09: fffff94000d9e026
      R10: fffff94000d9e026 R11: ffffea0006cf0137 R12: fffffffffffffffb
      R13: ffff8801937ceeb0 R14: 0000000000000003 R15: ffff880193419b40
      FS:  00007f36a61d5700(0000) GS:ffff8801db100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fc04ff93000 CR3: 00000001d0562000 CR4: 00000000001426e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       f2fs_get_valid_checkpoint+0x436/0x1ec0 fs/f2fs/checkpoint.c:860
       f2fs_fill_super+0x2d42/0x8110 fs/f2fs/super.c:2883
       mount_bdev+0x314/0x3e0 fs/super.c:1344
       f2fs_mount+0x3c/0x50 fs/f2fs/super.c:3133
       legacy_get_tree+0x131/0x460 fs/fs_context.c:729
       vfs_get_tree+0x1cb/0x5c0 fs/super.c:1743
       do_new_mount fs/namespace.c:2603 [inline]
       do_mount+0x6f2/0x1e20 fs/namespace.c:2927
       ksys_mount+0x12d/0x140 fs/namespace.c:3143
       __do_sys_mount fs/namespace.c:3157 [inline]
       __se_sys_mount fs/namespace.c:3154 [inline]
       __x64_sys_mount+0xbe/0x150 fs/namespace.c:3154
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x45943a
      Code: b8 a6 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 bd 8a fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 9a 8a fb ff c3 66 0f 1f 84 00 00 00 00 00
      RSP: 002b:00007f36a61d4a88 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
      RAX: ffffffffffffffda RBX: 00007f36a61d4b30 RCX: 000000000045943a
      RDX: 00007f36a61d4ad0 RSI: 0000000020000100 RDI: 00007f36a61d4af0
      RBP: 0000000020000100 R08: 00007f36a61d4b30 R09: 00007f36a61d4ad0
      R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000013
      R13: 0000000000000000 R14: 00000000004c8ea0 R15: 0000000000000000
      Modules linked in:
      Dumping ftrace buffer:
         (ftrace buffer empty)
      ---[ end trace bd8550c129352286 ]---
      RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
      RIP: 0010:compound_head include/linux/page-flags.h:142 [inline]
      RIP: 0010:PageLocked include/linux/page-flags.h:272 [inline]
      RIP: 0010:f2fs_put_page fs/f2fs/f2fs.h:2011 [inline]
      RIP: 0010:validate_checkpoint+0x66d/0xec0 fs/f2fs/checkpoint.c:835
      Code: e8 58 05 7f fe 4c 8d 6b 80 4d 8d 74 24 08 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 c6 04 02 00 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 f4 06 00 00 4c 89 ea 4d 8b 7c 24 08 48 b8 00 00
      RSP: 0018:ffff8801937cebe8 EFLAGS: 00010246
      RAX: dffffc0000000000 RBX: ffff8801937cef30 RCX: ffffc90006035000
      RDX: 0000000000000000 RSI: ffffffff82fd9658 RDI: 0000000000000005
      netlink: 65342 bytes leftover after parsing attributes in process `syz-executor4'.
      RBP: ffff8801937cef58 R08: ffff8801ab254700 R09: fffff94000d9e026
      openvswitch: netlink: Message has 8 unknown bytes.
      R10: fffff94000d9e026 R11: ffffea0006cf0137 R12: fffffffffffffffb
      R13: ffff8801937ceeb0 R14: 0000000000000003 R15: ffff880193419b40
      FS:  00007f36a61d5700(0000) GS:ffff8801db100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fc04ff93000 CR3: 00000001d0562000 CR4: 00000000001426e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      In validate_checkpoint(), if we failed to call get_checkpoint_version(), we
      will pass returned invalid page pointer into f2fs_put_page, cause accessing
      invalid memory, this patch tries to handle error path correctly to fix this
      issue.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d3f07c04
    • C
      f2fs: fix to avoid broken of dnode block list · 50fa53ec
      Chao Yu 提交于
      f2fs recovery flow is relying on dnode block link list, it means fsynced
      file recovery depends on previous dnode's persistence in the list, so
      during fsync() we should wait on all regular inode's dnode writebacked
      before issuing flush.
      
      By this way, we can avoid dnode block list being broken by out-of-order
      IO submission due to IO scheduler or driver.
      
      Sheng Yong helps to do the test with this patch:
      
      Target:/data (f2fs, -)
      64MB / 32768KB / 4KB / 8
      
      1 / PERSIST / Index
      
      Base:
      	SEQ-RD(MB/s)	SEQ-WR(MB/s)	RND-RD(IOPS)	RND-WR(IOPS)	Insert(TPS)	Update(TPS)	Delete(TPS)
      1	867.82		204.15		41440.03	41370.54	680.8		1025.94		1031.08
      2	871.87		205.87		41370.3		40275.2		791.14		1065.84		1101.7
      3	866.52		205.69		41795.67	40596.16	694.69		1037.16		1031.48
      Avg	868.7366667	205.2366667	41535.33333	40747.3		722.21		1042.98		1054.753333
      
      After:
      	SEQ-RD(MB/s)	SEQ-WR(MB/s)	RND-RD(IOPS)	RND-WR(IOPS)	Insert(TPS)	Update(TPS)	Delete(TPS)
      1	798.81		202.5		41143		40613.87	602.71		838.08		913.83
      2	805.79		206.47		40297.2		41291.46	604.44		840.75		924.27
      3	814.83		206.17		41209.57	40453.62	602.85		834.66		927.91
      Avg	806.4766667	205.0466667	40883.25667	40786.31667	603.3333333	837.83		922.0033333
      
      Patched/Original:
      	0.928332713	0.999074239	0.984300676	1.000957528	0.835398753	0.803303994	0.874141189
      
      It looks like atomic write will suffer performance regression.
      
      I suspect that the criminal is that we forcing to wait all dnode being in
      storage cache before we issue PREFLUSH+FUA.
      
      BTW, will commit ("f2fs: don't need to wait for node writes for atomic write")
      cause the problem: we will lose data of last transaction after SPO, even if
      atomic write return no error:
      
      - atomic_open();
      - write() P1, P2, P3;
      - atomic_commit();
       - writeback data: P1, P2, P3;
       - writeback node: N1, N2, N3;  <--- If N1, N2 is not writebacked, N3 with fsync_mark is
      writebacked, In SPOR, we won't find N3 since node chain is broken, turns out that losing
      last transaction.
       - preflush + fua;
      - power-cut
      
      If we don't wait dnode writeback for atomic_write:
      
      	SEQ-RD(MB/s)	SEQ-WR(MB/s)	RND-RD(IOPS)	RND-WR(IOPS)	Insert(TPS)	Update(TPS)	Delete(TPS)
      1	779.91		206.03		41621.5		40333.16	716.9		1038.21		1034.85
      2	848.51		204.35		40082.44	39486.17	791.83		1119.96		1083.77
      3	772.12		206.27		41335.25	41599.65	723.29		1055.07		971.92
      Avg	800.18		205.55		41013.06333	40472.99333	744.0066667	1071.08		1030.18
      
      Patched/Original:
      	0.92108464	1.001526693	0.987425886	0.993268102	1.030180511	1.026942031	0.976702294
      
      SQLite's performance recovers.
      
      Jaegeuk:
      "Practically, I don't see db corruption becase of this. We can excuse to lose
      the last transaction."
      
      Finally, we decide to keep original implementation of atomic write interface
      sematics that we don't wait all dnode writeback before preflush+fua submission.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      50fa53ec
    • G
      f2fs: use true and false for boolean values · 6e45f2a5
      Gustavo A. R. Silva 提交于
      Return statements in functions returning bool should use true or false
      instead of an integer value.
      
      This issue was detected with the help of Coccinelle.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6e45f2a5