1. 08 7月, 2020 5 次提交
    • Y
      f2fs: fix an oops in f2fs_is_compressed_page · 29b993c7
      Yu Changchun 提交于
      This patch is to fix a crash:
      
       #3 [ffffb6580689f898] oops_end at ffffffffa2835bc2
       #4 [ffffb6580689f8b8] no_context at ffffffffa28766e7
       #5 [ffffb6580689f920] async_page_fault at ffffffffa320135e
          [exception RIP: f2fs_is_compressed_page+34]
          RIP: ffffffffa2ba83a2  RSP: ffffb6580689f9d8  RFLAGS: 00010213
          RAX: 0000000000000001  RBX: fffffc0f50b34bc0  RCX: 0000000000002122
          RDX: 0000000000002123  RSI: 0000000000000c00  RDI: fffffc0f50b34bc0
          RBP: ffff97e815a40178   R8: 0000000000000000   R9: ffff97e83ffc9000
          R10: 0000000000032300  R11: 0000000000032380  R12: ffffb6580689fa38
          R13: fffffc0f50b34bc0  R14: ffff97e825cbd000  R15: 0000000000000c00
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
       #6 [ffffb6580689f9d8] __is_cp_guaranteed at ffffffffa2b7ea98
       #7 [ffffb6580689f9f0] f2fs_submit_page_write at ffffffffa2b81a69
       #8 [ffffb6580689fa30] f2fs_do_write_meta_page at ffffffffa2b99777
       #9 [ffffb6580689fae0] __f2fs_write_meta_page at ffffffffa2b75f1a
       #10 [ffffb6580689fb18] f2fs_sync_meta_pages at ffffffffa2b77466
       #11 [ffffb6580689fc98] do_checkpoint at ffffffffa2b78e46
       #12 [ffffb6580689fd88] f2fs_write_checkpoint at ffffffffa2b79c29
       #13 [ffffb6580689fdd0] f2fs_sync_fs at ffffffffa2b69d95
       #14 [ffffb6580689fe20] sync_filesystem at ffffffffa2ad2574
       #15 [ffffb6580689fe30] generic_shutdown_super at ffffffffa2a9b582
       #16 [ffffb6580689fe48] kill_block_super at ffffffffa2a9b6d1
       #17 [ffffb6580689fe60] kill_f2fs_super at ffffffffa2b6abe1
       #18 [ffffb6580689fea0] deactivate_locked_super at ffffffffa2a9afb6
       #19 [ffffb6580689feb8] cleanup_mnt at ffffffffa2abcad4
       #20 [ffffb6580689fee0] task_work_run at ffffffffa28bca28
       #21 [ffffb6580689ff00] exit_to_usermode_loop at ffffffffa28050b7
       #22 [ffffb6580689ff38] do_syscall_64 at ffffffffa280560e
       #23 [ffffb6580689ff50] entry_SYSCALL_64_after_hwframe at ffffffffa320008c
      
      This occurred when umount f2fs if enable F2FS_FS_COMPRESSION
      with F2FS_IO_TRACE. Fixes it by adding IS_IO_TRACED_PAGE to check
      validity of pid for page_private.
      Signed-off-by: NYu Changchun <yuchangchun1@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      29b993c7
    • C
      f2fs: fix to wait page writeback before update · a6d601f3
      Chao Yu 提交于
      Filesystem including f2fs should support stable page for special
      device like software raid, however there is one missing path that
      page could be updated while it is writeback state as below, fix
      this.
      
      - gc_node_segment
       - f2fs_move_node_page
        - __write_node_page
         - set_page_writeback
      
      - do_read_inode
       - f2fs_init_extent_tree
        - __f2fs_init_extent_tree
          i_ext->len = 0;
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a6d601f3
    • C
      f2fs: show more debug info for per-temperature log · 0759e2c1
      Chao Yu 提交于
      - Add to account and show per-log dirty_seg, full_seg and valid_blocks
      in debugfs.
      - reformat printed info.
      
          TYPE            segno    secno   zoneno  dirty_seg   full_seg  valid_blk
        - COLD   data:     1523     1523     1523          1          0        399
        - WARM   data:      769      769      769         20        255     133098
        - HOT    data:      767      767      767          9          0        167
        - Dir   dnode:       22       22       22          3          0         70
        - File  dnode:      722      722      722         14         10       6505
        - Indir nodes:        2        2        2          1          0          3
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0759e2c1
    • C
      f2fs: clean up parameter of f2fs_allocate_data_block() · f608c38c
      Chao Yu 提交于
      Use validation of @fio to inidcate whether caller want to serialize IOs
      in io.io_list or not, then @add_list will be redundant, remove it.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f608c38c
    • C
      f2fs: add prefix for exported symbols · 0ef81833
      Chao Yu 提交于
      to avoid polluting global symbol namespace.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0ef81833
  2. 09 6月, 2020 2 次提交
    • J
      f2fs: add node_io_flag for bio flags likewise data_io_flag · 32b6aba8
      Jaegeuk Kim 提交于
      This patch adds another way to attach bio flags to node writes.
      
      Description:   Give a way to attach REQ_META|FUA to node writes
                     given temperature-based bits. Now the bits indicate:
                     *      REQ_META     |      REQ_FUA      |
                     *    5 |    4 |   3 |    2 |    1 |   0 |
                     * Cold | Warm | Hot | Cold | Warm | Hot |
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      32b6aba8
    • E
      f2fs: don't return vmalloc() memory from f2fs_kmalloc() · 0b6d4ca0
      Eric Biggers 提交于
      kmalloc() returns kmalloc'ed memory, and kvmalloc() returns either
      kmalloc'ed or vmalloc'ed memory.  But the f2fs wrappers, f2fs_kmalloc()
      and f2fs_kvmalloc(), both return both kinds of memory.
      
      It's redundant to have two functions that do the same thing, and also
      breaking the standard naming convention is causing bugs since people
      assume it's safe to kfree() memory allocated by f2fs_kmalloc().  See
      e.g. the various allocations in fs/f2fs/compress.c.
      
      Fix this by making f2fs_kmalloc() just use kmalloc().  And to avoid
      re-introducing the allocation failures that the vmalloc fallback was
      intended to fix, convert the largest allocations to use f2fs_kvmalloc().
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0b6d4ca0
  3. 03 6月, 2020 2 次提交
  4. 19 5月, 2020 2 次提交
    • E
      fscrypt: support test_dummy_encryption=v2 · ed318a6c
      Eric Biggers 提交于
      v1 encryption policies are deprecated in favor of v2, and some new
      features (e.g. encryption+casefolding) are only being added for v2.
      
      Therefore, the "test_dummy_encryption" mount option (which is used for
      encryption I/O testing with xfstests) needs to support v2 policies.
      
      To do this, extend its syntax to be "test_dummy_encryption=v1" or
      "test_dummy_encryption=v2".  The existing "test_dummy_encryption" (no
      argument) also continues to be accepted, to specify the default setting
      -- currently v1, but the next patch changes it to v2.
      
      To cleanly support both v1 and v2 while also making it easy to support
      specifying other encryption settings in the future (say, accepting
      "$contents_mode:$filenames_mode:v2"), make ext4 and f2fs maintain a
      pointer to the dummy fscrypt_context rather than using mount flags.
      
      To avoid concurrency issues, don't allow test_dummy_encryption to be set
      or changed during a remount.  (The former restriction is new, but
      xfstests doesn't run into it, so no one should notice.)
      
      Tested with 'gce-xfstests -c {ext4,f2fs}/encrypt -g auto'.  On ext4,
      there are two regressions, both of which are test bugs: ext4/023 and
      ext4/028 fail because they set an xattr and expect it to be stored
      inline, but the increase in size of the fscrypt_context from
      24 to 40 bytes causes this xattr to be spilled into an external block.
      
      Link: https://lore.kernel.org/r/20200512233251.118314-4-ebiggers@kernel.orgAcked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      ed318a6c
    • J
      f2fs: fix checkpoint=disable:%u%% · 1ae18f71
      Jaegeuk Kim 提交于
      When parsing the mount option, we don't have sbi->user_block_count.
      Should do it after getting it.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1ae18f71
  5. 12 5月, 2020 10 次提交
    • C
      f2fs: add compressed/gc data read IO stat · 9c122384
      Chao Yu 提交于
      in order to account data read IOs more accurately.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9c122384
    • J
      f2fs: refactor resize_fs to avoid meta updates in progress · b4b10061
      Jaegeuk Kim 提交于
      Sahitya raised an issue:
      - prevent meta updates while checkpoint is in progress
      
      allocate_segment_for_resize() can cause metapage updates if
      it requires to change the current node/data segments for resizing.
      Stop these meta updates when there is a checkpoint already
      in progress to prevent inconsistent CP data.
      Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b4b10061
    • C
      f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS · c75488fb
      Chao Yu 提交于
      This patch introduces a new ioctl to rollback all compress inode
      status:
      - add reserved blocks in dnode blocks
      - increase i_compr_blocks, i_blocks, total_valid_block_count
      - remove immutable flag
      
      Then compress inode can be restored to support overwrite
      functionality again.
      Signee-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c75488fb
    • S
      f2fs: Avoid double lock for cp_rwsem during checkpoint · 34c061ad
      Sayali Lokhande 提交于
      There could be a scenario where f2fs_sync_node_pages gets
      called during checkpoint, which in turn tries to flush
      inline data and calls iput(). This results in deadlock as
      iput() tries to hold cp_rwsem, which is already held at the
      beginning by checkpoint->block_operations().
      
      Call stack :
      
      Thread A		Thread B
      f2fs_write_checkpoint()
      - block_operations(sbi)
       - f2fs_lock_all(sbi);
        - down_write(&sbi->cp_rwsem);
      
                              - open()
                               - igrab()
                              - write() write inline data
                              - unlink()
      - f2fs_sync_node_pages()
       - if (is_inline_node(page))
        - flush_inline_data()
         - ilookup()
           page = f2fs_pagecache_get_page()
           if (!page)
            goto iput_out;
           iput_out:
      			-close()
      			-iput()
             iput(inode);
             - f2fs_evict_inode()
              - f2fs_truncate_blocks()
               - f2fs_lock_op()
                 - down_read(&sbi->cp_rwsem);
      
      Fixes: 2049d4fc ("f2fs: avoid multiple node page writes due to inline_data")
      Signed-off-by: NSayali Lokhande <sayalil@codeaurora.org>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      34c061ad
    • Y
      f2fs: Fix wrong stub helper update_sit_info · 48abe91a
      YueHaibing 提交于
      update_sit_info should be f2fs_update_sit_info,
      otherwise build fails while no CONFIG_F2FS_STAT_FS.
      
      Fixes: fc7100ea ("f2fs: Add f2fs stats to sysfs")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      48abe91a
    • C
      f2fs: introduce F2FS_IOC_RELEASE_COMPRESS_BLOCKS · ef8d563f
      Chao Yu 提交于
      There are still reserved blocks on compressed inode, this patch
      introduce a new ioctl to help release reserved blocks back to
      filesystem, so that userspace can reuse those freed space.
      
      ----
      Daeho fixed a bug like below.
      
      Now, if writing pages and releasing compress blocks occur
      simultaneously, and releasing cblocks is executed more than one time
      to a file, then total block count of filesystem and block count of the
      file could be incorrect and damaged.
      
      We have to execute releasing compress blocks only one time for a file
      without being interfered by writepages path.
      ---
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ef8d563f
    • E
      f2fs: rework filename handling · 43c780ba
      Eric Biggers 提交于
      Rework f2fs's handling of filenames to use a new 'struct f2fs_filename'.
      Similar to 'struct ext4_filename', this stores the usr_fname, disk_name,
      dirhash, crypto_buf, and casefolded name.  Some of these names can be
      NULL in some cases.  'struct f2fs_filename' differs from
      'struct fscrypt_name' mainly in that the casefolded name is included.
      
      For user-initiated directory operations like lookup() and create(),
      initialize the f2fs_filename by translating the corresponding
      fscrypt_name, then computing the dirhash and casefolded name if needed.
      
      This makes the dirhash and casefolded name be cached for each syscall,
      so we don't have to recompute them repeatedly.  (Previously, f2fs
      computed the dirhash once per directory level, and the casefolded name
      once per directory block.)  This improves performance.
      
      This rework also makes it much easier to correctly handle all
      combinations of normal, encrypted, casefolded, and encrypted+casefolded
      directories.  (The fourth isn't supported yet but is being worked on.)
      
      The only other cases where an f2fs_filename gets initialized are for two
      filesystem-internal operations: (1) when converting an inline directory
      to a regular one, we grab the needed disk_name and hash from an existing
      f2fs_dir_entry; and (2) when roll-forward recovering a new dentry, we
      grab the needed disk_name from f2fs_inode::i_name and compute the hash.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      43c780ba
    • E
      f2fs: split f2fs_d_compare() from f2fs_match_name() · f874fa1c
      Eric Biggers 提交于
      Sharing f2fs_ci_compare() between comparing cached dentries
      (f2fs_d_compare()) and comparing on-disk dentries (f2fs_match_name())
      doesn't work as well as intended, as these actions fundamentally differ
      in several ways (e.g. whether the task may sleep, whether the directory
      is stable, whether the casefolded name was precomputed, whether the
      dentry will need to be decrypted once we allow casefold+encrypt, etc.)
      
      Just make f2fs_d_compare() implement what it needs directly, and rework
      f2fs_ci_compare() to be specialized for f2fs_match_name().
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f874fa1c
    • C
      f2fs: compress: support lzo-rle compress algorithm · 6d92b201
      Chao Yu 提交于
      LZO-RLE extension (run length encoding) was introduced to improve
      performance of LZO algorithm in scenario of data contains many zeros,
      zram has changed to use this extended algorithm by default, this
      patch adds to support this algorithm extension, to enable this
      extension, it needs to enable F2FS_FS_LZO and F2FS_FS_LZORLE config,
      and specifies "compress_algorithm=lzo-rle" mountoption.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6d92b201
    • C
      f2fs: introduce mempool for {,de}compress intermediate page allocation · 5e6bbde9
      Chao Yu 提交于
      If compression feature is on, in scenario of no enough free memory,
      page refault ratio is higher than before, the root cause is:
      - {,de}compression flow needs to allocate intermediate pages to store
      compressed data in cluster, so during their allocation, vm may reclaim
      mmaped pages.
      - if above reclaimed pages belong to compressed cluster, during its
      refault, it may cause more intermediate pages allocation, result in
      reclaiming more mmaped pages.
      
      So this patch introduces a mempool for intermediate page allocation,
      in order to avoid high refault ratio, by default, number of
      preallocated page in pool is 512, user can change the number by
      assigning 'num_compress_pages' parameter during module initialization.
      
      Ma Feng found warnings in the original patch and fixed like below.
      
      Fix the following sparse warning:
      fs/f2fs/compress.c:501:5: warning: symbol 'num_compress_pages' was not declared.
       Should it be static?
      fs/f2fs/compress.c:530:6: warning: symbol 'f2fs_compress_free_page' was not
      declared. Should it be static?
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NMa Feng <mafeng.ma@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5e6bbde9
  6. 08 5月, 2020 3 次提交
  7. 18 4月, 2020 2 次提交
  8. 17 4月, 2020 1 次提交
  9. 04 4月, 2020 3 次提交
  10. 31 3月, 2020 4 次提交
    • C
      f2fs: fix potential .flags overflow on 32bit architecture · 7653b9d8
      Chao Yu 提交于
      f2fs_inode_info.flags is unsigned long variable, it has 32 bits
      in 32bit architecture, since we introduced FI_MMAP_FILE flag
      when we support data compression, we may access memory cross
      the border of .flags field, corrupting .i_sem field, result in
      below deadlock.
      
      To fix this issue, let's expand .flags as an array to grab enough
      space to store new flags.
      
      Call Trace:
       __schedule+0x8d0/0x13fc
       ? mark_held_locks+0xac/0x100
       schedule+0xcc/0x260
       rwsem_down_write_slowpath+0x3ab/0x65d
       down_write+0xc7/0xe0
       f2fs_drop_nlink+0x3d/0x600 [f2fs]
       f2fs_delete_inline_entry+0x300/0x440 [f2fs]
       f2fs_delete_entry+0x3a1/0x7f0 [f2fs]
       f2fs_unlink+0x500/0x790 [f2fs]
       vfs_unlink+0x211/0x490
       do_unlinkat+0x483/0x520
       sys_unlink+0x4a/0x70
       do_fast_syscall_32+0x12b/0x683
       entry_SYSENTER_32+0xaa/0x102
      
      Fixes: 4c8ff709 ("f2fs: support data compression")
      Tested-by: NOndrej Jirman <megous@megous.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7653b9d8
    • C
      f2fs: don't trigger data flush in foreground operation · 7bcd0cfa
      Chao Yu 提交于
      Data flush can generate heavy IO and cause long latency during
      flush, so it's not appropriate to trigger it in foreground
      operation.
      
      And also, we may face below potential deadlock during data flush:
      - f2fs_write_multi_pages
       - f2fs_write_raw_pages
        - f2fs_write_single_data_page
         - f2fs_balance_fs
          - f2fs_balance_fs_bg
           - f2fs_sync_dirty_inodes
            - filemap_fdatawrite   -- stuck on flush same cluster
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7bcd0cfa
    • C
      f2fs: clean up f2fs_may_encrypt() · 8c7d4b57
      Chao Yu 提交于
      Merge below two conditions into f2fs_may_encrypt() for cleanup
      - IS_ENCRYPTED()
      - DUMMY_ENCRYPTION_ENABLED()
      
      Check IS_ENCRYPTED(inode) condition in f2fs_init_inode_metadata()
      is enough since we have already set encrypt flag in f2fs_new_inode().
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8c7d4b57
    • C
      f2fs: don't mark compressed inode dirty during f2fs_iget() · 530e0704
      Chao Yu 提交于
      - f2fs_iget
       - do_read_inode
        - set_inode_flag(, FI_COMPRESSED_FILE)
         - __mark_inode_dirty_flag(, true)
      
      It's unnecessary, so let's just mark compressed inode dirty while
      compressed inode conversion.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      530e0704
  11. 25 3月, 2020 1 次提交
  12. 23 3月, 2020 1 次提交
  13. 20 3月, 2020 4 次提交