1. 03 12月, 2020 2 次提交
  2. 03 11月, 2020 1 次提交
  3. 14 10月, 2020 1 次提交
    • J
      f2fs: handle errors of f2fs_get_meta_page_nofail · 86f33603
      Jaegeuk Kim 提交于
      First problem is we hit BUG_ON() in f2fs_get_sum_page given EIO on
      f2fs_get_meta_page_nofail().
      
      Quick fix was not to give any error with infinite loop, but syzbot caught
      a case where it goes to that loop from fuzzed image. In turned out we abused
      f2fs_get_meta_page_nofail() like in the below call stack.
      
      - f2fs_fill_super
       - f2fs_build_segment_manager
        - build_sit_entries
         - get_current_sit_page
      
      INFO: task syz-executor178:6870 can't die for more than 143 seconds.
      task:syz-executor178 state:R
       stack:26960 pid: 6870 ppid:  6869 flags:0x00004006
      Call Trace:
      
      Showing all locks held in the system:
      1 lock held by khungtaskd/1179:
       #0: ffffffff8a554da0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6242
      1 lock held by systemd-journal/3920:
      1 lock held by in:imklog/6769:
       #0: ffff88809eebc130 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:930
      1 lock held by syz-executor178/6870:
       #0: ffff8880925120e0 (&type->s_umount_key#47/1){+.+.}-{3:3}, at: alloc_super+0x201/0xaf0 fs/super.c:229
      
      Actually, we didn't have to use _nofail in this case, since we could return
      error to mount(2) already with the error handler.
      
      As a result, this patch tries to 1) remove _nofail callers as much as possible,
      2) deal with error case in last remaining caller, f2fs_get_sum_page().
      
      Reported-by: syzbot+ee250ac8137be41d7b13@syzkaller.appspotmail.com
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      86f33603
  4. 09 10月, 2020 1 次提交
  5. 30 9月, 2020 2 次提交
  6. 22 9月, 2020 2 次提交
    • E
      fscrypt: handle test_dummy_encryption in more logical way · ac4acb1f
      Eric Biggers 提交于
      The behavior of the test_dummy_encryption mount option is that when a
      new file (or directory or symlink) is created in an unencrypted
      directory, it's automatically encrypted using a dummy encryption policy.
      That's it; in particular, the encryption (or lack thereof) of existing
      files (or directories or symlinks) doesn't change.
      
      Unfortunately the implementation of test_dummy_encryption is a bit weird
      and confusing.  When test_dummy_encryption is enabled and a file is
      being created in an unencrypted directory, we set up an encryption key
      (->i_crypt_info) for the directory.  This isn't actually used to do any
      encryption, however, since the directory is still unencrypted!  Instead,
      ->i_crypt_info is only used for inheriting the encryption policy.
      
      One consequence of this is that the filesystem ends up providing a
      "dummy context" (policy + nonce) instead of a "dummy policy".  In
      commit ed318a6c ("fscrypt: support test_dummy_encryption=v2"), I
      mistakenly thought this was required.  However, actually the nonce only
      ends up being used to derive a key that is never used.
      
      Another consequence of this implementation is that it allows for
      'inode->i_crypt_info != NULL && !IS_ENCRYPTED(inode)', which is an edge
      case that can be forgotten about.  For example, currently
      FS_IOC_GET_ENCRYPTION_POLICY on an unencrypted directory may return the
      dummy encryption policy when the filesystem is mounted with
      test_dummy_encryption.  That seems like the wrong thing to do, since
      again, the directory itself is not actually encrypted.
      
      Therefore, switch to a more logical and maintainable implementation
      where the dummy encryption policy inheritance is done without setting up
      keys for unencrypted directories.  This involves:
      
      - Adding a function fscrypt_policy_to_inherit() which returns the
        encryption policy to inherit from a directory.  This can be a real
        policy, a dummy policy, or no policy.
      
      - Replacing struct fscrypt_dummy_context, ->get_dummy_context(), etc.
        with struct fscrypt_dummy_policy, ->get_dummy_policy(), etc.
      
      - Making fscrypt_fname_encrypted_size() take an fscrypt_policy instead
        of an inode.
      Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Acked-by: NJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-13-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>
      ac4acb1f
    • E
      f2fs: use fscrypt_prepare_new_inode() and fscrypt_set_context() · e075b690
      Eric Biggers 提交于
      Convert f2fs to use the new functions fscrypt_prepare_new_inode() and
      fscrypt_set_context().  This avoids calling
      fscrypt_get_encryption_info() from under f2fs_lock_op(), which can
      deadlock because fscrypt_get_encryption_info() isn't GFP_NOFS-safe.
      
      For more details about this problem, see the earlier patch
      "fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()".
      
      This also fixes a f2fs-specific deadlock when the filesystem is mounted
      with '-o test_dummy_encryption' and a file is created in an unencrypted
      directory other than the root directory:
      
          INFO: task touch:207 blocked for more than 30 seconds.
                Not tainted 5.9.0-rc4-00099-g729e3d09 #2
          "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
          task:touch           state:D stack:    0 pid:  207 ppid:   167 flags:0x00000000
          Call Trace:
           [...]
           lock_page include/linux/pagemap.h:548 [inline]
           pagecache_get_page+0x25e/0x310 mm/filemap.c:1682
           find_or_create_page include/linux/pagemap.h:348 [inline]
           grab_cache_page include/linux/pagemap.h:424 [inline]
           f2fs_grab_cache_page fs/f2fs/f2fs.h:2395 [inline]
           f2fs_grab_cache_page fs/f2fs/f2fs.h:2373 [inline]
           __get_node_page.part.0+0x39/0x2d0 fs/f2fs/node.c:1350
           __get_node_page fs/f2fs/node.c:35 [inline]
           f2fs_get_node_page+0x2e/0x60 fs/f2fs/node.c:1399
           read_inline_xattr+0x88/0x140 fs/f2fs/xattr.c:288
           lookup_all_xattrs+0x1f9/0x2c0 fs/f2fs/xattr.c:344
           f2fs_getxattr+0x9b/0x160 fs/f2fs/xattr.c:532
           f2fs_get_context+0x1e/0x20 fs/f2fs/super.c:2460
           fscrypt_get_encryption_info+0x9b/0x450 fs/crypto/keysetup.c:472
           fscrypt_inherit_context+0x2f/0xb0 fs/crypto/policy.c:640
           f2fs_init_inode_metadata+0xab/0x340 fs/f2fs/dir.c:540
           f2fs_add_inline_entry+0x145/0x390 fs/f2fs/inline.c:621
           f2fs_add_dentry+0x31/0x80 fs/f2fs/dir.c:757
           f2fs_do_add_link+0xcd/0x130 fs/f2fs/dir.c:798
           f2fs_add_link fs/f2fs/f2fs.h:3234 [inline]
           f2fs_create+0x104/0x290 fs/f2fs/namei.c:344
           lookup_open.isra.0+0x2de/0x500 fs/namei.c:3103
           open_last_lookups+0xa9/0x340 fs/namei.c:3177
           path_openat+0x8f/0x1b0 fs/namei.c:3365
           do_filp_open+0x87/0x130 fs/namei.c:3395
           do_sys_openat2+0x96/0x150 fs/open.c:1168
           [...]
      
      That happened because f2fs_add_inline_entry() locks the directory
      inode's page in order to add the dentry, then f2fs_get_context() tries
      to lock it recursively in order to read the encryption xattr.  This
      problem is specific to "test_dummy_encryption" because normally the
      directory's fscrypt_info would be set up prior to
      f2fs_add_inline_entry() in order to encrypt the new filename.
      
      Regardless, the new design fixes this test_dummy_encryption deadlock as
      well as potential deadlocks with fs reclaim, by setting up any needed
      fscrypt_info structs prior to taking so many locks.
      
      The test_dummy_encryption deadlock was reported by Daniel Rosenberg.
      Reported-by: NDaniel Rosenberg <drosen@google.com>
      Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-5-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>
      e075b690
  7. 12 9月, 2020 5 次提交
    • D
      f2fs: change return value of f2fs_disable_compressed_file to bool · 78134d03
      Daeho Jeong 提交于
      The returned integer is not required anywhere. So we need to change
      the return value to bool type.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      78134d03
    • D
      f2fs: change i_compr_blocks of inode to atomic value · c2759eba
      Daeho Jeong 提交于
      writepages() can be concurrently invoked for the same file by different
      threads such as a thread fsyncing the file and a kworker kernel thread.
      So, changing i_compr_blocks without protection is racy and we need to
      protect it by changing it with atomic type value. Plus, we don't need
      a 64bit value for i_compr_blocks, so just we will use a atomic value,
      not atomic64.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c2759eba
    • C
      f2fs: allocate proper size memory for zstd decompress · 0e2b7385
      Chao Yu 提交于
      As 5kft <5kft@5kft.org> reported:
      
       kworker/u9:3: page allocation failure: order:9, mode:0x40c40(GFP_NOFS|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
       CPU: 3 PID: 8168 Comm: kworker/u9:3 Tainted: G         C        5.8.3-sunxi #trunk
       Hardware name: Allwinner sun8i Family
       Workqueue: f2fs_post_read_wq f2fs_post_read_work
       [<c010d6d5>] (unwind_backtrace) from [<c0109a55>] (show_stack+0x11/0x14)
       [<c0109a55>] (show_stack) from [<c056d489>] (dump_stack+0x75/0x84)
       [<c056d489>] (dump_stack) from [<c0243b53>] (warn_alloc+0xa3/0x104)
       [<c0243b53>] (warn_alloc) from [<c024473b>] (__alloc_pages_nodemask+0xb87/0xc40)
       [<c024473b>] (__alloc_pages_nodemask) from [<c02267c5>] (kmalloc_order+0x19/0x38)
       [<c02267c5>] (kmalloc_order) from [<c02267fd>] (kmalloc_order_trace+0x19/0x90)
       [<c02267fd>] (kmalloc_order_trace) from [<c047c665>] (zstd_init_decompress_ctx+0x21/0x88)
       [<c047c665>] (zstd_init_decompress_ctx) from [<c047e9cf>] (f2fs_decompress_pages+0x97/0x228)
       [<c047e9cf>] (f2fs_decompress_pages) from [<c045d0ab>] (__read_end_io+0xfb/0x130)
       [<c045d0ab>] (__read_end_io) from [<c045d141>] (f2fs_post_read_work+0x61/0x84)
       [<c045d141>] (f2fs_post_read_work) from [<c0130b2f>] (process_one_work+0x15f/0x3b0)
       [<c0130b2f>] (process_one_work) from [<c0130e7b>] (worker_thread+0xfb/0x3e0)
       [<c0130e7b>] (worker_thread) from [<c0135c3b>] (kthread+0xeb/0x10c)
       [<c0135c3b>] (kthread) from [<c0100159>]
      
      zstd may allocate large size memory for {,de}compression, it may cause
      file copy failure on low-end device which has very few memory.
      
      For decompression, let's just allocate proper size memory based on current
      file's cluster size instead of max cluster size.
      Reported-by: N5kft <5kft@5kft.org>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0e2b7385
    • D
      f2fs: change compr_blocks of superblock info to 64bit · ae999bb9
      Daeho Jeong 提交于
      Current compr_blocks of superblock info is not 64bit value. We are
      accumulating each i_compr_blocks count of inodes to this value and
      those are 64bit values. So, need to change this to 64bit value.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ae999bb9
    • C
      f2fs: support age threshold based garbage collection · 093749e2
      Chao Yu 提交于
      There are several issues in current background GC algorithm:
      - valid blocks is one of key factors during cost overhead calculation,
      so if segment has less valid block, however even its age is young or
      it locates hot segment, CB algorithm will still choose the segment as
      victim, it's not appropriate.
      - GCed data/node will go to existing logs, no matter in-there datas'
      update frequency is the same or not, it may mix hot and cold data
      again.
      - GC alloctor mainly use LFS type segment, it will cost free segment
      more quickly.
      
      This patch introduces a new algorithm named age threshold based
      garbage collection to solve above issues, there are three steps
      mainly:
      
      1. select a source victim:
      - set an age threshold, and select candidates beased threshold:
      e.g.
       0 means youngest, 100 means oldest, if we set age threshold to 80
       then select dirty segments which has age in range of [80, 100] as
       candiddates;
      - set candidate_ratio threshold, and select candidates based the
      ratio, so that we can shrink candidates to those oldest segments;
      - select target segment with fewest valid blocks in order to
      migrate blocks with minimum cost;
      
      2. select a target victim:
      - select candidates beased age threshold;
      - set candidate_radius threshold, search candidates whose age is
      around source victims, searching radius should less than the
      radius threshold.
      - select target segment with most valid blocks in order to avoid
      migrating current target segment.
      
      3. merge valid blocks from source victim into target victim with
      SSR alloctor.
      
      Test steps:
      - create 160 dirty segments:
       * half of them have 128 valid blocks per segment
       * left of them have 384 valid blocks per segment
      - run background GC
      
      Benefit: GC count and block movement count both decrease obviously:
      
      - Before:
        - Valid: 86
        - Dirty: 1
        - Prefree: 11
        - Free: 6001 (6001)
      
      GC calls: 162 (BG: 220)
        - data segments : 160 (160)
        - node segments : 2 (2)
      Try to move 41454 blocks (BG: 41454)
        - data blocks : 40960 (40960)
        - node blocks : 494 (494)
      
      IPU: 0 blocks
      SSR: 0 blocks in 0 segments
      LFS: 41364 blocks in 81 segments
      
      - After:
      
        - Valid: 87
        - Dirty: 0
        - Prefree: 4
        - Free: 6008 (6008)
      
      GC calls: 75 (BG: 76)
        - data segments : 74 (74)
        - node segments : 1 (1)
      Try to move 12813 blocks (BG: 12813)
        - data blocks : 12544 (12544)
        - node blocks : 269 (269)
      
      IPU: 0 blocks
      SSR: 12032 blocks in 77 segments
      LFS: 855 blocks in 2 segments
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      093749e2
  8. 11 9月, 2020 7 次提交
    • D
      f2fs: Use generic casefolding support · eca4873e
      Daniel Rosenberg 提交于
      This switches f2fs over to the generic support provided in
      the previous patch.
      
      Since casefolded dentries behave the same in ext4 and f2fs, we decrease
      the maintenance burden by unifying them, and any optimizations will
      immediately apply to both.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      eca4873e
    • C
      f2fs: compress: use more readable atomic_t type for {cic,dic}.ref · e6c3948d
      Chao Yu 提交于
      refcount_t type variable should never be less than one, so it's a
      little bit hard to understand when we use it to indicate pending
      compressed page count, let's change to use atomic_t for better
      readability.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e6c3948d
    • C
      f2fs: support 64-bits key in f2fs rb-tree node entry · 2e9b2bb2
      Chao Yu 提交于
      then, we can add specified entry into rb-tree with 64-bits segment time
      as key.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2e9b2bb2
    • C
      f2fs: inherit mtime of original block during GC · c5d02785
      Chao Yu 提交于
      Don't let f2fs inner GC ruins original aging degree of segment.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c5d02785
    • C
      f2fs: introduce inmem curseg · d0b9e42a
      Chao Yu 提交于
      Previous implementation of aligned pinfile allocation will:
      - allocate new segment on cold data log no matter whether last used
      segment is partially used or not, it makes IOs more random;
      - force concurrent cold data/GCed IO going into warm data area, it
      can make a bad effect on hot/cold data separation;
      
      In this patch, we introduce a new type of log named 'inmem curseg',
      the differents from normal curseg is:
      - it reuses existed segment type (CURSEG_XXX_NODE/DATA);
      - it only exists in memory, its segno, blkofs, summary will not b
       persisted into checkpoint area;
      
      With this new feature, we can enhance scalability of log, special
      allocators can be created for purposes:
      - pure lfs allocator for aligned pinfile allocation or file
      defragmentation
      - pure ssr allocator for later feature
      
      So that, let's update aligned pinfile allocation to use this new
      inmem curseg fwk.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d0b9e42a
    • X
      f2fs: remove duplicated type casting · e90027d2
      Xiaojun Wang 提交于
      Since DUMMY_WRITTEN_PAGE and ATOMIC_WRITTEN_PAGE have already been
      converted as unsigned long type, we don't need do type casting again.
      Signed-off-by: NXiaojun Wang <wangxiaojun11@huawei.com>
      Reported-by: NJack Qiu <jack.qiu@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e90027d2
    • A
      f2fs: support zone capacity less than zone size · de881df9
      Aravind Ramesh 提交于
      NVMe Zoned Namespace devices can have zone-capacity less than zone-size.
      Zone-capacity indicates the maximum number of sectors that are usable in
      a zone beginning from the first sector of the zone. This makes the sectors
      sectors after the zone-capacity till zone-size to be unusable.
      This patch set tracks zone-size and zone-capacity in zoned devices and
      calculate the usable blocks per segment and usable segments per section.
      
      If zone-capacity is less than zone-size mark only those segments which
      start before zone-capacity as free segments. All segments at and beyond
      zone-capacity are treated as permanently used segments. In cases where
      zone-capacity does not align with segment size the last segment will start
      before zone-capacity and end beyond the zone-capacity of the zone. For
      such spanning segments only sectors within the zone-capacity are used.
      
      During writes and GC manage the usable segments in a section and usable
      blocks per segment. Segments which are beyond zone-capacity are never
      allocated, and do not need to be garbage collected, only the segments
      which are before zone-capacity needs to garbage collected.
      For spanning segments based on the number of usable blocks in that
      segment, write to blocks only up to zone-capacity.
      
      Zone-capacity is device specific and cannot be configured by the user.
      Since NVMe ZNS device zones are sequentially write only, a block device
      with conventional zones or any normal block device is needed along with
      the ZNS device for the metadata operations of F2fs.
      
      A typical nvme-cli output of a zoned device shows zone start and capacity
      and write pointer as below:
      
      SLBA: 0x0     WP: 0x0     Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      SLBA: 0x20000 WP: 0x20000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      SLBA: 0x40000 WP: 0x40000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      
      Here zone size is 64MB, capacity is 49MB, WP is at zone start as the zones
      are in EMPTY state. For each zone, only zone start + 49MB is usable area,
      any lba/sector after 49MB cannot be read or written to, the drive will fail
      any attempts to read/write. So, the second zone starts at 64MB and is
      usable till 113MB (64 + 49) and the range between 113 and 128MB is
      again unusable. The next zone starts at 128MB, and so on.
      Signed-off-by: NAravind Ramesh <aravind.ramesh@wdc.com>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      de881df9
  9. 24 8月, 2020 1 次提交
  10. 04 8月, 2020 1 次提交
  11. 26 7月, 2020 1 次提交
  12. 24 7月, 2020 1 次提交
  13. 22 7月, 2020 1 次提交
    • D
      f2fs: add F2FS_IOC_SEC_TRIM_FILE ioctl · 9af84648
      Daeho Jeong 提交于
      Added a new ioctl to send discard commands or/and zero out
      to selected data area of a regular file for security reason.
      
      The way of handling range.len of F2FS_IOC_SEC_TRIM_FILE:
      1. Added -1 value support for range.len to secure trim the whole blocks
         starting from range.start regardless of i_size.
      2. If the end of the range passes over the end of file, it means until
         the end of file (i_size).
      3. ignored the case of that range.len is zero to prevent the function
         from making end_addr zero and triggering different behaviour of
         the function.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9af84648
  14. 21 7月, 2020 1 次提交
  15. 09 7月, 2020 1 次提交
  16. 08 7月, 2020 8 次提交
    • D
      f2fs: add GC_URGENT_LOW mode in gc_urgent · 0e5e8111
      Daeho Jeong 提交于
      Added a new gc_urgent mode, GC_URGENT_LOW, in which mode
      F2FS will lower the bar of checking idle in order to
      process outstanding discard commands and GC a little bit
      aggressively.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0e5e8111
    • J
      f2fs: avoid readahead race condition · 6b12367d
      Jaegeuk Kim 提交于
      If two readahead threads having same offset enter in readpages, every read
      IOs are split and issued to the disk which giving lower bandwidth.
      
      This patch tries to avoid redundant readahead calls.
      
      Fixes one build error reported by Randy.
      Fix build error when F2FS_FS_COMPRESSION is not set/enabled.
      This label is needed in either case.
      
      ../fs/f2fs/data.c: In function ‘f2fs_mpage_readpages’:
      ../fs/f2fs/data.c:2327:5: error: label ‘next_page’ used but not defined
           goto next_page;
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6b12367d
    • C
      f2fs: split f2fs_allocate_new_segments() · 901d745f
      Chao Yu 提交于
      to two independent functions:
      - f2fs_allocate_new_segment() for specified type segment allocation
      - f2fs_allocate_new_segments() for all data type segments allocation
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      901d745f
    • Y
      f2fs: fix an oops in f2fs_is_compressed_page · 29b993c7
      Yu Changchun 提交于
      This patch is to fix a crash:
      
       #3 [ffffb6580689f898] oops_end at ffffffffa2835bc2
       #4 [ffffb6580689f8b8] no_context at ffffffffa28766e7
       #5 [ffffb6580689f920] async_page_fault at ffffffffa320135e
          [exception RIP: f2fs_is_compressed_page+34]
          RIP: ffffffffa2ba83a2  RSP: ffffb6580689f9d8  RFLAGS: 00010213
          RAX: 0000000000000001  RBX: fffffc0f50b34bc0  RCX: 0000000000002122
          RDX: 0000000000002123  RSI: 0000000000000c00  RDI: fffffc0f50b34bc0
          RBP: ffff97e815a40178   R8: 0000000000000000   R9: ffff97e83ffc9000
          R10: 0000000000032300  R11: 0000000000032380  R12: ffffb6580689fa38
          R13: fffffc0f50b34bc0  R14: ffff97e825cbd000  R15: 0000000000000c00
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
       #6 [ffffb6580689f9d8] __is_cp_guaranteed at ffffffffa2b7ea98
       #7 [ffffb6580689f9f0] f2fs_submit_page_write at ffffffffa2b81a69
       #8 [ffffb6580689fa30] f2fs_do_write_meta_page at ffffffffa2b99777
       #9 [ffffb6580689fae0] __f2fs_write_meta_page at ffffffffa2b75f1a
       #10 [ffffb6580689fb18] f2fs_sync_meta_pages at ffffffffa2b77466
       #11 [ffffb6580689fc98] do_checkpoint at ffffffffa2b78e46
       #12 [ffffb6580689fd88] f2fs_write_checkpoint at ffffffffa2b79c29
       #13 [ffffb6580689fdd0] f2fs_sync_fs at ffffffffa2b69d95
       #14 [ffffb6580689fe20] sync_filesystem at ffffffffa2ad2574
       #15 [ffffb6580689fe30] generic_shutdown_super at ffffffffa2a9b582
       #16 [ffffb6580689fe48] kill_block_super at ffffffffa2a9b6d1
       #17 [ffffb6580689fe60] kill_f2fs_super at ffffffffa2b6abe1
       #18 [ffffb6580689fea0] deactivate_locked_super at ffffffffa2a9afb6
       #19 [ffffb6580689feb8] cleanup_mnt at ffffffffa2abcad4
       #20 [ffffb6580689fee0] task_work_run at ffffffffa28bca28
       #21 [ffffb6580689ff00] exit_to_usermode_loop at ffffffffa28050b7
       #22 [ffffb6580689ff38] do_syscall_64 at ffffffffa280560e
       #23 [ffffb6580689ff50] entry_SYSCALL_64_after_hwframe at ffffffffa320008c
      
      This occurred when umount f2fs if enable F2FS_FS_COMPRESSION
      with F2FS_IO_TRACE. Fixes it by adding IS_IO_TRACED_PAGE to check
      validity of pid for page_private.
      Signed-off-by: NYu Changchun <yuchangchun1@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      29b993c7
    • C
      f2fs: fix to wait page writeback before update · a6d601f3
      Chao Yu 提交于
      Filesystem including f2fs should support stable page for special
      device like software raid, however there is one missing path that
      page could be updated while it is writeback state as below, fix
      this.
      
      - gc_node_segment
       - f2fs_move_node_page
        - __write_node_page
         - set_page_writeback
      
      - do_read_inode
       - f2fs_init_extent_tree
        - __f2fs_init_extent_tree
          i_ext->len = 0;
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a6d601f3
    • C
      f2fs: show more debug info for per-temperature log · 0759e2c1
      Chao Yu 提交于
      - Add to account and show per-log dirty_seg, full_seg and valid_blocks
      in debugfs.
      - reformat printed info.
      
          TYPE            segno    secno   zoneno  dirty_seg   full_seg  valid_blk
        - COLD   data:     1523     1523     1523          1          0        399
        - WARM   data:      769      769      769         20        255     133098
        - HOT    data:      767      767      767          9          0        167
        - Dir   dnode:       22       22       22          3          0         70
        - File  dnode:      722      722      722         14         10       6505
        - Indir nodes:        2        2        2          1          0          3
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0759e2c1
    • C
      f2fs: clean up parameter of f2fs_allocate_data_block() · f608c38c
      Chao Yu 提交于
      Use validation of @fio to inidcate whether caller want to serialize IOs
      in io.io_list or not, then @add_list will be redundant, remove it.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f608c38c
    • C
      f2fs: add prefix for exported symbols · 0ef81833
      Chao Yu 提交于
      to avoid polluting global symbol namespace.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0ef81833
  17. 09 6月, 2020 2 次提交
    • J
      f2fs: add node_io_flag for bio flags likewise data_io_flag · 32b6aba8
      Jaegeuk Kim 提交于
      This patch adds another way to attach bio flags to node writes.
      
      Description:   Give a way to attach REQ_META|FUA to node writes
                     given temperature-based bits. Now the bits indicate:
                     *      REQ_META     |      REQ_FUA      |
                     *    5 |    4 |   3 |    2 |    1 |   0 |
                     * Cold | Warm | Hot | Cold | Warm | Hot |
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      32b6aba8
    • E
      f2fs: don't return vmalloc() memory from f2fs_kmalloc() · 0b6d4ca0
      Eric Biggers 提交于
      kmalloc() returns kmalloc'ed memory, and kvmalloc() returns either
      kmalloc'ed or vmalloc'ed memory.  But the f2fs wrappers, f2fs_kmalloc()
      and f2fs_kvmalloc(), both return both kinds of memory.
      
      It's redundant to have two functions that do the same thing, and also
      breaking the standard naming convention is causing bugs since people
      assume it's safe to kfree() memory allocated by f2fs_kmalloc().  See
      e.g. the various allocations in fs/f2fs/compress.c.
      
      Fix this by making f2fs_kmalloc() just use kmalloc().  And to avoid
      re-introducing the allocation failures that the vmalloc fallback was
      intended to fix, convert the largest allocations to use f2fs_kvmalloc().
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0b6d4ca0
  18. 03 6月, 2020 2 次提交