1. 20 3月, 2020 2 次提交
  2. 11 3月, 2020 2 次提交
    • C
      f2fs: fix inconsistent comments · 7a88ddb5
      Chao Yu 提交于
      Lack of maintenance on comments may mislead developers, fix them.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7a88ddb5
    • C
      f2fs: cover last_disk_size update with spinlock · c10c9820
      Chao Yu 提交于
      This change solves below hangtask issue:
      
      INFO: task kworker/u16:1:58 blocked for more than 122 seconds.
            Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      kworker/u16:1   D    0    58      2 0x00000000
      Workqueue: writeback wb_workfn (flush-179:0)
      Backtrace:
       (__schedule) from [<c0913234>] (schedule+0x78/0xf4)
       (schedule) from [<c017ec74>] (rwsem_down_write_slowpath+0x24c/0x4c0)
       (rwsem_down_write_slowpath) from [<c0915f2c>] (down_write+0x6c/0x70)
       (down_write) from [<c0435b80>] (f2fs_write_single_data_page+0x608/0x7ac)
       (f2fs_write_single_data_page) from [<c0435fd8>] (f2fs_write_cache_pages+0x2b4/0x7c4)
       (f2fs_write_cache_pages) from [<c043682c>] (f2fs_write_data_pages+0x344/0x35c)
       (f2fs_write_data_pages) from [<c0267ee8>] (do_writepages+0x3c/0xd4)
       (do_writepages) from [<c0310cbc>] (__writeback_single_inode+0x44/0x454)
       (__writeback_single_inode) from [<c03112d0>] (writeback_sb_inodes+0x204/0x4b0)
       (writeback_sb_inodes) from [<c03115cc>] (__writeback_inodes_wb+0x50/0xe4)
       (__writeback_inodes_wb) from [<c03118f4>] (wb_writeback+0x294/0x338)
       (wb_writeback) from [<c0312dac>] (wb_workfn+0x35c/0x54c)
       (wb_workfn) from [<c014f2b8>] (process_one_work+0x214/0x544)
       (process_one_work) from [<c014f634>] (worker_thread+0x4c/0x574)
       (worker_thread) from [<c01564fc>] (kthread+0x144/0x170)
       (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
      Reported-and-tested-by: NOndřej Jirman <megi@xff.cz>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c10c9820
  3. 28 2月, 2020 1 次提交
    • S
      f2fs: fix the panic in do_checkpoint() · bf22c3cc
      Sahitya Tummala 提交于
      There could be a scenario where f2fs_sync_meta_pages() will not
      ensure that all F2FS_DIRTY_META pages are submitted for IO. Thus,
      resulting in the below panic in do_checkpoint() -
      
      f2fs_bug_on(sbi, get_pages(sbi, F2FS_DIRTY_META) &&
      				!f2fs_cp_error(sbi));
      
      This can happen in a low-memory condition, where shrinker could
      also be doing the writepage operation (stack shown below)
      at the same time when checkpoint is running on another core.
      
      schedule
      down_write
      f2fs_submit_page_write -> by this time, this page in page cache is tagged
      			as PAGECACHE_TAG_WRITEBACK and PAGECACHE_TAG_DIRTY
      			is cleared, due to which f2fs_sync_meta_pages()
      			cannot sync this page in do_checkpoint() path.
      f2fs_do_write_meta_page
      __f2fs_write_meta_page
      f2fs_write_meta_page
      shrink_page_list
      shrink_inactive_list
      shrink_node_memcg
      shrink_node
      kswapd
      Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bf22c3cc
  4. 18 1月, 2020 4 次提交
    • C
      f2fs: change to use rwsem for gc_mutex · fb24fea7
      Chao Yu 提交于
      Mutex lock won't serialize callers, in order to avoid starving of unlucky
      caller, let's use rwsem lock instead.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fb24fea7
    • C
      f2fs: code cleanup for f2fs_statfs_project() · bf2cbd3c
      Chengguang Xu 提交于
      Calling min_not_zero() to simplify complicated prjquota
      limit comparison in f2fs_statfs_project().
      Signed-off-by: NChengguang Xu <cgxu519@mykernel.net>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bf2cbd3c
    • C
      f2fs: fix miscounted block limit in f2fs_statfs_project() · acdf2172
      Chengguang Xu 提交于
      statfs calculates Total/Used/Avail disk space in block unit,
      so we should translate soft/hard prjquota limit to block unit
      as well.
      
      Below testing result shows the block/inode numbers of
      Total/Used/Avail from df command are all correct afer
      applying this patch.
      
      [root@localhost quota-tools]\# ./repquota -P /dev/sdb1
      *** Report for project quotas on device /dev/sdb1
      Block grace time: 7days; Inode grace time: 7days
                    Block limits                File limits
      Project   used soft    hard  grace  used  soft  hard  grace
      -----------------------------------------------------------
      \#0   --   4       0       0         1     0     0
      \#101 --   0       0       0         2     0     0
      \#102 --   0   10240       0         2    10     0
      \#103 --   0       0   20480         2     0    20
      \#104 --   0   10240   20480         2    10    20
      \#105 --   0   20480   10240         2    20    10
      
      [root@localhost sdb1]\# lsattr -p t{1,2,3,4,5}
        101 ----------------N-- t1/a1
        102 ----------------N-- t2/a2
        103 ----------------N-- t3/a3
        104 ----------------N-- t4/a4
        105 ----------------N-- t5/a5
      
      [root@localhost sdb1]\# df -hi t{1,2,3,4,5}
      Filesystem     Inodes IUsed IFree IUse% Mounted on
      /dev/sdb1        2.4M    21  2.4M    1% /mnt/sdb1
      /dev/sdb1          10     2     8   20% /mnt/sdb1
      /dev/sdb1          20     2    18   10% /mnt/sdb1
      /dev/sdb1          10     2     8   20% /mnt/sdb1
      /dev/sdb1          10     2     8   20% /mnt/sdb1
      
      [root@localhost sdb1]\# df -h t{1,2,3,4,5}
      Filesystem      Size  Used Avail Use% Mounted on
      /dev/sdb1        10G  489M  9.6G   5% /mnt/sdb1
      /dev/sdb1        10M     0   10M   0% /mnt/sdb1
      /dev/sdb1        20M     0   20M   0% /mnt/sdb1
      /dev/sdb1        10M     0   10M   0% /mnt/sdb1
      /dev/sdb1        10M     0   10M   0% /mnt/sdb1
      
      Fixes: 909110c0 ("f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project()")
      Signed-off-by: NChengguang Xu <cgxu519@mykernel.net>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      acdf2172
    • C
      f2fs: support data compression · 4c8ff709
      Chao Yu 提交于
      This patch tries to support compression in f2fs.
      
      - New term named cluster is defined as basic unit of compression, file can
      be divided into multiple clusters logically. One cluster includes 4 << n
      (n >= 0) logical pages, compression size is also cluster size, each of
      cluster can be compressed or not.
      
      - In cluster metadata layout, one special flag is used to indicate cluster
      is compressed one or normal one, for compressed cluster, following metadata
      maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs stores
      data including compress header and compressed data.
      
      - In order to eliminate write amplification during overwrite, F2FS only
      support compression on write-once file, data can be compressed only when
      all logical blocks in file are valid and cluster compress ratio is lower
      than specified threshold.
      
      - To enable compression on regular inode, there are three ways:
      * chattr +c file
      * chattr +c dir; touch dir/file
      * mount w/ -o compress_extension=ext; touch file.ext
      
      Compress metadata layout:
                                   [Dnode Structure]
                   +-----------------------------------------------+
                   | cluster 1 | cluster 2 | ......... | cluster N |
                   +-----------------------------------------------+
                   .           .                       .           .
             .                       .                .                      .
        .         Compressed Cluster       .        .        Normal Cluster            .
      +----------+---------+---------+---------+  +---------+---------+---------+---------+
      |compr flag| block 1 | block 2 | block 3 |  | block 1 | block 2 | block 3 | block 4 |
      +----------+---------+---------+---------+  +---------+---------+---------+---------+
                 .                             .
               .                                           .
             .                                                           .
            +-------------+-------------+----------+----------------------------+
            | data length | data chksum | reserved |      compressed data       |
            +-------------+-------------+----------+----------------------------+
      
      Changelog:
      
      20190326:
      - fix error handling of read_end_io().
      - remove unneeded comments in f2fs_encrypt_one_page().
      
      20190327:
      - fix wrong use of f2fs_cluster_is_full() in f2fs_mpage_readpages().
      - don't jump into loop directly to avoid uninitialized variables.
      - add TODO tag in error path of f2fs_write_cache_pages().
      
      20190328:
      - fix wrong merge condition in f2fs_read_multi_pages().
      - check compressed file in f2fs_post_read_required().
      
      20190401
      - allow overwrite on non-compressed cluster.
      - check cluster meta before writing compressed data.
      
      20190402
      - don't preallocate blocks for compressed file.
      
      - add lz4 compress algorithm
      - process multiple post read works in one workqueue
        Now f2fs supports processing post read work in multiple workqueue,
        it shows low performance due to schedule overhead of multiple
        workqueue executing orderly.
      
      20190921
      - compress: support buffered overwrite
      C: compress cluster flag
      V: valid block address
      N: NEW_ADDR
      
      One cluster contain 4 blocks
      
       before overwrite   after overwrite
      
      - VVVV		->	CVNN
      - CVNN		->	VVVV
      
      - CVNN		->	CVNN
      - CVNN		->	CVVV
      
      - CVVV		->	CVNN
      - CVVV		->	CVVV
      
      20191029
      - add kconfig F2FS_FS_COMPRESSION to isolate compression related
      codes, add kconfig F2FS_FS_{LZO,LZ4} to cover backend algorithm.
      note that: will remove lzo backend if Jaegeuk agreed that too.
      - update codes according to Eric's comments.
      
      20191101
      - apply fixes from Jaegeuk
      
      20191113
      - apply fixes from Jaegeuk
      - split workqueue for fsverity
      
      20191216
      - apply fixes from Jaegeuk
      
      20200117
      - fix to avoid NULL pointer dereference
      
      [Jaegeuk Kim]
      - add tracepoint for f2fs_{,de}compress_pages()
      - fix many bugs and add some compression stats
      - fix overwrite/mmap bugs
      - address 32bit build error, reported by Geert.
      - bug fixes when handling errors and i_compressed_blocks
      
      Reported-by: <noreply@ellerman.id.au>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4c8ff709
  5. 16 1月, 2020 3 次提交
    • J
      f2fs: declare nested quota_sem and remove unnecessary sems · 2c4e0c52
      Jaegeuk Kim 提交于
      1.
      f2fs_quota_sync
       -> down_read(&sbi->quota_sem)
       -> dquot_writeback_dquots
        -> f2fs_dquot_commit
         -> down_read(&sbi->quota_sem)
      
      2.
      f2fs_quota_sync
       -> down_read(&sbi->quota_sem)
        -> f2fs_write_data_pages
         -> f2fs_write_single_data_page
          -> down_write(&F2FS_I(inode)->i_sem)
      
      f2fs_mkdir
       -> f2fs_do_add_link
         -> down_write(&F2FS_I(inode)->i_sem)
         -> f2fs_init_inode_metadata
          -> f2fs_new_node_page
           -> dquot_alloc_inode
            -> f2fs_dquot_mark_dquot_dirty
             -> down_read(&sbi->quota_sem)
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2c4e0c52
    • C
      f2fs: introduce private bioset · f543805f
      Chao Yu 提交于
      In low memory scenario, we can allocate multiple bios without
      submitting any of them.
      
      - f2fs_write_checkpoint()
       - block_operations()
        - f2fs_sync_node_pages()
         step 1) flush cold nodes, allocate new bio from mempool
         - bio_alloc()
          - mempool_alloc()
         step 2) flush hot nodes, allocate a bio from mempool
         - bio_alloc()
          - mempool_alloc()
         step 3) flush warm nodes, be stuck in below call path
         - bio_alloc()
          - mempool_alloc()
           - loop to wait mempool element release, as we only
             reserved memory for two bio allocation, however above
             allocated two bios may never be submitted.
      
      So we need avoid using default bioset, in this patch we introduce a
      private bioset, in where we enlarg mempool element count to total
      number of log header, so that we can make sure we have enough
      backuped memory pool in scenario of allocating/holding multiple
      bios.
      Signed-off-by: NGao Xiang <gaoxiang25@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f543805f
    • S
      f2fs: Check write pointer consistency of non-open zones · d508c94e
      Shin'ichiro Kawasaki 提交于
      To catch f2fs bugs in write pointer handling code for zoned block
      devices, check write pointers of non-open zones that current segments do
      not point to. Do this check at mount time, after the fsync data recovery
      and current segments' write pointer consistency fix. Or when fsync data
      recovery is disabled by mount option, do the check when there is no fsync
      data.
      
      Check two items comparing write pointers with valid block maps in SIT.
      The first item is check for zones with no valid blocks. When there is no
      valid blocks in a zone, the write pointer should be at the start of the
      zone. If not, next write operation to the zone will cause unaligned write
      error. If write pointer is not at the zone start, reset the write pointer
      to place at the zone start.
      
      The second item is check between the write pointer position and the last
      valid block in the zone. It is unexpected that the last valid block
      position is beyond the write pointer. In such a case, report as a bug.
      Fix is not required for such zone, because the zone is not selected for
      next write operation until the zone get discarded.
      Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d508c94e
  6. 26 11月, 2019 1 次提交
    • C
      f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project() · 909110c0
      Chengguang Xu 提交于
      Setting softlimit larger than hardlimit seems meaningless
      for disk quota but currently it is allowed. In this case,
      there may be a bit of comfusion for users when they run
      df comamnd to directory which has project quota.
      
      For example, we set 20M softlimit and 10M hardlimit of
      block usage limit for project quota of test_dir(project id 123).
      
      [root@hades f2fs]# repquota -P -a
      *** Report for project quotas on device /dev/nvme0n1p8
      Block grace time: 7days; Inode grace time: 7days
      Block limits File limits
      Project used soft hard grace used soft hard grace
      ----------------------------------------------------------------------
      0 -- 4 0 0 1 0 0
      123 +- 10248 20480 10240 2 0 0
      
      The result of df command as below:
      
      [root@hades f2fs]# df -h /mnt/f2fs/test
      Filesystem Size Used Avail Use% Mounted on
      /dev/nvme0n1p8 20M 11M 10M 51% /mnt/f2fs
      
      Even though it looks like there is another 10M free space to use,
      if we write new data to diretory test(inherit project id),
      the write will fail with errno(-EDQUOT).
      
      After this patch, the df result looks like below.
      
      [root@hades f2fs]# df -h /mnt/f2fs/test
      Filesystem Size Used Avail Use% Mounted on
      /dev/nvme0n1p8 10M 10M 0 100% /mnt/f2fs
      Signed-off-by: NChengguang Xu <cgxu519@mykernel.net>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      909110c0
  7. 13 11月, 2019 1 次提交
  8. 08 11月, 2019 1 次提交
  9. 07 11月, 2019 1 次提交
  10. 04 11月, 2019 1 次提交
  11. 26 10月, 2019 1 次提交
    • C
      f2fs: cache global IPU bio · 0b20fcec
      Chao Yu 提交于
      In commit 8648de2c ("f2fs: add bio cache for IPU"), we added
      f2fs_submit_ipu_bio() in __write_data_page() as below:
      
      __write_data_page()
      
      	if (!S_ISDIR(inode->i_mode) && !IS_NOQUOTA(inode)) {
      		f2fs_submit_ipu_bio(sbi, bio, page);
      		....
      	}
      
      in order to avoid below deadlock:
      
      Thread A				Thread B
      - __write_data_page (inode x, page y)
       - f2fs_do_write_data_page
        - set_page_writeback        ---- set writeback flag in page y
        - f2fs_inplace_write_data
       - f2fs_balance_fs
      					 - lock gc_mutex
       - lock gc_mutex
      					  - f2fs_gc
      					   - do_garbage_collect
      					    - gc_data_segment
      					     - move_data_page
      					      - f2fs_wait_on_page_writeback
      					       - wait_on_page_writeback  --- wait writeback of page y
      
      However, the bio submission breaks the merge of IPU IOs.
      
      So in this patch let's add a global bio cache for merged IPU pages,
      then f2fs_wait_on_page_writeback() is able to submit bio if a
      writebacked page is cached in global bio cache.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0b20fcec
  12. 23 10月, 2019 2 次提交
  13. 16 9月, 2019 1 次提交
  14. 23 8月, 2019 7 次提交
    • C
      f2fs: fix wrong available node count calculation · 27cae0bc
      Chao Yu 提交于
      In mkfs, we have counted quota file's node number in cp.valid_node_count,
      so we have to avoid wrong substraction of quota node number in
      .available_nid/.avail_node_count calculation.
      
      f2fs_write_check_point_pack()
      {
      ..
      	set_cp(valid_node_count, 1 + c.quota_inum + c.lpf_inum);
      
      Fixes: 292c196a ("f2fs: reserve nid resource for quota sysfile")
      Fixes: 7b63f72f ("f2fs: fix to do sanity check on valid node/block count")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      27cae0bc
    • D
      f2fs: Support case-insensitive file name lookups · 2c2eb7a3
      Daniel Rosenberg 提交于
      Modeled after commit b886ee3e ("ext4: Support case-insensitive file
      name lookups")
      
      """
      This patch implements the actual support for case-insensitive file name
      lookups in f2fs, based on the feature bit and the encoding stored in the
      superblock.
      
      A filesystem that has the casefold feature set is able to configure
      directories with the +F (F2FS_CASEFOLD_FL) attribute, enabling lookups
      to succeed in that directory in a case-insensitive fashion, i.e: match
      a directory entry even if the name used by userspace is not a byte per
      byte match with the disk name, but is an equivalent case-insensitive
      version of the Unicode string.  This operation is called a
      case-insensitive file name lookup.
      
      The feature is configured as an inode attribute applied to directories
      and inherited by its children.  This attribute can only be enabled on
      empty directories for filesystems that support the encoding feature,
      thus preventing collision of file names that only differ by case.
      
      * dcache handling:
      
      For a +F directory, F2Fs only stores the first equivalent name dentry
      used in the dcache. This is done to prevent unintentional duplication of
      dentries in the dcache, while also allowing the VFS code to quickly find
      the right entry in the cache despite which equivalent string was used in
      a previous lookup, without having to resort to ->lookup().
      
      d_hash() of casefolded directories is implemented as the hash of the
      casefolded string, such that we always have a well-known bucket for all
      the equivalencies of the same string. d_compare() uses the
      utf8_strncasecmp() infrastructure, which handles the comparison of
      equivalent, same case, names as well.
      
      For now, negative lookups are not inserted in the dcache, since they
      would need to be invalidated anyway, because we can't trust missing file
      dentries.  This is bad for performance but requires some leveraging of
      the vfs layer to fix.  We can live without that for now, and so does
      everyone else.
      
      * on-disk data:
      
      Despite using a specific version of the name as the internal
      representation within the dcache, the name stored and fetched from the
      disk is a byte-per-byte match with what the user requested, making this
      implementation 'name-preserving'. i.e. no actual information is lost
      when writing to storage.
      
      DX is supported by modifying the hashes used in +F directories to make
      them case/encoding-aware.  The new disk hashes are calculated as the
      hash of the full casefolded string, instead of the string directly.
      This allows us to efficiently search for file names in the htree without
      requiring the user to provide an exact name.
      
      * Dealing with invalid sequences:
      
      By default, when a invalid UTF-8 sequence is identified, ext4 will treat
      it as an opaque byte sequence, ignoring the encoding and reverting to
      the old behavior for that unique file.  This means that case-insensitive
      file name lookup will not work only for that file.  An optional bit can
      be set in the superblock telling the filesystem code and userspace tools
      to enforce the encoding.  When that optional bit is set, any attempt to
      create a file name using an invalid UTF-8 sequence will fail and return
      an error to userspace.
      
      * Normalization algorithm:
      
      The UTF-8 algorithms used to compare strings in f2fs is implemented
      in fs/unicode, and is based on a previous version developed by
      SGI.  It implements the Canonical decomposition (NFD) algorithm
      described by the Unicode specification 12.1, or higher, combined with
      the elimination of ignorable code points (NFDi) and full
      case-folding (CF) as documented in fs/unicode/utf8_norm.c.
      
      NFD seems to be the best normalization method for F2FS because:
      
        - It has a lower cost than NFC/NFKC (which requires
          decomposing to NFD as an intermediary step)
        - It doesn't eliminate important semantic meaning like
          compatibility decompositions.
      
      Although:
      
      - This implementation is not completely linguistic accurate, because
      different languages have conflicting rules, which would require the
      specialization of the filesystem to a given locale, which brings all
      sorts of problems for removable media and for users who use more than
      one language.
      """
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2c2eb7a3
    • D
      f2fs: include charset encoding information in the superblock · 5aba5430
      Daniel Rosenberg 提交于
      Add charset encoding to f2fs to support casefolding. It is modeled after
      the same feature introduced in commit c83ad55e ("ext4: include charset
      encoding information in the superblock")
      
      Currently this is not compatible with encryption, similar to the current
      ext4 imlpementation. This will change in the future.
      
      >From the ext4 patch:
      """
      The s_encoding field stores a magic number indicating the encoding
      format and version used globally by file and directory names in the
      filesystem.  The s_encoding_flags defines policies for using the charset
      encoding, like how to handle invalid sequences.  The magic number is
      mapped to the exact charset table, but the mapping is specific to ext4.
      Since we don't have any commitment to support old encodings, the only
      encoding I am supporting right now is utf8-12.1.0.
      
      The current implementation prevents the user from enabling encoding and
      per-directory encryption on the same filesystem at the same time.  The
      incompatibility between these features lies in how we do efficient
      directory searches when we cannot be sure the encryption of the user
      provided fname will match the actual hash stored in the disk without
      decrypting every directory entry, because of normalization cases.  My
      quickest solution is to simply block the concurrent use of these
      features for now, and enable it later, once we have a better solution.
      """
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5aba5430
    • C
      f2fs: fix to handle quota_{on,off} correctly · fe973b06
      Chao Yu 提交于
      With quota_ino feature on, generic/232 reports an inconsistence issue
      on the image.
      
      The root cause is that the testcase tries to:
      - use quotactl to shutdown journalled quota based on sysfile;
      - and then use quotactl to enable/turn on quota based on specific file
      (aquota.user or aquota.group).
      
      Eventually, quota sysfile will be out-of-update due to following specific
      file creation.
      
      Change as below to fix this issue:
      - deny enabling quota based on specific file if quota sysfile exists.
      - set SBI_QUOTA_NEED_REPAIR once sysfile based quota shutdowns via
      ioctl.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fe973b06
    • C
      f2fs: fix to drop meta/node pages during umount · a8933b6b
      Chao Yu 提交于
      As reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=204193
      
      A null pointer dereference bug is triggered in f2fs under kernel-5.1.3.
      
       kasan_report.cold+0x5/0x32
       f2fs_write_end_io+0x215/0x650
       bio_endio+0x26e/0x320
       blk_update_request+0x209/0x5d0
       blk_mq_end_request+0x2e/0x230
       lo_complete_rq+0x12c/0x190
       blk_done_softirq+0x14a/0x1a0
       __do_softirq+0x119/0x3e5
       irq_exit+0x94/0xe0
       call_function_single_interrupt+0xf/0x20
      
      During umount, we will access NULL sbi->node_inode pointer in
      f2fs_write_end_io():
      
      	f2fs_bug_on(sbi, page->mapping == NODE_MAPPING(sbi) &&
      				page->index != nid_of_node(page));
      
      The reason is if disable_checkpoint mount option is on, meta dirty
      pages can remain during umount, and then be flushed by iput() of
      meta_inode, however node_inode has been iput()ed before
      meta_inode's iput().
      
      Since checkpoint is disabled, all meta/node datas are useless and
      should be dropped in next mount, so in umount, let's adjust
      drop_inode() to give a hint to iput_final() to drop all those dirty
      datas correctly.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a8933b6b
    • C
      f2fs: disallow switching io_bits option during remount · 1f78adfa
      Chao Yu 提交于
      If IO alignment feature is turned on after remount, we didn't
      initialize mempool of it, it turns out we will encounter panic
      during IO submission due to access NULL mempool pointer.
      
      This feature should be set only at mount time, so simply deny
      configuring during remount.
      
      This fixes bug reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=204135Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1f78adfa
    • C
      f2fs: fix panic of IO alignment feature · c72db71e
      Chao Yu 提交于
      Since 07173c3e ("block: enable multipage bvecs"), one bio vector
      can store multi pages, so that we can not calculate max IO size of
      bio as PAGE_SIZE * bio->bi_max_vecs. However IO alignment feature of
      f2fs always has that assumption, so finally, it may cause panic during
      IO submission as below stack.
      
       kernel BUG at fs/f2fs/data.c:317!
       RIP: 0010:__submit_merged_bio+0x8b0/0x8c0
       Call Trace:
        f2fs_submit_page_write+0x3cd/0xdd0
        do_write_page+0x15d/0x360
        f2fs_outplace_write_data+0xd7/0x210
        f2fs_do_write_data_page+0x43b/0xf30
        __write_data_page+0xcf6/0x1140
        f2fs_write_cache_pages+0x3ba/0xb40
        f2fs_write_data_pages+0x3dd/0x8b0
        do_writepages+0xbb/0x1e0
        __writeback_single_inode+0xb6/0x800
        writeback_sb_inodes+0x441/0x910
        wb_writeback+0x261/0x650
        wb_workfn+0x1f9/0x7a0
        process_one_work+0x503/0x970
        worker_thread+0x7d/0x820
        kthread+0x1ad/0x210
        ret_from_fork+0x35/0x40
      
      This patch adds one extra condition to check left space in bio while
      trying merging page to bio, to avoid panic.
      
      This bug was reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=204043Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c72db71e
  15. 13 8月, 2019 2 次提交
    • E
      f2fs: add fs-verity support · 95ae251f
      Eric Biggers 提交于
      Add fs-verity support to f2fs.  fs-verity is a filesystem feature that
      enables transparent integrity protection and authentication of read-only
      files.  It uses a dm-verity like mechanism at the file level: a Merkle
      tree is used to verify any block in the file in log(filesize) time.  It
      is implemented mainly by helper functions in fs/verity/.  See
      Documentation/filesystems/fsverity.rst for the full documentation.
      
      The f2fs support for fs-verity consists of:
      
      - Adding a filesystem feature flag and an inode flag for fs-verity.
      
      - Implementing the fsverity_operations to support enabling verity on an
        inode and reading/writing the verity metadata.
      
      - Updating ->readpages() to verify data as it's read from verity files
        and to support reading verity metadata pages.
      
      - Updating ->write_begin(), ->write_end(), and ->writepages() to support
        writing verity metadata pages.
      
      - Calling the fs-verity hooks for ->open(), ->setattr(), and ->ioctl().
      
      Like ext4, f2fs stores the verity metadata (Merkle tree and
      fsverity_descriptor) past the end of the file, starting at the first 64K
      boundary beyond i_size.  This approach works because (a) verity files
      are readonly, and (b) pages fully beyond i_size aren't visible to
      userspace but can be read/written internally by f2fs with only some
      relatively small changes to f2fs.  Extended attributes cannot be used
      because (a) f2fs limits the total size of an inode's xattr entries to
      4096 bytes, which wouldn't be enough for even a single Merkle tree
      block, and (b) f2fs encryption doesn't encrypt xattrs, yet the verity
      metadata *must* be encrypted when the file is because it contains hashes
      of the plaintext data.
      Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Acked-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      95ae251f
    • E
      f2fs: wire up new fscrypt ioctls · 8ce589c7
      Eric Biggers 提交于
      Wire up the new ioctls for adding and removing fscrypt keys to/from the
      filesystem, and the new ioctl for retrieving v2 encryption policies.
      
      The key removal ioctls also required making f2fs_drop_inode() call
      fscrypt_drop_inode().
      
      For more details see Documentation/filesystems/fscrypt.rst and the
      fscrypt patches that added the implementation of these ioctls.
      Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      8ce589c7
  16. 29 7月, 2019 1 次提交
    • I
      f2fs: use EINVAL for superblock with invalid magic · 38fb6d0e
      Icenowy Zheng 提交于
      The kernel mount_block_root() function expects -EACESS or -EINVAL for a
      unmountable filesystem when trying to mount the root with different
      filesystem types.
      
      However, in 5.3-rc1 the behavior when F2FS code cannot find valid block
      changed to return -EFSCORRUPTED(-EUCLEAN), and this error code makes
      mount_block_root() fail when trying to probe F2FS.
      
      When the magic number of the superblock mismatches, it has a high
      probability that it's just not a F2FS. In this case return -EINVAL seems
      to be a better result, and this return value can make mount_block_root()
      probing work again.
      
      Return -EINVAL when the superblock has magic mismatch, -EFSCORRUPTED in
      other cases (the magic matches but the superblock cannot be recognized).
      
      Fixes: 10f966bb ("f2fs: use generic EFSBADCRC/EFSCORRUPTED")
      Signed-off-by: NIcenowy Zheng <icenowy@aosc.io>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      38fb6d0e
  17. 12 7月, 2019 1 次提交
  18. 11 7月, 2019 1 次提交
  19. 03 7月, 2019 4 次提交
    • J
      f2fs: add a rw_sem to cover quota flag changes · db6ec53b
      Jaegeuk Kim 提交于
      Two paths to update quota and f2fs_lock_op:
      
      1.
       - lock_op
       |  - quota_update
       `- unlock_op
      
      2.
       - quota_update
       - lock_op
       `- unlock_op
      
      But, we need to make a transaction on quota_update + lock_op in #2 case.
      So, this patch introduces:
      1. lock_op
      2. down_write
      3. check __need_flush
      4. up_write
      5. if there is dirty quota entries, flush them
      6. otherwise, good to go
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      db6ec53b
    • C
      f2fs: use generic EFSBADCRC/EFSCORRUPTED · 10f966bb
      Chao Yu 提交于
      f2fs uses EFAULT as error number to indicate filesystem is corrupted
      all the time, but generic filesystems use EUCLEAN for such condition,
      we need to change to follow others.
      
      This patch adds two new macros as below to wrap more generic error
      code macros, and spread them in code.
      
      EFSBADCRC	EBADMSG		/* Bad CRC detected */
      EFSCORRUPTED	EUCLEAN		/* Filesystem is corrupted */
      Reported-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      10f966bb
    • J
      f2fs: introduce f2fs_<level> macros to wrap f2fs_printk() · dcbb4c10
      Joe Perches 提交于
      - Add and use f2fs_<level> macros
      - Convert f2fs_msg to f2fs_printk
      - Remove level from f2fs_printk and embed the level in the format
      - Coalesce formats and align multi-line arguments
      - Remove unnecessary duplicate extern f2fs_msg f2fs.h
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      dcbb4c10
    • Q
      f2fs: ioctl for removing a range from F2FS · 04f0b2ea
      Qiuyang Sun 提交于
      This ioctl shrinks a given length (aligned to sections) from end of the
      main area. Any cursegs and valid blocks will be moved out before
      invalidating the range.
      
      This feature can be used for adjusting partition sizes online.
      
      History of the patch:
      
      Sahitya Tummala:
       - Add this ioctl for f2fs_compat_ioctl() as well.
       - Fix debugfs status to reflect the online resize changes.
       - Fix potential race between online resize path and allocate new data
         block path or gc path.
      
      Others:
       - Rename some identifiers.
       - Add some error handling branches.
       - Clear sbi->next_victim_seg[BG_GC/FG_GC] in shrinking range.
       - Implement this interface as ext4's, and change the parameter from shrunk
      bytes to new block count of F2FS.
       - During resizing, force to empty sit_journal and forbid adding new
         entries to it, in order to avoid invalid segno in journal after resize.
       - Reduce sbi->user_block_count before resize starts.
       - Commit the updated superblock first, and then update in-memory metadata
         only when the former succeeds.
       - Target block count must align to sections.
       - Write checkpoint before and after committing the new superblock, w/o
      CP_FSCK_FLAG respectively, so that the FS can be fixed by fsck even if
      resize fails after the new superblock is committed.
       - In free_segment_range(), reduce granularity of gc_mutex.
       - Add protection on curseg migration.
       - Add freeze_bdev() and thaw_bdev() for resize fs.
       - Remove CUR_MAIN_SECS and use MAIN_SECS directly for allocation.
       - Recover super_block and FS metadata when resize fails.
       - No need to clear CP_FSCK_FLAG in update_ckpt_flags().
       - Clean up the sb and fs metadata update functions for resize_fs.
      
      Geert Uytterhoeven:
       - Use div_u64*() for 64-bit divisions
      
      Arnd Bergmann:
       - Not all architectures support get_user() with a 64-bit argument:
          ERROR: "__get_user_bad" [fs/f2fs/f2fs.ko] undefined!
          Use copy_from_user() here, this will always work.
      Signed-off-by: NQiuyang Sun <sunqiuyang@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      04f0b2ea
  20. 04 6月, 2019 2 次提交
  21. 31 5月, 2019 1 次提交