1. 29 5月, 2020 2 次提交
  2. 02 4月, 2020 1 次提交
  3. 06 3月, 2020 1 次提交
    • E
      ext4: remove EXT4_EOFBLOCKS_FL and associated code · 4337ecd1
      Eric Whitney 提交于
      The EXT4_EOFBLOCKS_FL inode flag is used to indicate whether a file
      contains unwritten blocks past i_size.  It's set when ext4_fallocate
      is called with the KEEP_SIZE flag to extend a file with an unwritten
      extent.  However, this flag hasn't been useful functionally since
      March, 2012, when a decision was made to remove it from ext4.
      
      All traces of EXT4_EOFBLOCKS_FL were removed from e2fsprogs version
      1.42.2 by commit 010dc7b90d97 ("e2fsck: remove EXT4_EOFBLOCKS_FL flag
      handling") at that time.  Now that enough time has passed to make
      e2fsprogs versions containing this modification common, this patch now
      removes the code associated with EXT4_EOFBLOCKS_FL from the kernel as
      well.
      
      This change has two implications.  First, because pre-1.42.2 e2fsck
      versions only look for a problem if EXT4_EOFBLOCKS_FL is set, and
      because that bit will never be set by newer kernels containing this
      patch, old versions of e2fsck won't have a compatibility problem with
      files created by newer kernels.
      
      Second, newer kernels will not clear EXT4_EOFBLOCKS_FL inode flag bits
      belonging to a file written by an older kernel.  If set, it will remain
      in that state until the file is deleted.  Because e2fsck versions since
      1.42.2 don't check the flag at all, no adverse effect is expected.
      However, pre-1.42.2 e2fsck versions that do check the flag may report
      that it is set when it ought not to be after a file has been truncated
      or had its unwritten blocks written.  In this case, the old version of
      e2fsck will offer to clear the flag.  No adverse effect would then
      occur whether the user chooses to clear the flag or not.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Link: https://lore.kernel.org/r/20200211210216.24960-1-enwlinux@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      4337ecd1
  4. 22 2月, 2020 3 次提交
  5. 21 2月, 2020 2 次提交
  6. 20 2月, 2020 1 次提交
    • Q
      ext4: fix a data race in EXT4_I(inode)->i_disksize · 35df4299
      Qian Cai 提交于
      EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
      KCSAN,
      
       BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
      
       write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
        ext4_write_end+0x4e3/0x750 [ext4]
        ext4_update_i_disksize at fs/ext4/ext4.h:3032
        (inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
        (inlined by) ext4_write_end at fs/ext4/inode.c:1287
        generic_perform_write+0x208/0x2a0
        ext4_buffered_write_iter+0x11f/0x210 [ext4]
        ext4_file_write_iter+0xce/0x9e0 [ext4]
        new_sync_write+0x29c/0x3b0
        __vfs_write+0x92/0xa0
        vfs_write+0x103/0x260
        ksys_write+0x9d/0x130
        __x64_sys_write+0x4c/0x60
        do_syscall_64+0x91/0xb47
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
       read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
        ext4_writepages+0x10ac/0x1d00 [ext4]
        mpage_map_and_submit_extent at fs/ext4/inode.c:2468
        (inlined by) ext4_writepages at fs/ext4/inode.c:2772
        do_writepages+0x5e/0x130
        __writeback_single_inode+0xeb/0xb20
        writeback_sb_inodes+0x429/0x900
        __writeback_inodes_wb+0xc4/0x150
        wb_writeback+0x4bd/0x870
        wb_workfn+0x6b4/0x960
        process_one_work+0x54c/0xbe0
        worker_thread+0x80/0x650
        kthread+0x1e0/0x200
        ret_from_fork+0x27/0x50
      
       Reported by Kernel Concurrency Sanitizer on:
       CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G        W  O L 5.5.0-next-20200204+ #5
       Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
       Workqueue: writeback wb_workfn (flush-7:0)
      
      Since only the read is operating as lockless (outside of the
      "i_data_sem"), load tearing could introduce a logic bug. Fix it by
      adding READ_ONCE() for the read and WRITE_ONCE() for the write.
      Signed-off-by: NQian Cai <cai@lca.pw>
      Link: https://lore.kernel.org/r/1581085751-31793-1-git-send-email-cai@lca.pwSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      35df4299
  7. 14 2月, 2020 1 次提交
    • J
      ext4: fix checksum errors with indexed dirs · 48a34311
      Jan Kara 提交于
      DIR_INDEX has been introduced as a compat ext4 feature. That means that
      even kernels / tools that don't understand the feature may modify the
      filesystem. This works because for kernels not understanding indexed dir
      format, internal htree nodes appear just as empty directory entries.
      Index dir aware kernels then check the htree structure is still
      consistent before using the data. This all worked reasonably well until
      metadata checksums were introduced. The problem is that these
      effectively made DIR_INDEX only ro-compatible because internal htree
      nodes store checksums in a different place than normal directory blocks.
      Thus any modification ignorant to DIR_INDEX (or just clearing
      EXT4_INDEX_FL from the inode) will effectively cause checksum mismatch
      and trigger kernel errors. So we have to be more careful when dealing
      with indexed directories on filesystems with checksumming enabled.
      
      1) We just disallow loading any directory inodes with EXT4_INDEX_FL when
      DIR_INDEX is not enabled. This is harsh but it should be very rare (it
      means someone disabled DIR_INDEX on existing filesystem and didn't run
      e2fsck), e2fsck can fix the problem, and we don't want to answer the
      difficult question: "Should we rather corrupt the directory more or
      should we ignore that DIR_INDEX feature is not set?"
      
      2) When we find out htree structure is corrupted (but the filesystem and
      the directory should in support htrees), we continue just ignoring htree
      information for reading but we refuse to add new entries to the
      directory to avoid corrupting it more.
      
      Link: https://lore.kernel.org/r/20200210144316.22081-1-jack@suse.cz
      Fixes: dbe89444 ("ext4: Calculate and verify checksums for htree nodes")
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      48a34311
  8. 18 1月, 2020 4 次提交
  9. 27 12月, 2019 3 次提交
  10. 07 11月, 2019 1 次提交
    • E
      ext4: add support for IV_INO_LBLK_64 encryption policies · b925acb8
      Eric Biggers 提交于
      IV_INO_LBLK_64 encryption policies have special requirements from the
      filesystem beyond those of the existing encryption policies:
      
      - Inode numbers must never change, even if the filesystem is resized.
      - Inode numbers must be <= 32 bits.
      - File logical block numbers must be <= 32 bits.
      
      ext4 has 32-bit inode and file logical block numbers.  However,
      resize2fs can re-number inodes when shrinking an ext4 filesystem.
      
      However, typically the people who would want to use this format don't
      care about filesystem shrinking.  They'd be fine with a solution that
      just prevents the filesystem from being shrunk.
      
      Therefore, add a new feature flag EXT4_FEATURE_COMPAT_STABLE_INODES that
      will do exactly that.  Then wire up the fscrypt_operations to expose
      this flag to fs/crypto/, so that it allows IV_INO_LBLK_64 policies when
      this flag is set.
      Acked-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      b925acb8
  11. 06 11月, 2019 4 次提交
  12. 23 10月, 2019 2 次提交
  13. 05 9月, 2019 1 次提交
  14. 30 8月, 2019 1 次提交
    • D
      ext4: Initialize timestamps limits · 4881c497
      Deepa Dinamani 提交于
      ext4 has different overflow limits for max filesystem
      timestamps based on the extra bytes available.
      
      The timestamp limits are calculated according to the
      encoding table in
      a4dad1aei(ext4: Fix handling of extended tv_sec):
      
      * extra  msb of                         adjust for signed
      * epoch  32-bit                         32-bit tv_sec to
      * bits   time    decoded 64-bit tv_sec  64-bit tv_sec      valid time range
      * 0 0    1    -0x80000000..-0x00000001  0x000000000   1901-12-13..1969-12-31
      * 0 0    0    0x000000000..0x07fffffff  0x000000000   1970-01-01..2038-01-19
      * 0 1    1    0x080000000..0x0ffffffff  0x100000000   2038-01-19..2106-02-07
      * 0 1    0    0x100000000..0x17fffffff  0x100000000   2106-02-07..2174-02-25
      * 1 0    1    0x180000000..0x1ffffffff  0x200000000   2174-02-25..2242-03-16
      * 1 0    0    0x200000000..0x27fffffff  0x200000000   2242-03-16..2310-04-04
      * 1 1    1    0x280000000..0x2ffffffff  0x300000000   2310-04-04..2378-04-22
      * 1 1    0    0x300000000..0x37fffffff  0x300000000   2378-04-22..2446-05-10
      
      Note that the time limits are not correct for deletion times.
      
      Added a warn when an inode cannot be extended to incorporate an
      extended timestamp.
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      Acked-by: NJeff Layton <jlayton@kernel.org>
      Cc: tytso@mit.edu
      Cc: adilger.kernel@dilger.ca
      Cc: linux-ext4@vger.kernel.org
      4881c497
  15. 28 8月, 2019 1 次提交
    • Z
      ext4: fix potential use after free after remounting with noblock_validity · 7727ae52
      zhangyi (F) 提交于
      Remount process will release system zone which was allocated before if
      "noblock_validity" is specified. If we mount an ext4 file system to two
      mountpoints with default mount options, and then remount one of them
      with "noblock_validity", it may trigger a use after free problem when
      someone accessing the other one.
      
       # mount /dev/sda foo
       # mount /dev/sda bar
      
      User access mountpoint "foo"   |   Remount mountpoint "bar"
                                     |
      ext4_map_blocks()              |   ext4_remount()
      check_block_validity()         |   ext4_setup_system_zone()
      ext4_data_block_valid()        |   ext4_release_system_zone()
                                     |   free system_blks rb nodes
      access system_blks rb nodes    |
      trigger use after free         |
      
      This problem can also be reproduced by one mountpint, At the same time,
      add_system_zone() can get called during remount as well so there can be
      racing ext4_data_block_valid() reading the rbtree at the same time.
      
      This patch add RCU to protect system zone from releasing or building
      when doing a remount which inverse current "noblock_validity" mount
      option. It assign the rbtree after the whole tree was complete and
      do actual freeing after rcu grace period, avoid any intermediate state.
      
      Reported-by: syzbot+1e470567330b7ad711d5@syzkaller.appspotmail.com
      Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      7727ae52
  16. 23 8月, 2019 2 次提交
    • E
      ext4: rework reserved cluster accounting when invalidating pages · 8fcc3a58
      Eric Whitney 提交于
      The goal of this patch is to remove two references to the buffer delay
      bit in ext4_da_page_release_reservation() as part of a larger effort
      to remove all such references from ext4.  These two references are
      principally used to reduce the reserved block/cluster count when pages
      are invalidated as a result of truncating, punching holes, or
      collapsing a block range in a file.  The entire function is removed
      and replaced with code in ext4_es_remove_extent() that reduces the
      reserved count as a side effect of removing a block range from delayed
      and not unwritten extents in the extent status tree as is done when
      truncating, punching holes, or collapsing ranges.
      
      The code is written to minimize the number of searches descending from
      rb tree roots for scalability.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      8fcc3a58
    • Z
      ext4: treat buffers with write errors as containing valid data · 7963e5ac
      ZhangXiaoxu 提交于
      I got some errors when I repair an ext4 volume which stacked by an
      iscsi target:
          Entry 'test60' in / (2) has deleted/unused inode 73750.  Clear?
      It can be reproduced when the network not good enough.
      
      When I debug this I found ext4 will read entry buffer from disk and
      the buffer is marked with write_io_error.
      
      If the buffer is marked with write_io_error, it means it already
      wroten to journal, and not checked out to disk. IOW, the journal
      is newer than the data in disk.
      If this journal record 'delete test60', it means the 'test60' still
      on the disk metadata.
      
      In this case, if we read the buffer from disk successfully and create
      file continue, the new journal record will overwrite the journal
      which record 'delete test60', then the entry corruptioned.
      
      So, use the buffer rather than read from disk if the buffer is marked
      with write_io_error.
      Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      7963e5ac
  17. 13 8月, 2019 3 次提交
    • E
      ext4: add fs-verity read support · 22cfe4b4
      Eric Biggers 提交于
      Make ext4_mpage_readpages() verify data as it is read from fs-verity
      files, using the helper functions from fs/verity/.
      
      To support both encryption and verity simultaneously, this required
      refactoring the decryption workflow into a generic "post-read
      processing" workflow which can do decryption, verification, or both.
      
      The case where the ext4 block size is not equal to the PAGE_SIZE is not
      supported yet, since in that case ext4_mpage_readpages() sometimes falls
      back to block_read_full_page(), which does not support fs-verity yet.
      Co-developed-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      22cfe4b4
    • E
      ext4: add basic fs-verity support · c93d8f88
      Eric Biggers 提交于
      Add most of fs-verity support to ext4.  fs-verity is a filesystem
      feature that enables transparent integrity protection and authentication
      of read-only files.  It uses a dm-verity like mechanism at the file
      level: a Merkle tree is used to verify any block in the file in
      log(filesize) time.  It is implemented mainly by helper functions in
      fs/verity/.  See Documentation/filesystems/fsverity.rst for the full
      documentation.
      
      This commit adds all of ext4 fs-verity support except for the actual
      data verification, including:
      
      - Adding a filesystem feature flag and an inode flag for fs-verity.
      
      - Implementing the fsverity_operations to support enabling verity on an
        inode and reading/writing the verity metadata.
      
      - Updating ->write_begin(), ->write_end(), and ->writepages() to support
        writing verity metadata pages.
      
      - Calling the fs-verity hooks for ->open(), ->setattr(), and ->ioctl().
      
      ext4 stores the verity metadata (Merkle tree and fsverity_descriptor)
      past the end of the file, starting at the first 64K boundary beyond
      i_size.  This approach works because (a) verity files are readonly, and
      (b) pages fully beyond i_size aren't visible to userspace but can be
      read/written internally by ext4 with only some relatively small changes
      to ext4.  This approach avoids having to depend on the EA_INODE feature
      and on rearchitecturing ext4's xattr support to support paging
      multi-gigabyte xattrs into memory, and to support encrypting xattrs.
      Note that the verity metadata *must* be encrypted when the file is,
      since it contains hashes of the plaintext data.
      
      This patch incorporates work by Theodore Ts'o and Chandan Rajendra.
      Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      c93d8f88
    • T
      ext4: drop legacy pre-1970 encoding workaround · cd2d9922
      Theodore Ts'o 提交于
      Originally, support for expanded timestamps had a bug in that pre-1970
      times were erroneously encoded as being in the the 24th century.  This
      was fixed in commit a4dad1ae ("ext4: Fix handling of extended
      tv_sec") which landed in 4.4.  Starting with 4.4, pre-1970 timestamps
      were correctly encoded, but for backwards compatibility those
      incorrectly encoded timestamps were mapped back to the pre-1970 dates.
      
      Given that backwards compatibility workaround has been around for 4
      years, and given that running e2fsck from e2fsprogs 1.43.2 and later
      will offer to fix these timestamps (which has been released for 3
      years), it's past time to drop the legacy workaround from the kernel.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      cd2d9922
  18. 12 8月, 2019 3 次提交
  19. 22 6月, 2019 3 次提交
  20. 20 6月, 2019 1 次提交