1. 18 1月, 2020 5 次提交
  2. 04 1月, 2020 1 次提交
  3. 27 12月, 2019 3 次提交
  4. 15 12月, 2019 1 次提交
  5. 20 11月, 2019 1 次提交
  6. 15 11月, 2019 1 次提交
    • Y
      ext4: fix a bug in ext4_wait_for_tail_page_commit · 565333a1
      yangerkun 提交于
      No need to wait for any commit once the page is fully truncated.
      Besides, it may confuse e.g. concurrent ext4_writepage() with the page
      still be dirty (will be cleared by truncate_pagecache() in
      ext4_setattr()) but buffers has been freed; and then trigger a bug
      show as below:
      
      [   26.057508] ------------[ cut here ]------------
      [   26.058531] kernel BUG at fs/ext4/inode.c:2134!
      ...
      [   26.088130] Call trace:
      [   26.088695]  ext4_writepage+0x914/0xb28
      [   26.089541]  writeout.isra.4+0x1b4/0x2b8
      [   26.090409]  move_to_new_page+0x3b0/0x568
      [   26.091338]  __unmap_and_move+0x648/0x988
      [   26.092241]  unmap_and_move+0x48c/0xbb8
      [   26.093096]  migrate_pages+0x220/0xb28
      [   26.093945]  kernel_mbind+0x828/0xa18
      [   26.094791]  __arm64_sys_mbind+0xc8/0x138
      [   26.095716]  el0_svc_common+0x190/0x490
      [   26.096571]  el0_svc_handler+0x60/0xd0
      [   26.097423]  el0_svc+0x8/0xc
      
      Run the procedure (generate by syzkaller) parallel with ext3.
      
      void main()
      {
      	int fd, fd1, ret;
      	void *addr;
      	size_t length = 4096;
      	int flags;
      	off_t offset = 0;
      	char *str = "12345";
      
      	fd = open("a", O_RDWR | O_CREAT);
      	assert(fd >= 0);
      
      	/* Truncate to 4k */
      	ret = ftruncate(fd, length);
      	assert(ret == 0);
      
      	/* Journal data mode */
      	flags = 0xc00f;
      	ret = ioctl(fd, _IOW('f', 2, long), &flags);
      	assert(ret == 0);
      
      	/* Truncate to 0 */
      	fd1 = open("a", O_TRUNC | O_NOATIME);
      	assert(fd1 >= 0);
      
      	addr = mmap(NULL, length, PROT_WRITE | PROT_READ,
      					MAP_SHARED, fd, offset);
      	assert(addr != (void *)-1);
      
      	memcpy(addr, str, 5);
      	mbind(addr, length, 0, 0, 0, MPOL_MF_MOVE);
      }
      
      And the bug will be triggered once we seen the below order.
      
      reproduce1                         reproduce2
      
      ...                            |   ...
      truncate to 4k                 |
      change to journal data mode    |
                                     |   memcpy(set page dirty)
      truncate to 0:                 |
      ext4_setattr:                  |
      ...                            |
      ext4_wait_for_tail_page_commit |
                                     |   mbind(trigger bug)
      truncate_pagecache(clean dirty)|   ...
      ...                            |
      
      mbind will call ext4_writepage() since the page still be dirty, and then
      report the bug since the buffers has been free. Fix it by return
      directly once offset equals to 0 which means the page has been fully
      truncated.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Link: https://lore.kernel.org/r/20190919063508.1045-1-yangerkun@huawei.comReviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      565333a1
  7. 14 11月, 2019 1 次提交
  8. 11 11月, 2019 1 次提交
  9. 06 11月, 2019 14 次提交
  10. 23 10月, 2019 2 次提交
  11. 21 10月, 2019 1 次提交
  12. 30 9月, 2019 1 次提交
    • L
      Revert "Revert "ext4: make __ext4_get_inode_loc plug"" · 02f03c42
      Linus Torvalds 提交于
      This reverts commit 72dbcf72.
      
      Instead of waiting forever for entropy that may just not happen, we now
      try to actively generate entropy when required, and are thus hopefully
      avoiding the problem that caused the nice ext4 IO pattern fix to be
      reverted.
      
      So revert the revert.
      
      Cc: Ahmed S. Darwish <darwish.07@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Alexander E. Patrakov <patrakov@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      02f03c42
  13. 16 9月, 2019 1 次提交
    • L
      Revert "ext4: make __ext4_get_inode_loc plug" · 72dbcf72
      Linus Torvalds 提交于
      This reverts commit b03755ad.
      
      This is sad, and done for all the wrong reasons.  Because that commit is
      good, and does exactly what it says: avoids a lot of small disk requests
      for the inode table read-ahead.
      
      However, it turns out that it causes an entirely unrelated problem: the
      getrandom() system call was introduced back in 2014 by commit
      c6e9d6f3 ("random: introduce getrandom(2) system call"), and people
      use it as a convenient source of good random numbers.
      
      But part of the current semantics for getrandom() is that it waits for
      the entropy pool to fill at least partially (unlike /dev/urandom).  And
      at least ArchLinux apparently has a systemd that uses getrandom() at
      boot time, and the improvements in IO patterns means that existing
      installations suddenly start hanging, waiting for entropy that will
      never happen.
      
      It seems to be an unlucky combination of not _quite_ enough entropy,
      together with a particular systemd version and configuration.  Lennart
      says that the systemd-random-seed process (which is what does this early
      access) is supposed to not block any other boot activity, but sadly that
      doesn't actually seem to be the case (possibly due bogus dependencies on
      cryptsetup for encrypted swapspace).
      
      The correct fix is to fix getrandom() to not block when it's not
      appropriate, but that fix is going to take a lot more discussion.  Do we
      just make it act like /dev/urandom by default, and add a new flag for
      "wait for entropy"? Do we add a boot-time option? Or do we just limit
      the amount of time it will wait for entropy?
      
      So in the meantime, we do the revert to give us time to discuss the
      eventual fix for the fundamental problem, at which point we can re-apply
      the ext4 inode table access optimization.
      Reported-by: NAhmed S. Darwish <darwish.07@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Alexander E. Patrakov <patrakov@gmail.com>
      Cc: Lennart Poettering <mzxreary@0pointer.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72dbcf72
  14. 03 9月, 2019 1 次提交
    • T
      ext4: fix kernel oops caused by spurious casefold flag · 6456ca65
      Theodore Ts'o 提交于
      If an directory has the a casefold flag set without the casefold
      feature set, s_encoding will not be initialized, and this will cause
      the kernel to dereference a NULL pointer.  In addition to adding
      checks to avoid these kernel oops, attempts to load inodes with the
      casefold flag when the casefold feature is not enable will cause the
      file system to be declared corrupted.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6456ca65
  15. 24 8月, 2019 1 次提交
    • T
      ext4: fix punch hole for inline_data file systems · c1e8220b
      Theodore Ts'o 提交于
      If a program attempts to punch a hole on an inline data file, we need
      to convert it to a normal file first.
      
      This was detected using ext4/032 using the adv configuration.  Simple
      reproducer:
      
      mke2fs -Fq -t ext4 -O inline_data /dev/vdc
      mount /vdc
      echo "" > /vdc/testfile
      xfs_io -c 'truncate 33554432' /vdc/testfile
      xfs_io -c 'fpunch 0 1048576' /vdc/testfile
      umount /vdc
      e2fsck -fy /dev/vdc
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c1e8220b
  16. 23 8月, 2019 2 次提交
    • E
      ext4: rework reserved cluster accounting when invalidating pages · 8fcc3a58
      Eric Whitney 提交于
      The goal of this patch is to remove two references to the buffer delay
      bit in ext4_da_page_release_reservation() as part of a larger effort
      to remove all such references from ext4.  These two references are
      principally used to reduce the reserved block/cluster count when pages
      are invalidated as a result of truncating, punching holes, or
      collapsing a block range in a file.  The entire function is removed
      and replaced with code in ext4_es_remove_extent() that reduces the
      reserved count as a side effect of removing a block range from delayed
      and not unwritten extents in the extent status tree as is done when
      truncating, punching holes, or collapsing ranges.
      
      The code is written to minimize the number of searches descending from
      rb tree roots for scalability.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      8fcc3a58
    • Z
      ext4: treat buffers with write errors as containing valid data · 7963e5ac
      ZhangXiaoxu 提交于
      I got some errors when I repair an ext4 volume which stacked by an
      iscsi target:
          Entry 'test60' in / (2) has deleted/unused inode 73750.  Clear?
      It can be reproduced when the network not good enough.
      
      When I debug this I found ext4 will read entry buffer from disk and
      the buffer is marked with write_io_error.
      
      If the buffer is marked with write_io_error, it means it already
      wroten to journal, and not checked out to disk. IOW, the journal
      is newer than the data in disk.
      If this journal record 'delete test60', it means the 'test60' still
      on the disk metadata.
      
      In this case, if we read the buffer from disk successfully and create
      file continue, the new journal record will overwrite the journal
      which record 'delete test60', then the entry corruptioned.
      
      So, use the buffer rather than read from disk if the buffer is marked
      with write_io_error.
      Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      7963e5ac
  17. 13 8月, 2019 2 次提交
    • E
      ext4: add fs-verity read support · 22cfe4b4
      Eric Biggers 提交于
      Make ext4_mpage_readpages() verify data as it is read from fs-verity
      files, using the helper functions from fs/verity/.
      
      To support both encryption and verity simultaneously, this required
      refactoring the decryption workflow into a generic "post-read
      processing" workflow which can do decryption, verification, or both.
      
      The case where the ext4 block size is not equal to the PAGE_SIZE is not
      supported yet, since in that case ext4_mpage_readpages() sometimes falls
      back to block_read_full_page(), which does not support fs-verity yet.
      Co-developed-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      22cfe4b4
    • E
      ext4: add basic fs-verity support · c93d8f88
      Eric Biggers 提交于
      Add most of fs-verity support to ext4.  fs-verity is a filesystem
      feature that enables transparent integrity protection and authentication
      of read-only files.  It uses a dm-verity like mechanism at the file
      level: a Merkle tree is used to verify any block in the file in
      log(filesize) time.  It is implemented mainly by helper functions in
      fs/verity/.  See Documentation/filesystems/fsverity.rst for the full
      documentation.
      
      This commit adds all of ext4 fs-verity support except for the actual
      data verification, including:
      
      - Adding a filesystem feature flag and an inode flag for fs-verity.
      
      - Implementing the fsverity_operations to support enabling verity on an
        inode and reading/writing the verity metadata.
      
      - Updating ->write_begin(), ->write_end(), and ->writepages() to support
        writing verity metadata pages.
      
      - Calling the fs-verity hooks for ->open(), ->setattr(), and ->ioctl().
      
      ext4 stores the verity metadata (Merkle tree and fsverity_descriptor)
      past the end of the file, starting at the first 64K boundary beyond
      i_size.  This approach works because (a) verity files are readonly, and
      (b) pages fully beyond i_size aren't visible to userspace but can be
      read/written internally by ext4 with only some relatively small changes
      to ext4.  This approach avoids having to depend on the EA_INODE feature
      and on rearchitecturing ext4's xattr support to support paging
      multi-gigabyte xattrs into memory, and to support encrypting xattrs.
      Note that the verity metadata *must* be encrypted when the file is,
      since it contains hashes of the plaintext data.
      
      This patch incorporates work by Theodore Ts'o and Chandan Rajendra.
      Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      c93d8f88
  18. 12 8月, 2019 1 次提交