1. 14 11月, 2018 1 次提交
  2. 16 9月, 2018 1 次提交
  3. 02 9月, 2018 1 次提交
    • T
      ext4: recalucate superblock checksum after updating free blocks/inodes · 4274f516
      Theodore Ts'o 提交于
      When mounting the superblock, ext4_fill_super() calculates the free
      blocks and free inodes and stores them in the superblock.  It's not
      strictly necessary, since we don't use them any more, but it's nice to
      keep them roughly aligned to reality.
      
      Since it's not critical for file system correctness, the code doesn't
      call ext4_commit_super().  The problem is that it's in
      ext4_commit_super() that we recalculate the superblock checksum.  So
      if we're not going to call ext4_commit_super(), we need to call
      ext4_superblock_csum_set() to make sure the superblock checksum is
      consistent.
      
      Most of the time, this doesn't matter, since we end up calling
      ext4_commit_super() very soon thereafter, and definitely by the time
      the file system is unmounted.  However, it doesn't work in this
      sequence:
      
      mke2fs -Fq -t ext4 /dev/vdc 128M
      mount /dev/vdc /vdc
      cp xfstests/git-versions /vdc
      godown /vdc
      umount /vdc
      mount /dev/vdc
      tune2fs -l /dev/vdc
      
      With this commit, the "tune2fs -l" no longer fails.
      Reported-by: NChengguang Xu <cgxu519@gmx.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      
      
      4274f516
  4. 30 7月, 2018 5 次提交
    • W
      ext4: fix race when setting the bitmap corrupted flag · 9af0b3d1
      Wang Shilong 提交于
      Whenever we hit block or inode bitmap corruptions we set
      bit and then reduce this block group free inode/clusters
      counter to expose right available space.
      
      However some of ext4_mark_group_bitmap_corrupted() is called
      inside group spinlock, some are not, this could make it happen
      that we double reduce one block group free counters from system.
      
      Always hold group spinlock for it could fix it, but it looks
      a little heavy, we could use test_and_set_bit() to fix race
      problems here.
      Signed-off-by: NWang Shilong <wshilong@ddn.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      9af0b3d1
    • C
      ext4: check allocation failure when duplicating "data" in ext4_remount() · 21ac738e
      Chengguang Xu 提交于
      There is no check for allocation failure when duplicating
      "data" in ext4_remount(). Check for failure and return
      error -ENOMEM in this case.
      Signed-off-by: NChengguang Xu <cgxu519@gmx.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      21ac738e
    • J
      ext4: fix warning message in ext4_enable_quotas() · 7f144fd0
      Junichi Uekawa 提交于
      Output the warning message before we clobber type and be -1 all the time.
      The error message would now be
      
      [    1.519791] EXT4-fs warning (device vdb): ext4_enable_quotas:5402:
      Failed to enable quota tracking (type=0, err=-3). Please run e2fsck to fix.
      Signed-off-by: NJunichi Uekawa <uekawa@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      7f144fd0
    • A
      ext4: super: extend timestamps to 40 bits · 6a0678a7
      Arnd Bergmann 提交于
      The inode timestamps use 34 bits in ext4, but the various timestamps in
      the superblock are limited to 32 bits. If every user accesses these as
      'unsigned', then this is good until year 2106, but it seems better to
      extend this a bit further in the process of removing the deprecated
      get_seconds() function.
      
      This adds another byte for each timestamp in the superblock, making
      them long enough to store timestamps beyond what is in the inodes,
      which seems good enough here (in ocfs2, they are already 64-bit wide,
      which is appropriate for a new layout).
      
      I did not modify e2fsprogs, which obviously needs the same change to
      actually interpret future timestamps correctly.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6a0678a7
    • T
      ext4: fix check to prevent initializing reserved inodes · 50122847
      Theodore Ts'o 提交于
      Commit 8844618d: "ext4: only look at the bg_flags field if it is
      valid" will complain if block group zero does not have the
      EXT4_BG_INODE_ZEROED flag set.  Unfortunately, this is not correct,
      since a freshly created file system has this flag cleared.  It gets
      almost immediately after the file system is mounted read-write --- but
      the following somewhat unlikely sequence will end up triggering a
      false positive report of a corrupted file system:
      
         mkfs.ext4 /dev/vdc
         mount -o ro /dev/vdc /vdc
         mount -o remount,rw /dev/vdc
      
      Instead, when initializing the inode table for block group zero, test
      to make sure that itable_unused count is not too large, since that is
      the case that will result in some or all of the reserved inodes
      getting cleared.
      
      This fixes the failures reported by Eric Whiteney when running
      generic/230 and generic/231 in the the nojournal test case.
      
      Fixes: 8844618d ("ext4: only look at the bg_flags field if it is valid")
      Reported-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      50122847
  5. 18 7月, 2018 1 次提交
  6. 09 7月, 2018 2 次提交
    • T
      ext4: clear mmp sequence number when remounting read-only · 2dca60d9
      Theodore Ts'o 提交于
      Previously, when an MMP-protected file system is remounted read-only,
      the kmmpd thread would exit the next time it woke up (a few seconds
      later), without resetting the MMP sequence number back to
      EXT4_MMP_SEQ_CLEAN.
      
      Fix this by explicitly killing the MMP thread when the file system is
      remounted read-only.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: Andreas Dilger <adilger@dilger.ca>
      2dca60d9
    • T
      ext4: fix false negatives *and* false positives in ext4_check_descriptors() · 44de022c
      Theodore Ts'o 提交于
      Ext4_check_descriptors() was getting called before s_gdb_count was
      initialized.  So for file systems w/o the meta_bg feature, allocation
      bitmaps could overlap the block group descriptors and ext4 wouldn't
      notice.
      
      For file systems with the meta_bg feature enabled, there was a
      fencepost error which would cause the ext4_check_descriptors() to
      incorrectly believe that the block allocation bitmap overlaps with the
      block group descriptor blocks, and it would reject the mount.
      
      Fix both of these problems.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      44de022c
  7. 03 7月, 2018 1 次提交
  8. 18 6月, 2018 1 次提交
    • T
      ext4: add more mount time checks of the superblock · bfe0a5f4
      Theodore Ts'o 提交于
      The kernel's ext4 mount-time checks were more permissive than
      e2fsprogs's libext2fs checks when opening a file system.  The
      superblock is considered too insane for debugfs or e2fsck to operate
      on it, the kernel has no business trying to mount it.
      
      This will make file system fuzzing tools work harder, but the failure
      cases that they find will be more useful and be easier to evaluate.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      bfe0a5f4
  9. 17 6月, 2018 1 次提交
  10. 14 6月, 2018 2 次提交
  11. 13 6月, 2018 2 次提交
    • T
      ext4: add warn_on_error mount option · 327eaf73
      Theodore Ts'o 提交于
      This is very handy when debugging bugs handling maliciously corrupted
      file systems.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      327eaf73
    • K
      treewide: kvmalloc() -> kvmalloc_array() · 344476e1
      Kees Cook 提交于
      The kvmalloc() function has a 2-factor argument form, kvmalloc_array(). This
      patch replaces cases of:
      
              kvmalloc(a * b, gfp)
      
      with:
              kvmalloc_array(a * b, gfp)
      
      as well as handling cases of:
      
              kvmalloc(a * b * c, gfp)
      
      with:
      
              kvmalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kvmalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kvmalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kvmalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kvmalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kvmalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kvmalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kvmalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kvmalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kvmalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kvmalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kvmalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kvmalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kvmalloc
      + kvmalloc_array
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kvmalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvmalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvmalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvmalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvmalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvmalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvmalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvmalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kvmalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kvmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kvmalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kvmalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kvmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kvmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kvmalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvmalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvmalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvmalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvmalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvmalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvmalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvmalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kvmalloc(C1 * C2 * C3, ...)
      |
        kvmalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvmalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvmalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvmalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kvmalloc(sizeof(THING) * C2, ...)
      |
        kvmalloc(sizeof(TYPE) * C2, ...)
      |
        kvmalloc(C1 * C2 * C3, ...)
      |
        kvmalloc(C1 * C2, ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kvmalloc
      + kvmalloc_array
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      344476e1
  12. 31 5月, 2018 2 次提交
  13. 21 5月, 2018 2 次提交
  14. 14 5月, 2018 1 次提交
    • J
      ext4: handle errors on ext4_commit_super · c89128a0
      Jaegeuk Kim 提交于
      When remounting ext4 from ro to rw, currently it allows its transition,
      even if ext4_commit_super() returns EIO. Even worse thing is, after that,
      fs/buffer complains buffer dirty bits like:
      
       Call trace:
       [<ffffff9750c259dc>] mark_buffer_dirty+0x184/0x1a4
       [<ffffff9750cb398c>] __ext4_handle_dirty_super+0x4c/0xfc
       [<ffffff9750c7a9fc>] ext4_file_open+0x154/0x1c0
       [<ffffff9750bea51c>] do_dentry_open+0x114/0x2d0
       [<ffffff9750bea75c>] vfs_open+0x5c/0x94
       [<ffffff9750bf879c>] path_openat+0x668/0xfe8
       [<ffffff9750bf8088>] do_filp_open+0x74/0x120
       [<ffffff9750beac98>] do_sys_open+0x148/0x254
       [<ffffff9750beade0>] SyS_openat+0x10/0x18
       [<ffffff9750a83ab0>] el0_svc_naked+0x24/0x28
       EXT4-fs (dm-1): previous I/O error to superblock detected
       Buffer I/O error on dev dm-1, logical block 0, lost sync page write
       EXT4-fs (dm-1): re-mounted. Opts: (null)
       Buffer I/O error on dev dm-1, logical block 80, lost async page write
      Signed-off-by: NJaegeuk Kim <jaegeuk@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c89128a0
  15. 12 5月, 2018 1 次提交
  16. 26 4月, 2018 1 次提交
  17. 30 3月, 2018 5 次提交
  18. 22 3月, 2018 2 次提交
  19. 19 2月, 2018 1 次提交
  20. 29 1月, 2018 1 次提交
  21. 20 1月, 2018 1 次提交
    • D
      ext4: auto disable dax instead of failing mount · 24f3478d
      Dan Williams 提交于
      Bring the ext4 filesystem in line with xfs that only warns and continues
      when the "-o dax" option is specified to mount and the backing device
      does not support dax. This is in preparation for removing dax support
      from devices that do not enable get_user_pages() operations on dax
      mappings. In other words 'gup' support is required and configurations
      that were using so called 'page-less' dax will be converted back to
      using the page cache.
      
      Removing the broken 'page-less' dax support is a pre-requisite for
      removing the "EXPERIMENTAL" warning when mounting a filesystem in dax
      mode.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      24f3478d
  22. 16 1月, 2018 1 次提交
    • D
      ext4: Define usercopy region in ext4_inode_cache slab cache · f8dd7c70
      David Windsor 提交于
      The ext4 symlink pathnames, stored in struct ext4_inode_info.i_data
      and therefore contained in the ext4_inode_cache slab cache, need
      to be copied to/from userspace.
      
      cache object allocation:
          fs/ext4/super.c:
              ext4_alloc_inode(...):
                  struct ext4_inode_info *ei;
                  ...
                  ei = kmem_cache_alloc(ext4_inode_cachep, GFP_NOFS);
                  ...
                  return &ei->vfs_inode;
      
          include/trace/events/ext4.h:
                  #define EXT4_I(inode) \
                      (container_of(inode, struct ext4_inode_info, vfs_inode))
      
          fs/ext4/namei.c:
              ext4_symlink(...):
                  ...
                  inode->i_link = (char *)&EXT4_I(inode)->i_data;
      
      example usage trace:
          readlink_copy+0x43/0x70
          vfs_readlink+0x62/0x110
          SyS_readlinkat+0x100/0x130
      
          fs/namei.c:
              readlink_copy(..., link):
                  ...
                  copy_to_user(..., link, len)
      
              (inlined into vfs_readlink)
              generic_readlink(dentry, ...):
                  struct inode *inode = d_inode(dentry);
                  const char *link = inode->i_link;
                  ...
                  readlink_copy(..., link);
      
      In support of usercopy hardening, this patch defines a region in the
      ext4_inode_cache slab cache in which userspace copy operations are
      allowed.
      
      This region is known as the slab cache's usercopy region. Slab caches
      can now check that each dynamically sized copy operation involving
      cache-managed memory falls entirely within the slab's usercopy region.
      
      This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on my
      understanding of the code. Changes or omissions from the original code are
      mine and don't reflect the original grsecurity/PaX code.
      Signed-off-by: NDavid Windsor <dave@nullcore.net>
      [kees: adjust commit log, provide usage trace]
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: linux-ext4@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      f8dd7c70
  23. 12 1月, 2018 2 次提交
  24. 10 1月, 2018 2 次提交