1. 16 10月, 2012 1 次提交
  2. 10 10月, 2012 1 次提交
    • T
      ext4: fix metadata checksum calculation for the superblock · 06db49e6
      Theodore Ts'o 提交于
      The function ext4_handle_dirty_super() was calculating the superblock
      on the wrong block data.  As a result, when the superblock is modified
      while it is mounted (most commonly, when inodes are added or removed
      from the orphan list), the superblock checksum would be wrong.  We
      didn't notice because the superblock *was* being correctly calculated
      in ext4_commit_super(), and this would get called when the file system
      was unmounted.  So the problem only became obvious if the system
      crashed while the file system was mounted.
      
      Fix this by removing the poorly designed function signature for
      ext4_superblock_csum_set(); if it only took a single argument, the
      pointer to a struct superblock, the ambiguity which caused this
      mistake would have been impossible.
      Reported-by: NGeorge Spelvin <linux@horizon.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      06db49e6
  3. 29 9月, 2012 2 次提交
  4. 27 9月, 2012 2 次提交
  5. 24 9月, 2012 1 次提交
  6. 18 9月, 2012 1 次提交
  7. 14 9月, 2012 1 次提交
  8. 05 9月, 2012 1 次提交
    • T
      ext4: grow the s_flex_groups array as needed when resizing · 117fff10
      Theodore Ts'o 提交于
      Previously, we allocated the s_flex_groups array to the maximum size
      that the file system could be resized.  There was two problems with
      this approach.  First, it wasted memory in the common case where the
      file system was not resized.  Secondly, once we start allowing online
      resizing using the meta_bg scheme, there is no maximum size that the
      file system can be resized.  So instead, we need to grow the
      s_flex_groups at inline resize time.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      117fff10
  9. 19 8月, 2012 1 次提交
  10. 18 8月, 2012 1 次提交
  11. 17 8月, 2012 4 次提交
    • T
      ext4: return an error if kset_create_and_add fails in ext4_init_fs() · 0e376b1e
      Theodore Ts'o 提交于
      In the very unlikely case that kset_create_and_add() fails when the
      ext4.ko module is being loaded (or during kernel startup) set err so
      that it's clear that the module load failed.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=27912Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0e376b1e
    • Z
      ext4: make the zero-out chunk size tunable · 67a5da56
      Zheng Liu 提交于
      Currently in ext4 the length of zero-out chunk is set to 7 file system
      blocks.  But if an inode has uninitailized extents from using
      fallocate to preallocate space, and the workload issues many random
      writes, this can cause a fragmented extent tree that will
      unnecessarily grow the extent tree.
      
      So create a new sysfs tunable, extent_max_zeroout_kb, which controls
      the maximum size where blocks will be zeroed out instead of creating a
      new uninitialized extent.  The default of this has been sent to 32kb.
      
      CC: Zach Brown <zab@zabbo.net>
      CC: Andreas Dilger <adilger@dilger.ca>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      67a5da56
    • T
      ext4: add max_dir_size_kb mount option · df981d03
      Theodore Ts'o 提交于
      Very large directories can cause significant performance problems, or
      perhaps even invoke the OOM killer, if the process is running in a
      highly constrained memory environment (whether it is VM's with a small
      amount of memory or in a small memory cgroup).
      
      So it is useful, in cloud server/data center environments, to be able
      to set a filesystem-wide cap on the maximum size of a directory, to
      ensure that directories never get larger than a sane size.  We do this
      via a new mount option, max_dir_size_kb.  If there is an attempt to
      grow the directory larger than max_dir_size_kb, the system call will
      return ENOSPC instead.
      
      Google-Bug-Id: 6863013
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      
      
      df981d03
    • T
      ext4: fix long mount times on very big file systems · 0548bbb8
      Theodore Ts'o 提交于
      Commit 8aeb00ff85a: "ext4: fix overhead calculation used by
      ext4_statfs()" introduced a O(n**2) calculation which makes very large
      file systems take forever to mount.  Fix this with an optimization for
      non-bigalloc file systems.  (For bigalloc file systems the overhead
      needs to be set in the the superblock.)
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      0548bbb8
  12. 06 8月, 2012 2 次提交
    • T
      ext4: avoid kmemcheck complaint from reading uninitialized memory · 7e731bc9
      Theodore Ts'o 提交于
      Commit 03179fe9 introduced a kmemcheck complaint in
      ext4_da_get_block_prep() because we save and restore
      ei->i_da_metadata_calc_last_lblock even though it is left
      uninitialized in the case where i_da_metadata_calc_len is zero.
      
      This doesn't hurt anything, but silencing the kmemcheck complaint
      makes it easier for people to find real bugs.
      
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=45631
      (which is marked as a regression).
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      7e731bc9
    • T
      ext4: make sure the journal sb is written in ext4_clear_journal_err() · d796c52e
      Theodore Ts'o 提交于
      After we transfer set the EXT4_ERROR_FS bit in the file system
      superblock, it's not enough to call jbd2_journal_clear_err() to clear
      the error indication from journal superblock --- we need to call
      jbd2_journal_update_sb_errno() as well.  Otherwise, when the root file
      system is mounted read-only, the journal is replayed, and the error
      indicator is transferred to the superblock --- but the s_errno field
      in the jbd2 superblock is left set (since although we cleared it in
      memory, we never flushed it out to disk).
      
      This can end up confusing e2fsck.  We should make e2fsck more robust
      in this case, but the kernel shouldn't be leaving things in this
      confused state, either.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      
      d796c52e
  13. 31 7月, 2012 1 次提交
  14. 23 7月, 2012 4 次提交
    • A
      ext4: weed out ext4_write_super · 4d47603d
      Artem Bityutskiy 提交于
      We do not depend on VFS's '->write_super()' anymore and do not need
      the 's_dirt' flag anymore, so weed out 'ext4_write_super()' and
      's_dirt'.
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      4d47603d
    • A
      ext4: remove unnecessary superblock dirtying · 58c5873a
      Artem Bityutskiy 提交于
      This patch changes the 'ext4_handle_dirty_super()' function which
      submits the superblock for I/O in the following cases:
      
      1. When creating the first large file on a file system without
         EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
      2. When re-sizing the file-system.
      3. When creating an xattr on a file-system without the
         EXT4_FEATURE_COMPAT_EXT_ATTR feature.
      
      If the file-system has journal enabled, the superblock is written via
      the journal. We do not modify this path.
      
      If the file-system has no journal, this function, falls back to just
      marking the superblock as dirty using the 's_dirt' superblock
      flag. This means that it delays the actual superblock I/O submission
      by 5 seconds (default setting).  Namely, the 'sync_supers()' kernel
      thread will call 'ext4_write_super()' later and will actually submit
      the superblock for I/O.
      
      And this is the behavior this patch modifies: we stop using 's_dirt'
      and just mark the superblock buffer as dirty right away. Indeed, all 3
      cases above are extremely rare and it does not add any value to delay
      the I/O submission for them.
      
      Note: 'ext4_handle_dirty_super()' executes
      '__ext4_handle_dirty_super()' with 'now = 0'. This patch basically
      makes the 'now' argument unneeded and it will be deleted in one of the
      next patches.
      
      This patch also removes 's_dirt' condition on the unmount path because
      we never set it anymore, so we should not test it.
      
      Tested using xfstests for both journalled and non-journalled ext4.
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      58c5873a
    • A
      ext4: make quota as first class supported feature · 7c319d32
      Aditya Kali 提交于
      This patch adds support for quotas as a first class feature in ext4;
      which is to say, the quota files are stored in hidden inodes as file
      system metadata, instead of as separate files visible in the file system
      directory hierarchy.
      
      It is based on the proposal at:                                                                                                           
      https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4
      
      This patch introduces a new feature - EXT4_FEATURE_RO_COMPAT_QUOTA
      which, when turned on, enables quota accounting at mount time
      iteself. Also, the quota inodes are stored in two additional superblock
      fields.  Some changes introduced by this patch that should be pointed
      out are:
      
      1) Two new ext4-superblock fields - s_usr_quota_inum and
         s_grp_quota_inum for storing the quota inodes in use.
      2) Default quota inodes are: inode#3 for tracking userquota and inode#4
         for tracking group quota. The superblock fields can be set to use
         other inodes as well.
      3) If the QUOTA feature and corresponding quota inodes are set in
         superblock, the quota usage tracking is turned on at mount time. On
         'quotaon' ioctl, the quota limits enforcement is turned
         on. 'quotaoff' ioctl turns off only the limits enforcement in this
         case.
      4) When QUOTA feature is in use, the quota mount options 'quota',
         'usrquota', 'grpquota' are ignored by the kernel.
      5) mke2fs or tune2fs can be used to set the QUOTA feature and initialize
         quota inodes. The default reserved inodes will not be visible to user
         as regular files.
      6) The quota-tools will need to be modified to support hidden quota
         files on ext4. E2fsprogs will also include support for creating and
         fixing quota files.
      7) Support is only for the new V2 quota file format.
      Tested-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NJohann Lombardi <johann@whamcloud.com>
      Signed-off-by: NAditya Kali <adityakali@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7c319d32
    • J
      quota: Move quota syncing to ->sync_fs method · a1177825
      Jan Kara 提交于
      Since the moment writes to quota files are using block device page cache and
      space for quota structures is reserved at the moment they are first accessed we
      have no reason to sync quota before inode writeback. In fact this order is now
      only harmful since quota information can easily change during inode writeback
      (either because conversion of delayed-allocated extents or simply because of
      allocation of new blocks for simple filesystems not using page_mkwrite).
      
      So move syncing of quota information after writeback of inodes into ->sync_fs
      method. This way we do not have to use ->quota_sync callback which is primarily
      intended for use by quotactl syscall anyway and we get rid of calling
      ->sync_fs() twice unnecessarily. We skip quota syncing for OCFS2 since it does
      proper quota journalling in all cases (unlike ext3, ext4, and reiserfs which
      also support legacy non-journalled quotas) and thus there are no dirty quota
      structures.
      
      CC: "Theodore Ts'o" <tytso@mit.edu>
      CC: Joel Becker <jlbec@evilplan.org>
      CC: reiserfs-devel@vger.kernel.org
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      Acked-by: NDave Kleikamp <shaggy@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a1177825
  15. 10 7月, 2012 1 次提交
    • T
      ext4: fix overhead calculation used by ext4_statfs() · 952fc18e
      Theodore Ts'o 提交于
      Commit f975d6bc introduced bug which caused ext4_statfs() to
      miscalculate the number of file system overhead blocks.  This causes
      the f_blocks field in the statfs structure to be larger than it should
      be.  This would in turn cause the "df" output to show the number of
      data blocks in the file system and the number of data blocks used to
      be larger than they should be.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      952fc18e
  16. 31 5月, 2012 2 次提交
  17. 29 5月, 2012 4 次提交
  18. 27 5月, 2012 1 次提交
  19. 21 5月, 2012 1 次提交
    • T
      ext4: enable the 64-bit jbd2 feature based on the 64-bit ext4 feature · f32aaf2d
      Theodore Ts'o 提交于
      Previously we were only enabling the 64-bit jbd2 feature if the number
      of blocks in the file system was greater 2**32-1.  The problem with
      this is that it makes it harder to test the 64-bit journal code paths
      with small file systems, since a small test file system would with the
      64-bit ext4 feature enable would use a 64-bit file system on-disk data
      structures, but use a 32-bit journal.
      
      This would also cause problems when trying to do an online resize to
      grow the filesystem above the 2**32-1 boundary.  Fortunately the patch
      to support online resize for 64-bit file systems hasn't been merged
      yet, so this problem hasn't arisen in practice.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f32aaf2d
  20. 16 5月, 2012 2 次提交
  21. 06 5月, 2012 1 次提交
  22. 30 4月, 2012 4 次提交
  23. 24 4月, 2012 1 次提交