1. 10 10月, 2008 1 次提交
  2. 09 9月, 2008 2 次提交
  3. 20 8月, 2008 1 次提交
    • E
      ext4: Fix bug where we return ENOSPC even though we have plenty of inodes · c001077f
      Eric Sandeen 提交于
      The find_group_flex() function starts with best_flex as the
      parent_fbg_group, which happens to have 0 inodes free.  Some of the
      flex groups searched have free blocks and free inodes, but the
      flex_freeb_ratio is < 10, so they're skipped.  Then when a group is
      compared to the current "best" flex group, it does not have more free
      blocks than "best", so it is skipped as well.
      
      This continues until no flex group with free inodes is found which has
      a proper ratio or which has more free blocks than the "best" group,
      and we're left with a "best" group that has 0 inodes free, and we
      return -ENOSPC.
      
      We fix this by changing the logic so that if the current "best" flex
      group has no inodes free, and the current one does have room, it is
      promoted to the next "best."
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c001077f
  4. 03 8月, 2008 2 次提交
    • E
      ext4: lock block groups when initializing · b5f10eed
      Eric Sandeen 提交于
      I noticed when filling a 1T filesystem with 4 threads using the
      fs_mark benchmark:
      
      fs_mark -d /mnt/test -D 256 -n 100000 -t 4 -s 20480 -F -S 0
      
      that I occasionally got checksum mismatch errors:
      
      EXT4-fs error (device sdb): ext4_init_inode_bitmap: Checksum bad for group 6935
      
      etc.  I'd reliably get 4-5 of them during the run.
      
      It appears that the problem is likely a race to init the bg's
      when the uninit_bg feature is enabled.
      
      With the patch below, which adds sb_bgl_locking around initialization,
      I was able to complete several runs with no errors or warnings.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b5f10eed
    • E
      ext4: sync up block and inode bitmap reading functions · e29d1cde
      Eric Sandeen 提交于
      ext4_read_block_bitmap and read_inode_bitmap do essentially
      the same thing, and yet they are structured quite differently.
      I came across this difference while looking at doing bg locking
      during bg initialization.
      
      This patch:
      
      * removes unnecessary casts in the error messages
      * renames read_inode_bitmap to ext4_read_inode_bitmap
      * and more substantially, restructures the inode bitmap
        reading function to be more like the block bitmap counterpart.
      
      The change to the inode bitmap reader simplifies the locking
      to be applied in the next patch.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e29d1cde
  5. 12 7月, 2008 4 次提交
  6. 30 4月, 2008 2 次提交
  7. 22 4月, 2008 1 次提交
  8. 17 4月, 2008 2 次提交
  9. 29 4月, 2008 1 次提交
  10. 26 2月, 2008 1 次提交
  11. 08 2月, 2008 1 次提交
  12. 29 1月, 2008 5 次提交
  13. 18 10月, 2007 2 次提交
    • A
      Ext4: Uninitialized Block Groups · 717d50e4
      Andreas Dilger 提交于
      In pass1 of e2fsck, every inode table in the fileystem is scanned and checked,
      regardless of whether it is in use.  This is this the most time consuming part
      of the filesystem check.  The unintialized block group feature can greatly
      reduce e2fsck time by eliminating checking of uninitialized inodes.
      
      With this feature, there is a a high water mark of used inodes for each block
      group.  Block and inode bitmaps can be uninitialized on disk via a flag in the
      group descriptor to avoid reading or scanning them at e2fsck time.  A checksum
      of each group descriptor is used to ensure that corruption in the group
      descriptor's bit flags does not cause incorrect operation.
      
      The feature is enabled through a mkfs option
      
      	mke2fs /dev/ -O uninit_groups
      
      A patch adding support for uninitialized block groups to e2fsprogs tools has
      been posted to the linux-ext4 mailing list.
      
      The patches have been stress tested with fsstress and fsx.  In performance
      tests testing e2fsck time, we have seen that e2fsck time on ext3 grows
      linearly with the total number of inodes in the filesytem.  In ext4 with the
      uninitialized block groups feature, the e2fsck time is constant, based
      solely on the number of used inodes rather than the total inode count.
      Since typical ext4 filesystems only use 1-10% of their inodes, this feature can
      greatly reduce e2fsck time for users.  With performance improvement of 2-20
      times, depending on how full the filesystem is.
      
      The attached graph shows the major improvements in e2fsck times in filesystems
      with a large total inode count, but few inodes in use.
      
      In each group descriptor if we have
      
      EXT4_BG_INODE_UNINIT set in bg_flags:
              Inode table is not initialized/used in this group. So we can skip
              the consistency check during fsck.
      EXT4_BG_BLOCK_UNINIT set in bg_flags:
              No block in the group is used. So we can skip the block bitmap
              verification for this group.
      
      We also add two new fields to group descriptor as a part of
      uninitialized group patch.
      
              __le16  bg_itable_unused;       /* Unused inodes count */
              __le16  bg_checksum;            /* crc16(sb_uuid+group+desc) */
      
      bg_itable_unused:
      
      If we have EXT4_BG_INODE_UNINIT not set in bg_flags
      then bg_itable_unused will give the offset within
      the inode table till the inodes are used. This can be
      used by fsck to skip list of inodes that are marked unused.
      
      bg_checksum:
      Now that we depend on bg_flags and bg_itable_unused to determine
      the block and inode usage, we need to make sure group descriptor
      is not corrupt. We add checksum to group descriptor to
      detect corruption. If the descriptor is found to be corrupt, we
      mark all the blocks and inodes in the group used.
      Signed-off-by: NAvantika Mathur <mathur@us.ibm.com>
      Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      717d50e4
    • C
      ext4: Remove (partial, never completed) fragment support · f077d0d7
      Coly Li 提交于
      Fragment support in ext2/3/4 was never implemented, and it probably will
      never be implemented.   So remove it from ext4.
      Signed-off-by: NColy Li <coyli@suse.de>
      Acked-by: NAndreas Dilger <adilger@clusterfs.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f077d0d7
  14. 17 10月, 2007 1 次提交
  15. 18 7月, 2007 1 次提交
    • K
      ext4: Add nanosecond timestamps · ef7f3835
      Kalpak Shah 提交于
      This patch adds nanosecond timestamps for ext4. This involves adding
      *time_extra fields to the ext4_inode to extend the timestamps to
      64-bits.  Creation time is also added by this patch.
      
      These extended fields will fit into an inode if the filesystem was
      formatted with large inodes (-I 256 or larger) and there are currently
      no EAs consuming all of the available space. For new inodes we always
      reserve enough space for the kernel's known extended fields, but for
      inodes created with an old kernel this might not have been the case. So
      this patch also adds the EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature
      flag(ro-compat so that older kernels can't create inodes with a smaller
      extra_isize). which indicates if the fields fitting inside
      s_min_extra_isize are available or not.  If the expansion of inodes if
      unsuccessful then this feature will be disabled.  This feature is only
      enabled if requested by the sysadmin.
      
      None of the extended inode fields is critical for correct filesystem
      operation.
      Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
      Signed-off-by: NKalpak Shah <kalpak@clusterfs.com>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      ef7f3835
  16. 12 10月, 2006 8 次提交
  17. 27 9月, 2006 3 次提交
  18. 26 6月, 2006 1 次提交
    • M
      [PATCH] ext3_fsblk_t: filesystem, group blocks and bug fixes · 1c2bf374
      Mingming Cao 提交于
      Some of the in-kernel ext3 block variable type are treated as signed 4 bytes
      int type, thus limited ext3 filesystem to 8TB (4kblock size based).  While
      trying to fix them, it seems quite confusing in the ext3 code where some
      blocks are filesystem-wide blocks, some are group relative offsets that need
      to be signed value (as -1 has special meaning).  So it seem saner to define
      two types of physical blocks: one is filesystem wide blocks, another is
      group-relative blocks.  The following patches clarify these two types of
      blocks in the ext3 code, and fix the type bugs which limit current 32 bit ext3
      filesystem limit to 8TB.
      
      With this series of patches and the percpu counter data type changes in the mm
      tree, we are able to extend exts filesystem limit to 16TB.
      
      This work is also a pre-request for the recent >32 bit ext3 work, and makes
      the kernel to able to address 48 bit ext3 block a lot easier: Simply redefine
      ext3_fsblk_t from unsigned long to sector_t and redefine the format string for
      ext3 filesystem block corresponding.
      
      Two RFC with a series patches have been posted to ext2-devel list and have
      been reviewed and discussed:
      http://marc.theaimsgroup.com/?l=ext2-devel&m=114722190816690&w=2
      
      http://marc.theaimsgroup.com/?l=ext2-devel&m=114784919525942&w=2
      
      Patches are tested on both 32 bit machine and 64 bit machine, <8TB ext3 and
      >8TB ext3 filesystem(with the latest to be released e2fsprogs-1.39).  Tests
      includes overnight fsx, tiobench, dbench and fsstress.
      
      This patch:
      
      Defines ext3_fsblk_t and ext3_grpblk_t, and the printk format string for
      filesystem wide blocks.
      
      This patch classifies all block group relative blocks, and ext3_fsblk_t blocks
      occurs in the same function where used to be confusing before.  Also include
      kernel bug fixes for filesystem wide in-kernel block variables.  There are
      some fileystem wide blocks are treated as int/unsigned int type in the kernel
      currently, especially in ext3 block allocation and reservation code.  This
      patch fixed those bugs by converting those variables to ext3_fsblk_t(unsigned
      long) type.
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1c2bf374
  19. 11 1月, 2006 1 次提交