1. 26 2月, 2009 1 次提交
    • T
      ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl · ccd2506b
      Theodore Ts'o 提交于
      Add an ioctl which forces all of the delay allocated blocks to be
      allocated.  This also provides a function ext4_alloc_da_blocks() which
      will be used by the following commits to force files to be fully
      allocated to preserve application-expected ext3 behaviour.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      ccd2506b
  2. 13 3月, 2009 1 次提交
    • T
      ext4: New inode/block allocation algorithms for flex_bg filesystems · a4912123
      Theodore Ts'o 提交于
      The find_group_flex() inode allocator is now only used if the
      filesystem is mounted using the "oldalloc" mount option.  It is
      replaced with the original Orlov allocator that has been updated for
      flex_bg filesystems (it should behave the same way if flex_bg is
      disabled).  The inode allocator now functions by taking into account
      each flex_bg group, instead of each block group, when deciding whether
      or not it's time to allocate a new directory into a fresh flex_bg.
      
      The block allocator has also been changed so that the first block
      group in each flex_bg is preferred for use for storing directory
      blocks.  This keeps directory blocks close together, which is good for
      speeding up e2fsck since large directories are more likely to look
      like this:
      
      debugfs:  stat /home/tytso/Maildir/cur
      Inode: 1844562   Type: directory    Mode:  0700   Flags: 0x81000
      Generation: 1132745781    Version: 0x00000000:0000ad71
      User: 15806   Group: 15806   Size: 1060864
      File ACL: 0    Directory ACL: 0
      Links: 2   Blockcount: 2072
      Fragment:  Address: 0    Number: 0    Size: 0
       ctime: 0x499c0ff4:164961f4 -- Wed Feb 18 08:41:08 2009
       atime: 0x499c0ff4:00000000 -- Wed Feb 18 08:41:08 2009
       mtime: 0x49957f51:00000000 -- Fri Feb 13 09:10:25 2009
      crtime: 0x499c0f57:00d51440 -- Wed Feb 18 08:38:31 2009
      Size of extra inode fields: 28
      BLOCKS:
      (0):7348651, (1-258):7348654-7348911
      TOTAL: 259
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a4912123
  3. 16 2月, 2009 2 次提交
  4. 15 2月, 2009 1 次提交
    • W
      ext4: New rec_len encoding for very large blocksizes · 3d0518f4
      Wei Yongjun 提交于
      The rec_len field in the directory entry is 16 bits, so to encode
      blocksizes larger than 64k becomes problematic.  This patch allows us
      to supprot block sizes up to 256k, by using the low 2 bits to extend
      the range of rec_len to 2**18-1 (since valid rec_len sizes must be a
      multiple of 4).  We use the convention that a rec_len of 0 or 65535
      means the filesystem block size, for compatibility with older kernels.
      
      It's unlikely we'll see VM pages of up to 256k, but at some point we
      might find that the Linux VM has been enhanced to support filesystem
      block sizes > than the VM page size, at which point it might be useful
      for some applications to allow very large filesystem block sizes.
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3d0518f4
  5. 07 2月, 2009 1 次提交
  6. 26 3月, 2009 1 次提交
  7. 10 2月, 2009 1 次提交
    • W
      ext4: Fix to read empty directory blocks correctly in 64k · 7be2baaa
      Wei Yongjun 提交于
      The rec_len field in the directory entry is 16 bits, so there was a
      problem representing rec_len for filesystems with a 64k block size in
      the case where the directory entry takes the entire 64k block.
      Unfortunately, there were two schemes that were proposed; one where
      all zeros meant 65536 and one where all ones (65535) meant 65536.
      E2fsprogs used 0, whereas the kernel used 65535.  Oops.  Fortunately
      this case happens extremely rarely, with the most common case being
      the lost+found directory, created by mke2fs.
      
      So we will be liberal in what we accept, and accept both encodings,
      but we will continue to encode 65536 as 65535.  This will require a
      change in e2fsprogs, but with fortunately ext4 filesystems normally
      have the dir_index feature enabled, which precludes having a
      completely empty directory block.
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7be2baaa
  8. 18 1月, 2009 1 次提交
    • T
      ext4: only use i_size_high for regular files · 06a279d6
      Theodore Ts'o 提交于
      Directories are not allowed to be bigger than 2GB, so don't use
      i_size_high for anything other than regular files.  E2fsck should
      complain about these inodes, but the simplest thing to do for the
      kernel is to only use i_size_high for regular files.
      
      This prevents an intentially corrupted filesystem from causing the
      kernel to burn a huge amount of CPU and issuing error messages such
      as:
      
      EXT4-fs warning (device loop0): ext4_block_to_path: block 135090028 > max
      
      Thanks to David Maciejak from Fortinet's FortiGuard Global Security
      Research Team for reporting this issue.
      
      http://bugzilla.kernel.org/show_bug.cgi?id=12375Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      06a279d6
  9. 07 1月, 2009 2 次提交
  10. 06 1月, 2009 5 次提交
  11. 23 11月, 2008 1 次提交
  12. 05 11月, 2008 1 次提交
    • T
      ext4: Change unsigned long to unsigned int · 498e5f24
      Theodore Ts'o 提交于
      Convert the unsigned longs that are most responsible for bloating the
      stack usage on 64-bit systems.
      
      Nearly all places in the ext3/4 code which uses "unsigned long" is
      probably a bug, since on 32-bit systems a ulong a 32-bits, which means
      we are wasting stack space on 64-bit systems.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      498e5f24
  13. 06 1月, 2009 1 次提交
  14. 04 1月, 2009 1 次提交
  15. 08 12月, 2008 1 次提交
    • T
      ext4: remove ext4_new_meta_block() · cfe82c85
      Theodore Ts'o 提交于
      There were only two one callers of the function ext4_new_meta_block(),
      which just a very simpler wrapper function around
      ext4_new_meta_blocks().  Change those two functions to call
      ext4_new_meta_blocks() directly, to save code and stack space usage.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      cfe82c85
  16. 02 1月, 2009 1 次提交
  17. 29 10月, 2008 1 次提交
  18. 28 10月, 2008 1 次提交
  19. 17 10月, 2008 1 次提交
  20. 11 10月, 2008 1 次提交
    • H
      ext4: add an option to control error handling on file data · 5bf5683a
      Hidehiro Kawai 提交于
      If the journal doesn't abort when it gets an IO error in file data
      blocks, the file data corruption will spread silently.  Because
      most of applications and commands do buffered writes without fsync(),
      they don't notice the IO error.  It's scary for mission critical
      systems.  On the other hand, if the journal aborts whenever it gets
      an IO error in file data blocks, the system will easily become
      inoperable.  So this patch introduces a filesystem option to
      determine whether it aborts the journal or just call printk() when
      it gets an IO error in file data.
      
      If you mount an ext4 fs with data_err=abort option, it aborts on file
      data write error.  If you mount it with data_err=ignore, it doesn't
      abort, just call printk().  data_err=ignore is the default.
      
      Here is the corresponding patch of the ext3 version:
      http://kerneltrap.org/mailarchive/linux-kernel/2008/9/9/3239374Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      5bf5683a
  21. 07 10月, 2008 1 次提交
  22. 10 10月, 2008 2 次提交
  23. 24 9月, 2008 1 次提交
  24. 23 9月, 2008 1 次提交
  25. 14 9月, 2008 1 次提交
    • T
      ext4: Renumber EXT4_IOC_MIGRATE · 8eea80d5
      Theodore Ts'o 提交于
      Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with
      other ext4 ioctl's.  Since there haven't been any major userspace
      users of this ioctl, we can afford to change this now, to avoid
      potential problems later.
      
      Also, reorder the ioctl numbers in ext4.h to avoid this sort of
      mistake in the future.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8eea80d5
  26. 09 10月, 2008 1 次提交
  27. 14 9月, 2008 2 次提交
  28. 09 10月, 2008 2 次提交
  29. 09 9月, 2008 1 次提交
  30. 20 8月, 2008 2 次提交
    • M
      ext4: journal credits reservation fixes for DIO, fallocate · f3bd1f3f
      Mingming Cao 提交于
      DIO and fallocate credit calculation is different than writepage, as
      they do start a new journal right for each call to ext4_get_blocks_wrap().
      This patch uses the helper function in DIO and fallocate case, passing
      a flag indicating that the modified data are contigous thus could account
      less indirect/index blocks.
      
      This patch also fixed the journal credit reservation for direct I/O
      (DIO).  Previously the estimated credits for DIO only was calculated for
      non-extent files, which was not enough if the file is extent-based.
      
      Also fixed was fallocate double-counting credits for modifying the the
      superblock.
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f3bd1f3f
    • M
      ext4: journal credits calulation cleanup and fix for non-extent writepage · a02908f1
      Mingming Cao 提交于
      When considering how many journal credits are needed for modifying a
      chunk of data, we need to account for the super block, inode block,
      quota blocks and xattr block, indirect/index blocks, also, group bitmap
      and group descriptor blocks for new allocation (including data and
      indirect/index blocks). There are many places in ext4 do the calculation
      on their own and often missed one or two meta blocks, and often they
      assume single block allocation, and did not considering the multile
      chunk of allocation case.
      
      This patch is trying to cleanup current journal credit code, provides
      some common helper funtion to calculate the journal credits, to be used
      for writepage, writepages, DIO, fallocate, migration, defrag, and for
      both nonextent and extent files.
      
      This patch modified the writepage/write_begin credit caculation for
      nonextent files, to use the new helper function. It also fixed the
      problem that writepage on nonextent files did not consider the case
      blocksize <pagesize, thus could possibelly need multiple block
      allocation in a single transaction.
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a02908f1