1. 13 3月, 2009 1 次提交
  2. 05 3月, 2009 1 次提交
    • E
      ext4: fix ext4_free_inode() vs. ext4_claim_inode() race · 7ce9d5d1
      Eric Sandeen 提交于
      I was seeing fsck errors on inode bitmaps after a 4 thread
      dbench run on a 4 cpu machine:
      
      Inode bitmap differences: -50736 -(50752--50753) etc...
      
      I believe that this is because ext4_free_inode() uses atomic
      bitops, and although ext4_new_inode() *used* to also use atomic 
      bitops for synchronization, commit 
      39341867 changed this to use
      the sb_bgl_lock, so that we could also synchronize against
      read_inode_bitmap and initialization of uninit inode tables.
      
      However, that change left ext4_free_inode using atomic bitops,
      which I think leaves no synchronization between setting & 
      unsetting bits in the inode table.
      
      The below patch fixes it for me, although I wonder if we're 
      getting at all heavy-handed with this spinlock...
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7ce9d5d1
  3. 22 2月, 2009 1 次提交
    • T
      ext4: Add fallback for find_group_flex · 05bf9e83
      Theodore Ts'o 提交于
      This is a workaround for find_group_flex() which badly needs to be
      replaced.  One of its problems (besides ignoring the Orlov algorithm)
      is that it is a bit hyperactive about returning failure under
      suspicious circumstances.  This can lead to spurious ENOSPC failures
      even when there are inodes still available.
      
      Work around this for now by retrying the search using
      find_group_other() if find_group_flex() returns -1.  If
      find_group_other() succeeds when find_group_flex() has failed, log a
      warning message.
      
      A better block/inode allocator that will fix this problem for real has
      been queued up for the next merge window.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      05bf9e83
  4. 07 1月, 2009 1 次提交
    • T
      ext4: Remove "extents" mount option · 83982b6f
      Theodore Ts'o 提交于
      This mount option is largely superfluous, and in fact the way it was
      implemented was buggy; if a filesystem which did not have the extents
      feature flag was mounted -o extents, the filesystem would attempt to
      create and use extents-based file even though the extents feature flag
      was not eabled.  The simplest thing to do is to nuke the mount option
      entirely.  It's not all that useful to force the non-creation of new
      extent-based files if the filesystem can support it.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      83982b6f
  5. 04 1月, 2009 1 次提交
  6. 06 1月, 2009 3 次提交
    • A
      ext4: mark the blocks/inode bitmap beyond end of group as used · 648f5879
      Aneesh Kumar K.V 提交于
      We need to mark the block/inode bitmap beyond the end of the group
      with '1'.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      648f5879
    • A
      ext4: Use new buffer_head flag to check uninit group bitmaps initialization · 2ccb5fb9
      Aneesh Kumar K.V 提交于
      For uninit block group, the on-disk bitmap is not initialized. That
      implies we cannot depend on the uptodate flag on the bitmap
      buffer_head to find bitmap validity.  Use a new buffer_head flag which
      would be set after we properly initialize the bitmap.  This also
      prevents (re-)initializing the uninit group bitmap every time we call 
      ext4_read_block_bitmap().
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      2ccb5fb9
    • A
      ext4: Fix the race between read_inode_bitmap() and ext4_new_inode() · 39341867
      Aneesh Kumar K.V 提交于
      We need to make sure we update the inode bitmap and clear
      EXT4_BG_INODE_UNINIT flag with sb_bgl_lock held, since
      ext4_read_inode_bitmap() looks at EXT4_BG_INODE_UNINIT to decide
      whether to initialize the inode bitmap each time it is called.
      (introduced by commit c806e68f.)
      
      ext4_read_inode_bitmap does:
      
      spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
      if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
      	ext4_init_inode_bitmap(sb, bh, block_group, desc);
      
      and ext4_new_inode does
      if (!ext4_set_bit_atomic(sb_bgl_lock(sbi, group),
                         ino, inode_bitmap_bh->b_data))
      		   ......
      		   ...
      spin_lock(sb_bgl_lock(sbi, group));
      
      gdp->bg_flags &= cpu_to_le16(~EXT4_BG_INODE_UNINIT);
      i.e., on allocation we update the bitmap then we take the sb_bgl_lock
      and clear the EXT4_BG_INODE_UNINIT flag. What can happen is a
      parallel ext4_read_inode_bitmap can zero out the bitmap in between
      the above ext4_set_bit_atomic and spin_lock(sb_bg_lock..)
      
      The race results in below user visible errors
      EXT4-fs error (device sdb1): ext4_free_inode: bit already cleared for inode 168449
      EXT4-fs warning (device sdb1): ext4_unlink: Deleting nonexistent file ...
      EXT4-fs warning (device sdb1): ext4_rmdir: empty directory has too many links ...
      # ls -al /mnt/tmp/f/p369/d3/d6/d39/db2/dee/d10f/d3f/l71
      ls: /mnt/tmp/f/p369/d3/d6/d39/db2/dee/d10f/d3f/l71: Stale NFS file handle
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      39341867
  7. 04 1月, 2009 1 次提交
  8. 06 1月, 2009 1 次提交
  9. 01 1月, 2009 1 次提交
  10. 14 11月, 2008 1 次提交
  11. 07 11月, 2008 1 次提交
  12. 06 1月, 2009 2 次提交
  13. 07 1月, 2009 1 次提交
    • F
      ext4: Allow ext4 to run without a journal · 0390131b
      Frank Mayhar 提交于
      A few weeks ago I posted a patch for discussion that allowed ext4 to run
      without a journal.  Since that time I've integrated the excellent
      comments from Andreas and fixed several serious bugs.  We're currently
      running with this patch and generating some performance numbers against
      both ext2 (with backported reservations code) and ext4 with and without
      a journal.  It just so happens that running without a journal is
      slightly faster for most everything.
      
      We did
      	iozone -T -t 4 s 2g -r 256k -T -I -i0 -i1 -i2
      
      which creates 4 threads, each of which create and do reads and writes on
      a 2G file, with a buffer size of 256K, using O_DIRECT for all file opens
      to bypass the page cache.  Results:
      
                           ext2        ext4, default   ext4, no journal
        initial writes   13.0 MB/s        15.4 MB/s          15.7 MB/s
        rewrites         13.1 MB/s        15.6 MB/s          15.9 MB/s
        reads            15.2 MB/s        16.9 MB/s          17.2 MB/s
        re-reads         15.3 MB/s        16.9 MB/s          17.2 MB/s
        random readers    5.6 MB/s         5.6 MB/s           5.7 MB/s
        random writers    5.1 MB/s         5.3 MB/s           5.4 MB/s 
      
      So it seems that, so far, this was a useful exercise.
      Signed-off-by: NFrank Mayhar <fmayhar@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0390131b
  14. 10 10月, 2008 2 次提交
    • F
      ext4: fix initialization of UNINIT bitmap blocks · c806e68f
      Frederic Bohe 提交于
      This fixes a bug which caused on-line resizing of filesystems with a
      1k blocksize to fail.  The root cause of this bug was the fact that if
      an uninitalized bitmap block gets read in by userspace (which
      e2fsprogs does try to avoid, but can happen when the blocksize is less
      than the pagesize and an adjacent blocks is read into memory)
      ext4_read_block_bitmap() was erroneously depending on the buffer
      uptodate flag to decide whether it needed to initialize the bitmap
      block in memory --- i.e., to set the standard set of blocks in use by
      a block group (superblock, bitmaps, inode table, etc.).  Essentially,
      ext4_read_block_bitmap() assumed it was the only routine that might
      try to read a block containing a block bitmap, which is simply not
      true.  
      
      To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap()
      must always initialize uninitialized bitmap blocks.  Once a block or
      inode is allocated out of that bitmap, it will be marked as
      initialized in the block group descriptor, so in general this won't
      result any extra unnecessary work.
      Signed-off-by: NFrederic Bohe <frederic.bohe@bull.net>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c806e68f
    • T
      ext4: Remove old legacy block allocator · c2ea3fde
      Theodore Ts'o 提交于
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c2ea3fde
  15. 09 9月, 2008 2 次提交
  16. 20 8月, 2008 1 次提交
    • E
      ext4: Fix bug where we return ENOSPC even though we have plenty of inodes · c001077f
      Eric Sandeen 提交于
      The find_group_flex() function starts with best_flex as the
      parent_fbg_group, which happens to have 0 inodes free.  Some of the
      flex groups searched have free blocks and free inodes, but the
      flex_freeb_ratio is < 10, so they're skipped.  Then when a group is
      compared to the current "best" flex group, it does not have more free
      blocks than "best", so it is skipped as well.
      
      This continues until no flex group with free inodes is found which has
      a proper ratio or which has more free blocks than the "best" group,
      and we're left with a "best" group that has 0 inodes free, and we
      return -ENOSPC.
      
      We fix this by changing the logic so that if the current "best" flex
      group has no inodes free, and the current one does have room, it is
      promoted to the next "best."
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c001077f
  17. 03 8月, 2008 2 次提交
    • E
      ext4: lock block groups when initializing · b5f10eed
      Eric Sandeen 提交于
      I noticed when filling a 1T filesystem with 4 threads using the
      fs_mark benchmark:
      
      fs_mark -d /mnt/test -D 256 -n 100000 -t 4 -s 20480 -F -S 0
      
      that I occasionally got checksum mismatch errors:
      
      EXT4-fs error (device sdb): ext4_init_inode_bitmap: Checksum bad for group 6935
      
      etc.  I'd reliably get 4-5 of them during the run.
      
      It appears that the problem is likely a race to init the bg's
      when the uninit_bg feature is enabled.
      
      With the patch below, which adds sb_bgl_locking around initialization,
      I was able to complete several runs with no errors or warnings.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b5f10eed
    • E
      ext4: sync up block and inode bitmap reading functions · e29d1cde
      Eric Sandeen 提交于
      ext4_read_block_bitmap and read_inode_bitmap do essentially
      the same thing, and yet they are structured quite differently.
      I came across this difference while looking at doing bg locking
      during bg initialization.
      
      This patch:
      
      * removes unnecessary casts in the error messages
      * renames read_inode_bitmap to ext4_read_inode_bitmap
      * and more substantially, restructures the inode bitmap
        reading function to be more like the block bitmap counterpart.
      
      The change to the inode bitmap reader simplifies the locking
      to be applied in the next patch.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e29d1cde
  18. 12 7月, 2008 4 次提交
  19. 30 4月, 2008 2 次提交
  20. 22 4月, 2008 1 次提交
  21. 17 4月, 2008 2 次提交
  22. 29 4月, 2008 1 次提交
  23. 26 2月, 2008 1 次提交
  24. 08 2月, 2008 1 次提交
  25. 29 1月, 2008 5 次提交