1. 17 5月, 2010 1 次提交
  2. 16 5月, 2010 1 次提交
  3. 24 3月, 2010 1 次提交
  4. 05 3月, 2010 3 次提交
    • C
      dquot: cleanup dquot initialize routine · 871a2931
      Christoph Hellwig 提交于
      Get rid of the initialize dquot operation - it is now always called from
      the filesystem and if a filesystem really needs it's own (which none
      currently does) it can just call into it's own routine directly.
      
      Rename the now static low-level dquot_initialize helper to __dquot_initialize
      and vfs_dq_init to dquot_initialize to have a consistent namespace.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      871a2931
    • C
      dquot: cleanup dquot drop routine · 9f754758
      Christoph Hellwig 提交于
      Get rid of the drop dquot operation - it is now always called from
      the filesystem and if a filesystem really needs it's own (which none
      currently does) it can just call into it's own routine directly.
      
      Rename the now static low-level dquot_drop helper to __dquot_drop
      and vfs_dq_drop to dquot_drop to have a consistent namespace.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      9f754758
    • C
      dquot: cleanup inode allocation / freeing routines · 63936dda
      Christoph Hellwig 提交于
      Get rid of the alloc_inode and free_inode dquot operations - they are
      always called from the filesystem and if a filesystem really needs
      their own (which none currently does) it can just call into it's
      own routine directly.
      
      Also get rid of the vfs_dq_alloc/vfs_dq_free wrappers and always
      call the lowlevel dquot_alloc_inode / dqout_free_inode routines
      directly, which now lose the number argument which is always 1.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      63936dda
  5. 17 2月, 2010 1 次提交
    • C
      ext4: Fix BUG_ON at fs/buffer.c:652 in no journal mode · 73b50c1c
      Curt Wohlgemuth 提交于
      Calls to ext4_handle_dirty_metadata should only pass in an inode
      pointer for inode-specific metadata, and not for shared metadata
      blocks such as inode table blocks, block group descriptors, the
      superblock, etc.
      
      The BUG_ON can get tripped when updating a special device (such as a
      block device) that is opened (so that i_mapping is set in
      fs/block_dev.c) and the file system is mounted in no journal mode.
      
      Addresses-Google-Bug: #2404870
      Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      73b50c1c
  6. 16 2月, 2010 1 次提交
  7. 25 1月, 2010 1 次提交
    • T
      ext4: Use bitops to read/modify EXT4_I(inode)->i_state · 19f5fb7a
      Theodore Ts'o 提交于
      At several places we modify EXT4_I(inode)->i_state without holding
      i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage,
      ext4_do_update_inode, ...). These modifications are racy and we can
      lose updates to i_state. So convert handling of i_state to use bitops
      which are atomic.
      
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      19f5fb7a
  8. 28 7月, 2009 1 次提交
  9. 13 6月, 2009 2 次提交
    • A
      ext4: teach the inode allocator to use a goal inode number · 11013911
      Andreas Dilger 提交于
      Enhance the inode allocator to take a goal inode number as a
      paremeter; if it is specified, it takes precedence over Orlov or
      parent directory inode allocation algorithms.
      
      The extents migration function uses the goal inode number so that the
      extent trees allocated the migration function use the correct flex_bg.
      In the future, the goal inode functionality will also be used to
      allocate an adjacent inode for the extended attributes.
      
      Also, for testing purposes the goal inode number can be specified via
      /sys/fs/{dev}/inode_goal.  This can be useful for testing inode
      allocation beyond 2^32 blocks on very large filesystems.
      Signed-off-by: NAndreas Dilger <adilger@sun.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      11013911
    • T
      ext4: Use a hash of the topdir directory name for the Orlov parent group · f157a4aa
      Theodore Ts'o 提交于
      Instead of using a random number to determine the goal parent grop for
      the Orlov top directories, use a hash of the directory name.  This
      allows for repeatable results when trying to benchmark filesystem
      layout algorithms.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f157a4aa
  10. 06 7月, 2009 1 次提交
  11. 25 5月, 2009 1 次提交
  12. 03 5月, 2009 1 次提交
  13. 02 5月, 2009 1 次提交
  14. 17 6月, 2009 1 次提交
  15. 01 5月, 2009 1 次提交
    • T
      ext4: Avoid races caused by on-line resizing and SMP memory reordering · 8df9675f
      Theodore Ts'o 提交于
      Ext4's on-line resizing adds a new block group and then, only at the
      last step adjusts s_groups_count.  However, it's possible on SMP
      systems that another CPU could see the updated the s_group_count and
      not see the newly initialized data structures for the just-added block
      group.  For this reason, it's important to insert a SMP read barrier
      after reading s_groups_count and before reading any (for example) the
      new block group descriptors allowed by the increased value of
      s_groups_count.
      
      Unfortunately, we rather blatently violate this locking protocol
      documented in fs/ext4/resize.c.  Fortunately, (1) on-line resizes
      happen relatively rarely, and (2) it seems rare that the filesystem
      code will immediately try to use just-added block group before any
      memory ordering issues resolve themselves.  So apparently problems
      here are relatively hard to hit, since ext3 has been vulnerable to the
      same issue for years with no one apparently complaining.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8df9675f
  16. 23 4月, 2009 1 次提交
    • T
      ext4: Fix potential inode allocation soft lockup in Orlov allocator · b5451f7b
      Theodore Ts'o 提交于
      If the Orlov allocator is having trouble finding an appropriate block
      group, the fallback code could loop forever, causing a soft lockup
      warning in find_group_orlov():
      
      BUG: soft lockup - CPU#0 stuck for 61s! [cp:11728]
           ...
      Pid: 11728, comm: cp Not tainted (2.6.30-rc1-dirty #77) Lenovo          
      EIP: 0060:[<c021650e>] EFLAGS: 00000246 CPU: 0
      EIP is at ext4_get_group_desc+0x54/0x9d
          ...
      Call Trace:
       [<c0218021>] find_group_orlov+0x2ee/0x334
       [<c0120a5f>] ? sched_clock+0x8/0xb
       [<c02188e3>] ext4_new_inode+0x2cf/0xb1a
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b5451f7b
  17. 14 4月, 2009 1 次提交
  18. 26 3月, 2009 1 次提交
  19. 13 3月, 2009 1 次提交
  20. 05 3月, 2009 3 次提交
  21. 13 3月, 2009 1 次提交
    • T
      ext4: New inode/block allocation algorithms for flex_bg filesystems · a4912123
      Theodore Ts'o 提交于
      The find_group_flex() inode allocator is now only used if the
      filesystem is mounted using the "oldalloc" mount option.  It is
      replaced with the original Orlov allocator that has been updated for
      flex_bg filesystems (it should behave the same way if flex_bg is
      disabled).  The inode allocator now functions by taking into account
      each flex_bg group, instead of each block group, when deciding whether
      or not it's time to allocate a new directory into a fresh flex_bg.
      
      The block allocator has also been changed so that the first block
      group in each flex_bg is preferred for use for storing directory
      blocks.  This keeps directory blocks close together, which is good for
      speeding up e2fsck since large directories are more likely to look
      like this:
      
      debugfs:  stat /home/tytso/Maildir/cur
      Inode: 1844562   Type: directory    Mode:  0700   Flags: 0x81000
      Generation: 1132745781    Version: 0x00000000:0000ad71
      User: 15806   Group: 15806   Size: 1060864
      File ACL: 0    Directory ACL: 0
      Links: 2   Blockcount: 2072
      Fragment:  Address: 0    Number: 0    Size: 0
       ctime: 0x499c0ff4:164961f4 -- Wed Feb 18 08:41:08 2009
       atime: 0x499c0ff4:00000000 -- Wed Feb 18 08:41:08 2009
       mtime: 0x49957f51:00000000 -- Fri Feb 13 09:10:25 2009
      crtime: 0x499c0f57:00d51440 -- Wed Feb 18 08:38:31 2009
      Size of extra inode fields: 28
      BLOCKS:
      (0):7348651, (1-258):7348654-7348911
      TOTAL: 259
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a4912123
  22. 22 2月, 2009 1 次提交
    • T
      ext4: Add fallback for find_group_flex · 05bf9e83
      Theodore Ts'o 提交于
      This is a workaround for find_group_flex() which badly needs to be
      replaced.  One of its problems (besides ignoring the Orlov algorithm)
      is that it is a bit hyperactive about returning failure under
      suspicious circumstances.  This can lead to spurious ENOSPC failures
      even when there are inodes still available.
      
      Work around this for now by retrying the search using
      find_group_other() if find_group_flex() returns -1.  If
      find_group_other() succeeds when find_group_flex() has failed, log a
      warning message.
      
      A better block/inode allocator that will fix this problem for real has
      been queued up for the next merge window.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      05bf9e83
  23. 16 2月, 2009 2 次提交
  24. 07 1月, 2009 1 次提交
    • T
      ext4: Remove "extents" mount option · 83982b6f
      Theodore Ts'o 提交于
      This mount option is largely superfluous, and in fact the way it was
      implemented was buggy; if a filesystem which did not have the extents
      feature flag was mounted -o extents, the filesystem would attempt to
      create and use extents-based file even though the extents feature flag
      was not eabled.  The simplest thing to do is to nuke the mount option
      entirely.  It's not all that useful to force the non-creation of new
      extent-based files if the filesystem can support it.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      83982b6f
  25. 04 1月, 2009 1 次提交
  26. 06 1月, 2009 3 次提交
    • A
      ext4: mark the blocks/inode bitmap beyond end of group as used · 648f5879
      Aneesh Kumar K.V 提交于
      We need to mark the block/inode bitmap beyond the end of the group
      with '1'.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      648f5879
    • A
      ext4: Use new buffer_head flag to check uninit group bitmaps initialization · 2ccb5fb9
      Aneesh Kumar K.V 提交于
      For uninit block group, the on-disk bitmap is not initialized. That
      implies we cannot depend on the uptodate flag on the bitmap
      buffer_head to find bitmap validity.  Use a new buffer_head flag which
      would be set after we properly initialize the bitmap.  This also
      prevents (re-)initializing the uninit group bitmap every time we call 
      ext4_read_block_bitmap().
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      2ccb5fb9
    • A
      ext4: Fix the race between read_inode_bitmap() and ext4_new_inode() · 39341867
      Aneesh Kumar K.V 提交于
      We need to make sure we update the inode bitmap and clear
      EXT4_BG_INODE_UNINIT flag with sb_bgl_lock held, since
      ext4_read_inode_bitmap() looks at EXT4_BG_INODE_UNINIT to decide
      whether to initialize the inode bitmap each time it is called.
      (introduced by commit c806e68f.)
      
      ext4_read_inode_bitmap does:
      
      spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
      if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
      	ext4_init_inode_bitmap(sb, bh, block_group, desc);
      
      and ext4_new_inode does
      if (!ext4_set_bit_atomic(sb_bgl_lock(sbi, group),
                         ino, inode_bitmap_bh->b_data))
      		   ......
      		   ...
      spin_lock(sb_bgl_lock(sbi, group));
      
      gdp->bg_flags &= cpu_to_le16(~EXT4_BG_INODE_UNINIT);
      i.e., on allocation we update the bitmap then we take the sb_bgl_lock
      and clear the EXT4_BG_INODE_UNINIT flag. What can happen is a
      parallel ext4_read_inode_bitmap can zero out the bitmap in between
      the above ext4_set_bit_atomic and spin_lock(sb_bg_lock..)
      
      The race results in below user visible errors
      EXT4-fs error (device sdb1): ext4_free_inode: bit already cleared for inode 168449
      EXT4-fs warning (device sdb1): ext4_unlink: Deleting nonexistent file ...
      EXT4-fs warning (device sdb1): ext4_rmdir: empty directory has too many links ...
      # ls -al /mnt/tmp/f/p369/d3/d6/d39/db2/dee/d10f/d3f/l71
      ls: /mnt/tmp/f/p369/d3/d6/d39/db2/dee/d10f/d3f/l71: Stale NFS file handle
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      39341867
  27. 04 1月, 2009 1 次提交
  28. 06 1月, 2009 1 次提交
  29. 01 1月, 2009 1 次提交
  30. 14 11月, 2008 1 次提交
  31. 07 11月, 2008 1 次提交
  32. 06 1月, 2009 1 次提交