1. 31 7月, 2011 2 次提交
    • T
      ext4: change umode_t in tracepoint headers to be an explicit __u16 · 59be8e72
      Theodore Ts'o 提交于
      As requested by Al Viro, since umode_t may be changing to a u32 for
      some architectures.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      59be8e72
    • T
      ext4: fix races in ext4_sync_parent() · d59729f4
      Theodore Ts'o 提交于
      Fix problems if fsync() races against a rename of a parent directory
      as pointed out by Al Viro in his own inimitable way:
      
      >While we are at it, could somebody please explain what the hell is ext4
      >doing in
      >static int ext4_sync_parent(struct inode *inode)
      >{
      >        struct writeback_control wbc;
      >        struct dentry *dentry = NULL;
      >        int ret = 0;
      >
      >        while (inode && ext4_test_inode_state(inode, EXT4_STATE_NEWENTRY)) {
      >                ext4_clear_inode_state(inode, EXT4_STATE_NEWENTRY);
      >                dentry = list_entry(inode->i_dentry.next,
      >                                    struct dentry, d_alias);
      >                if (!dentry || !dentry->d_parent || !dentry->d_parent->d_inode)
      >                        break;
      >                inode = dentry->d_parent->d_inode;
      >                ret = sync_mapping_buffers(inode->i_mapping);
      >                ...
      >Note that dentry obviously can't be NULL there.  dentry->d_parent is never
      >NULL.  And dentry->d_parent would better not be negative, for crying out
      >loud!  What's worse, there's no guarantees that dentry->d_parent will
      >remain our parent over that sync_mapping_buffers() *and* that inode won't
      >just be freed under us (after rename() and memory pressure leading to
      >eviction of what used to be our dentry->d_parent)......
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      d59729f4
  2. 28 7月, 2011 5 次提交
  3. 27 7月, 2011 8 次提交
  4. 26 7月, 2011 1 次提交
    • J
      ext4: fix data corruption in inodes with journalled data · 2d859db3
      Jan Kara 提交于
      When journalling data for an inode (either because it is a symlink or
      because the filesystem is mounted in data=journal mode), ext4_evict_inode()
      can discard unwritten data by calling truncate_inode_pages(). This is
      because we don't mark the buffer / page dirty when journalling data but only
      add the buffer to the running transaction and thus mm does not know there
      are still unwritten data.
      
      Fix the problem by carefully tracking transaction containing inode's data,
      committing this transaction, and writing uncheckpointed buffers when inode
      should be reaped.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      2d859db3
  5. 24 7月, 2011 6 次提交
  6. 18 7月, 2011 6 次提交
  7. 17 7月, 2011 1 次提交
  8. 12 7月, 2011 4 次提交
  9. 11 7月, 2011 7 次提交
    • R
      ext4: remove redundant goto in ext4_ext_insert_extent() · ffb505ff
      Robin Dong 提交于
      If eh->eh_entries is smaller than eh->eh_max, the routine will
      go to the "repeat" and then go to "has_space" directlly ,
      since argument "depth" and "eh" are not even changed.
      
      Therefore, goto "has_space" directly and remove redundant "repeat" tag.
      Signed-off-by: NRobin Dong <sanbai@taobao.com>
      ffb505ff
    • T
      ext4: Change the wrong param comment for ext4_trim_all_free · 22612283
      Tao Ma 提交于
      at ext4_trim_all_free() comment, there is no longer an @e4b parameter,
      instead it is @group.
      Reported-by: NAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      22612283
    • T
      ext4: Speed up FITRIM by recording flags in ext4_group_info · 3d56b8d2
      Tao Ma 提交于
      In ext4, when FITRIM is called every time, we iterate all the
      groups and do trim one by one. It is a bit time wasting if the
      group has been trimmed and there is no change since the last
      trim.
      
      So this patch adds a new flag in ext4_group_info->bb_state to
      indicate that the group has been trimmed, and it will be cleared
      if some blocks is freed(in release_blocks_on_commit). Another
      trim_minlen is added in ext4_sb_info to record the last minlen
      we use to trim the volume, so that if the caller provide a small
      one, we will go on the trim regardless of the bb_state.
      
      A simple test with my intel x25m ssd:
      df -h shows:
      /dev/sdb1              40G   21G   17G  56% /mnt/ext4
      Block size:               4096
      
      run the FITRIM with the following parameter:
      range.start = 0;
      range.len = UINT64_MAX;
      range.minlen = 1048576;
      
      without the patch:
      [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
      real	0m5.505s
      user	0m0.000s
      sys	0m1.224s
      [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
      real	0m5.359s
      user	0m0.000s
      sys	0m1.178s
      [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
      real	0m5.228s
      user	0m0.000s
      sys	0m1.151s
      
      with the patch:
      [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
      real	0m5.625s
      user	0m0.000s
      sys	0m1.269s
      [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
      real	0m0.002s
      user	0m0.000s
      sys	0m0.001s
      [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
      real	0m0.002s
      user	0m0.000s
      sys	0m0.001s
      
      A big improvement for the 2nd and 3rd run.
      
      Even after I delete some big image files, it is still much
      faster than iterating the whole disk.
      
      [root@boyu-tm test]# time ./ftrim /mnt/ext4/a
      real	0m1.217s
      user	0m0.000s
      sys	0m0.196s
      
      Cc: Lukas Czerner <lczerner@redhat.com>
      Reviewed-by: NAndreas Dilger <adilger.kernel@dilger.ca>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3d56b8d2
    • T
      ext4: Add new ext4 trim tracepoints · b3d4c2b1
      Tao Ma 提交于
      Add ext4_trim_extent and ext4_trim_all_free.
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b3d4c2b1
    • T
      ext4: speed up group trim with the right free block count · 169ddc3e
      Tao Ma 提交于
      When we trim some free blocks in a group of ext4, we need to 
      calculate the free blocks properly and check whether there are
      enough freed blocks left for us to trim. Current solution will
      only calculate free spaces if they are large for a trim which
      isn't appropriate.
      
      Let us see a small example:
      a group has 1.5M free which are 300k, 300k, 300k, 300k, 300k.
      And minblocks is 1M.  With current solution, we have to iterate
      the whole group since these 300k will never be subtracted from
      1.5M.  But actually we should exit after we find the first 2
      free spaces since the left 3 chunks only sum up to 900K if we
      subtract the first 600K although they can't be trimed.
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      169ddc3e
    • T
      ext4: fix trim length underflow with small trim length · 22f10457
      Tao Ma 提交于
      In 0f0a25bf, we adjust 'len' with s_first_data_block - start, but
      it could underflow in case blocksize=1K, fstrim_range.len=512 and
      fstrim_range.start = 0. In this case, when we run the code:
      len -= first_data_blk - start; len will be underflow to -1ULL.
      In the end, although we are safe that last_group check later will limit
      the trim to the whole volume, but that isn't what the user really want.
      
      So this patch fix it. It also adds the check for 'start' like ext3 so that
      we can break immediately if the start is invalid.
      
      Cc: Lukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      22f10457
    • T
      ext4: add tracepoint for ext4_journal_start · 12706394
      Theodore Ts'o 提交于
      This will help debug who is responsible for starting a jbd2 transaction.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      12706394