1. 06 6月, 2011 1 次提交
    • L
      ext4: Fix max file size and logical block counting of extent format file · f17722f9
      Lukas Czerner 提交于
      Kazuya Mio reported that he was able to hit BUG_ON(next == lblock)
      in ext4_ext_put_gap_in_cache() while creating a sparse file in extent
      format and fill the tail of file up to its end. We will hit the BUG_ON
      when we write the last block (2^32-1) into the sparse file.
      
      The root cause of the problem lies in the fact that we specifically set
      s_maxbytes so that block at s_maxbytes fit into on-disk extent format,
      which is 32 bit long. However, we are not storing start and end block
      number, but rather start block number and length in blocks. It means
      that in order to cover extent from 0 to EXT_MAX_BLOCK we need
      EXT_MAX_BLOCK+1 to fit into len (because we counting block 0 as well) -
      and it does not.
      
      The only way to fix it without changing the meaning of the struct
      ext4_extent members is, as Kazuya Mio suggested, to lower s_maxbytes
      by one fs block so we can cover the whole extent we can get by the
      on-disk extent format.
      
      Also in many places EXT_MAX_BLOCK is used as length instead of maximum
      logical block number as the name suggests, it is all a bit messy. So
      this commit renames it to EXT_MAX_BLOCKS and change its usage in some
      places to actually be maximum number of blocks in the extent.
      
      The bug which this commit fixes can be reproduced as follows:
      
       dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((2**32-2))
       sync
       dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((2**32-1))
      Reported-by: NKazuya Mio <k-mio@sx.jp.nec.com>
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f17722f9
  2. 27 5月, 2011 1 次提交
    • D
      ext4: add cleancache support · 7abc52c2
      Dan Magenheimer 提交于
      This seventh patch of eight in this cleancache series "opts-in"
      cleancache for ext4.  Filesystems must explicitly enable cleancache
      by calling cleancache_init_fs anytime an instance of the filesystem
      is mounted. For ext4, all other cleancache hooks are in
      the VFS layer including the matching cleancache_flush_fs
      hook which must be called on unmount.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      [v6-v8: no changes]
      [v5: jeremy@goop.org: simplify init hook and any future fs init changes]
      Signed-off-by: NDan Magenheimer <dan.magenheimer@oracle.com>
      Reviewed-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: NAndreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      7abc52c2
  3. 25 5月, 2011 2 次提交
    • J
      ext4: add support for multiple mount protection · c5e06d10
      Johann Lombardi 提交于
      Prevent an ext4 filesystem from being mounted multiple times.
      A sequence number is stored on disk and is periodically updated (every 5
      seconds by default) by a mounted filesystem.
      At mount time, we now wait for s_mmp_update_interval seconds to make sure
      that the MMP sequence does not change.
      In case of failure, the nodename, bdevname and the time at which the MMP
      block was last updated is displayed.
      Signed-off-by: NAndreas Dilger <adilger@whamcloud.com>
      Signed-off-by: NJohann Lombardi <johann@whamcloud.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c5e06d10
    • K
      ext4: ensure f_bfree returned by ext4_statfs() is non-negative · d02a9391
      Kazuya Mio 提交于
      I found the issue that the number of free blocks went negative.
      # stat -f /mnt/mp1/
        File: "/mnt/mp1/"
          ID: e175ccb83a872efe Namelen: 255     Type: ext2/ext3
      Block size: 4096       Fundamental block size: 4096
      Blocks: Total: 258022     Free: -15        Available: -13122
      Inodes: Total: 65536      Free: 63029
      
      f_bfree in struct statfs will go negative when the filesystem has
      few free blocks. Because the number of dirty blocks is bigger than
      the number of free blocks in the following two cases.
      
      CASE 1:
      ext4_da_writepages
        mpage_da_map_and_submit
          ext4_map_blocks
            ext4_ext_map_blocks
              ext4_mb_new_blocks
                ext4_mb_diskspace_used
                  percpu_counter_sub(&sbi->s_freeblocks_counter, ac->ac_b_ex.fe_len);
              <--- interrupt statfs systemcall --->
              ext4_da_update_reserve_space
                  percpu_counter_sub(&sbi->s_dirtyblocks_counter,
                                  used + ei->i_allocated_meta_blocks);
      
      CASE 2:
      ext4_write_begin
        __block_write_begin
          ext4_map_blocks
            ext4_ext_map_blocks
              ext4_mb_new_blocks
                ext4_mb_diskspace_used
                  percpu_counter_sub(&sbi->s_freeblocks_counter, ac->ac_b_ex.fe_len);
                  <--- interrupt statfs systemcall --->
                  percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
      
      To avoid the issue, this patch ensures that f_bfree is non-negative.
      Signed-off-by: NKazuya Mio <k-mio@sx.jp.nec.com>
      d02a9391
  4. 23 5月, 2011 2 次提交
  5. 21 5月, 2011 4 次提交
  6. 19 5月, 2011 1 次提交
  7. 16 5月, 2011 1 次提交
  8. 09 5月, 2011 2 次提交
  9. 19 4月, 2011 1 次提交
    • T
      ext4: check for ext[23] file system features when mounting as ext[23] · 2035e776
      Theodore Ts'o 提交于
      Provide better emulation for ext[23] mode by enforcing that the file
      system does not have any unsupported file system features as defined
      by ext[23] when emulating the ext[23] file system driver when
      CONFIG_EXT4_USE_FOR_EXT23 is defined.
      
      This causes the file system type information in /proc/mounts to be
      correct for the automatically mounted root file system.  This also
      means that "mount -t ext2 /dev/sda /mnt" will fail if /dev/sda
      contains an ext3 or ext4 file system, just as one would expect if the
      original ext2 file system driver were in use.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      2035e776
  10. 11 4月, 2011 1 次提交
  11. 06 4月, 2011 1 次提交
  12. 05 4月, 2011 2 次提交
  13. 31 3月, 2011 1 次提交
  14. 22 3月, 2011 1 次提交
  15. 15 3月, 2011 1 次提交
  16. 06 3月, 2011 1 次提交
  17. 28 2月, 2011 2 次提交
  18. 27 2月, 2011 1 次提交
  19. 24 2月, 2011 2 次提交
  20. 22 2月, 2011 2 次提交
  21. 12 2月, 2011 1 次提交
    • E
      ext4: serialize unaligned asynchronous DIO · e9e3bcec
      Eric Sandeen 提交于
      ext4 has a data corruption case when doing non-block-aligned
      asynchronous direct IO into a sparse file, as demonstrated
      by xfstest 240.
      
      The root cause is that while ext4 preallocates space in the
      hole, mappings of that space still look "new" and 
      dio_zero_block() will zero out the unwritten portions.  When
      more than one AIO thread is going, they both find this "new"
      block and race to zero out their portion; this is uncoordinated
      and causes data corruption.
      
      Dave Chinner fixed this for xfs by simply serializing all
      unaligned asynchronous direct IO.  I've done the same here.
      The difference is that we only wait on conversions, not all IO.
      This is a very big hammer, and I'm not very pleased with
      stuffing this into ext4_file_write().  But since ext4 is
      DIO_LOCKING, we need to serialize it at this high level.
      
      I tried to move this into ext4_ext_direct_IO, but by then
      we have the i_mutex already, and we will wait on the
      work queue to do conversions - which must also take the
      i_mutex.  So that won't work.
      
      This was originally exposed by qemu-kvm installing to
      a raw disk image with a normal sector-63 alignment.  I've
      tested a backport of this patch with qemu, and it does
      avoid the corruption.  It is also quite a lot slower
      (14 min for package installs, vs. 8 min for well-aligned)
      but I'll take slow correctness over fast corruption any day.
      
      Mingming suggested that we can track outstanding
      conversions, and wait on those so that non-sparse
      files won't be affected, and I've implemented that here;
      unaligned AIO to nonsparse files won't take a perf hit.
      
      [tytso@mit.edu: Keep the mutex as a hashed array instead
       of bloating the ext4 inode]
      
      [tytso@mit.edu: Fix up namespace issues so that global
       variables are protected with an "ext4_" prefix.]
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      e9e3bcec
  22. 04 2月, 2011 3 次提交
  23. 01 2月, 2011 1 次提交
  24. 13 1月, 2011 1 次提交
    • J
      quota: Fix deadlock during path resolution · f00c9e44
      Jan Kara 提交于
      As Al Viro pointed out path resolution during Q_QUOTAON calls to quotactl
      is prone to deadlocks. We hold s_umount semaphore for reading during the
      path resolution and resolution itself may need to acquire the semaphore
      for writing when e. g. autofs mountpoint is passed.
      
      Solve the problem by performing the resolution before we get hold of the
      superblock (and thus s_umount semaphore). The whole thing is complicated
      by the fact that some filesystems (OCFS2) ignore the path argument. So to
      distinguish between filesystem which want the path and which do not we
      introduce new .quota_on_meta callback which does not get the path. OCFS2
      then uses this callback instead of old .quota_on.
      
      CC: Al Viro <viro@ZenIV.linux.org.uk>
      CC: Christoph Hellwig <hch@lst.de>
      CC: Ted Ts'o <tytso@mit.edu>
      CC: Joel Becker <joel.becker@oracle.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      f00c9e44
  25. 11 1月, 2011 4 次提交