1. 01 1月, 2009 1 次提交
  2. 14 11月, 2008 1 次提交
  3. 13 11月, 2008 1 次提交
  4. 07 11月, 2008 1 次提交
    • A
      ext3: wait on all pending commits in ext3_sync_fs · c87591b7
      Arthur Jones 提交于
      In ext3_sync_fs, we only wait for a commit to finish if we started it, but
      there may be one already in progress which will not be synced.
      
      In the case of a data=ordered umount with pending long symlinks which are
      delayed due to a long list of other I/O on the backing block device, this
      causes the buffer associated with the long symlinks to not be moved to the
      inode dirty list in the second phase of fsync_super.  Then, before they
      can be dirtied again, kjournald exits, seeing the UMOUNT flag and the
      dirty pages are never written to the backing block device, causing long
      symlink corruption and exposing new or previously freed block data to
      userspace.
      
      This can be reproduced with a script created
      by Eric Sandeen <sandeen@redhat.com>:
      
      	#!/bin/bash
      
      	umount /mnt/test2
      	mount /dev/sdb4 /mnt/test2
      	rm -f /mnt/test2/*
      	dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
      	touch
      	/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
      	ln -s
      	/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
      	/mnt/test2/link
      	umount /mnt/test2
      	mount /dev/sdb4 /mnt/test2
      	ls /mnt/test2/
      	umount /mnt/test2
      
      To ensure all commits are synced, we flush all journal commits now when
      sync_fs'ing ext3.
      Signed-off-by: NArthur Jones <ajones@riverbed.com>
      Cc: Eric Sandeen <sandeen@redhat.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: <stable@kernel.org>		[2.6.everything]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c87591b7
  5. 28 10月, 2008 1 次提交
  6. 26 10月, 2008 1 次提交
  7. 24 10月, 2008 1 次提交
  8. 23 10月, 2008 4 次提交
  9. 21 10月, 2008 2 次提交
  10. 20 10月, 2008 6 次提交
  11. 14 10月, 2008 1 次提交
  12. 04 10月, 2008 1 次提交
    • J
      generic block based fiemap implementation · 68c9d702
      Josef Bacik 提交于
      Any block based fs (this patch includes ext3) just has to declare its own
      fiemap() function and then call this generic function with its own
      get_block_t. This works well for block based filesystems that will map
      multiple contiguous blocks at one time, but will work for filesystems that
      only map one block at a time, you will just end up with an "extent" for each
      block. One gotcha is this will not play nicely where there is hole+data
      after the EOF. This function will assume its hit the end of the data as soon
      as it hits a hole after the EOF, so if there is any data past that it will
      not pick that up. AFAIK no block based fs does this anyway, but its in the
      comments of the function anyway just in case.
      Signed-off-by: NJosef Bacik <jbacik@redhat.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      68c9d702
  13. 01 8月, 2008 1 次提交
    • A
      [PATCH] fix races and leaks in vfs_quota_on() users · 77e69dac
      Al Viro 提交于
      * new helper: vfs_quota_on_path(); equivalent of vfs_quota_on() sans the
        pathname resolution.
      * callers of vfs_quota_on() that do their own pathname resolution and
        checks based on it are switched to vfs_quota_on_path(); that way we
        avoid the races.
      * reiserfs leaked dentry/vfsmount references on several failure exits.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      77e69dac
  14. 29 7月, 2008 1 次提交
    • H
      vfs: pagecache usage optimization for pagesize!=blocksize · 8ab22b9a
      Hisashi Hifumi 提交于
      When we read some part of a file through pagecache, if there is a
      pagecache of corresponding index but this page is not uptodate, read IO
      is issued and this page will be uptodate.
      
      I think this is good for pagesize == blocksize environment but there is
      room for improvement on pagesize != blocksize environment.  Because in
      this case a page can have multiple buffers and even if a page is not
      uptodate, some buffers can be uptodate.
      
      So I suggest that when all buffers which correspond to a part of a file
      that we want to read are uptodate, use this pagecache and copy data from
      this pagecache to user buffer even if a page is not uptodate.  This can
      reduce read IO and improve system throughput.
      
      I wrote a benchmark program and got result number with this program.
      
      This benchmark do:
      
        1: mount and open a test file.
      
        2: create a 512MB file.
      
        3: close a file and umount.
      
        4: mount and again open a test file.
      
        5: pwrite randomly 300000 times on a test file.  offset is aligned
           by IO size(1024bytes).
      
        6: measure time of preading randomly 100000 times on a test file.
      
      The result was:
      	2.6.26
              330 sec
      
      	2.6.26-patched
              226 sec
      
      Arch:i386
      Filesystem:ext3
      Blocksize:1024 bytes
      Memory: 1GB
      
      On ext3/4, a file is written through buffer/block.  So random read/write
      mixed workloads or random read after random write workloads are optimized
      with this patch under pagesize != blocksize environment.  This test result
      showed this.
      
      The benchmark program is as follows:
      
      #include <stdio.h>
      #include <sys/types.h>
      #include <sys/stat.h>
      #include <fcntl.h>
      #include <unistd.h>
      #include <time.h>
      #include <stdlib.h>
      #include <string.h>
      #include <sys/mount.h>
      
      #define LEN 1024
      #define LOOP 1024*512 /* 512MB */
      
      main(void)
      {
      	unsigned long i, offset, filesize;
      	int fd;
      	char buf[LEN];
      	time_t t1, t2;
      
      	if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) {
      		perror("cannot mount\n");
      		exit(1);
      	}
      	memset(buf, 0, LEN);
      	fd = open("/root/test1/testfile", O_CREAT|O_RDWR|O_TRUNC);
      	if (fd < 0) {
      		perror("cannot open file\n");
      		exit(1);
      	}
      	for (i = 0; i < LOOP; i++)
      		write(fd, buf, LEN);
      	close(fd);
      	if (umount("/root/test1/") < 0) {
      		perror("cannot umount\n");
      		exit(1);
      	}
      	if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) {
      		perror("cannot mount\n");
      		exit(1);
      	}
      	fd = open("/root/test1/testfile", O_RDWR);
      	if (fd < 0) {
      		perror("cannot open file\n");
      		exit(1);
      	}
      
      	filesize = LEN * LOOP;
      	for (i = 0; i < 300000; i++){
      		offset = (random() % filesize) & (~(LEN - 1));
      		pwrite(fd, buf, LEN, offset);
      	}
      	printf("start test\n");
      	time(&t1);
      	for (i = 0; i < 100000; i++){
      		offset = (random() % filesize) & (~(LEN - 1));
      		pread(fd, buf, LEN, offset);
      	}
      	time(&t2);
      	printf("%ld sec\n", t2-t1);
      	close(fd);
      	if (umount("/root/test1/") < 0) {
      		perror("cannot umount\n");
      		exit(1);
      	}
      }
      Signed-off-by: NHisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8ab22b9a
  15. 27 7月, 2008 2 次提交
  16. 26 7月, 2008 10 次提交
  17. 05 7月, 2008 1 次提交
  18. 07 6月, 2008 1 次提交
    • J
      ext3: fix online resize bug · 9bb91784
      Josef Bacik 提交于
      There is a bug when we are trying to verify that the reserve inode's
      double indirect blocks point back to the primary gdt blocks.  The fix is
      obvious, we need to mod the gdb count by the addr's per block.  You can
      verify this with the following test case
      
      dd if=/dev/zero of=disk1 seek=1024 count=1 bs=100M
      losetup /dev/loop1 disk1
      pvcreate /dev/loop1
      vgcreate loopvg1 /dev/loop1
      lvcreate -l 100%VG loopvg1 -n looplv1
      mkfs.ext3 -J size=64 -b 1024 /dev/loopvg1/looplv1
      mount /dev/loopvg1/looplv1 /mnt/loop
      dd if=/dev/zero of=disk2 seek=1024 count=1 bs=50M
      losetup /dev/loop2 disk2
      pvcreate /dev/loop2
      vgextend loopvg1 /dev/loop2
      lvextend -l 100%VG /dev/loopvg1/looplv1
      resize2fs /dev/loopvg1/looplv1
      
      without this patch the resize2fs fails, with it the resize2fs succeeds.
      Signed-off-by: NJosef Bacik <jbacik@redhat.com>
      Acked-by: NAndreas Dilger <adilger@sun.com>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9bb91784
  19. 15 5月, 2008 1 次提交
  20. 30 4月, 2008 1 次提交
  21. 28 4月, 2008 1 次提交