1. 03 4月, 2009 1 次提交
  2. 27 3月, 2009 1 次提交
  3. 26 3月, 2009 2 次提交
  4. 12 2月, 2009 1 次提交
  5. 17 1月, 2009 1 次提交
  6. 10 1月, 2009 1 次提交
    • T
      filesystem freeze: add error handling of write_super_lockfs/unlockfs · c4be0c1d
      Takashi Sato 提交于
      Currently, ext3 in mainline Linux doesn't have the freeze feature which
      suspends write requests.  So, we cannot take a backup which keeps the
      filesystem's consistency with the storage device's features (snapshot and
      replication) while it is mounted.
      
      In many case, a commercial filesystem (e.g.  VxFS) has the freeze feature
      and it would be used to get the consistent backup.
      
      If Linux's standard filesystem ext3 has the freeze feature, we can do it
      without a commercial filesystem.
      
      So I have implemented the ioctls of the freeze feature.
      I think we can take the consistent backup with the following steps.
      1. Freeze the filesystem with the freeze ioctl.
      2. Separate the replication volume or create the snapshot
         with the storage device's feature.
      3. Unfreeze the filesystem with the unfreeze ioctl.
      4. Take the backup from the separated replication volume
         or the snapshot.
      
      This patch:
      
      VFS:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that they can return an error.
      Rename write_super_lockfs and unlockfs of the super block operation
      freeze_fs and unfreeze_fs to avoid a confusion.
      
      ext3, ext4, xfs, gfs2, jfs:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that write_super_lockfs returns an error if needed,
      and unlockfs always returns 0.
      
      reiserfs:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that they always return 0 (success) to keep a current behavior.
      Signed-off-by: NTakashi Sato <t-sato@yk.jp.nec.com>
      Signed-off-by: NMasayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com>
      Cc: <xfs-masters@oss.sgi.com>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4be0c1d
  7. 09 1月, 2009 4 次提交
  8. 06 1月, 2009 2 次提交
  9. 05 1月, 2009 1 次提交
    • N
      fs: symlink write_begin allocation context fix · 54566b2c
      Nick Piggin 提交于
      With the write_begin/write_end aops, page_symlink was broken because it
      could no longer pass a GFP_NOFS type mask into the point where the
      allocations happened.  They are done in write_begin, which would always
      assume that the filesystem can be entered from reclaim.  This bug could
      cause filesystem deadlocks.
      
      The funny thing with having a gfp_t mask there is that it doesn't really
      allow the caller to arbitrarily tinker with the context in which it can be
      called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
      take the page lock.  The only thing any callers care about is __GFP_FS
      anyway, so turn that into a single flag.
      
      Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
      this flag in their write_begin function.  Change __grab_cache_page to
      accept a nofs argument as well, to honour that flag (while we're there,
      change the name to grab_cache_page_write_begin which is more instructive
      and does away with random leading underscores).
      
      This is really a more flexible way to go in the end anyway -- if a
      filesystem happens to want any extra allocations aside from the pagecache
      ones in ints write_begin function, it may now use GFP_KERNEL (rather than
      GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
      random example).
      
      [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
      [kosaki.motohiro@jp.fujitsu.com: fix fuse]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@kernel.org>		[2.6.28.x]
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ Cleaned up the calling convention: just pass in the AOP flags
        untouched to the grab_cache_page_write_begin() function.  That
        just simplifies everybody, and may even allow future expansion of the
        logic.   - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54566b2c
  10. 06 1月, 2009 1 次提交
    • T
      ext3: provide function to release metadata pages under memory pressure · 6b082b53
      Toshiyuki Okajima 提交于
      Pages in the page cache belonging to ext3 data files are released via
      the ext3_releasepage() function specified in the ext3 inode's
      address_space_ops.  However, metadata blocks (such as indirect blocks,
      directory blocks, etc) are managed via the block device
      address_space_ops, and they can not be released by
      try_to_free_buffers() if they have a journal head attached to them.
      
      To address this, we supply a try_to_free_pages() function which calls
      journal_try_to_free_buffers() function to free the metadata, and which
      is called by the block device's blkdev_releasepage() function.
      Signed-off-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      6b082b53
  11. 01 1月, 2009 2 次提交
  12. 14 11月, 2008 1 次提交
  13. 13 11月, 2008 1 次提交
  14. 07 11月, 2008 1 次提交
    • A
      ext3: wait on all pending commits in ext3_sync_fs · c87591b7
      Arthur Jones 提交于
      In ext3_sync_fs, we only wait for a commit to finish if we started it, but
      there may be one already in progress which will not be synced.
      
      In the case of a data=ordered umount with pending long symlinks which are
      delayed due to a long list of other I/O on the backing block device, this
      causes the buffer associated with the long symlinks to not be moved to the
      inode dirty list in the second phase of fsync_super.  Then, before they
      can be dirtied again, kjournald exits, seeing the UMOUNT flag and the
      dirty pages are never written to the backing block device, causing long
      symlink corruption and exposing new or previously freed block data to
      userspace.
      
      This can be reproduced with a script created
      by Eric Sandeen <sandeen@redhat.com>:
      
      	#!/bin/bash
      
      	umount /mnt/test2
      	mount /dev/sdb4 /mnt/test2
      	rm -f /mnt/test2/*
      	dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
      	touch
      	/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
      	ln -s
      	/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
      	/mnt/test2/link
      	umount /mnt/test2
      	mount /dev/sdb4 /mnt/test2
      	ls /mnt/test2/
      	umount /mnt/test2
      
      To ensure all commits are synced, we flush all journal commits now when
      sync_fs'ing ext3.
      Signed-off-by: NArthur Jones <ajones@riverbed.com>
      Cc: Eric Sandeen <sandeen@redhat.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: <stable@kernel.org>		[2.6.everything]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c87591b7
  15. 07 12月, 2008 1 次提交
  16. 29 10月, 2008 1 次提交
    • T
      ext3: Add support for non-native signed/unsigned htree hash algorithms · 5e1f8c9e
      Theodore Ts'o 提交于
      The original ext3 hash algorithms assumed that variables of type char
      were signed, as God and K&R intended.  Unfortunately, this assumption
      is not true on some architectures.  Userspace support for marking
      filesystems with non-native signed/unsigned chars was added two years
      ago, but the kernel-side support was never added (until now).
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: akpm@linux-foundation.org
      Cc: linux-kernel@vger.kernel.org
      5e1f8c9e
  17. 28 10月, 2008 1 次提交
  18. 26 10月, 2008 1 次提交
  19. 24 10月, 2008 1 次提交
  20. 23 10月, 2008 4 次提交
  21. 21 10月, 2008 2 次提交
  22. 20 10月, 2008 6 次提交
  23. 14 10月, 2008 1 次提交
  24. 04 10月, 2008 1 次提交
    • J
      generic block based fiemap implementation · 68c9d702
      Josef Bacik 提交于
      Any block based fs (this patch includes ext3) just has to declare its own
      fiemap() function and then call this generic function with its own
      get_block_t. This works well for block based filesystems that will map
      multiple contiguous blocks at one time, but will work for filesystems that
      only map one block at a time, you will just end up with an "extent" for each
      block. One gotcha is this will not play nicely where there is hole+data
      after the EOF. This function will assume its hit the end of the data as soon
      as it hits a hole after the EOF, so if there is any data past that it will
      not pick that up. AFAIK no block based fs does this anyway, but its in the
      comments of the function anyway just in case.
      Signed-off-by: NJosef Bacik <jbacik@redhat.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      68c9d702
  25. 01 8月, 2008 1 次提交
    • A
      [PATCH] fix races and leaks in vfs_quota_on() users · 77e69dac
      Al Viro 提交于
      * new helper: vfs_quota_on_path(); equivalent of vfs_quota_on() sans the
        pathname resolution.
      * callers of vfs_quota_on() that do their own pathname resolution and
        checks based on it are switched to vfs_quota_on_path(); that way we
        avoid the races.
      * reiserfs leaked dentry/vfsmount references on several failure exits.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      77e69dac