1. 07 12月, 2008 1 次提交
  2. 29 10月, 2008 1 次提交
  3. 05 1月, 2009 1 次提交
    • N
      fs: symlink write_begin allocation context fix · 54566b2c
      Nick Piggin 提交于
      With the write_begin/write_end aops, page_symlink was broken because it
      could no longer pass a GFP_NOFS type mask into the point where the
      allocations happened.  They are done in write_begin, which would always
      assume that the filesystem can be entered from reclaim.  This bug could
      cause filesystem deadlocks.
      
      The funny thing with having a gfp_t mask there is that it doesn't really
      allow the caller to arbitrarily tinker with the context in which it can be
      called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
      take the page lock.  The only thing any callers care about is __GFP_FS
      anyway, so turn that into a single flag.
      
      Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
      this flag in their write_begin function.  Change __grab_cache_page to
      accept a nofs argument as well, to honour that flag (while we're there,
      change the name to grab_cache_page_write_begin which is more instructive
      and does away with random leading underscores).
      
      This is really a more flexible way to go in the end anyway -- if a
      filesystem happens to want any extra allocations aside from the pagecache
      ones in ints write_begin function, it may now use GFP_KERNEL (rather than
      GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
      random example).
      
      [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
      [kosaki.motohiro@jp.fujitsu.com: fix fuse]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@kernel.org>		[2.6.28.x]
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ Cleaned up the calling convention: just pass in the AOP flags
        untouched to the grab_cache_page_write_begin() function.  That
        just simplifies everybody, and may even allow future expansion of the
        logic.   - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54566b2c
  4. 01 1月, 2009 1 次提交
  5. 24 10月, 2008 1 次提交
  6. 23 10月, 2008 1 次提交
  7. 11 10月, 2008 1 次提交
  8. 23 9月, 2008 1 次提交
  9. 09 9月, 2008 2 次提交
  10. 12 7月, 2008 3 次提交
  11. 30 4月, 2008 2 次提交
  12. 17 4月, 2008 2 次提交
  13. 29 4月, 2008 1 次提交
  14. 26 2月, 2008 1 次提交
  15. 16 2月, 2008 1 次提交
  16. 22 2月, 2008 1 次提交
  17. 08 2月, 2008 1 次提交
  18. 05 2月, 2008 1 次提交
  19. 29 1月, 2008 3 次提交
  20. 18 10月, 2007 1 次提交
  21. 20 9月, 2007 2 次提交
  22. 18 7月, 2007 2 次提交
    • A
      ext4: Remove 65000 subdirectory limit · f8628a14
      Andreas Dilger 提交于
      This patch adds support to ext4 for allowing more than 65000
      subdirectories. Currently the maximum number of subdirectories is capped
      at 32000.
      
      If we exceed 65000 subdirectories in an htree directory it sets the
      inode link count to 1 and no longer counts subdirectories.  The
      directory link count is not actually used when determining if a
      directory is empty, as that only counts subdirectories and not regular
      files that might be in there. 
      
      A EXT4_FEATURE_RO_COMPAT_DIR_NLINK flag has been added and it is set if
      the subdir count for any directory crosses 65000. A later fsck will clear
      EXT4_FEATURE_RO_COMPAT_DIR_NLINK if there are no longer any directory
      with >65000 subdirs.
      Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
      Signed-off-by: NKalpak Shah <kalpak@clusterfs.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      f8628a14
    • K
      ext4: Add nanosecond timestamps · ef7f3835
      Kalpak Shah 提交于
      This patch adds nanosecond timestamps for ext4. This involves adding
      *time_extra fields to the ext4_inode to extend the timestamps to
      64-bits.  Creation time is also added by this patch.
      
      These extended fields will fit into an inode if the filesystem was
      formatted with large inodes (-I 256 or larger) and there are currently
      no EAs consuming all of the available space. For new inodes we always
      reserve enough space for the kernel's known extended fields, but for
      inodes created with an old kernel this might not have been the case. So
      this patch also adds the EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature
      flag(ro-compat so that older kernels can't create inodes with a smaller
      extra_isize). which indicates if the fields fitting inside
      s_min_extra_isize are available or not.  If the expansion of inodes if
      unsuccessful then this feature will be disabled.  This feature is only
      enabled if requested by the sysadmin.
      
      None of the extended inode fields is critical for correct filesystem
      operation.
      Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
      Signed-off-by: NKalpak Shah <kalpak@clusterfs.com>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      ef7f3835
  23. 17 7月, 2007 1 次提交
    • V
      ext3/ext4: orphan list corruption due bad inode · a6c15c2b
      Vasily Averin 提交于
      After ext3 orphan list check has been added into ext3_destroy_inode()
      (please see my previous patch) the following situation has been detected:
      
       EXT3-fs warning (device sda6): ext3_unlink: Deleting nonexistent file (37901290), 0
       Inode 00000101a15b7840: orphan list check failed!
       00000773 6f665f00 74616d72 00000573 65725f00 06737270 66000000 616d726f
      ...
       Call Trace: [<ffffffff80211ea9>] ext3_destroy_inode+0x79/0x90
        [<ffffffff801a2b16>] sys_unlink+0x126/0x1a0
        [<ffffffff80111479>] error_exit+0x0/0x81
        [<ffffffff80110aba>] system_call+0x7e/0x83
      
      First messages said that unlinked inode has i_nlink=0, then ext3_unlink()
      adds this inode into orphan list.
      
      Second message means that this inode has not been removed from orphan list.
       Inode dump has showed that i_fop = &bad_file_ops and it can be set in
      make_bad_inode() only.  Then I've found that ext3_read_inode() can call
      make_bad_inode() without any error/warning messages, for example in the
      following case:
      
      ...
              if (inode->i_nlink == 0) {
                      if (inode->i_mode == 0 ||
                          !(EXT3_SB(inode->i_sb)->s_mount_state & EXT3_ORPHAN_FS)) {
                              /* this inode is deleted */
                              brelse (bh);
                              goto bad_inode;
      ...
      
      Bad inode can live some time, ext3_unlink can add it to orphan list, but
      ext3_delete_inode() do not deleted this inode from orphan list.  As result
      we can have orphan list corruption detected in ext3_destroy_inode().
      
      However it is not clear for me how to fix this issue correctly.
      
      As far as i see is_bad_inode() is called after iget() in all places
      excluding ext3_lookup() and ext3_get_parent().  I believe it makes sense to
      add bad inode check to these functions too and call iput if bad inode
      detected.
      Signed-off-by: NVasily Averin <vvs@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a6c15c2b
  24. 01 6月, 2007 1 次提交
  25. 09 5月, 2007 2 次提交
  26. 13 2月, 2007 1 次提交
  27. 12 2月, 2007 2 次提交
  28. 09 12月, 2006 1 次提交
  29. 08 12月, 2006 1 次提交
    • E
      [PATCH] handle ext4 directory corruption better · e6c40211
      Eric Sandeen 提交于
      I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
      http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz
      
      Basically it makes a filesystem, splats some random bits over it, then
      tries to mount it and do some simple filesystem actions.
      
      At best, the filesystem catches the corruption gracefully.  At worst,
      things spin out of control.
      
      As you might guess, we found a couple places in ext4 where things spin out
      of control :)
      
      First, we had a corrupted directory that was never checked for
      consistency...  it was corrupt, and pointed to another bad "entry" of
      length 0.  The for() loop looped forever, since the length of
      ext4_next_entry(de) was 0, and we kept looking at the same pointer over and
      over and over and over...  I modeled this check and subsequent action on
      what is done for other directory types in ext4_readdir...
      
      (adding this check adds some computational expense; I am testing a followup
      patch to reduce the number of times we check and re-check these directory
      entries, in all cases.  Thanks for the idea, Andreas).
      
      Next we had a root directory inode which had a corrupted size, claimed to
      be > 200M on a 4M filesystem.  There was only really 1 block in the
      directory, but because the size was so large, readdir kept coming back for
      more, spewing thousands of printk's along the way.
      
      Per Andreas' suggestion, if we're in this read error condition and we're
      trying to read an offset which is greater than i_blocks worth of bytes,
      stop trying, and break out of the loop.
      
      With these two changes fsfuzz test survives quite well on ext4.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e6c40211