1. 13 2月, 2007 1 次提交
  2. 12 2月, 2007 3 次提交
  3. 14 12月, 2006 2 次提交
  4. 09 12月, 2006 1 次提交
  5. 08 12月, 2006 3 次提交
  6. 20 10月, 2006 1 次提交
    • M
      [PATCH] Take i_mutex in splice_from_pipe() · 62752ee1
      Mark Fasheh 提交于
      The splice_actor may be calling ->prepare_write() and ->commit_write(). We
      want i_mutex on the inode being written to before calling those so that we
      don't race i_size changes.
      
      The double locking behavior is done elsewhere in splice.c, and if we
      eventually want _nolock variants of generic_file_splice_write(), fs modules
      might have to replicate the nasty locking code. We introduce
      inode_double_lock() and inode_double_unlock() to consolidate the locking
      rules into one set of functions.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      62752ee1
  7. 11 10月, 2006 1 次提交
  8. 02 10月, 2006 1 次提交
  9. 01 10月, 2006 1 次提交
  10. 30 9月, 2006 1 次提交
    • A
      [PATCH] fs.h: ifdef security fields · 50462062
      Alexey Dobriyan 提交于
      [assuming BSD security levels are deleted]
      The only user of i_security, f_security, s_security fields is SELinux,
      however, quite a few security modules are trying to get into kernel.
      So, wrap them under CONFIG_SECURITY. Adding config option for each
      security field is likely an overkill.
      
      Following Stephen Smalley's suggestion, i_security initialization is
      moved to security_inode_alloc() to not clutter core code with ifdefs
      and make alloc_inode() codepath tiny little bit smaller and faster.
      
      The user of (highly greppable) struct fown_struct::security field is
      still to be found. I've checked every "fown_struct" and every "f_owner"
      occurence. Additionally it's removal doesn't break i386 allmodconfig
      build.
      
      struct inode, struct file, struct super_block, struct fown_struct
      become smaller.
      
      P.S. Combined with two reiserfs inode shrinking patches sent to
      linux-fsdevel, I can finally suck 12 reiserfs inodes into one page.
      
      		/proc/slabinfo
      
      	-ext2_inode_cache	388	10
      	+ext2_inode_cache	384	10
      	-inode_cache		280	14
      	+inode_cache		276	14
      	-proc_inode_cache	296	13
      	+proc_inode_cache	292	13
      	-reiser_inode_cache	336	11
      	+reiser_inode_cache	332	12 <=
      	-shmem_inode_cache	372	10
      	+shmem_inode_cache	368	10
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      50462062
  11. 27 9月, 2006 3 次提交
  12. 01 7月, 2006 2 次提交
  13. 29 6月, 2006 1 次提交
  14. 02 4月, 2006 1 次提交
  15. 29 3月, 2006 1 次提交
  16. 27 3月, 2006 1 次提交
  17. 26 3月, 2006 1 次提交
  18. 24 3月, 2006 1 次提交
    • P
      [PATCH] cpuset memory spread slab cache hooks · b0196009
      Paul Jackson 提交于
      Change the kmem_cache_create calls for certain slab caches to support cpuset
      memory spreading.
      
      See the previous patches, cpuset_mem_spread, for an explanation of cpuset
      memory spreading, and cpuset_mem_spread_slab_cache for the slab cache support
      for memory spreading.
      
      The slab caches marked for now are: dentry_cache, inode_cache, some xfs slab
      caches, and buffer_head.  This list may change over time.  In particular,
      other file system types that are used extensively on large NUMA systems may
      want to allow for spreading their directory and inode slab cache entries.
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b0196009
  19. 23 3月, 2006 2 次提交
  20. 02 2月, 2006 1 次提交
  21. 11 1月, 2006 3 次提交
  22. 10 1月, 2006 1 次提交
  23. 09 1月, 2006 1 次提交
  24. 31 10月, 2005 1 次提交
    • A
      [PATCH] fix nr_unused accounting, and avoid recursing in iput with I_WILL_FREE set · 7f04c26d
      Andrea Arcangeli 提交于
       			list_move(&inode->i_list, &inode_in_use);
       		} else {
       			list_move(&inode->i_list, &inode_unused);
      +			inodes_stat.nr_unused++;
       		}
       	}
       	wake_up_inode(inode);
      
      Are you sure the above diff is correct? It was added somewhere between
      2.6.5 and 2.6.8. I think it's wrong.
      
      The only way I can imagine the i_count to be zero in the above path, is
      that I_WILL_FREE is set.  And if I_WILL_FREE is set, then we must not
      increase nr_unused.  So I believe the above change is buggy and it will
      definitely overstate the number of unused inodes and it should be backed
      out.
      
      Note that __writeback_single_inode before calling __sync_single_inode, can
      drop the spinlock and we can have both the dirty and locked bitflags clear
      here:
      
      		spin_unlock(&inode_lock);
      		__wait_on_inode(inode);
      		iput(inode);
      XXXXXXX
      		spin_lock(&inode_lock);
      	}
      	use inode again here
      
      a construct like the above makes zero sense from a reference counting
      standpoint.
      
      Either we don't ever use the inode again after the iput, or the
      inode_lock should be taken _before_ executing the iput (i.e. a __iput
      would be required). Taking the inode_lock after iput means the iget was
      useless if we keep using the inode after the iput.
      
      So the only chance the 2.6 was safe to call __writeback_single_inode
      with the i_count == 0, is that I_WILL_FREE is set (I_WILL_FREE will
      prevent the VM to free the inode in XXXXX).
      
      Potentially calling the above iput with I_WILL_FREE was also wrong
      because it would recurse in iput_final (the second mainline bug).
      
      The below (untested) patch fixes the nr_unused accounting, avoids recursing
      in iput when I_WILL_FREE is set and makes sure (with the BUG_ON) that we
      don't corrupt memory and that all holders that don't set I_WILL_FREE, keeps
      a reference on the inode!
      Signed-off-by: NAndrea Arcangeli <andrea@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7f04c26d
  25. 28 10月, 2005 1 次提交
    • A
      [PATCH] gfp_t: fs/* · 27496a8c
      Al Viro 提交于
       - ->releasepage() annotated (s/int/gfp_t), instances updated
       - missing gfp_t in fs/* added
       - fixed misannotation from the original sweep caught by bitwise checks:
         XFS used __nocast both for gfp_t and for flags used by XFS allocator.
         The latter left with unsigned int __nocast; we might want to add a
         different type for those but for now let's leave them alone.  That,
         BTW, is a case when __nocast use had been actively confusing - it had
         been used in the same code for two different and similar types, with
         no way to catch misuses.  Switch of gfp_t to bitwise had caught that
         immediately...
      
      One tricky bit is left alone to be dealt with later - mapping->flags is
      a mix of gfp_t and error indications.  Left alone for now.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      27496a8c
  26. 10 9月, 2005 1 次提交
    • M
      [PATCH] move truncate_inode_pages() into ->delete_inode() · e85b5652
      Mark Fasheh 提交于
      Allow file systems supporting ->delete_inode() to call
      truncate_inode_pages() on their own.  OCFS2 wants this so it can query the
      cluster before making a final decision on whether to wipe an inode from
      disk or not.  In some corner cases an inode marked on the local node via
      voting may not actually get orphaned.  A good example is node death before
      the transaction moving the inode to the orphan dir commits to the journal.
      Without this patch, the truncate_inode_pages() call in
      generic_delete_inode() would discard valid data for such inodes.
      
      During earlier discussion in the 2.6.13 merge plan thread, Christoph
      Hellwig indicated that other file systems might also find this useful.
      
      IMHO, the best solution would be to just allow ->drop_inode() to do the
      cluster query but it seems that would require a substantial reworking of
      that section of the code.  Assuming it is safe to call write_inode_now() in
      ocfs2_delete_inode() for those inodes which won't actually get wiped, this
      solution should get us by for now.
      
      Trivial testing of this patch (and a related OCFS2 update) has shown this
      to avoid the corruption I'm seeing.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      Acked-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e85b5652
  27. 08 9月, 2005 1 次提交
  28. 14 7月, 2005 1 次提交
    • A
      [PATCH] Fix soft lockup due to NTFS: VFS part and explanation · 88bd5121
      Anton Altaparmakov 提交于
      Something has changed in the core kernel such that we now get concurrent
      inode write outs, one e.g via pdflush and one via sys_sync or whatever.
      This causes a nasty deadlock in ntfs.  The only clean solution
      unfortunately requires a minor vfs api extension.
      
      First the deadlock analysis:
      
      Prerequisive knowledge: NTFS has a file $MFT (inode 0) loaded at mount
      time.  The NTFS driver uses the page cache for storing the file contents as
      usual.  More interestingly this file contains the table of on-disk inodes
      as a sequence of MFT_RECORDs.  Thus NTFS driver accesses the on-disk inodes
      by accessing the MFT_RECORDs in the page cache pages of the loaded inode
      $MFT.
      
      The situation: VFS inode X on a mounted ntfs volume is dirty.  For same
      inode X, the ntfs_inode is dirty and thus corresponding on-disk inode,
      which is as explained above in a dirty PAGE_CACHE_PAGE belonging to the
      table of inodes ($MFT, inode 0).
      
      What happens:
      
      Process 1: sys_sync()/umount()/whatever...  calls __sync_single_inode() for
      $MFT -> do_writepages() -> write_page for the dirty page containing the
      on-disk inode X, the page is now locked -> ntfs_write_mst_block() which
      clears PageUptodate() on the page to prevent anyone else getting hold of it
      whilst it does the write out (this is necessary as the on-disk inode needs
      "fixups" applied before the write to disk which are removed again after the
      write and PageUptodate is then set again).  It then analyses the page
      looking for dirty on-disk inodes and when it finds one it calls
      ntfs_may_write_mft_record() to see if it is safe to write this on-disk
      inode.  This then calls ilookup5() to check if the corresponding VFS inode
      is in icache().  This in turn calls ifind() which waits on the inode lock
      via wait_on_inode whilst holding the global inode_lock.
      
      Process 2: pdflush results in a call to __sync_single_inode for the same
      VFS inode X on the ntfs volume.  This locks the inode (I_LOCK) then calls
      write-inode -> ntfs_write_inode -> map_mft_record() -> read_cache_page() of
      the page (in page cache of table of inodes $MFT, inode 0) containing the
      on-disk inode.  This page has PageUptodate() clear because of Process 1
      (see above) so read_cache_page() blocks when tries to take the page lock
      for the page so it can call ntfs_read_page().
      
      Thus Process 1 is holding the page lock on the page containing the on-disk
      inode X and it is waiting on the inode X to be unlocked in ifind() so it
      can write the page out and then unlock the page.
      
      And Process 2 is holding the inode lock on inode X and is waiting for the
      page to be unlocked so it can call ntfs_readpage() or discover that
      Process 1 set PageUptodate() again and use the page.
      
      Thus we have a deadlock due to ifind() waiting on the inode lock.
      
      The only sensible solution: NTFS does not care whether the VFS inode is
      locked or not when it calls ilookup5() (it doesn't use the VFS inode at
      all, it just uses it to find the corresponding ntfs_inode which is of
      course attached to the VFS inode (both are one single struct); and it uses
      the ntfs_inode which is subject to its own locking so I_LOCK is irrelevant)
      hence we want a modified ilookup5_nowait() which is the same as ilookup5()
      but it does not wait on the inode lock.
      
      Without such functionality I would have to keep my own ntfs_inode cache in
      the NTFS driver just so I can find ntfs_inodes independent of their VFS
      inodes which would be slow, memory and cpu cycle wasting, and incredibly
      stupid given the icache already exists in the VFS.
      
      Below is a patch that does the ilookup5_nowait() implementation in
      fs/inode.c and exports it.
      
      ilookup5_nowait.diff:
      
      Introduce ilookup5_nowait() which is basically the same as ilookup5() but
      it does not wait on the inode's lock (i.e. it omits the wait_on_inode()
      done in ifind()).
      
      This is needed to avoid a nasty deadlock in NTFS.
      Signed-off-by: NAnton Altaparmakov <aia21@cantab.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      88bd5121
  29. 13 7月, 2005 1 次提交
    • R
      [PATCH] inotify · 0eeca283
      Robert Love 提交于
      inotify is intended to correct the deficiencies of dnotify, particularly
      its inability to scale and its terrible user interface:
      
              * dnotify requires the opening of one fd per each directory
                that you intend to watch. This quickly results in too many
                open files and pins removable media, preventing unmount.
              * dnotify is directory-based. You only learn about changes to
                directories. Sure, a change to a file in a directory affects
                the directory, but you are then forced to keep a cache of
                stat structures.
              * dnotify's interface to user-space is awful.  Signals?
      
      inotify provides a more usable, simple, powerful solution to file change
      notification:
      
              * inotify's interface is a system call that returns a fd, not SIGIO.
      	  You get a single fd, which is select()-able.
              * inotify has an event that says "the filesystem that the item
                you were watching is on was unmounted."
              * inotify can watch directories or files.
      
      Inotify is currently used by Beagle (a desktop search infrastructure),
      Gamin (a FAM replacement), and other projects.
      
      See Documentation/filesystems/inotify.txt.
      Signed-off-by: NRobert Love <rml@novell.com>
      Cc: John McCutchan <ttb@tentacle.dhs.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0eeca283