1. 07 1月, 2009 1 次提交
    • T
      poll: allow f_op->poll to sleep · 5f820f64
      Tejun Heo 提交于
      f_op->poll is the only vfs operation which is not allowed to sleep.  It's
      because poll and select implementation used task state to synchronize
      against wake ups, which doesn't have to be the case anymore as wait/wake
      interface can now use custom wake up functions.  The non-sleep restriction
      can be a bit tricky because ->poll is not called from an atomic context
      and the result of accidentally sleeping in ->poll only shows up as
      temporary busy looping when the timing is right or rather wrong.
      
      This patch converts poll/select to use custom wake up function and use
      separate triggered variable to synchronize against wake up events.  The
      only added overhead is an extra function call during wake up and
      negligible.
      
      This patch removes the one non-sleep exception from vfs locking rules and
      is beneficial to userland filesystem implementations like FUSE, 9p or
      peculiar fs like spufs as it's very difficult for those to implement
      non-sleeping poll method.
      
      While at it, make the following cosmetic changes to make poll.h and
      select.c checkpatch friendly.
      
      * s/type * symbol/type *symbol/		   : three places in poll.h
      * remove blank line before EXPORT_SYMBOL() : two places in select.c
      
      Oleg: spotted missing barrier in poll_schedule_timeout()
      Davide: spotted missing write barrier in pollwake()
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Ron Minnich <rminnich@sandia.gov>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Brad Boyer <flar@allandria.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5f820f64
  2. 01 1月, 2009 1 次提交
    • A
      kill ->dir_notify() · 6badd79b
      Al Viro 提交于
      Remove the hopelessly misguided ->dir_notify().  The only instance (cifs)
      has been broken by design from the very beginning; the objects it creates
      are never destroyed, keep references to struct file they can outlive, nothing
      that could possibly evict them exists on close(2) path *and* no locking
      whatsoever is done to prevent races with close(), should the previous, er,
      deficiencies someday be dealt with.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6badd79b
  3. 31 10月, 2008 1 次提交
  4. 10 9月, 2008 1 次提交
  5. 25 7月, 2008 1 次提交
  6. 07 5月, 2008 1 次提交
    • C
      [PATCH] kill ->put_inode · 33dcdac2
      Christoph Hellwig 提交于
      And with that last patch to affs killing the last put_inode instance we
      can finally, after many years of transition kill this racy and awkward
      interface.
      
      (It's kinda funny that even the description in
      Documentation/filesystems/vfs.txt was entirely wrong..)
      
      Also remove a very misleading comment above the defintion of
      struct super_operations.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      33dcdac2
  7. 28 4月, 2008 1 次提交
  8. 08 2月, 2008 1 次提交
    • D
      iget: remove iget() and the read_inode() super op as being obsolete · 12debc42
      David Howells 提交于
      Remove the old iget() call and the read_inode() superblock operation it uses
      as these are really obsolete, and the use of read_inode() does not produce
      proper error handling (no distinction between ENOMEM and EIO when marking an
      inode bad).
      
      Furthermore, this removes the temptation to use iget() to find an inode by
      number in a filesystem from code outside that filesystem.
      
      iget_locked() should be used instead.  A new function is added in an earlier
      patch (iget_failed) that is to be called to mark an inode as bad, unlock it
      and release it should the get routine fail.  Mark iget() and read_inode() as
      being obsolete and remove references to them from the documentation.
      
      Typically a filesystem will be modified such that the read_inode function
      becomes an internal iget function, for example the following:
      
      	void thingyfs_read_inode(struct inode *inode)
      	{
      		...
      	}
      
      would be changed into something like:
      
      	struct inode *thingyfs_iget(struct super_block *sp, unsigned long ino)
      	{
      		struct inode *inode;
      		int ret;
      
      		inode = iget_locked(sb, ino);
      		if (!inode)
      			return ERR_PTR(-ENOMEM);
      		if (!(inode->i_state & I_NEW))
      			return inode;
      
      		...
      		unlock_new_inode(inode);
      		return inode;
      	error:
      		iget_failed(inode);
      		return ERR_PTR(ret);
      	}
      
      and then thingyfs_iget() would be called rather than iget(), for example:
      
      	ret = -EINVAL;
      	inode = iget(sb, ino);
      	if (!inode || is_bad_inode(inode))
      		goto error;
      
      becomes:
      
      	inode = thingyfs_iget(sb, ino);
      	if (IS_ERR(inode)) {
      		ret = PTR_ERR(inode);
      		goto error;
      	}
      
      Note that is_bad_inode() does not need to be called.  The error returned by
      thingyfs_iget() should render it unnecessary.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12debc42
  9. 20 10月, 2007 1 次提交
  10. 17 10月, 2007 1 次提交
  11. 20 7月, 2007 3 次提交
    • N
      mm: fault feedback #1 · d0217ac0
      Nick Piggin 提交于
      Change ->fault prototype.  We now return an int, which contains
      VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
       FAULT_RET_ code tells the VM whether a page was found, whether it has been
      locked, and potentially other things.  This is not quite the way he wanted
      it yet, but that's changed in the next patch (which requires changes to
      arch code).
      
      This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
      that a page is locked which requires filemap_nopage to go away (because we
      can no longer remain backward compatible without that flag), but we were
      going to do that anyway.
      
      struct fault_data is renamed to struct vm_fault as Linus asked. address
      is now a void __user * that we should firmly encourage drivers not to use
      without really good reason.
      
      The page is now returned via a page pointer in the vm_fault struct.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0217ac0
    • M
      Document ->page_mkwrite() locking · ed2f2f9b
      Mark Fasheh 提交于
      There seems to be very little documentation about this callback in general.
      The locking in particular is a bit tricky, so it's worth having this in
      writing.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed2f2f9b
    • N
      mm: merge populate and nopage into fault (fixes nonlinear) · 54cb8821
      Nick Piggin 提交于
      Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
      the virtual address -> file offset differently from linear mappings.
      
      ->populate is a layering violation because the filesystem/pagecache code
      should need to know anything about the virtual memory mapping.  The hitch here
      is that the ->nopage handler didn't pass down enough information (ie.  pgoff).
       But it is more logical to pass pgoff rather than have the ->nopage function
      calculate it itself anyway (because that's a similar layering violation).
      
      Having the populate handler install the pte itself is likewise a nasty thing
      to be doing.
      
      This patch introduces a new fault handler that replaces ->nopage and
      ->populate and (later) ->nopfn.  Most of the old mechanism is still in place
      so there is a lot of duplication and nice cleanups that can be removed if
      everyone switches over.
      
      The rationale for doing this in the first place is that nonlinear mappings are
      subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
      to duplicate the synchronisation logic rather than just consolidate the two.
      
      After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
      pagecache.  Seems like a fringe functionality anyway.
      
      NOPAGE_REFAULT is removed.  This should be implemented with ->fault, and no
      users have hit mainline yet.
      
      [akpm@linux-foundation.org: cleanup]
      [randy.dunlap@oracle.com: doc. fixes for readahead]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54cb8821
  12. 09 5月, 2007 2 次提交
  13. 12 1月, 2007 1 次提交
    • T
      [PATCH] NFS: Fix race in nfs_release_page() · e3db7691
      Trond Myklebust 提交于
          NFS: Fix race in nfs_release_page()
      
          invalidate_inode_pages2() may find the dirty bit has been set on a page
          owing to the fact that the page may still be mapped after it was locked.
          Only after the call to unmap_mapping_range() are we sure that the page
          can no longer be dirtied.
          In order to fix this, NFS has hooked the releasepage() method and tries
          to write the page out between the call to unmap_mapping_range() and the
          call to remove_mapping(). This, however leads to deadlocks in the page
          reclaim code, where the page may be locked without holding a reference
          to the inode or dentry.
      
          Fix is to add a new address_space_operation, launder_page(), which will
          attempt to write out a dirty page without releasing the page lock.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      
          Also, the bare SetPageDirty() can skew all sort of accounting leading to
          other nasties.
      
      [akpm@osdl.org: cleanup]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e3db7691
  14. 08 12月, 2006 1 次提交
  15. 01 10月, 2006 1 次提交
  16. 11 7月, 2006 1 次提交
  17. 23 6月, 2006 2 次提交
    • D
      [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry · 726c3342
      David Howells 提交于
      Give the statfs superblock operation a dentry pointer rather than a superblock
      pointer.
      
      This complements the get_sb() patch.  That reduced the significance of
      sb->s_root, allowing NFS to place a fake root there.  However, NFS does
      require a dentry to use as a target for the statfs operation.  This permits
      the root in the vfsmount to be used instead.
      
      linux/mount.h has been added where necessary to make allyesconfig build
      successfully.
      
      Interest has also been expressed for use with the FUSE and XFS filesystems.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      726c3342
    • D
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells 提交于
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      454e2398
  18. 01 5月, 2005 1 次提交
  19. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4