1. 07 1月, 2011 2 次提交
    • N
      fs: rcu-walk for path lookup · 31e6b01f
      Nick Piggin 提交于
      Perform common cases of path lookups without any stores or locking in the
      ancestor dentry elements. This is called rcu-walk, as opposed to the current
      algorithm which is a refcount based walk, or ref-walk.
      
      This results in far fewer atomic operations on every path element,
      significantly improving path lookup performance. It also avoids cacheline
      bouncing on common dentries, significantly improving scalability.
      
      The overall design is like this:
      * LOOKUP_RCU is set in nd->flags, which distinguishes rcu-walk from ref-walk.
      * Take the RCU lock for the entire path walk, starting with the acquiring
        of the starting path (eg. root/cwd/fd-path). So now dentry refcounts are
        not required for dentry persistence.
      * synchronize_rcu is called when unregistering a filesystem, so we can
        access d_ops and i_ops during rcu-walk.
      * Similarly take the vfsmount lock for the entire path walk. So now mnt
        refcounts are not required for persistence. Also we are free to perform mount
        lookups, and to assume dentry mount points and mount roots are stable up and
        down the path.
      * Have a per-dentry seqlock to protect the dentry name, parent, and inode,
        so we can load this tuple atomically, and also check whether any of its
        members have changed.
      * Dentry lookups (based on parent, candidate string tuple) recheck the parent
        sequence after the child is found in case anything changed in the parent
        during the path walk.
      * inode is also RCU protected so we can load d_inode and use the inode for
        limited things.
      * i_mode, i_uid, i_gid can be tested for exec permissions during path walk.
      * i_op can be loaded.
      
      When we reach the destination dentry, we lock it, recheck lookup sequence,
      and increment its refcount and mountpoint refcount. RCU and vfsmount locks
      are dropped. This is termed "dropping rcu-walk". If the dentry refcount does
      not match, we can not drop rcu-walk gracefully at the current point in the
      lokup, so instead return -ECHILD (for want of a better errno). This signals the
      path walking code to re-do the entire lookup with a ref-walk.
      
      Aside from the final dentry, there are other situations that may be encounted
      where we cannot continue rcu-walk. In that case, we drop rcu-walk (ie. take
      a reference on the last good dentry) and continue with a ref-walk. Again, if
      we can drop rcu-walk gracefully, we return -ECHILD and do the whole lookup
      using ref-walk. But it is very important that we can continue with ref-walk
      for most cases, particularly to avoid the overhead of double lookups, and to
      gain the scalability advantages on common path elements (like cwd and root).
      
      The cases where rcu-walk cannot continue are:
      * NULL dentry (ie. any uncached path element)
      * parent with d_inode->i_op->permission or ACLs
      * dentries with d_revalidate
      * Following links
      
      In future patches, permission checks and d_revalidate become rcu-walk aware. It
      may be possible eventually to make following links rcu-walk aware.
      
      Uncached path elements will always require dropping to ref-walk mode, at the
      very least because i_mutex needs to be grabbed, and objects allocated.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      31e6b01f
    • N
      fs: dcache remove dcache_lock · b5c84bf6
      Nick Piggin 提交于
      dcache_lock no longer protects anything. remove it.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      b5c84bf6
  2. 23 12月, 2009 1 次提交
  3. 12 12月, 2009 1 次提交
  4. 21 9月, 2009 1 次提交
  5. 12 6月, 2009 3 次提交
  6. 09 5月, 2009 1 次提交
  7. 01 1月, 2009 1 次提交
  8. 23 10月, 2008 3 次提交
  9. 27 7月, 2008 5 次提交
  10. 15 2月, 2008 4 次提交
  11. 14 2月, 2008 1 次提交
  12. 17 10月, 2007 1 次提交
    • C
      partially fix up the lookup_one_noperm mess · eead1911
      Christoph Hellwig 提交于
      Try to fix the mess created by sysfs braindamage.
      
       - refactor code internal to fs/namei.c a little to avoid too much
         duplication:
      	o __lookup_hash_kern is renamed back to __lookup_hash
      	o the old __lookup_hash goes away, permission checks moves to
      	  the two callers
      	o useless inline qualifiers on above functions go away
       - lookup_one_len_kern loses it's last argument and is renamed to
         lookup_one_noperm to make it's useage a little more clear
       - added kerneldoc comments to describe lookup_one_len aswell as
         lookup_one_noperm and make it very clear that no one should use
         the latter ever.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eead1911
  13. 20 7月, 2007 3 次提交
  14. 28 4月, 2007 1 次提交
  15. 09 12月, 2006 1 次提交
  16. 01 10月, 2006 1 次提交
  17. 30 9月, 2006 1 次提交
  18. 15 7月, 2006 1 次提交
  19. 01 4月, 2006 1 次提交
  20. 19 1月, 2006 1 次提交
    • U
      [PATCH] vfs: *at functions: core · 5590ff0d
      Ulrich Drepper 提交于
      Here is a series of patches which introduce in total 13 new system calls
      which take a file descriptor/filename pair instead of a single file
      name.  These functions, openat etc, have been discussed on numerous
      occasions.  They are needed to implement race-free filesystem traversal,
      they are necessary to implement a virtual per-thread current working
      directory (think multi-threaded backup software), etc.
      
      We have in glibc today implementations of the interfaces which use the
      /proc/self/fd magic.  But this code is rather expensive.  Here are some
      results (similar to what Jim Meyering posted before).
      
      The test creates a deep directory hierarchy on a tmpfs filesystem.  Then
      rm -fr is used to remove all directories.  Without syscall support I get
      this:
      
      real    0m31.921s
      user    0m0.688s
      sys     0m31.234s
      
      With syscall support the results are much better:
      
      real    0m20.699s
      user    0m0.536s
      sys     0m20.149s
      
      The interfaces are for obvious reasons currently not much used.  But they'll
      be used.  coreutils (and Jeff's posixutils) are already using them.
      Furthermore, code like ftw/fts in libc (maybe even glob) will also start using
      them.  I expect a patch to make follow soon.  Every program which is walking
      the filesystem tree will benefit.
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5590ff0d
  21. 11 1月, 2006 1 次提交
  22. 09 11月, 2005 1 次提交
    • C
      [PATCH] sanitize lookup_hash prototype · 49705b77
      Christoph Hellwig 提交于
      ->permission and ->lookup have a struct nameidata * argument these days to
      pass down lookup intents.  Unfortunately some callers of lookup_hash don't
      actually pass this one down.  For lookup_one_len() we don't have a struct
      nameidata to pass down, but as this function is a library function only
      used by filesystem code this is an acceptable limitation.  All other
      callers should pass down the nameidata, so this patch changes the
      lookup_hash interface to only take a struct nameidata argument and derives
      the other two arguments to __lookup_hash from it.  All callers already have
      the nameidata argument available so this is not a problem.
      
      At the same time I'd like to deprecate the lookup_hash interface as there
      are better exported interfaces for filesystem usage.  Before it can
      actually be removed I need to fix up rpc_pipefs.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Ram Pai <linuxram@us.ibm.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      49705b77
  23. 19 10月, 2005 1 次提交
  24. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4