1. 29 10月, 2006 2 次提交
  2. 22 10月, 2006 1 次提交
  3. 12 10月, 2006 1 次提交
    • D
      [PATCH] VFS: Destroy the dentries contributed by a superblock on unmounting · c636ebdb
      David Howells 提交于
      The attached patch destroys all the dentries attached to a superblock in one go
      by:
      
       (1) Destroying the tree rooted at s_root.
      
       (2) Destroying every entry in the anon list, one at a time.
      
       (3) Each entry in the anon list has its subtree consumed from the leaves
           inwards.
      
      This reduces the amount of work generic_shutdown_super() does, and avoids
      iterating through the dentry_unused list.
      
      Note that locking is almost entirely absent in the shrink_dcache_for_umount*()
      functions added by this patch.  This is because:
      
       (1) at the point the filesystem calls generic_shutdown_super(), it is not
           permitted to further touch the superblock's set of dentries, and nor may
           it remove aliases from inodes;
      
       (2) the dcache memory shrinker now skips dentries that are being unmounted;
           and
      
       (3) the superblock no longer has any external references through which the VFS
           can reach it.
      
      Given these points, the only locking we need to do is when we remove dentries
      from the unused list and the name hashes, which we do a directory's worth at a
      time.
      
      We also don't need to guard against reference counts going to zero unexpectedly
      and removing bits of the tree we're working on as nothing else can call dput().
      
      A cut down version of dentry_iput() has been folded into
      shrink_dcache_for_umount_subtree() function.  Apart from not needing to unlock
      things, it also doesn't need to check for inotify watches.
      
      In this version of the patch, the complaint about a dentry still being in use
      has been expanded from a single BUG_ON() and now gives much more information.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      Acked-by: NIan Kent <raven@themaw.net>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c636ebdb
  4. 04 10月, 2006 1 次提交
    • N
      [PATCH] knfsd: close a race-opportunity in d_splice_alias · 21c0d8fd
      NeilBrown 提交于
      There is a possible race in d_splice_alias.  Though __d_find_alias(inode, 1)
      will only return a dentry with DCACHE_DISCONNECTED set, it is possible for it
      to get cleared before the BUG_ON, and it is is not possible to lock against
      that.
      
      There are a couple of problems here.  Firstly, the code doesn't match the
      comment.  The comment describes a 'disconnected' dentry as being IS_ROOT as
      well as DCACHE_DISCONNECTED, however there is not testing of IS_ROOT anythere.
      
      A dentry is marked DCACHE_DISCONNECTED when allocated with d_alloc_anon, and
      remains DCACHE_DISCONNECTED while a path is built up towards the root.  So a
      dentry can have a valid name and a valid parent and even grandparent, but will
      still be DCACHE_DISCONNECTED until a path to the root is created.  Once the
      path to the root is complete, everything in the path gets DCACHE_DISCONNECTED
      cleared.  So the fact that DCACHE_DISCONNECTED isn't enough to say that a
      dentry is free to be spliced in with a given name.  This can only be allowed
      if the dentry does not yet have a name, so the IS_ROOT test is needed too.
      
      However even adding that test to __d_find_alias isn't enough.  As
      d_splice_alias drops dcache_lock before calling d_move to perform the splice,
      it could race with another thread calling d_splice_alias to splice the inode
      in with a different name in a different part of the tree (in the case where a
      file has hard links).  So that splicing code is only really safe for
      directories (as we know that directories only have one link).  For
      directories, the caller of d_splice_alias will be holding i_mutex on the
      (unique) parent so there is no room for a race.
      
      A consequence of this is that a non-directory will never benefit from being
      spliced into a pre-exisiting dentry, but that isn't a problem.  It is
      perfectly OK for a non-directory to have multiple dentries, some anonymous,
      some not.  And the comment for d_splice_alias says that it only happens for
      directories anyway.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      21c0d8fd
  5. 01 10月, 2006 1 次提交
  6. 23 9月, 2006 1 次提交
    • D
      NFS: Add dentry materialisation op · 770bfad8
      David Howells 提交于
      The attached patch adds a new directory cache management function that prepares
      a disconnected anonymous function to be connected into the dentry tree. The
      anonymous dentry is transferred the name and parentage from another dentry.
      
      The following changes were made in [try #2]:
      
       (*) d_materialise_dentry() now switches the parentage of the two nodes around
           correctly when one or other of them is self-referential.
      
      The following changes were made in [try #7]:
      
       (*) d_instantiate_unique() has had the interior part split out as function
           __d_instantiate_unique(). Callers of this latter function must be holding
           the appropriate locks.
      
       (*) _d_rehash() has been added as a wrapper around __d_rehash() to call it
           with the most obvious hash list (the one from the name). d_rehash() now
           calls _d_rehash().
      
       (*) d_materialise_dentry() is now __d_materialise_dentry() and is static.
      
       (*) d_materialise_unique() added to perform the combination of d_find_alias(),
           d_materialise_dentry() and d_add_unique() that the NFS client was doing
           twice, all within a single dcache_lock critical section. This reduces the
           number of times two different spinlocks were being accessed.
      
      The following further changes were made:
      
       (*) Add the dentries onto their parents d_subdirs lists.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      770bfad8
  7. 04 7月, 2006 2 次提交
  8. 01 7月, 2006 1 次提交
  9. 27 6月, 2006 2 次提交
  10. 26 6月, 2006 1 次提交
  11. 23 6月, 2006 3 次提交
    • D
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells 提交于
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      454e2398
    • A
      [PATCH] prune_one_dentry() tweaks · d702ccb3
      Andrew Morton 提交于
      - Add description of d_lock handling to comments over prune_one_dentry().
      
      - It has three callsites - uninline it, saving 200 bytes of text.
      
      Cc: Jan Blunck <jblunck@suse.de>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: Olaf Hering <olh@suse.de>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d702ccb3
    • N
      [PATCH] Fix dcache race during umount · 0feae5c4
      NeilBrown 提交于
      The race is that the shrink_dcache_memory shrinker could get called while a
      filesystem is being unmounted, and could try to prune a dentry belonging to
      that filesystem.
      
      If it does, then it will call in to iput on the inode while the dentry is
      no longer able to be found by the umounting process.  If iput takes a
      while, generic_shutdown_super could get all the way though
      shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
      ever waiting on this particular inode.
      
      Eventually the superblock gets freed anyway and if the iput tried to touch
      it (which some filesystems certainly do), it will lose.  The promised
      "Self-destruct in 5 seconds" doesn't lead to a nice day.
      
      The race is closed by holding s_umount while calling prune_one_dentry on
      someone else's dentry.  As a down_read_trylock is used,
      shrink_dcache_memory will no longer try to prune the dentry of a filesystem
      that is being unmounted, and unmount will not be able to start until any
      such active prune_one_dentry completes.
      
      This requires that prune_dcache *knows* which filesystem (if any) it is
      doing the prune on behalf of so that it can be careful of other
      filesystems.  shrink_dcache_memory isn't called it on behalf of any
      filesystem, and so is careful of everything.
      
      shrink_dcache_anon is now passed a super_block rather than the s_anon list
      out of the superblock, so it can get the s_anon list itself, and can pass
      the superblock down to prune_dcache.
      
      If prune_dcache finds a dentry that it cannot free, it leaves it where it
      is (at the tail of the list) and exits, on the assumption that some other
      thread will be removing that dentry soon.  To try to make sure that some
      work gets done, a limited number of dnetries which are untouchable are
      skipped over while choosing the dentry to work on.
      
      I believe this race was first found by Kirill Korotaev.
      
      Cc: Jan Blunck <jblunck@suse.de>
      Acked-by: NKirill Korotaev <dev@openvz.org>
      Cc: Olaf Hering <olh@suse.de>
      Acked-by: NBalbir Singh <balbir@in.ibm.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NBalbir Singh <balbir@in.ibm.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0feae5c4
  12. 01 4月, 2006 2 次提交
  13. 27 3月, 2006 3 次提交
  14. 26 3月, 2006 3 次提交
  15. 24 3月, 2006 1 次提交
    • P
      [PATCH] cpuset memory spread slab cache hooks · b0196009
      Paul Jackson 提交于
      Change the kmem_cache_create calls for certain slab caches to support cpuset
      memory spreading.
      
      See the previous patches, cpuset_mem_spread, for an explanation of cpuset
      memory spreading, and cpuset_mem_spread_slab_cache for the slab cache support
      for memory spreading.
      
      The slab caches marked for now are: dentry_cache, inode_cache, some xfs slab
      caches, and buffer_head.  This list may change over time.  In particular,
      other file system types that are used extensively on large NUMA systems may
      want to allow for spreading their directory and inode slab cache entries.
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b0196009
  16. 09 3月, 2006 1 次提交
    • D
      [PATCH] fix file counting · 529bf6be
      Dipankar Sarma 提交于
      I have benchmarked this on an x86_64 NUMA system and see no significant
      performance difference on kernbench.  Tested on both x86_64 and powerpc.
      
      The way we do file struct accounting is not very suitable for batched
      freeing.  For scalability reasons, file accounting was
      constructor/destructor based.  This meant that nr_files was decremented
      only when the object was removed from the slab cache.  This is susceptible
      to slab fragmentation.  With RCU based file structure, consequent batched
      freeing and a test program like Serge's, we just speed this up and end up
      with a very fragmented slab -
      
      llm22:~ # cat /proc/sys/fs/file-nr
      587730  0       758844
      
      At the same time, I see only a 2000+ objects in filp cache.  The following
      patch I fixes this problem.
      
      This patch changes the file counting by removing the filp_count_lock.
      Instead we use a separate percpu counter, nr_files, for now and all
      accesses to it are through get_nr_files() api.  In the sysctl handler for
      nr_files, we populate files_stat.nr_files before returning to user.
      
      Counting files as an when they are created and destroyed (as opposed to
      inside slab) allows us to correctly count open files with RCU.
      Signed-off-by: NDipankar Sarma <dipankar@in.ibm.com>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      529bf6be
  17. 04 2月, 2006 1 次提交
  18. 15 1月, 2006 1 次提交
  19. 11 1月, 2006 1 次提交
  20. 09 1月, 2006 1 次提交
    • E
      [PATCH] shrink dentry struct · 5160ee6f
      Eric Dumazet 提交于
      Some long time ago, dentry struct was carefully tuned so that on 32 bits
      UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
      of memory cache lines.
      
      Then RCU was added and dentry struct enlarged by two pointers, with nice
      results for SMP, but not so good on UP, because breaking the above tuning
      (128 + 8 = 136 bytes)
      
      This patch reverts this unwanted side effect, by using an union (d_u),
      where d_rcu and d_child are placed so that these two fields can share their
      memory needs.
      
      At the time d_free() is called (and d_rcu is really used), d_child is known
      to be empty and not touched by the dentry freeing.
      
      Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
      the previous content of d_child is not needed if said dentry was unhashed
      but still accessed by a CPU because of RCU constraints)
      
      As dentry cache easily contains millions of entries, a size reduction is
      worth the extra complexity of the ugly C union.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Maneesh Soni <maneesh@in.ibm.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Neil Brown <neilb@cse.unsw.edu.au>
      Cc: James Morris <jmorris@namei.org>
      Cc: Stephen Smalley <sds@epoch.ncsc.mil>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5160ee6f
  21. 07 11月, 2005 1 次提交
  22. 28 10月, 2005 1 次提交
    • A
      [PATCH] gfp_t: fs/* · 27496a8c
      Al Viro 提交于
       - ->releasepage() annotated (s/int/gfp_t), instances updated
       - missing gfp_t in fs/* added
       - fixed misannotation from the original sweep caught by bitwise checks:
         XFS used __nocast both for gfp_t and for flags used by XFS allocator.
         The latter left with unsigned int __nocast; we might want to add a
         different type for those but for now let's leave them alone.  That,
         BTW, is a case when __nocast use had been actively confusing - it had
         been used in the same code for two different and similar types, with
         no way to catch misuses.  Switch of gfp_t to bitwise had caught that
         immediately...
      
      One tricky bit is left alone to be dealt with later - mapping->flags is
      a mix of gfp_t and error indications.  Left alone for now.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      27496a8c
  23. 20 9月, 2005 1 次提交
  24. 11 9月, 2005 1 次提交
  25. 09 8月, 2005 1 次提交
    • J
      [PATCH] fsnotify_name/inoderemove · 7a91bf7f
      John McCutchan 提交于
      The patch below unhooks fsnotify from vfs_unlink & vfs_rmdir.  It
      introduces two new fsnotify calls, that are hooked in at the dcache
      level.  This not only more closely matches how the VFS layer works, it
      also avoids the problem with locking and inode lifetimes.
      
      The two functions are
      
       - fsnotify_nameremove -- called when a directory entry is going away.
         It notifies the PARENT of the deletion.  This is called from
         d_delete().
      
       - inoderemove -- called when the files inode itself is going away.  It
         notifies the inode that is being deleted.  This is called from
         dentry_iput().
      Signed-off-by: NJohn McCutchan <ttb@tentacle.dhs.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7a91bf7f
  26. 06 5月, 2005 1 次提交
  27. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4