1. 25 7月, 2008 1 次提交
    • K
      fix soft lock up at NFS mount via per-SB LRU-list of unused dentries · da3bbdd4
      Kentaro Makita 提交于
      [Summary]
      
       Split LRU-list of unused dentries to one per superblock to avoid soft
       lock up during NFS mounts and remounting of any filesystem.
      
       Previously I posted here:
       http://lkml.org/lkml/2008/3/5/590
      
      [Descriptions]
      
      - background
      
        dentry_unused is a list of dentries which are not referenced.
        dentry_unused grows up when references on directories or files are
        released.  This list can be very long if there is huge free memory.
      
      - the problem
      
        When shrink_dcache_sb() is called, it scans all dentry_unused linearly
        under spin_lock(), and if dentry->d_sb is differnt from given
        superblock, scan next dentry.  This scan costs very much if there are
        many entries, and very ineffective if there are many superblocks.
      
        IOW, When we need to shrink unused dentries on one dentry, but scans
        unused dentries on all superblocks in the system.  For example, we scan
        500 dentries to unmount a filesystem, but scans 1,000,000 or more unused
        dentries on other superblocks.
      
        In our case , At mounting NFS*, shrink_dcache_sb() is called to shrink
        unused dentries on NFS, but scans 100,000,000 unused dentries on
        superblocks in the system such as local ext3 filesystems.  I hear NFS
        mounting took 1 min on some system in use.
      
      * : NFS uses virtual filesystem in rpc layer, so NFS is affected by
        this problem.
      
        100,000,000 is possible number on large systems.
      
        Per-superblock LRU of unused dentried can reduce the cost in
        reasonable manner.
      
      - How to fix
      
        I found this problem is solved by David Chinner's "Per-superblock
        unused dentry LRU lists V3"(1), so I rebase it and add some fix to
        reclaim with fairness, which is in Andrew Morton's comments(2).
      
        1) http://lkml.org/lkml/2006/5/25/318
        2) http://lkml.org/lkml/2006/5/25/320
      
        Split LRU-list of unused dentries to each superblocks.  Then, NFS
        mounting will check dentries under a superblock instead of all.  But
        this spliting will break LRU of dentry-unused.  So, I've attempted to
        make reclaim unused dentrins with fairness by calculate number of
        dentries to scan on this sb based on following way
      
        number of dentries to scan on this sb =
        count * (number of dentries on this sb / number of dentries in the machine)
      
      - ToDo
       - I have to measuring performance number and do stress tests.
      
       - When unmount occurs during prune_dcache(), scanning on same
        superblock, It is unable to reach next superblock because it is gone
        away.  We restart scannig superblock from first one, it causes
        unfairness of reclaim unused dentries on first superblock.  But I think
        this happens very rarely.
      
      - Test Results
      
        Result on 6GB boxes with excessive unused dentries.
      
      Without patch:
      
      $ cat /proc/sys/fs/dentry-state
      10181835        10180203        45      0       0       0
      # mount -t nfs 10.124.60.70:/work/kernel-src nfs
      real    0m1.830s
      user    0m0.001s
      sys     0m1.653s
      
       With this patch:
      $ cat /proc/sys/fs/dentry-state
      10236610        10234751        45      0       0       0
      # mount -t nfs 10.124.60.70:/work/kernel-src nfs
      real    0m0.106s
      user    0m0.002s
      sys     0m0.032s
      
      [akpm@linux-foundation.org: fix comments]
      Signed-off-by: NKentaro Makita <k-makita@np.css.fujitsu.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      da3bbdd4
  2. 24 6月, 2008 3 次提交
  3. 23 6月, 2008 1 次提交
  4. 23 4月, 2008 2 次提交
  5. 15 2月, 2008 5 次提交
  6. 07 2月, 2008 2 次提交
  7. 22 10月, 2007 1 次提交
  8. 21 10月, 2007 1 次提交
    • A
      [PATCH] audit: watching subtrees · 74c3cbe3
      Al Viro 提交于
      New kind of audit rule predicates: "object is visible in given subtree".
      The part that can be sanely implemented, that is.  Limitations:
      	* if you have hardlink from outside of tree, you'd better watch
      it too (or just watch the object itself, obviously)
      	* if you mount something under a watched tree, tell audit
      that new chunk should be added to watched subtrees
      	* if you umount something in a watched tree and it's still mounted
      elsewhere, you will get matches on events happening there.  New command
      tells audit to recalculate the trees, trimming such sources of false
      positives.
      
      Note that it's _not_ about path - if something mounted in several places
      (multiple mount, bindings, different namespaces, etc.), the match does
      _not_ depend on which one we are using for access.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      74c3cbe3
  9. 17 10月, 2007 6 次提交
  10. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  11. 18 7月, 2007 1 次提交
  12. 09 5月, 2007 4 次提交
  13. 08 5月, 2007 1 次提交
  14. 14 2月, 2007 1 次提交
  15. 13 2月, 2007 1 次提交
    • A
      [PATCH] Fix d_path for lazy unmounts · eb3dfb0c
      Andreas Gruenbacher 提交于
      Here is a bugfix to d_path.
      
      First, when d_path() hits a lazily unmounted mount point, it tries to
      prepend the name of the lazily unmounted dentry to the path name.  It gets
      this wrong, and also overwrites the slash that separates the name from the
      following pathname component.  This is demonstrated by the attached test
      case, which prints "getcwd returned d_path-bugsubdir" with the bug.  The
      correct result would be "getcwd returned d_path-bug/subdir".
      
      It could be argued that the name of the root dentry should not be part of
      the result of d_path in the first place.  On the other hand, what the
      unconnected namespace was once reachable as may provide some useful hints
      to users, and so that seems okay.
      
      Second, it isn't always possible to tell from the __d_path result whether
      the specified root and rootmnt (i.e., the chroot) was reached: lazy
      unmounts of bind mounts will produce a path that does start with a
      non-slash so we can tell from that, but other lazy unmounts will produce a
      path that starts with a slash, just like "ordinary" paths.
      
      The attached patch cleans up __d_path() to fix the bug with overlapping
      pathname components.  It also adds a @fail_deleted argument, which allows
      to get rid of some of the mess in sys_getcwd().  Grabbing the dcache_lock
      can then also be moved into __d_path().  The patch also makes sure that
      paths will only start with a slash for paths which are connected to the
      root and rootmnt.
      
      The @fail_deleted argument could be added to d_path() as well: this would
      allow callers to recognize deleted files, without having to resort to the
      ambiguous check for the " (deleted)" string at the end of the pathnames.
      This is not currently done, but it might be worthwhile.
      Signed-off-by: NAndreas Gruenbacher <agruen@suse.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eb3dfb0c
  16. 08 12月, 2006 2 次提交
  17. 29 10月, 2006 2 次提交
  18. 22 10月, 2006 1 次提交
  19. 12 10月, 2006 1 次提交
    • D
      [PATCH] VFS: Destroy the dentries contributed by a superblock on unmounting · c636ebdb
      David Howells 提交于
      The attached patch destroys all the dentries attached to a superblock in one go
      by:
      
       (1) Destroying the tree rooted at s_root.
      
       (2) Destroying every entry in the anon list, one at a time.
      
       (3) Each entry in the anon list has its subtree consumed from the leaves
           inwards.
      
      This reduces the amount of work generic_shutdown_super() does, and avoids
      iterating through the dentry_unused list.
      
      Note that locking is almost entirely absent in the shrink_dcache_for_umount*()
      functions added by this patch.  This is because:
      
       (1) at the point the filesystem calls generic_shutdown_super(), it is not
           permitted to further touch the superblock's set of dentries, and nor may
           it remove aliases from inodes;
      
       (2) the dcache memory shrinker now skips dentries that are being unmounted;
           and
      
       (3) the superblock no longer has any external references through which the VFS
           can reach it.
      
      Given these points, the only locking we need to do is when we remove dentries
      from the unused list and the name hashes, which we do a directory's worth at a
      time.
      
      We also don't need to guard against reference counts going to zero unexpectedly
      and removing bits of the tree we're working on as nothing else can call dput().
      
      A cut down version of dentry_iput() has been folded into
      shrink_dcache_for_umount_subtree() function.  Apart from not needing to unlock
      things, it also doesn't need to check for inotify watches.
      
      In this version of the patch, the complaint about a dentry still being in use
      has been expanded from a single BUG_ON() and now gives much more information.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      Acked-by: NIan Kent <raven@themaw.net>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c636ebdb
  20. 04 10月, 2006 1 次提交
    • N
      [PATCH] knfsd: close a race-opportunity in d_splice_alias · 21c0d8fd
      NeilBrown 提交于
      There is a possible race in d_splice_alias.  Though __d_find_alias(inode, 1)
      will only return a dentry with DCACHE_DISCONNECTED set, it is possible for it
      to get cleared before the BUG_ON, and it is is not possible to lock against
      that.
      
      There are a couple of problems here.  Firstly, the code doesn't match the
      comment.  The comment describes a 'disconnected' dentry as being IS_ROOT as
      well as DCACHE_DISCONNECTED, however there is not testing of IS_ROOT anythere.
      
      A dentry is marked DCACHE_DISCONNECTED when allocated with d_alloc_anon, and
      remains DCACHE_DISCONNECTED while a path is built up towards the root.  So a
      dentry can have a valid name and a valid parent and even grandparent, but will
      still be DCACHE_DISCONNECTED until a path to the root is created.  Once the
      path to the root is complete, everything in the path gets DCACHE_DISCONNECTED
      cleared.  So the fact that DCACHE_DISCONNECTED isn't enough to say that a
      dentry is free to be spliced in with a given name.  This can only be allowed
      if the dentry does not yet have a name, so the IS_ROOT test is needed too.
      
      However even adding that test to __d_find_alias isn't enough.  As
      d_splice_alias drops dcache_lock before calling d_move to perform the splice,
      it could race with another thread calling d_splice_alias to splice the inode
      in with a different name in a different part of the tree (in the case where a
      file has hard links).  So that splicing code is only really safe for
      directories (as we know that directories only have one link).  For
      directories, the caller of d_splice_alias will be holding i_mutex on the
      (unique) parent so there is no room for a race.
      
      A consequence of this is that a non-directory will never benefit from being
      spliced into a pre-exisiting dentry, but that isn't a problem.  It is
      perfectly OK for a non-directory to have multiple dentries, some anonymous,
      some not.  And the comment for d_splice_alias says that it only happens for
      directories anyway.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      21c0d8fd
  21. 01 10月, 2006 1 次提交
  22. 23 9月, 2006 1 次提交
    • D
      NFS: Add dentry materialisation op · 770bfad8
      David Howells 提交于
      The attached patch adds a new directory cache management function that prepares
      a disconnected anonymous function to be connected into the dentry tree. The
      anonymous dentry is transferred the name and parentage from another dentry.
      
      The following changes were made in [try #2]:
      
       (*) d_materialise_dentry() now switches the parentage of the two nodes around
           correctly when one or other of them is self-referential.
      
      The following changes were made in [try #7]:
      
       (*) d_instantiate_unique() has had the interior part split out as function
           __d_instantiate_unique(). Callers of this latter function must be holding
           the appropriate locks.
      
       (*) _d_rehash() has been added as a wrapper around __d_rehash() to call it
           with the most obvious hash list (the one from the name). d_rehash() now
           calls _d_rehash().
      
       (*) d_materialise_dentry() is now __d_materialise_dentry() and is static.
      
       (*) d_materialise_unique() added to perform the combination of d_find_alias(),
           d_materialise_dentry() and d_add_unique() that the NFS client was doing
           twice, all within a single dcache_lock critical section. This reduces the
           number of times two different spinlocks were being accessed.
      
      The following further changes were made:
      
       (*) Add the dentries onto their parents d_subdirs lists.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      770bfad8