1. 01 1月, 2009 2 次提交
    • A
      fix switch_names() breakage in short-to-short case · dc711ca3
      Al Viro 提交于
      We want ->name.len to match the resulting name on *both*
      source and target
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      dc711ca3
    • N
      shrink struct dentry · c2452f32
      Nick Piggin 提交于
      struct dentry is one of the most critical structures in the kernel. So it's
      sad to see it going neglected.
      
      With CONFIG_PROFILING turned on (which is probably the common case at least
      for distros and kernel developers), sizeof(struct dcache) == 208 here
      (64-bit). This gives 19 objects per slab.
      
      I packed d_mounted into a hole, and took another 4 bytes off the inline
      name length to take the padding out from the end of the structure. This
      shinks it to 200 bytes. I could have gone the other way and increased the
      length to 40, but I'm aiming for a magic number, read on...
      
      I then got rid of the d_cookie pointer. This shrinks it to 192 bytes. Rant:
      why was this ever a good idea? The cookie system should increase its hash
      size or use a tree or something if lookups are a problem. Also the "fast
      dcookie lookups" in oprofile should be moved into the dcookie code -- how
      can oprofile possibly care about the dcookie_mutex? It gets dropped after
      get_dcookie() returns so it can't be providing any sort of protection.
      
      At 192 bytes, 21 objects fit into a 4K page, saving about 3MB on my system
      with ~140 000 entries allocated. 192 is also a multiple of 64, so we get
      nice cacheline alignment on 64 and 32 byte line systems -- any given dentry
      will now require 3 cachelines to touch all fields wheras previously it
      would require 4.
      
      I know the inline name size was chosen quite carefully, however with the
      reduction in cacheline footprint, it should actually be just about as fast
      to do a name lookup for a 36 character name as it was before the patch (and
      faster for other sizes). The memory footprint savings for names which are
      <= 32 or > 36 bytes long should more than make up for the memory cost for
      33-36 byte names.
      
      Performance is a feature...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c2452f32
  2. 23 10月, 2008 9 次提交
  3. 29 9月, 2008 1 次提交
    • L
      Fix NULL pointer dereference in proc_sys_compare · d0185c08
      Linus Torvalds 提交于
      The VFS interface for the 'd_compare()' is a bit special (read: 'odd'),
      because it really just essentially replaces a memcmp().  The filesystem
      is supposed to just compare the two names with whatever case-independent
      or other function.
      
      And when I say 'is supposed to', I obviously mean that 'procfs does odd
      things, and actually looks at the dentry that we don't even pass down,
      rather than just the name'.  Which results in problems, because we
      actually call d_compare before we have even verified that the dentry is
      still hashed at all.
      
      And that causes a problm since the inode that procfs looks at may have
      been free'd and the d_inode pointer is NULL.  procfs just assumes that
      all dentries are positive, since procfs itself never generates a
      negative one.  But memory pressure will still result in the dentry
      getting torn down, and as it is removed by RCU, it still remains visible
      on some lists - and to d_compare.
      
      If the filesystem just did a name comparison, we wouldn't care.  And we
      could just fix procfs to know about negative dentries too.  But rather
      than have the low-level filesystems know about internal VFS details,
      just move the check for a unhashed dentry up a bit, so that we will only
      call d_compare on dentries that are still active.
      
      The actual oops this caused didn't look like a NULL pointer dereference
      because procfs did a 'container_of(inode, struct proc_inode, vfs_inode)'
      to get at its internal proc_inode information from the inode pointer,
      and accessed a field below the inode. So the oops would look something
      like
      
      	BUG: unable to handle kernel paging request at fffffffffffffff0
      	IP: [<ffffffff802bc6c6>] proc_sys_compare+0x36/0x50
      
      and was seen on both x86-64 (Alexey Dobriyan and Hugh Dickins) and
      ppc64 (Hugh Dickins).
      Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: NHugh Dickins <hugh@veritas.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-of-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0185c08
  4. 25 8月, 2008 1 次提交
  5. 28 7月, 2008 1 次提交
  6. 27 7月, 2008 1 次提交
  7. 25 7月, 2008 1 次提交
    • K
      fix soft lock up at NFS mount via per-SB LRU-list of unused dentries · da3bbdd4
      Kentaro Makita 提交于
      [Summary]
      
       Split LRU-list of unused dentries to one per superblock to avoid soft
       lock up during NFS mounts and remounting of any filesystem.
      
       Previously I posted here:
       http://lkml.org/lkml/2008/3/5/590
      
      [Descriptions]
      
      - background
      
        dentry_unused is a list of dentries which are not referenced.
        dentry_unused grows up when references on directories or files are
        released.  This list can be very long if there is huge free memory.
      
      - the problem
      
        When shrink_dcache_sb() is called, it scans all dentry_unused linearly
        under spin_lock(), and if dentry->d_sb is differnt from given
        superblock, scan next dentry.  This scan costs very much if there are
        many entries, and very ineffective if there are many superblocks.
      
        IOW, When we need to shrink unused dentries on one dentry, but scans
        unused dentries on all superblocks in the system.  For example, we scan
        500 dentries to unmount a filesystem, but scans 1,000,000 or more unused
        dentries on other superblocks.
      
        In our case , At mounting NFS*, shrink_dcache_sb() is called to shrink
        unused dentries on NFS, but scans 100,000,000 unused dentries on
        superblocks in the system such as local ext3 filesystems.  I hear NFS
        mounting took 1 min on some system in use.
      
      * : NFS uses virtual filesystem in rpc layer, so NFS is affected by
        this problem.
      
        100,000,000 is possible number on large systems.
      
        Per-superblock LRU of unused dentried can reduce the cost in
        reasonable manner.
      
      - How to fix
      
        I found this problem is solved by David Chinner's "Per-superblock
        unused dentry LRU lists V3"(1), so I rebase it and add some fix to
        reclaim with fairness, which is in Andrew Morton's comments(2).
      
        1) http://lkml.org/lkml/2006/5/25/318
        2) http://lkml.org/lkml/2006/5/25/320
      
        Split LRU-list of unused dentries to each superblocks.  Then, NFS
        mounting will check dentries under a superblock instead of all.  But
        this spliting will break LRU of dentry-unused.  So, I've attempted to
        make reclaim unused dentrins with fairness by calculate number of
        dentries to scan on this sb based on following way
      
        number of dentries to scan on this sb =
        count * (number of dentries on this sb / number of dentries in the machine)
      
      - ToDo
       - I have to measuring performance number and do stress tests.
      
       - When unmount occurs during prune_dcache(), scanning on same
        superblock, It is unable to reach next superblock because it is gone
        away.  We restart scannig superblock from first one, it causes
        unfairness of reclaim unused dentries on first superblock.  But I think
        this happens very rarely.
      
      - Test Results
      
        Result on 6GB boxes with excessive unused dentries.
      
      Without patch:
      
      $ cat /proc/sys/fs/dentry-state
      10181835        10180203        45      0       0       0
      # mount -t nfs 10.124.60.70:/work/kernel-src nfs
      real    0m1.830s
      user    0m0.001s
      sys     0m1.653s
      
       With this patch:
      $ cat /proc/sys/fs/dentry-state
      10236610        10234751        45      0       0       0
      # mount -t nfs 10.124.60.70:/work/kernel-src nfs
      real    0m0.106s
      user    0m0.002s
      sys     0m0.032s
      
      [akpm@linux-foundation.org: fix comments]
      Signed-off-by: NKentaro Makita <k-makita@np.css.fujitsu.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      da3bbdd4
  8. 24 6月, 2008 3 次提交
  9. 23 6月, 2008 1 次提交
  10. 23 4月, 2008 2 次提交
  11. 15 2月, 2008 5 次提交
  12. 07 2月, 2008 2 次提交
  13. 22 10月, 2007 1 次提交
  14. 21 10月, 2007 1 次提交
    • A
      [PATCH] audit: watching subtrees · 74c3cbe3
      Al Viro 提交于
      New kind of audit rule predicates: "object is visible in given subtree".
      The part that can be sanely implemented, that is.  Limitations:
      	* if you have hardlink from outside of tree, you'd better watch
      it too (or just watch the object itself, obviously)
      	* if you mount something under a watched tree, tell audit
      that new chunk should be added to watched subtrees
      	* if you umount something in a watched tree and it's still mounted
      elsewhere, you will get matches on events happening there.  New command
      tells audit to recalculate the trees, trimming such sources of false
      positives.
      
      Note that it's _not_ about path - if something mounted in several places
      (multiple mount, bindings, different namespaces, etc.), the match does
      _not_ depend on which one we are using for access.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      74c3cbe3
  15. 17 10月, 2007 6 次提交
  16. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  17. 18 7月, 2007 1 次提交
  18. 09 5月, 2007 1 次提交