1. 13 1月, 2011 1 次提交
  2. 07 1月, 2011 1 次提交
    • N
      fs: icache RCU free inodes · fa0d7e3d
      Nick Piggin 提交于
      RCU free the struct inode. This will allow:
      
      - Subsequent store-free path walking patch. The inode must be consulted for
        permissions when walking, so an RCU inode reference is a must.
      - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
        to take i_lock no longer need to take sb_inode_list_lock to walk the list in
        the first place. This will simplify and optimize locking.
      - Could remove some nested trylock loops in dcache code
      - Could potentially simplify things a bit in VM land. Do not need to take the
        page lock to follow page->mapping.
      
      The downsides of this is the performance cost of using RCU. In a simple
      creat/unlink microbenchmark, performance drops by about 10% due to inability to
      reuse cache-hot slab objects. As iterations increase and RCU freeing starts
      kicking over, this increases to about 20%.
      
      In cases where inode lifetimes are longer (ie. many inodes may be allocated
      during the average life span of a single inode), a lot of this cache reuse is
      not applicable, so the regression caused by this patch is smaller.
      
      The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
      however this adds some complexity to list walking and store-free path walking,
      so I prefer to implement this at a later date, if it is shown to be a win in
      real situations. I haven't found a regression in any non-micro benchmark so I
      doubt it will be a problem.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      fa0d7e3d
  3. 29 10月, 2010 1 次提交
  4. 05 10月, 2010 2 次提交
    • A
      BKL: Remove BKL from afs · 77f2fe03
      Arnd Bergmann 提交于
      The BKL is only used in put_super and fill_super, which are both protected
      by the superblocks s_umount rw_semaphore. Therefore it is safe to remove
      the BKL entirely.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: linux-afs@lists.infradead.org
      Cc: David Howells <dhowells@redhat.com>
      77f2fe03
    • J
      BKL: Explicitly add BKL around get_sb/fill_super · db719222
      Jan Blunck 提交于
      This patch is a preparation necessary to remove the BKL from do_new_mount().
      It explicitly adds calls to lock_kernel()/unlock_kernel() around
      get_sb/fill_super operations for filesystems that still uses the BKL.
      
      I've read through all the code formerly covered by the BKL inside
      do_kern_mount() and have satisfied myself that it doesn't need the BKL
      any more.
      
      do_kern_mount() is already called without the BKL when mounting the rootfs
      and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
      from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
      through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
      afs_mntpt_follow_link(). Both later functions are actually the filesystems
      follow_link inode operation. vfs_kern_mount() is calling the specified
      get_sb function and lets the filesystem do its job by calling the given
      fill_super function.
      
      Therefore I think it is safe to push down the BKL from the VFS to the
      low-level filesystems get_sb/fill_super operation.
      
      [arnd: do not add the BKL to those file systems that already
             don't use it elsewhere]
      Signed-off-by: NJan Blunck <jblunck@infradead.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Christoph Hellwig <hch@infradead.org>
      db719222
  5. 12 8月, 2010 1 次提交
    • W
      AFS: Implement an autocell mount capability [ver #2] · bec5eb61
      wanglei 提交于
      Implement the ability for the root directory of a mounted AFS filesystem to
      accept lookups of arbitrary directory names, to interpet the names as the names
      of cells, to look the cell names up in the DNS for AFSDB records and to mount
      the root.cell volume of the nominated cell on the pseudo-directory created by
      lookup.
      
      This facility is requested by passing:
      
      	-o autocell
      
      to the mountpoint for which this is desired, usually the /afs mount.
      
      To use this facility, a DNS upcall program is required for AFSDB records.  This
      can be obtained from:
      
      	http://people.redhat.com/~dhowells/afs/dns.afsdb.c
      
      It should be compiled with -lresolv and -lkeyutils and installed as, say:
      
      	/usr/sbin/dns.afsdb
      
      Then the following line needs to be added to /sbin/request-key.conf:
      
      	create	dns_resolver afsdb:*	*	/usr/sbin/dns.afsdb %k
      
      This can be tested by mounting AFS, say:
      
      	insmod dns_resolver.ko
      	insmod af-rxrpc.ko
      	insmod kafs.ko rootcell=grand.central.org
      	mount -t afs "#grand.central.org:root.cell." /afs -o autocell
      
      and doing:
      
      	ls /afs/grand.central.org/
      
      which should show:
      
      	archive/  cvs/  doc/  local/  project/  service/  software/  user/  www/
      
      if it works.
      Signed-off-by: NWang Lei <wang840925@gmail.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      bec5eb61
  6. 10 8月, 2010 1 次提交
  7. 22 4月, 2010 1 次提交
  8. 06 3月, 2010 1 次提交
    • C
      make sure data is on disk before calling ->write_inode · 26821ed4
      Christoph Hellwig 提交于
      Similar to the fsync issue fixed a while ago in commit
      2daea67e we need to write for data to
      actually hit the disk before writing out the metadata to guarantee
      data integrity for filesystems that modify the inode in the data I/O
      completion path.  Currently XFS and NFS handle this manually, and AFS
      has a write_inode method that does nothing but waiting for data, while
      others are possibly missing out on this.
      
      Fortunately this change has a lot less impact than the fsync change
      as none of the write_inode methods starts data writeout of any form
      by itself.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      26821ed4
  9. 13 7月, 2009 1 次提交
  10. 12 6月, 2009 1 次提交
    • C
      push BKL down into ->put_super · 6cfd0148
      Christoph Hellwig 提交于
      Move BKL into ->put_super from the only caller.  A couple of
      filesystems had trivial enough ->put_super (only kfree and NULLing of
      s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs,
      hugetlbfs, omfs, qnx4, shmem, all others got the full treatment.  Most
      of them probably don't need it, but I'd rather sort that out individually.
      Preferably after all the other BKL pushdowns in that area.
      
      [AV: original used to move lock_super() down as well; these changes are
      removed since we don't do lock_super() at all in generic_shutdown_super()
      now]
      [AV: fuse, btrfs and xfs are known to need no damn BKL, exempt]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6cfd0148
  11. 09 5月, 2009 2 次提交
  12. 14 10月, 2008 1 次提交
  13. 27 7月, 2008 1 次提交
  14. 07 6月, 2008 1 次提交
  15. 28 3月, 2008 1 次提交
  16. 09 2月, 2008 1 次提交
  17. 17 10月, 2007 1 次提交
  18. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  19. 17 7月, 2007 1 次提交
  20. 22 5月, 2007 1 次提交
    • A
      Detach sched.h from mm.h · e8edc6e0
      Alexey Dobriyan 提交于
      First thing mm.h does is including sched.h solely for can_do_mlock() inline
      function which has "current" dereference inside. By dealing with can_do_mlock()
      mm.h can be detached from sched.h which is good. See below, why.
      
      This patch
      a) removes unconditional inclusion of sched.h from mm.h
      b) makes can_do_mlock() normal function in mm/mlock.c
      c) exports can_do_mlock() to not break compilation
      d) adds sched.h inclusions back to files that were getting it indirectly.
      e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
         getting them indirectly
      
      Net result is:
      a) mm.h users would get less code to open, read, preprocess, parse, ... if
         they don't need sched.h
      b) sched.h stops being dependency for significant number of files:
         on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
         after patch it's only 3744 (-8.3%).
      
      Cross-compile tested on
      
      	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
      	alpha alpha-up
      	arm
      	i386 i386-up i386-defconfig i386-allnoconfig
      	ia64 ia64-up
      	m68k
      	mips
      	parisc parisc-up
      	powerpc powerpc-up
      	s390 s390-up
      	sparc sparc-up
      	sparc64 sparc64-up
      	um-x86_64
      	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
      
      as well as my two usual configs.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8edc6e0
  21. 17 5月, 2007 2 次提交
  22. 11 5月, 2007 2 次提交
  23. 10 5月, 2007 1 次提交
    • D
      AFS: implement basic file write support · 31143d5d
      David Howells 提交于
      Implement support for writing to regular AFS files, including:
      
       (1) write
      
       (2) truncate
      
       (3) fsync, fdatasync
      
       (4) chmod, chown, chgrp, utime.
      
      AFS writeback attempts to batch writes into as chunks as large as it can manage
      up to the point that it writes back 65535 pages in one chunk or it meets a
      locked page.
      
      Furthermore, if a page has been written to using a particular key, then should
      another write to that page use some other key, the first write will be flushed
      before the second is allowed to take place.  If the first write fails due to a
      security error, then the page will be scrapped and reread before the second
      write takes place.
      
      If a page is dirty and the callback on it is broken by the server, then the
      dirty data is not discarded (same behaviour as NFS).
      
      Shared-writable mappings are not supported by this patch.
      
      [akpm@linux-foundation.org: fix a bunch of warnings]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      31143d5d
  24. 08 5月, 2007 1 次提交
    • C
      slab allocators: Remove SLAB_DEBUG_INITIAL flag · 50953fe9
      Christoph Lameter 提交于
      I have never seen a use of SLAB_DEBUG_INITIAL.  It is only supported by
      SLAB.
      
      I think its purpose was to have a callback after an object has been freed
      to verify that the state is the constructor state again?  The callback is
      performed before each freeing of an object.
      
      I would think that it is much easier to check the object state manually
      before the free.  That also places the check near the code object
      manipulation of the object.
      
      Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
      compiled with SLAB debugging on.  If there would be code in a constructor
      handling SLAB_DEBUG_INITIAL then it would have to be conditional on
      SLAB_DEBUG otherwise it would just be dead code.  But there is no such code
      in the kernel.  I think SLUB_DEBUG_INITIAL is too problematic to make real
      use of, difficult to understand and there are easier ways to accomplish the
      same effect (i.e.  add debug code before kfree).
      
      There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
      clear in fs inode caches.  Remove the pointless checks (they would even be
      pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.
      
      This is the last slab flag that SLUB did not support.  Remove the check for
      unimplemented flags from SLUB.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50953fe9
  25. 03 5月, 2007 1 次提交
    • D
      [AFS/AF_RXRPC]: Miscellaneous fixes. · 80c72fe4
      David Howells 提交于
      Make miscellaneous fixes to AFS and AF_RXRPC:
      
       (*) Make AF_RXRPC select KEYS rather than RXKAD or AFS_FS in Kconfig.
      
       (*) Don't use FS_BINARY_MOUNTDATA.
      
       (*) Remove a done 'TODO' item in a comemnt on afs_get_sb().
      
       (*) Don't pass a void * as the page pointer argument of kmap_atomic() as this
           breaks on m68k.  Patch from Geert Uytterhoeven <geert@linux-m68k.org>.
      
       (*) Use match_*() functions rather than doing my own parsing.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80c72fe4
  26. 27 4月, 2007 5 次提交
  27. 13 2月, 2007 1 次提交
  28. 08 12月, 2006 3 次提交
  29. 23 6月, 2006 1 次提交
    • D
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells 提交于
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      454e2398
  30. 09 6月, 2006 1 次提交