1. 09 5月, 2009 1 次提交
  2. 03 4月, 2009 1 次提交
    • I
      kmemtrace, fs: uninline simple_transaction_set() · 76791ab2
      Ingo Molnar 提交于
      Impact: cleanup
      
      We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace
      changes), but this is not possible currently without breaking the
      build because fs.h has an implicit include file depedency: it
      uses PAGE_SIZE but does not include asm/page.h which defines it.
      
      This problem gets masked in practice because most fs.h using sites
      use rcupreempt.h (and other headers) which includes percpu.h which
      brings in asm/page.h indirectly.
      
      We cannot add asm/page.h to asm/fs.h because page.h is not an
      exported header.
      
      Move simple_transaction_set() to the other simple-transaction
      file helpers in fs/libfs.c.
      
      This removes the include file hell and also reduces
      kernel size a bit.
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: paulmck@linux.vnet.ibm.com
      LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      76791ab2
  3. 28 3月, 2009 2 次提交
  4. 06 1月, 2009 1 次提交
  5. 05 1月, 2009 1 次提交
    • N
      fs: symlink write_begin allocation context fix · 54566b2c
      Nick Piggin 提交于
      With the write_begin/write_end aops, page_symlink was broken because it
      could no longer pass a GFP_NOFS type mask into the point where the
      allocations happened.  They are done in write_begin, which would always
      assume that the filesystem can be entered from reclaim.  This bug could
      cause filesystem deadlocks.
      
      The funny thing with having a gfp_t mask there is that it doesn't really
      allow the caller to arbitrarily tinker with the context in which it can be
      called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
      take the page lock.  The only thing any callers care about is __GFP_FS
      anyway, so turn that into a single flag.
      
      Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
      this flag in their write_begin function.  Change __grab_cache_page to
      accept a nofs argument as well, to honour that flag (while we're there,
      change the name to grab_cache_page_write_begin which is more instructive
      and does away with random leading underscores).
      
      This is really a more flexible way to go in the end anyway -- if a
      filesystem happens to want any extra allocations aside from the pagecache
      ones in ints write_begin function, it may now use GFP_KERNEL (rather than
      GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
      random example).
      
      [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
      [kosaki.motohiro@jp.fujitsu.com: fix fuse]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@kernel.org>		[2.6.28.x]
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ Cleaned up the calling convention: just pass in the AOP flags
        untouched to the grab_cache_page_write_begin() function.  That
        just simplifies everybody, and may even allow future expansion of the
        logic.   - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54566b2c
  6. 31 10月, 2008 1 次提交
  7. 23 10月, 2008 1 次提交
    • C
      [PATCH] new helper: d_obtain_alias · 4ea3ada2
      Christoph Hellwig 提交于
      The calling conventions of d_alloc_anon are rather unfortunate for all
      users, and it's name is not very descriptive either.
      
      Add d_obtain_alias as a new exported helper that drops the inode
      reference in the failure case, too and allows to pass-through NULL
      pointers and inodes to allow for tail-calls in the export operations.
      
      Incidentally this helper already existed as a private function in
      libfs.c as exportfs_d_alloc so kill that one and switch the callers
      to d_obtain_alias.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4ea3ada2
  8. 31 7月, 2008 1 次提交
    • A
      VFS: increase pseudo-filesystem block size to PAGE_SIZE · 3971e1a9
      Alex Nixon 提交于
      This commit:
      
          commit ba52de12
          Author: Theodore Ts'o <tytso@mit.edu>
          Date:   Wed Sep 27 01:50:49 2006 -0700
      
              [PATCH] inode-diet: Eliminate i_blksize from the inode structure
      
      caused the block size used by pseudo-filesystems to decrease from
      PAGE_SIZE to 1024 leading to a doubling of the number of context switches
      during a kernbench run.
      Signed-off-by: NAlex Nixon <Alex.Nixon@citrix.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3971e1a9
  9. 05 7月, 2008 1 次提交
  10. 07 6月, 2008 1 次提交
    • A
      introduce memory_read_from_buffer() · 93b07113
      Akinobu Mita 提交于
      This patch introduces memory_read_from_buffer().
      
      The only difference between memory_read_from_buffer() and
      simple_read_from_buffer() is which address space the function copies to.
      
      simple_read_from_buffer copies to user space memory.
      memory_read_from_buffer copies to normal memory.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Doug Warzecha <Douglas_Warzecha@dell.com>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: Matt Domsch <Matt_Domsch@dell.com>
      Cc: Abhay Salunke <Abhay_Salunke@dell.com>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Markus Rechberger <markus.rechberger@amd.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Bob Moore <robert.moore@intel.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "Antonino A. Daplas" <adaplas@pol.net>
      Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
      Cc: Michael Holzheu <holzheu@de.ibm.com>
      Cc: Brian King <brking@us.ibm.com>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Andrew Vasquez <linux-driver@qlogic.com>
      Cc: Seokmann Ju <seokmann.ju@qlogic.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93b07113
  11. 09 2月, 2008 3 次提交
  12. 06 2月, 2008 1 次提交
    • C
      Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user · eebd2aa3
      Christoph Lameter 提交于
      Simplify page cache zeroing of segments of pages through 3 functions
      
      zero_user_segments(page, start1, end1, start2, end2)
      
              Zeros two segments of the page. It takes the position where to
              start and end the zeroing which avoids length calculations and
      	makes code clearer.
      
      zero_user_segment(page, start, end)
      
              Same for a single segment.
      
      zero_user(page, start, length)
      
              Length variant for the case where we know the length.
      
      We remove the zero_user_page macro. Issues:
      
      1. Its a macro. Inline functions are preferable.
      
      2. The KM_USER0 macro is only defined for HIGHMEM.
      
         Having to treat this special case everywhere makes the
         code needlessly complex. The parameter for zeroing is always
         KM_USER0 except in one single case that we open code.
      
      Avoiding KM_USER0 makes a lot of code not having to be dealing
      with the special casing for HIGHMEM anymore. Dealing with
      kmap is only necessary for HIGHMEM configurations. In those
      configurations we use KM_USER0 like we do for a series of other
      functions defined in highmem.h.
      
      Since KM_USER0 is depends on HIGHMEM the existing zero_user_page
      function could not be a macro. zero_user_* functions introduced
      here can be be inline because that constant is not used when these
      functions are called.
      
      Also extract the flushing of the caches to be outside of the kmap.
      
      [akpm@linux-foundation.org: fix nfs and ntfs build]
      [akpm@linux-foundation.org: fix ntfs build some more]
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: David Chinner <dgc@sgi.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eebd2aa3
  13. 22 10月, 2007 1 次提交
    • C
      exportfs: add new methods · 2596110a
      Christoph Hellwig 提交于
      Add the guts for the new filesystem API to exportfs.
      
      There's now a fh_to_dentry method that returns a dentry for the object looked
      for given a filehandle fragment, and a fh_to_parent operation that returns the
      dentry for the encoded parent directory in case the file handle contains it.
      
      There are default implementations for these methods that only take a callback
      for an nfs-enhanced iget variant and implement the rest of the semantics.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: David Chinner <dgc@sgi.com>
      Cc: Timothy Shimmin <tes@sgi.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Chris Mason <mason@suse.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: "Vladimir V. Saveliev" <vs@namesys.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2596110a
  14. 17 10月, 2007 2 次提交
  15. 09 5月, 2007 2 次提交
  16. 05 3月, 2007 1 次提交
  17. 21 2月, 2007 1 次提交
    • N
      [PATCH] fs: fix libfs data leak · 955eff5a
      Nick Piggin 提交于
      simple_prepare_write leaks uninitialised kernel data.  This happens because
      the it leaves an uninitialised "hole" over the part of the page that the
      write is expected to go to.  This is fine, but it then marks the page
      uptodate, which means a concurrent read can come in and copy the
      uninitialised memory into userspace before it written to.
      
      Fix it by simply marking it uptodate in simple_commit_write instead, after
      the hole has been filled in.  This could theoretically break an fs that
      uses simple_prepare_write and not simple_commit_write, and that relies on
      the incorrect simple_prepare_write behaviour.  Luckily, none of those
      exists in the tree.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      955eff5a
  18. 13 2月, 2007 2 次提交
  19. 09 12月, 2006 1 次提交
  20. 01 10月, 2006 2 次提交
  21. 30 9月, 2006 1 次提交
  22. 27 9月, 2006 2 次提交
    • T
      [PATCH] inode-diet: Eliminate i_blksize from the inode structure · ba52de12
      Theodore Ts'o 提交于
      This eliminates the i_blksize field from struct inode.  Filesystems that want
      to provide a per-inode st_blksize can do so by providing their own getattr
      routine instead of using the generic_fillattr() function.
      
      Note that some filesystems were providing pretty much random (and incorrect)
      values for i_blksize.
      
      [bunk@stusta.de: cleanup]
      [akpm@osdl.org: generic_fillattr() fix]
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ba52de12
    • T
      [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private · 8e18e294
      Theodore Ts'o 提交于
      The following patches reduce the size of the VFS inode structure by 28 bytes
      on a UP x86.  (It would be more on an x86_64 system).  This is a 10% reduction
      in the inode size on a UP kernel that is configured in a production mode
      (i.e., with no spinlock or other debugging functions enabled; if you want to
      save memory taken up by in-core inodes, the first thing you should do is
      disable the debugging options; they are responsible for a huge amount of bloat
      in the VFS inode structure).
      
      This patch:
      
      The filesystem or device-specific pointer in the inode is inside a union,
      which is pretty pointless given that all 30+ users of this field have been
      using the void pointer.  Get rid of the union and rename it to i_private, with
      a comment to explain who is allowed to use the void pointer.  This is just a
      cleanup, but it allows us to reuse the union 'u' for something something where
      the union will actually be used.
      
      [judith@osdl.org: powerpc build fix]
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NJudith Lebzelter <judith@osdl.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8e18e294
  23. 27 6月, 2006 1 次提交
  24. 23 6月, 2006 2 次提交
    • D
      [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry · 726c3342
      David Howells 提交于
      Give the statfs superblock operation a dentry pointer rather than a superblock
      pointer.
      
      This complements the get_sb() patch.  That reduced the significance of
      sb->s_root, allowing NFS to place a fake root there.  However, NFS does
      require a dentry to use as a target for the statfs operation.  This permits
      the root in the vfsmount to be used instead.
      
      linux/mount.h has been added where necessary to make allyesconfig build
      successfully.
      
      Interest has also been expressed for use with the FUSE and XFS filesystems.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      726c3342
    • D
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells 提交于
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      454e2398
  25. 09 6月, 2006 1 次提交
  26. 29 3月, 2006 1 次提交
  27. 23 3月, 2006 1 次提交
  28. 04 2月, 2006 1 次提交
  29. 10 1月, 2006 1 次提交
  30. 09 1月, 2006 1 次提交
    • E
      [PATCH] shrink dentry struct · 5160ee6f
      Eric Dumazet 提交于
      Some long time ago, dentry struct was carefully tuned so that on 32 bits
      UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
      of memory cache lines.
      
      Then RCU was added and dentry struct enlarged by two pointers, with nice
      results for SMP, but not so good on UP, because breaking the above tuning
      (128 + 8 = 136 bytes)
      
      This patch reverts this unwanted side effect, by using an union (d_u),
      where d_rcu and d_child are placed so that these two fields can share their
      memory needs.
      
      At the time d_free() is called (and d_rcu is really used), d_child is known
      to be empty and not touched by the dentry freeing.
      
      Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
      the previous content of d_child is not needed if said dentry was unhashed
      but still accessed by a CPU because of RCU constraints)
      
      As dentry cache easily contains millions of entries, a size reduction is
      worth the extra complexity of the ugly C union.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Maneesh Soni <maneesh@in.ibm.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Neil Brown <neilb@cse.unsw.edu.au>
      Cc: James Morris <jmorris@namei.org>
      Cc: Stephen Smalley <sds@epoch.ncsc.mil>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5160ee6f
  31. 26 6月, 2005 1 次提交