1. 08 1月, 2009 1 次提交
    • J
      nfsd: fix double-locks of directory mutex · 9a8d248e
      J. Bruce Fields 提交于
      A number of nfsd operations depend on the i_mutex to cover more code
      than just the fsync, so the approach of 4c728ef5 "add a vfs_fsync
      helper" doesn't work for nfsd.  Revert the parts of those patches that
      touch nfsd.
      
      Note: we can't, however, remove the logic from vfs_fsync that was needed
      only for the special case of nfsd, because a vfs_fsync(NULL,...) call
      can still result indirectly from a stackable filesystem that was called
      by nfsd.  (Thanks to Christoph Hellwig for pointing this out.)
      Reported-by: NEric Sesterhenn <snakebyte@gmx.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      9a8d248e
  2. 06 1月, 2009 2 次提交
    • C
      add a vfs_fsync helper · 4c728ef5
      Christoph Hellwig 提交于
      Fsync currently has a fdatawrite/fdatawait pair around the method call,
      and a mutex_lock/unlock of the inode mutex.  All callers of fsync have
      to duplicate this, but we have a few and most of them don't quite get
      it right.  This patch adds a new vfs_fsync that takes care of this.
      It's a little more complicated as usual as ->fsync might get a NULL file
      pointer and just a dentry from nfsd, but otherwise gets afile and we
      want to take the mapping and file operations from it when it is there.
      
      Notes on the fsync callers:
      
       - ecryptfs wasn't calling filemap_fdatawrite / filemap_fdatawait on the
         	lower file
       - coda wasn't calling filemap_fdatawrite / filemap_fdatawait on the host
      	file, and returning 0 when ->fsync was missing
       - shm wasn't calling either filemap_fdatawrite / filemap_fdatawait nor
         taking i_mutex.  Now given that shared memory doesn't have disk
         backing not doing anything in fsync seems fine and I left it out of
         the vfs_fsync conversion for now, but in that case we might just
         not pass it through to the lower file at all but just call the no-op
         simple_sync_file directly.
      
      [and now actually export vfs_fsync]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4c728ef5
    • A
      inode->i_op is never NULL · acfa4380
      Al Viro 提交于
      We used to have rather schizophrenic set of checks for NULL ->i_op even
      though it had been eliminated years ago.  You'd need to go out of your
      way to set it to NULL explicitly _and_ a bunch of code would die on
      such inodes anyway.  After killing two remaining places that still
      did that bogosity, all that crap can go away.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      acfa4380
  3. 14 11月, 2008 2 次提交
  4. 10 11月, 2008 1 次提交
    • D
      Fix nfsd truncation of readdir results · b726e923
      Doug Nazar 提交于
      Commit 8d7c4203 "nfsd: fix failure to set eof in readdir in some
      situations" introduced a bug: on a directory in an exported ext3
      filesystem with dir_index unset, a READDIR will only return about 250
      entries, even if the directory was larger.
      
      Bisected it back to this commit; reverting it fixes the problem.
      
      It turns out that in this case ext3 reads a block at a time, then
      returns from readdir, which means we can end up with buf.full==0 but
      with more entries in the directory still to be read.  Before 8d7c4203
      (but after c002a6c7 "Optimise NFS readdir hack slightly"), this would
      cause us to return the READDIR result immediately, but with the eof bit
      unset.  That could cause a performance regression (because the client
      would need more roundtrips to the server to read the whole directory),
      but no loss in correctness, since the cleared eof bit caused the client
      to send another readdir.  After 8d7c4203, the setting of the eof bit
      made this a correctness problem.
      
      So, move nfserr_eof into the loop and remove the buf.full check so that
      we loop until buf.used==0.  The following seems to do the right thing
      and reduces the network traffic since we don't return a READDIR result
      until the buffer is full.
      
      Tested on an empty directory & large directory; eof is properly sent and
      there are no more short buffers.
      Signed-off-by: NDoug Nazar <nazard@dragoninc.ca>
      Cc: David Woodhouse <David.Woodhouse@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      b726e923
  5. 31 10月, 2008 1 次提交
    • J
      nfsd: fix failure to set eof in readdir in some situations · 8d7c4203
      J. Bruce Fields 提交于
      Before 14f7dd63 "[PATCH] Copy XFS
      readdir hack into nfsd code", readdir_cd->err was reset to eof before
      each call to vfs_readdir; afterwards, it is set only once.  Similarly,
      c002a6c7 "[PATCH] Optimise NFS readdir
      hack slightly", can cause us to exit without nfserr_eof set.  Fix this.
      
      This ensures the "eof" bit is set when needed in readdir replies.  (The
      particular case I saw was an nfsv4 readdir of an empty directory, which
      returned with no entries (the protocol requires "." and ".." to be
      filtered out), but with eof unset.)
      
      Cc: David Woodhouse <David.Woodhouse@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      8d7c4203
  6. 23 10月, 2008 5 次提交
  7. 30 9月, 2008 2 次提交
    • J
      knfsd: allocate readahead cache in individual chunks · 54a66e54
      Jeff Layton 提交于
      I had a report from someone building a large NFS server that they were
      unable to start more than 585 nfsd threads. It was reported against an
      older kernel using the slab allocator, and I tracked it down to the
      large allocation in nfsd_racache_init failing.
      
      It appears that the slub allocator handles large allocations better,
      but large contiguous allocations can often be problematic. There
      doesn't seem to be any reason that the racache has to be allocated as a
      single large chunk. This patch breaks this up so that the racache is
      built up from separate allocations.
      
      (Thanks also to Takashi Iwai for a bugfix.)
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Cc: Takashi Iwai <tiwai@suse.de>
      54a66e54
    • J
      nfsd: permit unauthenticated stat of export root · 04716e66
      J. Bruce Fields 提交于
      RFC 2623 section 2.3.2 permits the server to bypass gss authentication
      checks for certain operations that a client may perform when mounting.
      In the case of a client that doesn't have some form of credentials
      available to it on boot, this allows it to perform the mount unattended.
      (Presumably real file access won't be needed until a user with
      credentials logs in.)
      
      Being slightly more lenient allows lots of old clients to access
      krb5-only exports, with the only loss being a small amount of
      information leaked about the root directory of the export.
      
      This affects only v2 and v3; v4 still requires authentication for all
      access.
      
      Thanks to Peter Staubach testing against a Solaris client, which
      suggesting addition of v3 getattr, to the list, and to Trond for noting
      that doing so exposes no additional information.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Cc: Peter Staubach <staubach@redhat.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      04716e66
  8. 27 7月, 2008 2 次提交
  9. 02 7月, 2008 1 次提交
  10. 24 6月, 2008 1 次提交
  11. 24 4月, 2008 4 次提交
  12. 19 4月, 2008 6 次提交
  13. 15 2月, 2008 1 次提交
  14. 02 2月, 2008 2 次提交
    • J
      nfsd: allow root to set uid and gid on create · 5c002b3b
      J. Bruce Fields 提交于
      The server silently ignores attempts to set the uid and gid on create.
      Based on the comment, this appears to have been done to prevent some
      overly-clever IRIX client from causing itself problems.
      
      Perhaps we should remove that hack completely.  For now, at least, it
      makes sense to allow root (when no_root_squash is set) to set uid and
      gid.
      
      While we're there, since nfsd_create and nfsd_create_v3 share the same
      logic, pull that out into a separate function.  And spell out the
      individual modifications of ia_valid instead of doing them both at once
      inside a conditional.
      
      Thanks to Roger Willcocks <roger@filmlight.ltd.uk> for the bug report
      and original patch on which this is based.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      5c002b3b
    • C
      NFSD: Adjust filename length argument of nfsd_lookup · 5a022fc8
      Chuck Lever 提交于
      Clean up: adjust the sign of the length argument of nfsd_lookup and
      nfsd_lookup_dentry, for consistency with recent changes.  NFSD version
      4 callers already pass an unsigned file name length.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Acked-By: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      5a022fc8
  15. 20 10月, 2007 1 次提交
  16. 19 10月, 2007 1 次提交
  17. 17 10月, 2007 2 次提交
    • S
      Implement file posix capabilities · b5376771
      Serge E. Hallyn 提交于
      Implement file posix capabilities.  This allows programs to be given a
      subset of root's powers regardless of who runs them, without having to use
      setuid and giving the binary all of root's powers.
      
      This version works with Kaigai Kohei's userspace tools, found at
      http://www.kaigai.gr.jp/index.php.  For more information on how to use this
      patch, Chris Friedhoff has posted a nice page at
      http://www.friedhoff.org/fscaps.html.
      
      Changelog:
      	Nov 27:
      	Incorporate fixes from Andrew Morton
      	(security-introduce-file-caps-tweaks and
      	security-introduce-file-caps-warning-fix)
      	Fix Kconfig dependency.
      	Fix change signaling behavior when file caps are not compiled in.
      
      	Nov 13:
      	Integrate comments from Alexey: Remove CONFIG_ ifdef from
      	capability.h, and use %zd for printing a size_t.
      
      	Nov 13:
      	Fix endianness warnings by sparse as suggested by Alexey
      	Dobriyan.
      
      	Nov 09:
      	Address warnings of unused variables at cap_bprm_set_security
      	when file capabilities are disabled, and simultaneously clean
      	up the code a little, by pulling the new code into a helper
      	function.
      
      	Nov 08:
      	For pointers to required userspace tools and how to use
      	them, see http://www.friedhoff.org/fscaps.html.
      
      	Nov 07:
      	Fix the calculation of the highest bit checked in
      	check_cap_sanity().
      
      	Nov 07:
      	Allow file caps to be enabled without CONFIG_SECURITY, since
      	capabilities are the default.
      	Hook cap_task_setscheduler when !CONFIG_SECURITY.
      	Move capable(TASK_KILL) to end of cap_task_kill to reduce
      	audit messages.
      
      	Nov 05:
      	Add secondary calls in selinux/hooks.c to task_setioprio and
      	task_setscheduler so that selinux and capabilities with file
      	cap support can be stacked.
      
      	Sep 05:
      	As Seth Arnold points out, uid checks are out of place
      	for capability code.
      
      	Sep 01:
      	Define task_setscheduler, task_setioprio, cap_task_kill, and
      	task_setnice to make sure a user cannot affect a process in which
      	they called a program with some fscaps.
      
      	One remaining question is the note under task_setscheduler: are we
      	ok with CAP_SYS_NICE being sufficient to confine a process to a
      	cpuset?
      
      	It is a semantic change, as without fsccaps, attach_task doesn't
      	allow CAP_SYS_NICE to override the uid equivalence check.  But since
      	it uses security_task_setscheduler, which elsewhere is used where
      	CAP_SYS_NICE can be used to override the uid equivalence check,
      	fixing it might be tough.
      
      	     task_setscheduler
      		 note: this also controls cpuset:attach_task.  Are we ok with
      		     CAP_SYS_NICE being used to confine to a cpuset?
      	     task_setioprio
      	     task_setnice
      		 sys_setpriority uses this (through set_one_prio) for another
      		 process.  Need same checks as setrlimit
      
      	Aug 21:
      	Updated secureexec implementation to reflect the fact that
      	euid and uid might be the same and nonzero, but the process
      	might still have elevated caps.
      
      	Aug 15:
      	Handle endianness of xattrs.
      	Enforce capability version match between kernel and disk.
      	Enforce that no bits beyond the known max capability are
      	set, else return -EPERM.
      	With this extra processing, it may be worth reconsidering
      	doing all the work at bprm_set_security rather than
      	d_instantiate.
      
      	Aug 10:
      	Always call getxattr at bprm_set_security, rather than
      	caching it at d_instantiate.
      
      [morgan@kernel.org: file-caps clean up for linux/capability.h]
      [bunk@kernel.org: unexport cap_inode_killpriv]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: James Morris <jmorris@namei.org>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Andrew Morgan <morgan@kernel.org>
      Signed-off-by: NAndrew Morgan <morgan@kernel.org>
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5376771
    • D
      r/o bind mounts: create cleanup helper svc_msnfs() · a8754bee
      Dave Hansen 提交于
      I'm going to be modifying nfsd_rename() shortly to support read-only bind
      mounts.  This #ifdef is around the area I'm patching, and it starts to get
      really ugly if I just try to add my new code by itself.  Using this little
      helper makes things a lot cleaner to use.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Acked-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8754bee
  18. 10 10月, 2007 3 次提交
  19. 11 9月, 2007 1 次提交
  20. 01 8月, 2007 1 次提交