1. 18 6月, 2009 3 次提交
  2. 03 4月, 2009 2 次提交
    • D
      NFS: Add mount options to enable local caching on NFS · b797cac7
      David Howells 提交于
      Add NFS mount options to allow the local caching support to be enabled.
      
      The attached patch makes it possible for the NFS filesystem to be told to make
      use of the network filesystem local caching service (FS-Cache).
      
      To be able to use this, a recent nfsutils package is required.
      
      There are three variant NFS mount options that can be added to a mount command
      to control caching for a mount.  Only the last one specified takes effect:
      
       (*) Adding "fsc" will request caching.
      
       (*) Adding "fsc=<string>" will request caching and also specify a uniquifier.
      
       (*) Adding "nofsc" will disable caching.
      
      For example:
      
      	mount warthog:/ /a -o fsc
      
      The cache of a particular superblock (NFS FSID) will be shared between all
      mounts of that volume, provided they have the same connection parameters and
      are not marked 'nosharecache'.
      
      Where it is otherwise impossible to distinguish superblocks because all the
      parameters are identical, but the 'nosharecache' option is supplied, a
      uniquifying string must be supplied, else only the first mount will be
      permitted to use the cache.
      
      If there's a key collision, then the second mount will disable caching and give
      a warning into the kernel log.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
      b797cac7
    • D
      NFS: Define and create superblock-level objects · 08734048
      David Howells 提交于
      Define and create superblock-level cache index objects (as managed by
      nfs_server structs).
      
      Each superblock object is created in a server level index object and is itself
      an index into which inode-level objects are inserted.
      
      Ideally there would be one superblock-level object per server, and the former
      would be folded into the latter; however, since the "nosharecache" option
      exists this isn't possible.
      
      The superblock object key is a sequence consisting of:
      
       (1) Certain superblock s_flags.
      
       (2) Various connection parameters that serve to distinguish superblocks for
           sget().
      
       (3) The volume FSID.
      
       (4) The security flavour.
      
       (5) The uniquifier length.
      
       (6) The uniquifier text.  This is normally an empty string, unless the fsc=xyz
           mount option was used to explicitly specify a uniquifier.
      
      The key blob is of variable length, depending on the length of (6).
      
      The superblock object is given no coherency data to carry in the auxiliary data
      permitted by the cache.  It is assumed that the superblock is always coherent.
      
      This patch also adds uniquification handling such that two otherwise identical
      superblocks, at least one of which is marked "nosharecache", won't end up
      trying to share the on-disk cache.  It will be possible to manually provide a
      uniquifier through a mount option with a later patch to avoid the error
      otherwise produced.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
      08734048
  3. 20 3月, 2009 1 次提交
    • T
      NFS: Optimise NFS close() · 7fe5c398
      Trond Myklebust 提交于
      Close-to-open cache consistency rules really only require us to flush out
      writes on calls to close(), and require us to revalidate attributes on the
      very last close of the file.
      
      Currently we appear to be doing a lot of extra attribute revalidation
      and cache flushes.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      7fe5c398
  4. 12 3月, 2009 1 次提交
    • T
      NFS: Throttle page dirtying while we're flushing to disk · 72cb77f4
      Trond Myklebust 提交于
      The following patch is a combination of a patch by myself and Peter
      Staubach.
      
      Trond: If we allow other processes to dirty pages while a process is doing
      a consistency sync to disk, we can end up never making progress.
      
      Peter: Attached is a patch which addresses a continuing problem with
      the NFS client generating out of order WRITE requests.  While
      this is compliant with all of the current protocol
      specifications, there are servers in the market which can not
      handle out of order WRITE requests very well.  Also, this may
      lead to sub-optimal block allocations in the underlying file
      system on the server.  This may cause the read throughputs to
      be reduced when reading the file from the server.
      
      Peter: There has been a lot of work recently done to address out of
      order issues on a systemic level.  However, the NFS client is
      still susceptible to the problem.  Out of order WRITE
      requests can occur when pdflush is in the middle of writing
      out pages while the process dirtying the pages calls
      generic_file_buffered_write which calls
      generic_perform_write which calls
      balance_dirty_pages_rate_limited which ends up calling
      writeback_inodes which ends up calling back into the NFS
      client to writes out dirty pages for the same file that
      pdflush happens to be working with.
      Signed-off-by: NPeter Staubach <staubach@redhat.com>
      [modification by Trond to merge the two similar patches]
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      72cb77f4
  5. 24 12月, 2008 3 次提交
  6. 11 10月, 2008 1 次提交
  7. 08 10月, 2008 2 次提交
  8. 07 10月, 2008 1 次提交
  9. 10 7月, 2008 1 次提交
  10. 20 3月, 2008 3 次提交
  11. 06 3月, 2008 1 次提交
    • E
      NFS: use new LSM interfaces to explicitly set mount options · f9c3a380
      Eric Paris 提交于
      NFS and SELinux worked together previously because SELinux had NFS
      specific knowledge built in.  This design was approved by both groups
      back in 2004 but the recent NFS changes to use nfs_parsed_mount_data and
      the usage of nfs_clone_mount_data showed this to be a poor fragile
      solution.  This patch fixes the NFS functionality regression by making
      use of the new LSM interfaces to allow an FS to explicitly set its own
      mount options.
      
      The explicit setting of mount options is done in the nfs get_sb
      functions which are called before the generic vfs hooks try to set mount
      options for filesystems which use text mount data.
      
      This does not currently support NFSv4 as that functionality did not
      exist in previous kernels and thus there is no regression.  I will be
      adding the needed code, which I believe to be the exact same as the v3
      code, in nfs4_get_sb for 2.6.26.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      f9c3a380
  12. 26 2月, 2008 1 次提交
    • T
      NFS: Add an nfsiod workqueue · 5746006f
      Trond Myklebust 提交于
      NFS post-rpciod cleanups often involve tasks that cannot be safely
      performed within the rpciod context (due to deadlock concerns). We
      therefore add a dedicated NFS workqueue that can perform tasks like
      cleaning up state after an interrupted NFSv4 open() call, or calling
      put_nfs_open_context() after an asynchronous read or write call.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      5746006f
  13. 30 1月, 2008 7 次提交
  14. 10 10月, 2007 3 次提交
  15. 11 7月, 2007 1 次提交
  16. 01 5月, 2007 1 次提交
  17. 04 2月, 2007 1 次提交
  18. 06 12月, 2006 1 次提交
  19. 21 10月, 2006 1 次提交
  20. 23 9月, 2006 5 次提交
    • D
      NFS: Add server and volume lists to /proc · 6aaca566
      David Howells 提交于
      Make two new proc files available:
      
      	/proc/fs/nfsfs/servers
      	/proc/fs/nfsfs/volumes
      
      The first lists the servers with which we are currently dealing (struct
      nfs_client), and the second lists the volumes we have on those servers (struct
      nfs_server).
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      6aaca566
    • D
      NFS: Share NFS superblocks per-protocol per-server per-FSID · 54ceac45
      David Howells 提交于
      The attached patch makes NFS share superblocks between mounts from the same
      server and FSID over the same protocol.
      
      It does this by creating each superblock with a false root and returning the
      real root dentry in the vfsmount presented by get_sb(). The root dentry set
      starts off as an anonymous dentry if we don't already have the dentry for its
      inode, otherwise it simply returns the dentry we already have.
      
      We may thus end up with several trees of dentries in the superblock, and if at
      some later point one of anonymous tree roots is discovered by normal filesystem
      activity to be located in another tree within the superblock, the anonymous
      root is named and materialises attached to the second tree at the appropriate
      point.
      
      Why do it this way? Why not pass an extra argument to the mount() syscall to
      indicate the subpath and then pathwalk from the server root to the desired
      directory? You can't guarantee this will work for two reasons:
      
       (1) The root and intervening nodes may not be accessible to the client.
      
           With NFS2 and NFS3, for instance, mountd is called on the server to get
           the filehandle for the tip of a path. mountd won't give us handles for
           anything we don't have permission to access, and so we can't set up NFS
           inodes for such nodes, and so can't easily set up dentries (we'd have to
           have ghost inodes or something).
      
           With this patch we don't actually create dentries until we get handles
           from the server that we can use to set up their inodes, and we don't
           actually bind them into the tree until we know for sure where they go.
      
       (2) Inaccessible symbolic links.
      
           If we're asked to mount two exports from the server, eg:
      
      	mount warthog:/warthog/aaa/xxx /mmm
      	mount warthog:/warthog/bbb/yyy /nnn
      
           We may not be able to access anything nearer the root than xxx and yyy,
           but we may find out later that /mmm/www/yyy, say, is actually the same
           directory as the one mounted on /nnn. What we might then find out, for
           example, is that /warthog/bbb was actually a symbolic link to
           /warthog/aaa/xxx/www, but we can't actually determine that by talking to
           the server until /warthog is made available by NFS.
      
           This would lead to having constructed an errneous dentry tree which we
           can't easily fix. We can end up with a dentry marked as a directory when
           it should actually be a symlink, or we could end up with an apparently
           hardlinked directory.
      
           With this patch we need not make assumptions about the type of a dentry
           for which we can't retrieve information, nor need we assume we know its
           place in the grand scheme of things until we actually see that place.
      
      This patch reduces the possibility of aliasing in the inode and page caches for
      inodes that may be accessed by more than one NFS export. It also reduces the
      number of superblocks required for NFS where there are many NFS exports being
      used from a server (home directory server + autofs for example).
      
      This in turn makes it simpler to do local caching of network filesystems, as it
      can then be guaranteed that there won't be links from multiple inodes in
      separate superblocks to the same cache file.
      
      Obviously, cache aliasing between different levels of NFS protocol could still
      be a problem, but at least that gives us another key to use when indexing the
      cache.
      
      This patch makes the following changes:
      
       (1) The server record construction/destruction has been abstracted out into
           its own set of functions to make things easier to get right.  These have
           been moved into fs/nfs/client.c.
      
           All the code in fs/nfs/client.c has to do with the management of
           connections to servers, and doesn't touch superblocks in any way; the
           remaining code in fs/nfs/super.c has to do with VFS superblock management.
      
       (2) The sequence of events undertaken by NFS mount is now reordered:
      
           (a) A volume representation (struct nfs_server) is allocated.
      
           (b) A server representation (struct nfs_client) is acquired.  This may be
           	 allocated or shared, and is keyed on server address, port and NFS
           	 version.
      
           (c) If allocated, the client representation is initialised.  The state
           	 member variable of nfs_client is used to prevent a race during
           	 initialisation from two mounts.
      
           (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find
           	 the root filehandle for the mount (fs/nfs/getroot.c).  For NFS2/3 we
           	 are given the root FH in advance.
      
           (e) The volume FSID is probed for on the root FH.
      
           (f) The volume representation is initialised from the FSINFO record
           	 retrieved on the root FH.
      
           (g) sget() is called to acquire a superblock.  This may be allocated or
           	 shared, keyed on client pointer and FSID.
      
           (h) If allocated, the superblock is initialised.
      
           (i) If the superblock is shared, then the new nfs_server record is
           	 discarded.
      
           (j) The root dentry for this mount is looked up from the root FH.
      
           (k) The root dentry for this mount is assigned to the vfsmount.
      
       (3) nfs_readdir_lookup() creates dentries for each of the entries readdir()
           returns; this function now attaches disconnected trees from alternate
           roots that happen to be discovered attached to a directory being read (in
           the same way nfs_lookup() is made to do for lookup ops).
      
           The new d_materialise_unique() function is now used to do this, thus
           permitting the whole thing to be done under one set of locks, and thus
           avoiding any race between mount and lookup operations on the same
           directory.
      
       (4) The client management code uses a new debug facility: NFSDBG_CLIENT which
           is set by echoing 1024 to /proc/net/sunrpc/nfs_debug.
      
       (5) Clone mounts are now called xdev mounts.
      
       (6) Use the dentry passed to the statfs() op as the handle for retrieving fs
           statistics rather than the root dentry of the superblock (which is now a
           dummy).
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      54ceac45
    • D
      NFS: Eliminate client_sys in favour of cl_rpcclient · 5006a76c
      David Howells 提交于
      Eliminate nfs_server::client_sys in favour of nfs_client::cl_rpcclient as we
      only really need one per server that we're talking to since it doesn't have any
      security on it.
      
      The retransmission management variables are also moved to the common struct as
      they're required to set up the cl_rpcclient connection.
      
      The NFS2/3 client and client_acl connections are thenceforth derived by cloning
      the cl_rpcclient connection and post-applying the authorisation flavour.
      
      The code for setting up the initial common connection has been moved to
      client.c as nfs_create_rpc_client().  All the NFS program definition tables are
      also moved there as that's where they're now required rather than super.c.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      5006a76c
    • D
      NFS: Generalise the nfs_client structure · 24c8dbbb
      David Howells 提交于
      Generalise the nfs_client structure by:
      
       (1) Moving nfs_client to a more general place (nfs_fs_sb.h).
      
       (2) Renaming its maintenance routines to be non-NFS4 specific.
      
       (3) Move those maintenance routines to a new non-NFS4 specific file (client.c)
           and move the declarations to internal.h.
      
       (4) Make nfs_find/get_client() take a full sockaddr_in to include the port
           number (will be required for NFS2/3).
      
       (5) Make nfs_find/get_client() take the NFS protocol version (again will be
           required to differentiate NFS2, 3 & 4 client records).
      
      Also:
      
       (6) Make nfs_client construction proceed akin to inodes, marking them as under
           construction and providing a function to indicate completion.
      
       (7) Make nfs_get_client() wait interruptibly if it finds a client that it can
           share, but that client is currently being constructed.
      
       (8) Make nfs4_create_client() use (6) and (7) instead of locking cl_sem.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      24c8dbbb
    • D
      NFS: Fix up split of fs/nfs/inode.c · 7d4e2747
      David Howells 提交于
      Fix ups for the splitting of the superblock stuff out of fs/nfs/inode.c,
      including:
      
       (*) Move the callback tcpport module param into callback.c.
      
       (*) Move the idmap cache timeout module param into idmap.c.
      
       (*) Changes to internal.h:
      
           (*) namespace-nfs4.c was renamed to nfs4namespace.c.
      
           (*) nfs_stat_to_errno() is in nfs2xdr.c, not nfs4xdr.c.
      
           (*) nfs4xdr.c is contingent on CONFIG_NFS_V4.
      
           (*) nfs4_path() is only uses if CONFIG_NFS_V4 is set.
      
      Plus also:
      
       (*) The sec_flavours[] table should really be const.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      7d4e2747