1. 22 1月, 2015 1 次提交
    • A
      NFS: Fix use of nfs_attr_use_mounted_on_fileid() · 2ef47eb1
      Anna Schumaker 提交于
      This function call was being optimized out during nfs_fhget(), leading
      to situations where we have a valid fileid but still want to use the
      mounted_on_fileid.  For example, imagine we have our server configured
      like this:
      
      server % df
      Filesystem      Size  Used Avail Use% Mounted on
      /dev/vda1       9.1G  6.5G  1.9G  78% /
      /dev/vdb1       487M  2.3M  456M   1% /exports
      /dev/vdc1       487M  2.3M  456M   1% /exports/vol1
      /dev/vdd1       487M  2.3M  456M   1% /exports/vol2
      
      If our client mounts /exports and tries to do a "chown -R" across the
      entire mountpoint, we will get a nasty message warning us about a circular
      directory structure.  Running chown with strace tells me that each directory
      has the same device and inode number:
      
      newfstatat(AT_FDCWD, "/nfs/", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
      newfstatat(4, "vol1", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
      newfstatat(4, "vol2", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
      
      With this patch the mounted_on_fileid values are used for st_ino, so the
      directory loop warning isn't reported.
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      2ef47eb1
  2. 13 9月, 2014 1 次提交
  3. 10 9月, 2014 1 次提交
  4. 05 8月, 2014 1 次提交
    • E
      NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes · 65b38851
      Eric W. Biederman 提交于
      The usage of pid_ns->child_reaper->nsproxy->net_ns in
      nfs_server_list_open and nfs_client_list_open is not safe.
      
      /proc for a pid namespace can remain mounted after the all of the
      process in that pid namespace have exited.  There are also times
      before the initial process in a pid namespace has started or after the
      initial process in a pid namespace has exited where
      pid_ns->child_reaper can be NULL or stale.  Making the idiom
      pid_ns->child_reaper->nsproxy a double whammy of problems.
      
      Luckily all that needs to happen is to move /proc/fs/nfsfs/servers and
      /proc/fs/nfsfs/volumes under /proc/net to /proc/net/nfsfs/servers and
      /proc/net/nfsfs/volumes and add a symlink from the original location,
      and to use seq_open_net as it has been designed.
      
      Cc: stable@vger.kernel.org
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      65b38851
  5. 16 7月, 2014 1 次提交
    • N
      sched: Allow wait_on_bit_action() functions to support a timeout · c1221321
      NeilBrown 提交于
      It is currently not possible for various wait_on_bit functions
      to implement a timeout.
      
      While the "action" function that is called to do the waiting
      could certainly use schedule_timeout(), there is no way to carry
      forward the remaining timeout after a false wake-up.
      As false-wakeups a clearly possible at least due to possible
      hash collisions in bit_waitqueue(), this is a real problem.
      
      The 'action' function is currently passed a pointer to the word
      containing the bit being waited on.  No current action functions
      use this pointer.  So changing it to something else will be a
      little noisy but will have no immediate effect.
      
      This patch changes the 'action' function to take a pointer to
      the "struct wait_bit_key", which contains a pointer to the word
      containing the bit so nothing is really lost.
      
      It also adds a 'private' field to "struct wait_bit_key", which
      is initialized to zero.
      
      An action function can now implement a timeout with something
      like
      
      static int timed_out_waiter(struct wait_bit_key *key)
      {
      	unsigned long waited;
      	if (key->private == 0) {
      		key->private = jiffies;
      		if (key->private == 0)
      			key->private -= 1;
      	}
      	waited = jiffies - key->private;
      	if (waited > 10 * HZ)
      		return -EAGAIN;
      	schedule_timeout(waited - 10 * HZ);
      	return 0;
      }
      
      If any other need for context in a waiter were found it would be
      easy to use ->private for some other purpose, or even extend
      "struct wait_bit_key".
      
      My particular need is to support timeouts in nfs_release_page()
      to avoid deadlocks with loopback mounted NFS.
      
      While wait_on_bit_timeout() would be a cleaner interface, it
      will not meet my need.  I need the timeout to be sensitive to
      the state of the connection with the server, which could change.
       So I need to use an 'action' interface.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c1221321
  6. 13 7月, 2014 1 次提交
    • W
      nfs: handle multiple reqs in nfs_page_async_flush · d4581383
      Weston Andros Adamson 提交于
      Change nfs_find_and_lock_request so nfs_page_async_flush can handle multiple
      requests in a page. There is only one request for a page the first time
      nfs_page_async_flush is called, but if a write or commit fails, async_flush
      is called again and there may be multiple requests associated with the page.
      The solution is to merge all the requests in a page group into a single
      request before calling nfs_pageio_add_request.
      
      Rename nfs_find_and_lock_request to nfs_lock_and_join_requests and
      change it to first lock all requests for the page, then cancel and merge
      all subrequests into the head request.
      Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      d4581383
  7. 25 6月, 2014 3 次提交
  8. 12 6月, 2014 1 次提交
  9. 29 5月, 2014 16 次提交
  10. 07 5月, 2014 2 次提交
  11. 18 3月, 2014 1 次提交
  12. 18 2月, 2014 1 次提交
  13. 12 2月, 2014 1 次提交
  14. 10 2月, 2014 1 次提交
    • T
      NFS: Do not set NFS_INO_INVALID_LABEL unless server supports labeled NFS · fd1defc2
      Trond Myklebust 提交于
      Commit aa9c2669 (NFS: Client implementation of Labeled-NFS) introduces
      a performance regression. When nfs_zap_caches_locked is called, it sets
      the NFS_INO_INVALID_LABEL flag irrespectively of whether or not the
      NFS server supports security labels. Since that flag is never cleared,
      it means that all calls to nfs_revalidate_inode() will now trigger
      an on-the-wire GETATTR call.
      
      This patch ensures that we never set the NFS_INO_INVALID_LABEL unless the
      server advertises support for labeled NFS.
      It also causes nfs_setsecurity() to clear NFS_INO_INVALID_LABEL when it
      has successfully set the security label for the inode.
      Finally it gets rid of the NFS_INO_INVALID_LABEL cruft from nfs_update_inode,
      which has nothing to do with labeled NFS.
      Reported-by: NNeil Brown <neilb@suse.de>
      Cc: stable@vger.kernel.org # 3.11+
      Tested-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      fd1defc2
  15. 20 11月, 2013 1 次提交
  16. 29 10月, 2013 3 次提交
    • W
      NFS: add support for multiple sec= mount options · 4d4b69dd
      Weston Andros Adamson 提交于
      This patch adds support for multiple security options which can be
      specified using a colon-delimited list of security flavors (the same
      syntax as nfsd's exports file).
      
      This is useful, for instance, when NFSv4.x mounts cross SECINFO
      boundaries. With this patch a user can use "sec=krb5i,krb5p"
      to mount a remote filesystem using krb5i, but can still cross
      into krb5p-only exports.
      
      New mounts will try all security options before failing.  NFSv4.x
      SECINFO results will be compared against the sec= flavors to
      find the first flavor in both lists or if no match is found will
      return -EPERM.
      Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      4d4b69dd
    • W
      NFS: separate passed security flavs from selected · a3f73c27
      Weston Andros Adamson 提交于
      When filling parsed_mount_data, store the parsed sec= mount option in
      the new struct nfs_auth_info and the chosen flavor in selected_flavor.
      
      This patch lays the groundwork for supporting multiple sec= options.
      Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      a3f73c27
    • C
      NFS: Add nfs4_update_server · 32e62b7c
      Chuck Lever 提交于
      New function nfs4_update_server() moves an nfs_server to a different
      nfs_client.  This is done as part of migration recovery.
      
      Though it may be appealing to think of them as the same thing,
      migration recovery is not the same as following a referral.
      
      For a referral, the client has not descended into the file system
      yet: it has no nfs_server, no super block, no inodes or open state.
      It is enough to simply instantiate the nfs_server and super block,
      and perform a referral mount.
      
      For a migration, however, we have all of those things already, and
      they have to be moved to a different nfs_client.  No local namespace
      changes are needed here.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      32e62b7c
  17. 11 9月, 2013 1 次提交
    • D
      fs: convert fs shrinkers to new scan/count API · 1ab6c499
      Dave Chinner 提交于
      Convert the filesystem shrinkers to use the new API, and standardise some
      of the behaviours of the shrinkers at the same time.  For example,
      nr_to_scan means the number of objects to scan, not the number of objects
      to free.
      
      I refactored the CIFS idmap shrinker a little - it really needs to be
      broken up into a shrinker per tree and keep an item count with the tree
      root so that we don't need to walk the tree every time the shrinker needs
      to count the number of objects in the tree (i.e.  all the time under
      memory pressure).
      
      [glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree]
      [assorted fixes folded in]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NGlauber Costa <glommer@openvz.org>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Acked-by: NJan Kara <jack@suse.cz>
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1ab6c499
  18. 08 9月, 2013 1 次提交
  19. 07 9月, 2013 1 次提交
    • A
      NFSv4.1 Use MDS auth flavor for data server connection · 0e20162e
      Andy Adamson 提交于
      Commit 4edaa308 "NFS: Use "krb5i" to establish NFSv4 state whenever possible"
      uses the nfs_client cl_rpcclient for all state management operations, and
      will use krb5i or auth_sys with no regard to the mount command authflavor
      choice.
      
      The MDS, as any NFSv4.1 mount point, uses the nfs_server rpc client for all
      non-state management operations with a different nfs_server for each fsid
      encountered traversing the mount point, each with a potentially different
      auth flavor.
      
      pNFS data servers are not mounted in the normal sense as there is no associated
      nfs_server structure. Data servers can also export multiple fsids, each with
      a potentially different auth flavor.
      
      Data servers need to use the same authflavor as the MDS server rpc client for
      non-state management operations. Populate a list of rpc clients with the MDS
      server rpc client auth flavor for the DS to use.
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      0e20162e
  20. 04 9月, 2013 1 次提交
    • A
      NFS avoid expired credential keys for buffered writes · dc24826b
      Andy Adamson 提交于
      We must avoid buffering a WRITE that is using a credential key (e.g. a GSS
      context key) that is about to expire or has expired.  We currently will
      paint ourselves into a corner by returning success to the applciation
      for such a buffered WRITE, only to discover that we do not have permission when
      we attempt to flush the WRITE (and potentially associated COMMIT) to disk.
      
      Use the RPC layer credential key timeout and expire routines which use a
      a watermark, gss_key_expire_timeo. We test the key in nfs_file_write.
      
      If a WRITE is using a credential with a key that will expire within
      watermark seconds, flush the inode in nfs_write_end and send only
      NFS_FILE_SYNC WRITEs by adding nfs_ctx_key_to_expire to nfs_need_sync_write.
      Note that this results in single page NFS_FILE_SYNC WRITEs.
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      [Trond: removed a pr_warn_ratelimited() for now]
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      dc24826b