- 05 12月, 2009 10 次提交
-
-
由 Andy Adamson 提交于
If the session is reset during state recovery, the state manager thread can sleep on the slot_tbl_waitq causing a deadlock. Add a completion framework to the session. Have the state manager thread set a new session state (NFS4CLNT_SESSION_DRAINING) and wait for the session slot table to drain. Signal the state manager thread in nfs41_sequence_free_slot when the NFS4CLNT_SESSION_DRAINING bit is set and the session is drained. Reported-by: NTrond Myklebust <trond@netapp.com> Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
nfs4_recover_session can put rpciod to sleep. Just use nfs4_schedule_recovery. Reported-by: NTrond Myklebust <trond.myklebust@netapp.com> Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
Do not fall through and set NFS4CLNT_SESSION_RESET bit on NFS4ERR_EXPIRED Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
Do not fall through and call nfs4_delay on session error handling. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
nfs4_read_done returns zero on unhandled errors. nfs_readpage_result will return on a negative tk_status without freeing the slot. Call nfs4_sequence_free_slot on unhandled errors in nfs4_read_done. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
nfs41_sequence_free_slot can be called multiple times on SEQUENCE operation errors. No reason to inline nfs4_restart_rpc Reported-by: NTrond Myklebust <trond.myklebust@netapp.com> nfs_writeback_done and nfs_readpage_retry call nfs4_restart_rpc outside the error handler, and the slot is not freed prior to restarting in the rpc_prepare state during session reset. Fix this by moving the call to nfs41_sequence_free_slot from the error path of nfs41_sequence_done into nfs4_restart_rpc, and by removing the test for NFS4CLNT_SESSION_SETUP. Always free slot and goto the rpc prepare state on async errors. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
Make this clear by calling rpc_restart-call. Prepare for nfs4_restart_rpc() to free slots. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
The bit is no longer used for session setup, only for session reset. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Andy Adamson 提交于
Reported-by: NTrond Myklebust <trond.myklebust@netapp.com> Resetting the clientid from the state manager could result in not confirming the clientid due to create session not being called. Move the create session call from the NFS4CLNT_SESSION_SETUP state manager initialize session case into the NFS4CLNT_LEASE_EXPIRED case establish_clid call. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 04 12月, 2009 29 次提交
-
-
由 Trond Myklebust 提交于
-
由 NeilBrown 提交于
NFS4ERR_FILE_OPEN is return by the server when an operation cannot be performed because the file is currently open and local (to the server) semantics prohibit the operation while the file is open. A typical case is a RENAME operation on an MS-Windows platform, which prevents rename while the file is open. While it is possible that such a condition is transitory, it is also very possible that the file will be held open for an extended period of time thus preventing the operation. The current behaviour of Linux/NFS is to retry the operation indefinitely. This is not appropriate - we do not expect a rename to take an arbitrary amount of time to complete. Rather, and error should be returned. The most obvious error code would be EBUSY, which is a legal at least for 'rename' and 'unlink', and accurately captures the reason for the error. This patch allows a few retries until about 2 seconds have elapsed, then returns EBUSY. Signed-off-by: NNeilBrown <neilb@suse.de> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
-
由 Miklos Szeredi 提交于
The d_instantiate(new_dentry, NULL) is superfluous, the dentry is already negative. Rehashing this dummy dentry isn't needed either, d_move() works fine on an unhashed target. The re-checking for busy after a failed nfs_sillyrename() is bogus too: new_dentry->d_count < 2 would be a bug here. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Miklos Szeredi 提交于
Move unhashing the target to after the check for existence and being a non-directory. If renaming a directory then the VFS already unhashes the target if it is not busy. If it's busy then acquiring more references during the rename makes no difference. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Miklos Szeredi 提交于
Comments are wrong or out of date. In particular d_drop() doesn't free the inode it just unhashes the dentry. And if target is a directory then it is not checked for being busy. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Miklos Szeredi 提交于
VFS already checks if both source and target are directories. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Introduce soft connect behavior for UDP transports. In this case, a major timeout returns ETIMEDOUT instead of EIO. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Currently, if a remote RPC service is unreachable, an RPC ping will hang until the underlying transport connect attempt times out. A more desirable behavior might be to have the ping fail immediately so upper layers can recover appropriately. In the case of an NFS mount, for instance, this would mean the mount(2) system call could fail immediately if the server isn't listening, rather than hanging uninterruptibly for more than 3 minutes. Change rpc_ping() so that it fails immediately for connection-oriented transports. rpc_create() will then fail immediately for such transports if an RPC ping was requested. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Autobinding is handled by the rpciod process, not in user processes that are generating regular RPC requests. Thus autobinding is usually not affected by signals targetting user processes, such as KILL or timer expiration events. In addition, an RPC request generated by a user process that has RPC_TASK_SOFTCONN set and needs to perform an autobind will hang if the remote rpcbind service is not available. For rpcbind queries on connection-oriented transports, let's use the new soft connect semantic to return control to the user's process quickly, if the kernel's rpcbind client can't connect to the remote rpcbind service. Logic is introduced in call_bind_status() to handle connection errors that occurred during an asynchronous rpcbind query. The logic abandons the rpcbind query if the RPC request has SOFTCONN set, and retries after a few seconds in the normal case. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Use TCP with the soft connect semantic for local rpcbind upcalls so the kernel can detect immediately if the local rpcbind daemon is not running. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
The kernel's rpcbind client creates and deletes an rpc_clnt and its underlying transport socket for every upcall to the local rpcbind daemon. When starting a typical NFS server on IPv4 and IPv6, the NFS service itself does three upcalls (one per version) times two upcalls (one per transport) times two upcalls (one per address family), making 12, plus another one for the initial call to unregister previous NFS services. Starting the NLM service adds an additional 13 upcalls, for similar reasons. (Currently the NFS service doesn't start IPv6 listeners, but it will soon enough). Instead, let's create an rpc_clnt for rpcbind upcalls during the first local rpcbind query, and cache it. This saves the overhead of creating and destroying an rpc_clnt and a socket for every upcall. The new logic also prevents the kernel from attempting an RPCB_SET or RPCB_UNSET if it knows from the start that the local portmapper does not support rpcbind protocol version 4. This will cut down on the number of rpcbind upcalls in legacy environments. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
Clean up: At one point, rpcb_local_clnt() handled IPv6 loopback addresses too, but it doesn't any more; only IPv4 loopback is used now. Get rid of the @addr and @addrlen arguments to rpcb_local_clnt(). Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
The kernel sometimes makes RPC calls to services that aren't running. Because the kernel's RPC client always assumes the hard retry semantic when reconnecting a connection-oriented RPC transport, the underlying reconnect logic takes a long while to time out, even though the remote may have responded immediately with ECONNREFUSED. In certain cases, like upcalls to our local rpcbind daemon, or for NFS mount requests, we'd like the kernel to fail immediately if the remote service isn't reachable. This allows another transport to be tried immediately, or the pending request can be abandoned quickly. Introduce a per-request flag which controls how call_transmit_status() behaves when request transmission fails because the server cannot be reached. We don't want soft connection semantics to apply to other errors. The default case of the switch statement in call_transmit_status() no longer falls through; the fall through code is copied to the default case, and a "break;" is added. The transport's connection re-establishment timeout is also ignored for such requests. We want the request to fail immediately, so the reconnect delay is skipped. Additionally, we don't want a connect failure here to further increase the reconnect timeout value, since this request will not be retried. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
The success case, where task->tk_status == 0, is by far the most frequent case in call_transmit_status(). The default: arm of the switch statement in call_transmit_status() handles the 0 case. default: was moved close to the top of the switch statement in call_transmit_status() under the theory that the compiler places object code for the earliest arms of a switch statement first, making the CPU do less work. The default: arm of a switch statement, however, is executed only after all the other cases have been checked. Even if the compiler rearranges the object code, the default: arm is the "last resort", meaning all of the other cases have been explicitly exhausted. That makes the current arrangement about as inefficient as it gets for the common case. To fix this, add an explicit check for zero before the switch statement. That forces the compiler to do the zero check first, no matter what optimizations it might try to do to the switch statement. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
When the "rsize=" or "wsize=" mount options are not specified, text-based mounts have slightly different behavior than legacy binary mounts. Text-based mounts use the smaller of the server's maximum and the client's maximum, but binary mounts use the smaller of the server's _preferred_ size and the client's maximum. This difference is actually pretty subtle. Most servers advertise the same value as their maximum and their preferred transfer size, so the end result is the same in most cases. The reason for this difference is that for text-based mounts, if r/wsize are not specified, they are set to the largest value supported by the client. For legacy mounts, the values are set to zero if these options are not specified. nfs_server_set_fsinfo() can negotiate the transfer size defaults correctly in any case. There's no need to specify any particular value as default in the text-based option parsing logic. Note that nfs4 doesn't use nfs_server_set_fsinfo(), but the mount.nfs4 command does set rsize and wsize to 0 if the user didn't specify these options. So, make the same change for text-based NFSv4 mounts. Thanks to James Pearson <james-p@moving-picture.com> for reporting and diagnosing the problem. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Recent changes to snprintf() introduced the %pI6c formatter, which can display an IPv6 address with standard shorthanding. Use this new formatter when displaying IPv6 server addresses in /proc/mounts. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Recent changes to snprintf() introduced the %pI6c formatter, which can display an IPv6 address with standard shorthanding. Using a shorthanded address can save us a few bytes of memory for each stored presentation address, or a few bytes on the wire when sending these in a universal address. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Richard Kennedy 提交于
reorder nfs4_sequence_args to remove 8 bytes of padding on 64 bit builds. The size of this structure drops to 24 bytes from 32 and reduces the text size of nfs.ko. On my x86_64 size reports text data bss 2.6.32-rc5 200996 8512 432 209940 33414 nfs.ko +patch 200884 8512 432 209828 333a4 nfs.ko Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Jeff Layton 提交于
Solaris uses netids as values for the proto= option, so that when someone specifies "tcp6" they get traffic over TCP + IPv6. Until recently, this has never really been an issue for Linux since it didn't support NFS over IPv6. The netid and the protocol name were generally always the same (modulo any strange configuration in /etc/netconfig). The solaris manpage documents their proto= option as: proto= _netid_ | rdma This patch is intended to bring Linux closer to how the Solaris proto= option works, by declaring a static netid mapping in the kernel and converting the proto= and mountproto= options to follow it and display the proper values in /proc/mounts. Much of this functionality will need to be provided by a userspace mount.nfs patch. Chuck Lever has a patch to change mount.nfs in the same way. In principle, we could do *all* of this in userspace but that would mean that the options in /proc/mounts may not match the options used by userspace. The alternative to the static mapping here is to add a mechanism to upcall to userspace for netid's. I'm not opposed to that option, but it'll probably mean more overhead (and quite a bit more code). Rather than shoot for that at first, I figured it was probably better to start simply. Comments welcome. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 J. Bruce Fields 提交于
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Fix another 'sparse' warning in fs/nfs/nfs4proc.c Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Fix two 'sparse' warnings in fs/nfs/dns_resolve.c Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
The nfs4_state_manager should not be looking at the error values when deciding whether or not to loop round in order to handle a higher priority state recovery task. It should rather be looking at the clp->cl_state. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
If our lease expires, and the server reboots while we're recovering, we need to be able to wait until the grace period is over. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
nfs4_recovery_handle_error() will correctly handle errors such as NFS4ERR_CB_PATH_DOWN, however because they are still passed back to the main loop in nfs4_state_manager(), they can cause the latter to exit prematurely. Fix this by letting nfs4_recovery_handle_error() change the error value in cases where there is no action required by the caller. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
In practice, we need to ensure that we call nfs4_state_end_reclaim_reboot in 2 cases: - If we lose the lease while we were reclaiming state OR - After we're done with reboot recovery Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 03 12月, 2009 1 次提交
-
-
由 Trond Myklebust 提交于
The nfsv4 state manager could potentially deadlock inside __nfs_inode_return_delegation() if the server reboots, so that the calls to nfs_msync_inode() end up waiting on state recovery to complete. Also ensure that if a server reboot or network partition causes us to have to stop returning delegations, that NFS4CLNT_DELEGRETURN is set so that the state manager can resume any outstanding delegation returns after it has dealt with the state recovery situation. Finally, ensure that the state manager doesn't wait for the DELEGRETURN call to complete. It doesn't need to, and that too can cause a deadlock. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-