- 01 2月, 2017 1 次提交
-
-
由 Kinglong Mee 提交于
nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid(). If nfsd doesn't go through init_lock_stateid() and put stateid at end, there is a NULL reference to .sc_free when calling nfs4_put_stid(ns). This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid(). Cc: stable@vger.kernel.org Fixes: 356a95ec "nfsd: clean up races in lock stateid searching..." Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 27 9月, 2016 3 次提交
-
-
由 Jeff Layton 提交于
It's possible for a client to call in on a lock that is blocked for a long time, but discontinue polling for it. A malicious client could even set a lock on a file, and then spam the server with failing lock requests from different lockowners that pile up in a DoS attack. Add the blocked lock structures to a per-net namespace LRU when hashing them, and timestamp them. If the lock request is not revisited after a lease period, we'll drop it under the assumption that the client is no longer interested. This also gives us a mechanism to clean up these objects at server shutdown time as well. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Jeff Layton 提交于
Create a new per-lockowner+per-inode structure that contains a file_lock. Have nfsd4_lock add this structure to the lockowner's list prior to setting the lock. Then call the vfs and request a blocking lock (by setting FL_SLEEP). If we get anything besides FILE_LOCK_DEFERRED back, then we dequeue the block structure and free it. When the next lock request comes in, we'll look for an existing block for the same filehandle and dequeue and reuse it if there is one. When the lock comes free (a'la an lm_notify call), we dequeue it from the lockowner's list and kick off a CB_NOTIFY_LOCK callback to inform the client that it should retry the lock request. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Jeff Layton 提交于
Add the encoding/decoding for CB_NOTIFY_LOCK operations. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 17 9月, 2016 1 次提交
-
-
由 Jeff Layton 提交于
We already have that info in the client pointer. No need to pass around a copy. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 14 7月, 2016 1 次提交
-
-
由 Andrew Elble 提交于
This addresses the conundrum referenced in RFC5661 18.35.3, and will allow clients to return state to the server using the machine credentials. The biggest part of the problem is that we need to allow the client to send a compound op with integrity/privacy on mounts that don't have it enabled. Add server support for properly decoding and using spo_must_enforce and spo_must_allow bits. Add support for machine credentials to be used for CLOSE, OPEN_DOWNGRADE, LOCKU, DELEGRETURN, and TEST/FREE STATEID. Implement a check so as to not throw WRONGSEC errors when these operations are used if integrity/privacy isn't turned on. Without this, Linux clients with credentials that expired while holding delegations were getting stuck in an endless loop. Signed-off-by: NAndrew Elble <aweits@rit.edu> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 16 6月, 2016 1 次提交
-
-
由 Oleg Drokin 提交于
It used to be the case that state had an rwlock that was locked for write by downgrades, but for read for upgrades (opens). Well, the problem is if there are two competing opens for the same state, they step on each other toes potentially leading to leaking file descriptors from the state structure, since access mode is a bitmap only set once. Signed-off-by: NOleg Drokin <green@linuxhacker.ru> Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 14 5月, 2016 1 次提交
-
-
由 Jeff Layton 提交于
Move the existing static function to an inline helper, and call it. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 08 12月, 2015 1 次提交
-
-
由 Anna Schumaker 提交于
This will be needed so COPY can look up the saved_fh in addition to the current_fh. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 24 11月, 2015 1 次提交
-
-
由 Julia Lawall 提交于
The nfsd4_callback_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 24 10月, 2015 2 次提交
-
-
由 Jeff Layton 提交于
Bruce points out that the increment of the seqid in stateids is not serialized in any way, so it's possible for racing calls to bump it twice and end up sending the same stateid. While we don't have any reports of this problem it _is_ theoretically possible, and could lead to spurious state recovery by the client. In the current code, update_stateid is always followed by a memcpy of that stateid, so we can combine the two operations. For better atomicity, we add a spinlock to the nfs4_stid and hold that when bumping the seqid and copying the stateid. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Jeff Layton 提交于
In order to allow the client to make a sane determination of what happened with racing LAYOUTGET/LAYOUTRETURN/CB_LAYOUTRECALL calls, we must ensure that the seqids return accurately represent the order of operations. The simplest way to do that is to ensure that operations on a single stateid are serialized. This patch adds a mutex to the layout stateid, and locks it when checking the layout stateid's seqid. The mutex is held over the entire operation and released after the seqid is bumped. Note that in the case of CB_LAYOUTRECALL we must move the increment of the seqid and setting into a new cb "prepare" operation. The lease infrastructure will call the lm_break callback with a spinlock held, so and we can't take the mutex in that codepath. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 13 10月, 2015 1 次提交
-
-
由 Jeff Layton 提交于
Andrew was seeing a race occur when an OPEN and OPEN_DOWNGRADE were running in parallel. The server would receive the OPEN_DOWNGRADE first and check its seqid, but then an OPEN would race in and bump it. The OPEN_DOWNGRADE would then complete and bump the seqid again. The result was that the OPEN_DOWNGRADE would be applied after the OPEN, even though it should have been rejected since the seqid changed. The only recourse we have here I think is to serialize operations that bump the seqid in a stateid, particularly when we're given a seqid in the call. To address this, we add a new rw_semaphore to the nfs4_ol_stateid struct. We do a down_write prior to checking the seqid after looking up the stateid to ensure that nothing else is going to bump it while we're operating on it. In the case of OPEN, we do a down_read, as the call doesn't contain a seqid. Those can run in parallel -- we just need to serialize them when there is a concurrent OPEN_DOWNGRADE or CLOSE. LOCK and LOCKU however always take the write lock as there is no opportunity for parallelizing those. Reported-and-Tested-by: NAndrew W Elble <aweits@rit.edu> Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 13 8月, 2015 1 次提交
-
-
由 Kinglong Mee 提交于
According to Christoph's advice, this patch introduce a new helper nfsd4_cb_sequence_done() for processing more callback errors, following the example of the client's nfs41_sequence_done(). Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 23 6月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
This patch changes nfs4_preprocess_stateid_op so it always returns a valid struct file if it has been asked for that. For that we now allocate a temporary struct file for special stateids, and check permissions if we got the file structure from the stateid. This ensures that all callers will get their handling of special stateids right, and avoids code duplication. There is a little wart in here because the read code needs to know if we allocated a file structure so that it can copy around the read-ahead parameters. In the long run we should probably aim to cache full file structures used with special stateids instead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 05 6月, 2015 1 次提交
-
-
由 Kinglong Mee 提交于
When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED. It is caused by nfs return NFS4ERR_DELAY before validate_seqid(), don't update the sequnce id, but nfsd updates the sequnce id !!! According to RFC5661 20.9.3, " If CB_SEQUENCE returns an error, then the state of the slot (sequence ID, cached reply) MUST NOT change. " Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 05 5月, 2015 3 次提交
-
-
由 Christoph Hellwig 提交于
Checking the rpc_client pointer is not a reliable way to detect backchannel changes: cl_cb_client is changed only after shutting down the rpc client, so the condition cl_cb_client = tk_client will always be true. Check the RPC_TASK_KILLED flag instead, and rewrite the code to avoid the buggy cl_callbacks list and fix the lifetime rules due to double calls of the ->prepare callback operations method for this retry case. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Christoph Hellwig 提交于
We must only increment the sequence id if the client has seen and responded to a request. If we failed to deliver it to the client we must resend with the same sequence id. So just like the client track errors at the transport level differently from those returned in the XDR. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Sachin Bhamare 提交于
For the sake of forgetful clients, the server should return the layouts to the file system on 'last close' of a file (assuming that there are no delegations outstanding to that particular client) or on delegreturn (assuming that there are no opens on a file from that particular client). In theory the information is all there in current data structures, but it's not efficiently available; nfs4_file->fi_ref includes references on the file across all clients, but we need a per-(client, file) count. Walking through lots of stateid's to calculate this on each close or delegreturn would be painful. This patch introduces infrastructure to maintain per-client opens and delegation counters on a per-file basis. [hch: ported to the mainline pNFS support, merged various fixes from Jeff] Signed-off-by: NSachin Bhamare <sachin.bhamare@primarydata.com> Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 03 2月, 2015 5 次提交
-
-
由 Christoph Hellwig 提交于
Add support to issue layout recalls to clients. For now we only support full-file recalls to get a simple and stable implementation. This allows to embedd a nfsd4_callback structure in the layout_state and thus avoid any memory allocations under spinlocks during a recall. For normal use cases that do not intent to share a single file between multiple clients this implementation is fully sufficient. To ensure layouts are recalled on local filesystem access each layout state registers a new FL_LAYOUT lease with the kernel file locking code, which filesystems that support pNFS exports that require recalls need to break on conflicting access patterns. The XDR code is based on the old pNFS server implementation by Andy Adamson, Benny Halevy, Boaz Harrosh, Dean Hildebrand, Fred Isaman, Marc Eshel, Mike Sager and Ricardo Labiaga. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage outstanding layouts and devices. Layout management is very straight forward, with a nfs4_layout_stateid structure that extends nfs4_stid to manage layout stateids as the top-level structure. It is linked into the nfs4_file and nfs4_client structures like the other stateids, and contains a linked list of layouts that hang of the stateid. The actual layout operations are implemented in layout drivers that are not part of this commit, but will be added later. The worst part of this commit is the management of the pNFS device IDs, which suffers from a specification that is not sanely implementable due to the fact that the device-IDs are global and not bound to an export, and have a small enough size so that we can't store the fsid portion of a file handle, and must never be reused. As we still do need perform all export authentication and validation checks on a device ID passed to GETDEVICEINFO we are caught between a rock and a hard place. To work around this issue we add a new hash that maps from a 64-bit integer to a fsid so that we can look up the export to authenticate against it, a 32-bit integer as a generation that we can bump when changing the device, and a currently unused 32-bit integer that could be used in the future to handle more than a single device per export. Entries in this hash table are never deleted as we can't reuse the ids anyway, and would have a severe lifetime problem anyway as Linux export structures are temporary structures that can go away under load. Parts of the XDR data, structures and marshaling/unmarshaling code, as well as many concepts are derived from the old pNFS server implementation from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman, Mike Sager, Ricardo Labiaga and many others. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 08 1月, 2015 1 次提交
-
-
由 Jeff Layton 提交于
fi_delegees is always handled under the fi_lock, so there's no need to use an atomic_t for this field. Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 08 11月, 2014 1 次提交
-
-
由 Jeff Layton 提交于
The global state_lock protects the file_hashtbl, and that has the potential to be a scalability bottleneck. Address this by making the file_hashtbl use RCU. Add a rcu_head to the nfs4_file and use that when freeing ones that have been hashed. In order to conserve space, we union the fi_rcu field with the fi_delegations list_head which must be clear by the time the last reference to the file is dropped. Convert find_file_locked to use RCU lookup primitives and not to require that the state_lock be held, and convert find_file to do a lockless lookup. Convert find_or_add_file to attempt a lockless lookup first, and then fall back to doing a locked search and insert if that fails to find anything. Also, minimize the number of times we need to calculate the hash value by passing it in as an argument to the search and insert functions, and optimize the order of arguments in nfsd4_init_file. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 24 10月, 2014 1 次提交
-
-
由 Jeff Layton 提交于
They're a bit outdated wrt to some recent changes. Signed-off-by: NJeff Layton <jlayton@primarydata.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 08 10月, 2014 1 次提交
-
-
由 Jeff Layton 提交于
Now that we don't need to pass in an actual lease pointer to vfs_setlease on unlock, we can stop tracking a pointer to the lease in the nfs4_file. Switch all of the places that check the fi_lease to check fi_deleg_file instead. We always set that at the same time so it will have the same semantics. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@primarydata.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 02 10月, 2014 1 次提交
-
-
由 Jeff Layton 提交于
We now have cb_to_delegation and to_delegation, which do the same thing and are defined separately in different .c files. Move the cb_to_delegation definition into a header file and eliminate the redundant to_delegation definition. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJeff Layton <jlayton@primarydata.com>
-
- 27 9月, 2014 4 次提交
-
-
由 Christoph Hellwig 提交于
Add a higher level abstraction than the rpc_ops for callback operations. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Christoph Hellwig 提交于
Split out initializing the nfs4_callback structure from using it. For the NULL callback this gets rid of tons of pointless re-initializations. Note that I don't quite understand what protects us from running multiple NULL callbacks at the same time, but at least this chance doesn't make it worse.. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Christoph Hellwig 提交于
Add a helper to queue up a callback. CB_NULL has a bit of special casing because it is special in the specification, but all other new callback operations will be able to share code with this and a few more changes to refactor the callback code. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Christoph Hellwig 提交于
We can always get at the private data by using container_of, no need for a void pointer. Also introduce a little to_delegation helper to avoid opencoding the container_of everywhere. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 18 9月, 2014 3 次提交
-
-
由 Jeff Layton 提交于
In a later patch, we want to add a flag that will allow us to reduce the need for upcalls. In order to handle that correctly, we'll need to ensure that racing upcalls for the same client can't occur. In practice it should be rare for this to occur with a well-behaved client, but it is possible. Convert one of the bits in the cl_flags field to be an upcall bitlock, and use it to ensure that upcalls for the same client are serialized. Signed-off-by: NJeff Layton <jlayton@primarydata.com>
-
由 Jeff Layton 提交于
Allow a privileged userland process to end the v4 grace period early. Writing "Y", "y", or "1" to the file will cause the v4 grace period to be lifted. The basic idea with this will be to allow the userland client tracking program to lift the grace period once it knows that no more clients will be reclaiming state. Signed-off-by: NJeff Layton <jlayton@primarydata.com>
-
由 Jeff Layton 提交于
Since it's stored in nfsd_net, we don't need to pass it in separately. Signed-off-by: NJeff Layton <jlayton@primarydata.com>
-
- 06 8月, 2014 2 次提交
-
-
由 Jeff Layton 提交于
Add some comments that describe what each of these objects is, and how they related to one another. Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 05 8月, 2014 2 次提交
-
-
由 Jeff Layton 提交于
Remove the old nfsd_for_n_state function and move nfsd_find_client higher up into the file to get rid of forward declaration. Remove the struct nfsd_fault_inject_op arguments from the operations as they are no longer needed by any of them. Finally, remove the old "standard" get and set routines, which also eliminates the client_mutex from this code. Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Jeff Layton 提交于
...instead of relying on the client_mutex. Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-