- 22 8月, 2018 1 次提交
-
-
由 Trond Myklebust 提交于
If we knew that the file was empty, we wouldn't be asking for a layout. Any optimisation here is already done before calling pnfs_update_layout(). As it stands, we sometimes end up doing an unnecessary inband read to the MDS even when holding a layout. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 17 8月, 2018 2 次提交
-
-
由 Trond Myklebust 提交于
Yes, it is possible to get trapped in a loop, but the server should be administratively revoking the recalled layout if it never gets returned. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
When we update the layout stateid in nfs4_layoutreturn_refresh_stateid, we should also update the range in order to let the server know we're actually returning everything. Fixes: 16c278dbfa63 ("pnfs: Fix handling of NFS4ERR_OLD_STATEID replies...") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 09 8月, 2018 3 次提交
-
-
由 Gustavo A. R. Silva 提交于
Return statements in functions returning bool should use true or false instead of an integer value. This issue was detected with the help of Coccinelle. Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Layout segment validity is determined only by the NFS_LSEG_VALID flag. If it is set, the layout segment is finable. As it is, when the flexfiles driver sets NFS_LSEG_LAYOUTRETURN to indicate that we cannot discard the layout segment, but that it must be returned, then this can result in an unnecessary layoutget storm. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
If the server tells us that out layoutreturn raced with another layout update, then we must ensure that the new layout segments are not in use before we resend with an updated layout stateid. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 27 7月, 2018 4 次提交
-
-
由 Trond Myklebust 提交于
Even if the results of the permissions checks failed, we should parse the results of the layout on open call so that we can return the layout if required. Note that we also want to ignore the sequence counter for whether or not a layout recall occurred. If the recall pertained to our OPEN, then the callback will know, and will attempt to wait for us to finih processing anyway. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If the old layout was recalled, and we returned NFS4ERR_NOMATCHINGLAYOUT then we need to wait for all outstanding layoutget calls to complete before we can send a new one. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If a layout has been recalled, then we should fire off a layoutreturn as soon as all the layout segments that match the recall have been retired. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If there are layout segments that are marked for return, then we need to ensure that pnfs_mark_matching_lsegs_return() does not just silently discard them, but it should tell the caller that there is a layoutreturn scheduled. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 12 6月, 2018 1 次提交
-
-
由 Olga Kornievskaia 提交于
Currently, when IO to DS fails, client returns the layout and retries against the MDS. However, then on umounting (inode eviction) it returns the layout again. This is because pnfs_return_layout() was changed in commit d78471d3 ("pnfs/blocklayout: set PNFS_LAYOUTRETURN_ON_ERROR") to always set NFS_LAYOUT_RETURN_REQUESTED so even if we returned the layout, it will be returned again. Instead, let's also check if we have already marked the layout invalid. Signed-off-by: NOlga Kornievskaia <kolga@netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 01 6月, 2018 14 次提交
-
-
由 Trond Myklebust 提交于
If the layoutget on open call failed, we can't really commit the inode, so don't bother calling it. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If we're only opening the file for reading, and the file is empty and/or we already have cached data, then heuristically optimise away the LAYOUTGET. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Ensure that we only switch off the LAYOUTGET operation in the OPEN compound when the server is truly broken, and/or it is complaining that the compound is too large. Currently, we end up turning off the functionality permanently, even for transient errors such as EACCES or ENOSPC. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
We need to ensure that pnfs_parse_lgopen() doesn't try to parse a struct nfs4_layoutget_res that was not filled by a successful call to decode_layoutget(). This can happen if we performed a cached open, or if either the OP_ACCESS or OP_GETATTR operations preceding the OP_LAYOUTGET in the compound returned an error. By initialising the 'status' field to NFS4ERR_DELAY, we ensure that pnfs_parse_lgopen() won't try to interpret the structure. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
The flag was not always being cleared after LAYOUTGET on OPEN. Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
Since the LAYOUTGET on OPEN can be sent without prior inode information, existing methods to prevent LAYOUTGET from being sent while processing CB_LAYOUTRECALL don't work. Track if a recall occurred while LAYOUTGET was being sent, and if so ignore the results. Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Move the actual freeing of the struct nfs4_layoutget into fs/nfs/pnfs.c where it can be reused by the layoutget on open code. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
This triggers when have no pre-existing inode to attach to. The preexisting case is saved for later. Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
Don't send in a layout, instead use the (possibly NULL) inode. This is needed for LAYOUTGET attached to an OPEN where the inode is not yet set. Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
It will be needed now by the pnfs code. Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
They work better in the new alloc_init function. Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Fred Isaman 提交于
Pull out the alloc/init part for eventual reuse by OPEN. Signed-off-by: NFred Isaman <fred.isaman@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 09 3月, 2018 1 次提交
-
-
由 Trond Myklebust 提交于
Ensure that we hold a reference to the layout header when processing the pNFS return-on-close so that the refcount value does not inadvertently go to zero. Reported-by: NTigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.10+ Tested-by: NTigran Mkrtchyan <tigran.mkrtchyan@desy.de>
-
- 15 1月, 2018 2 次提交
-
-
由 Scott Mayhew 提交于
Currently when falling back to doing I/O through the MDS (via pnfs_{read|write}_through_mds), the client frees the nfs_pgio_header without releasing the reference taken on the dreq via pnfs_generic_pg_{read|write}pages -> nfs_pgheader_init -> nfs_direct_pgio_init. It then takes another reference on the dreq via nfs_generic_pg_pgios -> nfs_pgheader_init -> nfs_direct_pgio_init and as a result the requester will become stuck in inode_dio_wait. Once that happens, other processes accessing the inode will become stuck as well. Ensure that pnfs_read_through_mds() and pnfs_write_through_mds() clean up correctly by calling hdr->completion_ops->completion() instead of calling hdr->release() directly. This can be reproduced (sometimes) by performing "storage failover takeover" commands on NetApp filer while doing direct I/O from a client. This can also be reproduced using SystemTap to simulate a failure while doing direct I/O from a client (from Dave Wysochanski <dwysocha@redhat.com>): stap -v -g -e 'probe module("nfs_layout_nfsv41_files").function("nfs4_fl_prepare_ds").return { $return=NULL; exit(); }' Suggested-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NScott Mayhew <smayhew@redhat.com> Fixes: 1ca018d2 ("pNFS: Fix a memory leak when attempted pnfs fails") Cc: stable@vger.kernel.org Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Benjamin Coddington 提交于
PNFS block/SCSI layouts should gracefully handle cases where block devices are not available when a layout is retrieved, or the block devices are removed while the client holds a layout. While setting up a layout segment, keep a record of an unavailable or un-parsable block device in cache with a flag so that subsequent layouts do not spam the server with GETDEVINFO. We can reuse the current NFS_DEVICEID_UNAVAILABLE handling with one variation: instead of reusing the device, we will discard it and send a fresh GETDEVINFO after the timeout, since the lookup and validation of the device occurs within the GETDEVINFO response handling. A lookup of a layout segment that references an unavailable device will return a segment with the NFS_LSEG_UNAVAILABLE flag set. This will allow the pgio layer to mark the layout with the appropriate fail bit, which forces subsequent IO to the MDS, and prevents spamming the server with LAYOUTGET, LAYOUTRETURN. Finally, when IO to a block device fails, look up the block device(s) referenced by the pgio header, and mark them as unavailable. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 18 11月, 2017 4 次提交
-
-
由 Trond Myklebust 提交于
If our layoutreturn on close operation returns an NFS4ERR_OLD_STATEID, then try to update the stateid and retry. We know that there should be no further LAYOUTGET requests being launched. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Thomas Meyer 提交于
Bool initializations should use true and false. Bool tests don't need comparisons. Signed-off-by: NThomas Meyer <thomas@m3y3r.de> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Elena Reshetova 提交于
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable pnfs_layout_hdr.plh_refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: NKees Cook <keescook@chromium.org> Reviewed-by: NDavid Windsor <dwindsor@gmail.com> Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Elena Reshetova 提交于
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 12 9月, 2017 1 次提交
-
-
由 Trond Myklebust 提交于
Instead of having a private method for copying the open/delegation stateid, use the same call that is used for standard I/O through the MDS. Note that this means we transmit the stateid with a zero seqid, avoiding issues with NFS4ERR_OLD_STATEID. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 09 9月, 2017 1 次提交
-
-
由 Trond Myklebust 提交于
The writeback code wants to send a commit after processing the pages, which is why we want to delay releasing the struct path until after that's done. Also, the layout code expects that we do not free the inode before we've put the layout segments in pnfs_writehdr_free() and pnfs_readhdr_free() Fixes: 919e3bd9 ("NFS: Ensure we commit after writeback is complete") Fixes: 4714fb51 ("nfs: remove pgio_header refcount, related cleanup") Cc: stable@vger.kernel.org Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 15 8月, 2017 1 次提交
-
-
由 Trond Myklebust 提交于
Now that we no longer hold the inode->i_lock when manipulating the commit lists, it is safe to call pnfs_put_lseg() again. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 24 5月, 2017 1 次提交
-
-
由 Benjamin Coddington 提交于
It's possible and acceptable for NFS to attempt to add requests beyond the range of the current pgio->pg_lseg, a case which should be caught and limited by the pg_test operation. However, the current handling of this case replaces pgio->pg_lseg with a new layout segment (after a WARN) within that pg_test operation. That will cause all the previously added requests to be submitted with this new layout segment, which may not be valid for those requests. Fix this problem by only returning zero for the number of bytes to coalesce from pg_test for this case which allows any previously added requests to complete on the current layout segment. The check for requests starting out of range of the layout segment moves to pg_init, so that the replacement of pgio->pg_lseg will be done when the next request is added. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 03 5月, 2017 2 次提交
-
-
由 Trond Myklebust 提交于
Consider the following deadlock: Process P1 Process P2 Process P3 ========== ========== ========== lock_page(page) lseg = pnfs_update_layout(inode) lo = NFS_I(inode)->layout pnfs_error_mark_layout_for_return(lo) lock_page(page) lseg = pnfs_update_layout(inode) In this scenario, - P1 has declared the layout to be in error, but P2 holds a reference to a layout segment on that inode, so the layoutreturn is deferred. - P2 is waiting for a page lock held by P3. - P3 is asking for a new layout segment, but is blocked waiting for the layoutreturn. The fix is to ensure that pnfs_error_mark_layout_for_return() does not set the NFS_LAYOUT_RETURN flag, which blocks P3. Instead, we allow the latter to call LAYOUTGET so that it can make progress and unblock P2. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
In pnfs_clear_layoutreturn_info, ensure that we don't clear the layout return info if there are new segments queued for return due to, for instance, a race between a LAYOUTRETURN and a failed I/O attempt. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 29 4月, 2017 2 次提交
-
-
由 Trond Myklebust 提交于
If the layout is being invalidated on the server, then we must invoke nfs_commit_inode() to ensure any commits to the DS get cleared out. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
If the attempt to write through pNFS fails, we need to use the same failure semantics as for the read path: If the FF_FLAGS_NO_IO_THRU_MDS flag is set or we have sufficient valid DSes, then we must retry through pNFS Fixes: d67ae825 ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-