- 29 5月, 2014 8 次提交
-
-
由 Weston Andros Adamson 提交于
Add "page groups" - a circular list of nfs requests (struct nfs_page) that all reference the same page. This gives nfs read and write paths the ability to account for sub-page regions independently. This somewhat follows the design of struct buffer_head's sub-page accounting. Only "head" requests are ever added/removed from the inode list in the buffered write path. "head" and "sub" requests are treated the same through the read path and the rest of the write/commit path. Requests are given an extra reference across the life of the list. Page groups are never rejoined after being split. If the read/write request fails and the client falls back to another path (ie revert to MDS in PNFS case), the already split requests are pushed through the recoalescing code again, which may split them further and then coalesce them into properly sized requests on the wire. Fragmentation shouldn't be a problem with the current design, because we flush all requests in page group when a non-contiguous request is added, so the only time resplitting should occur is on a resend of a read or write. This patch lays the groundwork for sub-page splitting, but does not actually do any splitting. For now all page groups have one request as pg_test functions don't yet split pages. There are several related patches that are needed support multiple requests per page group. Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
This is a step toward allowing pg_test to inform the the coalescing code to reduce the size of requests so they may fit in whatever scheme the pg_test callback wants to define. For now, just return the size of the request if there is space, or 0 if there is not. This shouldn't change any behavior as it acts the same as when the pg_test functions returned bool. Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
@inode is passed but not used. Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
Remove unused flags PG_NEED_COMMIT and PG_NEED_RESCHED. Add comments describing how each flag is used. Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Anna Schumaker 提交于
Most of this code is the same for both the read and write paths, so combine everything and use the rw_ops when necessary. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Anna Schumaker 提交于
Combining these functions will let me make a single nfs_rw_common_ops struct (see the next patch). Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Anna Schumaker 提交于
The read and write paths do exactly the same thing for the rpc_prepare rpc_op. This patch combines them together into a single function. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Anna Schumaker 提交于
I create a new struct nfs_rw_ops to decide the differences between reads and writes. This struct will be set when initializing a new nfs_pgio_descriptor, and then passed on to the nfs_rw_header when a new header is allocated. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 03 8月, 2012 1 次提交
-
-
由 Peng Tao 提交于
To allow layout driver to pass private information around pg_init/pg_doio. Signed-off-by: NPeng Tao <tao.peng@emc.com> Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 29 6月, 2012 1 次提交
-
-
由 Trond Myklebust 提交于
The 'committed' field is not needed once we have put the struct nfs_page on the right list. Also correct the type of the verifier: it is not an array of __be32, but simply an 8 byte long opaque array. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 10 5月, 2012 3 次提交
-
-
由 Trond Myklebust 提交于
Function rename to ensure that the functionality of nfs_unlock_request() mirrors that of nfs_lock_request(). Then let nfs_unlock_and_release_request() do the work of what used to be called nfs_unlock_request()... Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
-
由 Trond Myklebust 提交于
We only have two places where we need to grab a reference when trying to lock the nfs_page. We're better off making that explicit. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
-
由 Trond Myklebust 提交于
We have to unlock the nfs_page before we call nfs_end_page_writeback to avoid races with functions that expect the page to be unlocked when PG_locked and PG_writeback are not set. The problem is that nfs_unlock_request also releases the nfs_page, causing a deadlock if the release of the nfs_open_context triggers an iput() while the PG_writeback flag is still set... The solution is to separate the unlocking and release of the nfs_page, so that we can do the former before nfs_end_page_writeback and the latter after. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
-
- 28 4月, 2012 4 次提交
-
-
由 Fred Isaman 提交于
This also has the advantage that it allows directio to use pnfs. Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
Factors out the code that will need to change when directio starts using these code paths. This will allow directio to use the generic pagein and flush routines Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
Decouple nfs_pgio_header and nfs_read_data, and have (possibly multiple) nfs_read_datas each take a refcount on nfs_pgio_header. For the moment keeps nfs_read_header as a way to preallocate a single nfs_read_data with the nfs_pgio_header. The code doesn't need this, and would be prettier without, but given the amount of churn I am already introducing I didn't want to play with tuning new mempools. This also fixes bug in pnfs_ld_handle_read_error. In the case of desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing replay attempt to do nothing. Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 17 3月, 2012 1 次提交
-
-
由 Trond Myklebust 提交于
Move more pnfs-isms out of the generic commit code. Bugfixes: - filelayout_scan_commit_lists doesn't need to get/put the lseg. In fact since it is run under the inode->i_lock, the lseg_put() can deadlock. - Ensure that we distinguish between what needs to be done for commit-to-data server and what needs to be done for commit-to-MDS using the new flag PG_COMMIT_TO_DS. Otherwise we may end up calling put_lseg() on a bucket for a struct nfs_page that got written through the MDS. - Fix a case where we were using list_del() on an nfs_page->wb_list instead of list_del_init(). - filelayout_initiate_commit needs to call filelayout_commit_release on error instead of the mds_ops->rpc_release(). Otherwise it won't clear the commit lock. Cleanups: - Let the files layout manage the commit lists for the pNFS case. Don't expose stuff like pnfs_choose_commit_list, and the fact that the commit buckets hold references to the layout segment in common code. - Cast out the put_lseg() calls for the struct nfs_read/write_data->lseg into the pNFS layer from whence they came. - Let the pNFS layer manage the NFS_INO_PNFS_COMMIT bit. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
-
- 11 3月, 2012 2 次提交
-
-
由 Fred Isaman 提交于
The radix tree is only being used to compile lists of reqs needing commit. It is simpler to just put the reqs directly into a list. Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
The last real use of this tag was removed by commit 7f2f12d9 NFS: Simplify nfs_wb_page() Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 20 10月, 2011 1 次提交
-
-
由 Trond Myklebust 提交于
Don't rely on the PageError flag to tell us if one of the partial reads of the page failed. Instead, replace that with a dedicated flag in the struct nfs_page. Then clean out redundant uses of the PageError flag: the VM no longer checks it for reads. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 15 7月, 2011 4 次提交
-
-
由 Trond Myklebust 提交于
...and ensure that we recoalese to take into account differences in differences in block sizes when falling back to write through the MDS. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
...and ensure that we recoalese to take into account differences in block sizes when falling back to read through the MDS. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
If an attempt to do pNFS fails, and we have to fall back to writing through the MDS, then we may want to re-coalesce the requests that we already have since the block size for the MDS read/writes may be different to that of the DS read/writes. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 13 7月, 2011 2 次提交
-
-
由 Trond Myklebust 提交于
Ensure that we always get a layout before setting up the i/o request. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
We need to ensure that the layouts are set up before we can decide to coalesce requests. To do so, we want to further split up the struct nfs_pageio_descriptor operations into an initialisation callback, a coalescing test callback, and a 'do i/o' callback. This patch cleans up the existing callback methods before adding the 'initialisation' callback. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 21 6月, 2011 1 次提交
-
-
由 Benny Halevy 提交于
Otherwise we end up overflowing the rpc buffer size on the receive end. Signed-off-by: NBenny Halevy <benny@tonian.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 30 5月, 2011 1 次提交
-
-
由 Benny Halevy 提交于
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
-
- 27 3月, 2011 1 次提交
-
-
由 Trond Myklebust 提交于
Now that the inode scalability patches have been merged, it is no longer safe to call igrab() under the inode->i_lock. Now that we no longer call nfs_clear_request() until the nfs_page is being freed, we know that we are always holding a reference to the nfs_open_context, which again holds a reference to the path, and so the inode cannot be freed until the last nfs_page has been removed from the radix tree and freed. We can therefore skip the igrab()/iput() altogether. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 24 3月, 2011 1 次提交
-
-
由 Fred Isaman 提交于
We create three major hooks for the pnfs code. pnfs_mark_request_commit() is called during writeback_done from nfs_mark_request_commit, which gives the driver an opportunity to claim it wants control over commiting a particular req. pnfs_choose_commit_list() is called from nfs_scan_list to choose which list a given req should be added to, based on where we intend to send it for COMMIT. It is up to the driver to have preallocated list headers for each destination it may need. pnfs_commit_list() is how the driver actually takes control, it is used instead of nfs_commit_list(). In order to pass information between the above functions, we create a union in nfs_page to hold a lseg (which is possible because the req is not on any list while in transition), and add some flags to indicate if we need to use the pnfs code. Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 22 3月, 2011 1 次提交
-
-
由 Trond Myklebust 提交于
If we're only doing a single write, and there are no other unstable writes being queued up, we might want to just flip to using a stable write RPC call. Reviewed-by: NNeilBrown <neilb@suse.de> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 12 3月, 2011 3 次提交
-
-
由 Fred Isaman 提交于
This will make it possible to clear the lseg pointer in the same function as it is put, instead of in the caller nfs_pageio_doio(). Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
Move the pnfs_update_layout call location to nfs_pageio_do_add_request(). Grab the lseg sent in the doio function to nfs_read_rpcsetup and attach it to each nfs_read_data so it can be sent to the layout driver. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NDean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: NFred Isaman <iisaman@citi.umich.edu> Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com> Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NOleg Drokin <green@linuxhacker.ru> Signed-off-by: NTao Guo <guotao@nrchpc.ac.cn> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
Add a pg_test layout driver hook which is used to avoid coelescing I/O across layout stripes. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NDean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: NFred Isaman <iisaman@citi.umich.edu> Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com> Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NOleg Drokin <green@linuxhacker.ru> Signed-off-by: NTao Guo <guotao@nrchpc.ac.cn> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 08 12月, 2010 1 次提交
-
-
由 Trond Myklebust 提交于
When a nfs_page is freed, nfs_free_request is called which also calls nfs_clear_request to clean out the lock and open contexts and free the pagecache page. However, a couple of places in the nfs code call nfs_clear_request themselves. What happens here if the refcount on the request is still high? We'll be releasing contexts and freeing pointers while the request is possibly still in use. Remove those bare calls to nfs_clear_context. That should only be done when the request is being freed. Note that when doing this, we need to watch out for tests of req->wb_page. Previously, nfs_set_page_tag_locked() and nfs_clear_page_tag_locked() would check the value of req->wb_page to figure out if the page is mapped into the nfsi->nfs_page_tree. We now indicate the page is mapped using the new bit PG_MAPPED in req->wb_flags . Reported-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 31 7月, 2010 1 次提交
-
-
由 Trond Myklebust 提交于
This patch fixes bugzilla entry 14501: https://bugzilla.kernel.org/show_bug.cgi?id=14501Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 10 7月, 2008 1 次提交
-
-
由 Trond Myklebust 提交于
Currently, if an unstable write completes, we cannot redirty the page in order to reflect a new change in the page data until after we've sent a COMMIT request. This patch allows a page rewrite to proceed without the unnecessary COMMIT step, putting it immediately back onto the dirty page list, undoing the VM unstable write accounting, and removing the NFS_PAGE_TAG_COMMIT tag from the NFS radix tree. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 30 1月, 2008 1 次提交
-
-
由 Trond Myklebust 提交于
Ensure that we set/clear NFS_PAGE_TAG_LOCKED when the nfs_page is hashed. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 10 10月, 2007 1 次提交
-
-
由 Trond Myklebust 提交于
The addition of nfs_page_mkwrite means that We should no longer need to create requests inside nfs_writepage() Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-