- 03 12月, 2020 40 次提交
-
-
由 Sargun Dhillon 提交于
In several patches work has been done to enable NFSv4 to use user namespaces: 58002399: NFSv4: Convert the NFS client idmapper to use the container user namespace 3b7eb5e3: NFS: When mounting, don't share filesystems between different user namespaces Unfortunately, the userspace APIs were only such that the userspace facing side of the filesystem (superblock s_user_ns) could be set to a non init user namespace. This furthers the fs_context related refactoring, and piggybacks on top of that logic, so the superblock user namespace, and the NFS user namespace are the same. Users can still use rpc.idmapd if they choose to, but there are complexities with user namespaces and request-key that have yet to be addresssed. Eventually, we will need to at least: * Come up with an upcall mechanism that can be triggered inside of the container, or safely triggered outside, with the requisite context to do the right mapping. * Handle whatever refactoring needs to be done in net/sunrpc. Signed-off-by: NSargun Dhillon <sargun@sargun.me> Tested-by: NAlban Crequy <alban.crequy@gmail.com> Fixes: 62a55d08 ("NFS: Additional refactoring for fs_context conversion") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Sargun Dhillon 提交于
There was refactoring done to use the fs_context for mounting done in: 62a55d08: NFS: Additional refactoring for fs_context conversion This made it so that the net_ns is fetched from the fs_context (the netns that fsopen is called in). This change also makes it so that the credential fetched during fsopen is used as well as the net_ns. NFS has already had a number of changes to prepare it for user namespaces: 1a58e8a0: NFS: Store the credential of the mount process in the nfs_server 264d948c: NFS: Convert NFSv3 to use the container user namespace c207db2f: NFS: Convert NFSv2 to use the container user namespace Previously, different credentials could be used for creation of the fs_context versus creation of the nfs_server, as FSCONFIG_CMD_CREATE did the actual credential check, and that's where current_creds() were fetched. This meant that the user namespace which fsopen was called in could be a non-init user namespace. This still requires that the user that calls FSCONFIG_CMD_CREATE has CAP_SYS_ADMIN in the init user ns. This roughly allows a privileged user to mount on behalf of an unprivileged usernamespace, by forking off and calling fsopen in the unprivileged user namespace. It can then pass back that fsfd to the privileged process which can configure the NFS mount, and then it can call FSCONFIG_CMD_CREATE before switching back into the mount namespace of the container, and finish up the mounting process and call fsmount and move_mount. Signed-off-by: NSargun Dhillon <sargun@sargun.me> Tested-by: NAlban Crequy <alban.crequy@gmail.com> Fixes: 62a55d08 ("NFS: Additional refactoring for fs_context conversion") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
When returning the layout in nfs4_evict_inode(), we need to ensure that the layout is actually done being freed before we can proceed to free the inode itself. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
While we always want to align to the next page and/or the beginning of the tail buffer when we call xdr_set_next_page(), the functions xdr_align_data() and xdr_expand_hole() really want to align to the next object in that next page or tail. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
rpc_prepare_reply_pages() currently expects the 'hdrsize' argument to contain the length of the data that we expect to want placed in the head kvec plus a count of 1 word of padding that is placed after the page data. This is very confusing when trying to read the code, and sometimes leads to callers adding an arbitrary value of '1' just in order to satisfy the requirement (whether or not the page data actually needs such padding). This patch aims to clarify the code by changing the 'hdrsize' argument to remove that 1 word of padding. This means we need to subtract the padding from all the existing callers. Fixes: 02ef04e4 ("NFS: Account for XDR pad of buf->pages") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Fix up xdr_read_pages() so that it can handle object lengths that are larger than the page length, by simply aligning to the next object in the buffer tail. The function will continue to return the length of the truncate object data that actually fit into the pages. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Allow xdr_set_iov() to set a base so that we can use it to set the cursor to a specific position in the kvec buffer. If the new base overflows the kvec/pages buffer in either xdr_set_iov() or xdr_set_page_base(), then truncate it so that we point to the end of the buffer. Finally, change both function to return the number of bytes remaining to read in their buffers. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
We already know that the head buffer and page are empty, so if there is any data, it is in the tail. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
We can fit the device_addr4 opaque data padding in the pages. Fixes: cf500bac ("SUNRPC: Introduce rpc_prepare_reply_pages()") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Use the existing xdr_stream_decode_string_dup() to safely decode into kmalloced strings. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Ensure that we report the correct netid when using UDP or RDMA transports to the DSes. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
We want to enable RDMA and UDP as valid transport methods if a GETDEVICEINFO call specifies it. Do so by adding a parser for the netid that translates it to an appropriate argument for the RPC transport layer. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If the pNFS metadata server advertises multiple addresses for the same data server, we should try to connect to just one protocol family and transport type on the assumption that homogeneity will improve performance. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Switch the mount code to use xprt_find_transport_ident() and to check the results before allowing the mount to proceed. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
After we've looked up the transport module, we need to ensure it can't go away until we've finished running the transport setup code. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
According to RFC5666, the correct netid for an IPv6 addressed RDMA transport is "rdma6", which we've supported as a mount option since Linux-4.7. The problem is when we try to load the module "xprtrdma6", that will fail, since there is no modulealias of that name. Fixes: 181342c5 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If the directory is changing, causing the page cache to get invalidated while we are listing the contents, then the NFS client is currently forced to read in the entire directory contents from scratch, because it needs to perform a linear search for the readdir cookie. While this is not an issue for small directories, it does not scale to directories with millions of entries. In order to be able to deal with large directories that are changing, add a heuristic to ensure that if the page cache is empty, and we are searching for a cookie that is not the zero cookie, we just default to performing uncached readdir. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
If we're doing uncached readdir, allocate multiple pages in order to try to avoid duplicate RPC calls for the same getdents() call. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
If the server is handing out monotonically increasing readdir cookie values, then we can optimise away searches through pages that contain cookies that lie outside our search range. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
If the server insists on using the readdir verifiers in order to allow cookies to expire, then we should ensure that we cache the verifier with the cookie, so that we can return an error if the application tries to use the expired cookie. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
If the server returns NFS4ERR_NOT_SAME or tells us that the cookie is bad in response to a READDIR call, then we should empty the page cache so that we can fill it from scratch again. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
If we're ever going to allow support for servers that use the readdir verifier, then that use needs to be managed by the middle layers as those need to be able to reject cookies from other verifiers. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
The descriptor and the struct nfs_entry are both large structures, so don't allocate them from the stack. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
Clean up nfs_do_filldir(). Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
Remove the redundant caching of the credential in struct nfs_open_dir_context. Pass the buffer size as an argument to nfs_readdir_xdr_filler(). Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
Support readdir buffers of up to 1MB in size so that we can read large directories using few RPC calls. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
We don't need to store a hash, so replace struct qstr with a simple const char pointer and length. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
The kmapped pointer is only used once per loop to check if we need to exit. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
If a readdir call returns more data than we can fit into one page cache page, then allocate a new one for that data rather than discarding the data. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
Refactor to use pagecache_get_page() so that we can fill the page in multiple stages. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
Clean up handling of the case where there are no entries in the readdir reply. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-
由 Trond Myklebust 提交于
Since the 'eof_index' is only ever used as a flag, make it so. Also add a flag to detect if the page has been completely filled. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Tested-by: NDave Wysochanski <dwysocha@redhat.com>
-