- 20 4月, 2008 1 次提交
-
-
由 Trond Myklebust 提交于
In the case of readpage() we need to ensure that the pages get unlocked, and that the error is flagged. In the case of O_DIRECT, we need to ensure that the pages are all released. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 20 3月, 2008 2 次提交
-
-
由 Fred Isaman 提交于
Signed-off-by: NFred Isaman <iisaman@citi.umich.edu> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
Ignoring the return value from nfs_pageio_add_request can cause deadlocks. In read path: call nfs_pageio_add_request from readpage_async_filler assume at this point that there are requests already in desc, that can't be merged with the current request. so nfs_pageio_doio is fired up to clear out desc. assume something goes wrong in setting up the io, so desc->pg_error is set. This causes nfs_pageio_add_request to return 0, *WITHOUT* adding the original request. BUT, since return code is ignored, readpage_async_filler assumes it has been added, and does nothing further, leaving page locked. do_generic_mapping_read will eventually call lock_page, resulting in deadlock In write path: page is marked dirty by generic_perform_write nfs_writepages is called call nfs_pageio_add_request from nfs_page_async_flush assume at this point that there are requests already in desc, that can't be merged with the current request. so nfs_pageio_doio is fired up to clear out desc. assume something goes wrong in setting up the io, so desc->pg_error is set. This causes nfs_page_async_flush to return 0, *WITHOUT* adding the original request, yet marking the request as locked (PG_BUSY) and in writeback, clearing dirty marks. The next time a write is done to the page, deadlock will result as nfs_write_end calls nfs_update_request Signed-off-by: NFred Isaman <iisaman@citi.umich.edu> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 29 2月, 2008 1 次提交
-
-
由 Trond Myklebust 提交于
Now that we've tightened up the locking rules for RPC queue wakeups, we can remove the RCU-safe kfree calls... Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 26 2月, 2008 2 次提交
-
-
由 Trond Myklebust 提交于
We want to ensure that rpc_call_ops that involve mntput() are run on nfsiod rather than on rpciod, so that they don't deadlock when the resulting umount calls rpc_shutdown_client(). Hence we specify that read, write and commit calls must complete on nfsiod. Ditto for NFSv4 open, lock, locku and close asynchronous calls. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
We can't allow rpc callback functions like task->tk_ops->rpc_call_prepare() and task->tk_ops->rpc_call_done() to call mntput() in any way, since that will cause a deadlock when the call to rpc_shutdown_client() attempts to wait on 'task' to complete. We can avoid the above deadlock by moving calls to mntput to task->tk_ops->rpc_release() callback, since at that time the task will be marked as completed, and so rpc_shutdown_client won't attempt to wait on it. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 06 2月, 2008 1 次提交
-
-
由 Christoph Lameter 提交于
Simplify page cache zeroing of segments of pages through 3 functions zero_user_segments(page, start1, end1, start2, end2) Zeros two segments of the page. It takes the position where to start and end the zeroing which avoids length calculations and makes code clearer. zero_user_segment(page, start, end) Same for a single segment. zero_user(page, start, length) Length variant for the case where we know the length. We remove the zero_user_page macro. Issues: 1. Its a macro. Inline functions are preferable. 2. The KM_USER0 macro is only defined for HIGHMEM. Having to treat this special case everywhere makes the code needlessly complex. The parameter for zeroing is always KM_USER0 except in one single case that we open code. Avoiding KM_USER0 makes a lot of code not having to be dealing with the special casing for HIGHMEM anymore. Dealing with kmap is only necessary for HIGHMEM configurations. In those configurations we use KM_USER0 like we do for a series of other functions defined in highmem.h. Since KM_USER0 is depends on HIGHMEM the existing zero_user_page function could not be a macro. zero_user_* functions introduced here can be be inline because that constant is not used when these functions are called. Also extract the flushing of the caches to be outside of the kmap. [akpm@linux-foundation.org: fix nfs and ntfs build] [akpm@linux-foundation.org: fix ntfs build some more] Signed-off-by: NChristoph Lameter <clameter@sgi.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 1月, 2008 5 次提交
-
-
由 Benny Halevy 提交于
use NFS_I(inode)->flags instead Signed-off-by: NBenny Halevy <bhalevy@panasas.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Move the common code for setting up the nfs_write_data and nfs_read_data structures into fs/nfs/read.c, fs/nfs/write.c and fs/nfs/direct.c. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
We want the default scheduling priority (priority == 0) to remain RPC_PRIORITY_NORMAL. Also ensure that the priority wait queue scheduling is per process id instead of sometimes being per thread, and sometimes being per inode. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 07 12月, 2007 1 次提交
-
-
由 Matthew Wilcox 提交于
By using the TASK_KILLABLE infrastructure, we can get rid of the 'intr' mount option. We have to use _killable everywhere instead of _interruptible as we get rid of rpc_clnt_sigmask/sigunmask. Signed-off-by: NLiam R. Howlett <howlett@gmail.com> Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
-
- 10 10月, 2007 2 次提交
-
-
由 Trond Myklebust 提交于
NFSv3 will correctly update atime on a read() call, so there is no need to set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode() fails. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 20 7月, 2007 1 次提交
-
-
由 Paul Mundt 提交于
Slab destructors were no longer supported after Christoph's c59def9f change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
-
- 11 7月, 2007 2 次提交
-
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Since PG_uptodate may now end up getting set during the call to nfs_wb_page(), we can avoid putting a read request on the wire in those situations. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 15 5月, 2007 1 次提交
-
-
由 Nate Diller 提交于
Use zero_user_page() instead of the newly deprecated memclear_highpage_flush(). Signed-off-by: NNate Diller <nate.diller@gmail.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 01 5月, 2007 4 次提交
-
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Do the coalescing of read requests into block sized requests at start of I/O as we scan through the pages instead of going through a second pass. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 04 2月, 2007 2 次提交
-
-
由 Chuck Lever 提交于
The tk_pid field is an unsigned short. The proper print format specifier for that type is %5u, not %4d. Also clean up some miscellaneous print formatting nits. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
It makes no sense to maintain 2 parallel systems for reading in pages. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 08 12月, 2006 2 次提交
-
-
由 Christoph Lameter 提交于
Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
SLAB_NOFS is an alias of GFP_NOFS. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 06 12月, 2006 5 次提交
-
-
由 Trond Myklebust 提交于
Clean up a lot of ad-hoc page length calculations in fs/nfs/write.c Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Frank Filz 提交于
Remove use of the Big Kernel Lock around calls to rpc_execute. Signed-off-by: NFrank Filz <ffilz@us.ibm.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
We must always call ->read_done() before we truncate the page data, or decide to flag an error. The reasons are that in NFSv2, ->read_done() is where the eof flag gets set. in NFSv3/v4 ->read_done() handles EJUKEBOX-type errors, and v4 state recovery. However, we need to mark the pages as uptodate before we deal with short read errors, since we may need to modify the nfs_read_data arguments. We therefore split the current nfs_readpage_result() into two parts: nfs_readpage_result(), which calls ->read_done() etc, and nfs_readpage_retry(), which subsequently handles short reads. Note: Removing the code that retries in case of a short read also fixes a bug in nfs_direct_read_result(), which used to return a corrupted number of bytes. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Use RCU to ensure that we can safely call rpc_finish_wakeup after we've called __rpc_do_wake_up_task. If not, there is a theoretical race, in which the rpc_task finishes executing, and gets freed first. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 27 9月, 2006 1 次提交
-
-
由 Alexey Dobriyan 提交于
* Rougly half of callers already do it by not checking return value * Code in drivers/acpi/osl.c does the following to be sure: (void)kmem_cache_destroy(cache); * Those who check it printk something, however, slab_error already printed the name of failed cache. * XFS BUGs on failed kmem_cache_destroy which is not the decision low-level filesystem driver should make. Converted to ignore. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 23 9月, 2006 2 次提交
-
-
由 Trond Myklebust 提交于
Currently, a read() request will return EIO even if the file has been deleted on the server, simply because that is what the VM will return if the call to readpage() fails to update the page. Ensure that readpage() marks the inode as stale if it receives an ESTALE. Then return that error to userland. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 David Howells 提交于
The attached patch makes NFS share superblocks between mounts from the same server and FSID over the same protocol. It does this by creating each superblock with a false root and returning the real root dentry in the vfsmount presented by get_sb(). The root dentry set starts off as an anonymous dentry if we don't already have the dentry for its inode, otherwise it simply returns the dentry we already have. We may thus end up with several trees of dentries in the superblock, and if at some later point one of anonymous tree roots is discovered by normal filesystem activity to be located in another tree within the superblock, the anonymous root is named and materialises attached to the second tree at the appropriate point. Why do it this way? Why not pass an extra argument to the mount() syscall to indicate the subpath and then pathwalk from the server root to the desired directory? You can't guarantee this will work for two reasons: (1) The root and intervening nodes may not be accessible to the client. With NFS2 and NFS3, for instance, mountd is called on the server to get the filehandle for the tip of a path. mountd won't give us handles for anything we don't have permission to access, and so we can't set up NFS inodes for such nodes, and so can't easily set up dentries (we'd have to have ghost inodes or something). With this patch we don't actually create dentries until we get handles from the server that we can use to set up their inodes, and we don't actually bind them into the tree until we know for sure where they go. (2) Inaccessible symbolic links. If we're asked to mount two exports from the server, eg: mount warthog:/warthog/aaa/xxx /mmm mount warthog:/warthog/bbb/yyy /nnn We may not be able to access anything nearer the root than xxx and yyy, but we may find out later that /mmm/www/yyy, say, is actually the same directory as the one mounted on /nnn. What we might then find out, for example, is that /warthog/bbb was actually a symbolic link to /warthog/aaa/xxx/www, but we can't actually determine that by talking to the server until /warthog is made available by NFS. This would lead to having constructed an errneous dentry tree which we can't easily fix. We can end up with a dentry marked as a directory when it should actually be a symlink, or we could end up with an apparently hardlinked directory. With this patch we need not make assumptions about the type of a dentry for which we can't retrieve information, nor need we assume we know its place in the grand scheme of things until we actually see that place. This patch reduces the possibility of aliasing in the inode and page caches for inodes that may be accessed by more than one NFS export. It also reduces the number of superblocks required for NFS where there are many NFS exports being used from a server (home directory server + autofs for example). This in turn makes it simpler to do local caching of network filesystems, as it can then be guaranteed that there won't be links from multiple inodes in separate superblocks to the same cache file. Obviously, cache aliasing between different levels of NFS protocol could still be a problem, but at least that gives us another key to use when indexing the cache. This patch makes the following changes: (1) The server record construction/destruction has been abstracted out into its own set of functions to make things easier to get right. These have been moved into fs/nfs/client.c. All the code in fs/nfs/client.c has to do with the management of connections to servers, and doesn't touch superblocks in any way; the remaining code in fs/nfs/super.c has to do with VFS superblock management. (2) The sequence of events undertaken by NFS mount is now reordered: (a) A volume representation (struct nfs_server) is allocated. (b) A server representation (struct nfs_client) is acquired. This may be allocated or shared, and is keyed on server address, port and NFS version. (c) If allocated, the client representation is initialised. The state member variable of nfs_client is used to prevent a race during initialisation from two mounts. (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find the root filehandle for the mount (fs/nfs/getroot.c). For NFS2/3 we are given the root FH in advance. (e) The volume FSID is probed for on the root FH. (f) The volume representation is initialised from the FSINFO record retrieved on the root FH. (g) sget() is called to acquire a superblock. This may be allocated or shared, keyed on client pointer and FSID. (h) If allocated, the superblock is initialised. (i) If the superblock is shared, then the new nfs_server record is discarded. (j) The root dentry for this mount is looked up from the root FH. (k) The root dentry for this mount is assigned to the vfsmount. (3) nfs_readdir_lookup() creates dentries for each of the entries readdir() returns; this function now attaches disconnected trees from alternate roots that happen to be discovered attached to a directory being read (in the same way nfs_lookup() is made to do for lookup ops). The new d_materialise_unique() function is now used to do this, thus permitting the whole thing to be done under one set of locks, and thus avoiding any race between mount and lookup operations on the same directory. (4) The client management code uses a new debug facility: NFSDBG_CLIENT which is set by echoing 1024 to /proc/net/sunrpc/nfs_debug. (5) Clone mounts are now called xdev mounts. (6) Use the dentry passed to the statfs() op as the handle for retrieving fs statistics rather than the root dentry of the superblock (which is now a dummy). Signed-Off-By: NDavid Howells <dhowells@redhat.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 19 9月, 2006 1 次提交
-
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 09 9月, 2006 1 次提交
-
-
由 Trond Myklebust 提交于
The logic in nfs_direct_read_schedule and nfs_direct_write_schedule can allow data->npages to be one larger than rpages. This causes a page pointer to be written beyond the end of the pagevec in nfs_read_data (or nfs_write_data). Fix this by making nfs_(read|write)_alloc() calculate the size of the pagevec array, and initialise data->npages. Also get rid of the redundant argument to nfs_commit_alloc(). Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 25 8月, 2006 1 次提交
-
-
由 Trond Myklebust 提交于
The problem is that we may be caching writes that would extend the file and create a hole in the region that we are reading. In this case, we need to detect the eof from the server, ensure that we zero out the pages that are part of the hole and mark them as up to date. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 856b603b01b99146918c093969b6cb1b1b0f1c01 commit)
-
- 04 8月, 2006 1 次提交
-
-
由 Adrian Bunk 提交于
nfs_writedata_free() and nfs_readdata_free() can now become static. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 5e1ce40f0c3c8f67591aff17756930d7a18ceb1a commit)
-
- 01 7月, 2006 1 次提交
-
-
由 Jörn Engel 提交于
Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-