- 06 12月, 2006 6 次提交
-
-
由 Trond Myklebust 提交于
Allow callers to directly pass it a struct writeback_control. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Pass down struct writeback control. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Make it take a struct writepages argument, and rename to nfs_flush_mapping(). Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Frank Filz 提交于
Remove use of the Big Kernel Lock around calls to rpc_execute. Signed-off-by: NFrank Filz <ffilz@us.ibm.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Currently nfs_sync_inode_wait() will fail to loop correctly when we call nfs_sync_inode_wait with the FLUSH_INVALIDATE argument. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Use RCU to ensure that we can safely call rpc_finish_wakeup after we've called __rpc_do_wake_up_task. If not, there is a theoretical race, in which the rpc_task finishes executing, and gets freed first. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 21 10月, 2006 2 次提交
-
-
由 Trond Myklebust 提交于
Fix two bugs: - nfs_inode_remove_request will call nfs_clear_request, so we cannot reference req->wb_page after it. Move the call to dec_zone_page_state so that it occurs while req->wb_page is still valid. - Calling nfs_clear_page_writeback is unnecessary since the radix tree tags will have been cleared by the call to nfs_inode_remove_request. Replace with a simple call to nfs_unlock_request. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andrew Morton 提交于
Separate out the concept of "queue congestion" from "backing-dev congestion". Congestion is a backing-dev concept, not a queue concept. The blk_* congestion functions are retained, as wrappers around the core backing-dev congestion functions. This proper layering is needed so that NFS can cleanly use the congestion functions, and so that CONFIG_BLOCK=n actually links. Cc: "Thomas Maier" <balagi@justmail.de> Cc: "Jens Axboe" <jens.axboe@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: David Howells <dhowells@redhat.com> Cc: Peter Osterlund <petero2@telia.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 01 10月, 2006 1 次提交
-
-
由 David Howells 提交于
Remove inclusions of linux/mpage.h that are no longer necessary due to the transfer of generic_writepages(). Signed-Off-By: NDavid Howells <dhowells@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 9月, 2006 1 次提交
-
-
由 Alexey Dobriyan 提交于
* Rougly half of callers already do it by not checking return value * Code in drivers/acpi/osl.c does the following to be sure: (void)kmem_cache_destroy(cache); * Those who check it printk something, however, slab_error already printed the name of failed cache. * XFS BUGs on failed kmem_cache_destroy which is not the decision low-level filesystem driver should make. Converted to ignore. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 23 9月, 2006 3 次提交
-
-
由 Chuck Lever 提交于
Comments-only change to clarify a detail of the NFS protocol and how it is implemented in Linux. Test plan: None. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 David Howells 提交于
The attached patch makes NFS share superblocks between mounts from the same server and FSID over the same protocol. It does this by creating each superblock with a false root and returning the real root dentry in the vfsmount presented by get_sb(). The root dentry set starts off as an anonymous dentry if we don't already have the dentry for its inode, otherwise it simply returns the dentry we already have. We may thus end up with several trees of dentries in the superblock, and if at some later point one of anonymous tree roots is discovered by normal filesystem activity to be located in another tree within the superblock, the anonymous root is named and materialises attached to the second tree at the appropriate point. Why do it this way? Why not pass an extra argument to the mount() syscall to indicate the subpath and then pathwalk from the server root to the desired directory? You can't guarantee this will work for two reasons: (1) The root and intervening nodes may not be accessible to the client. With NFS2 and NFS3, for instance, mountd is called on the server to get the filehandle for the tip of a path. mountd won't give us handles for anything we don't have permission to access, and so we can't set up NFS inodes for such nodes, and so can't easily set up dentries (we'd have to have ghost inodes or something). With this patch we don't actually create dentries until we get handles from the server that we can use to set up their inodes, and we don't actually bind them into the tree until we know for sure where they go. (2) Inaccessible symbolic links. If we're asked to mount two exports from the server, eg: mount warthog:/warthog/aaa/xxx /mmm mount warthog:/warthog/bbb/yyy /nnn We may not be able to access anything nearer the root than xxx and yyy, but we may find out later that /mmm/www/yyy, say, is actually the same directory as the one mounted on /nnn. What we might then find out, for example, is that /warthog/bbb was actually a symbolic link to /warthog/aaa/xxx/www, but we can't actually determine that by talking to the server until /warthog is made available by NFS. This would lead to having constructed an errneous dentry tree which we can't easily fix. We can end up with a dentry marked as a directory when it should actually be a symlink, or we could end up with an apparently hardlinked directory. With this patch we need not make assumptions about the type of a dentry for which we can't retrieve information, nor need we assume we know its place in the grand scheme of things until we actually see that place. This patch reduces the possibility of aliasing in the inode and page caches for inodes that may be accessed by more than one NFS export. It also reduces the number of superblocks required for NFS where there are many NFS exports being used from a server (home directory server + autofs for example). This in turn makes it simpler to do local caching of network filesystems, as it can then be guaranteed that there won't be links from multiple inodes in separate superblocks to the same cache file. Obviously, cache aliasing between different levels of NFS protocol could still be a problem, but at least that gives us another key to use when indexing the cache. This patch makes the following changes: (1) The server record construction/destruction has been abstracted out into its own set of functions to make things easier to get right. These have been moved into fs/nfs/client.c. All the code in fs/nfs/client.c has to do with the management of connections to servers, and doesn't touch superblocks in any way; the remaining code in fs/nfs/super.c has to do with VFS superblock management. (2) The sequence of events undertaken by NFS mount is now reordered: (a) A volume representation (struct nfs_server) is allocated. (b) A server representation (struct nfs_client) is acquired. This may be allocated or shared, and is keyed on server address, port and NFS version. (c) If allocated, the client representation is initialised. The state member variable of nfs_client is used to prevent a race during initialisation from two mounts. (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find the root filehandle for the mount (fs/nfs/getroot.c). For NFS2/3 we are given the root FH in advance. (e) The volume FSID is probed for on the root FH. (f) The volume representation is initialised from the FSINFO record retrieved on the root FH. (g) sget() is called to acquire a superblock. This may be allocated or shared, keyed on client pointer and FSID. (h) If allocated, the superblock is initialised. (i) If the superblock is shared, then the new nfs_server record is discarded. (j) The root dentry for this mount is looked up from the root FH. (k) The root dentry for this mount is assigned to the vfsmount. (3) nfs_readdir_lookup() creates dentries for each of the entries readdir() returns; this function now attaches disconnected trees from alternate roots that happen to be discovered attached to a directory being read (in the same way nfs_lookup() is made to do for lookup ops). The new d_materialise_unique() function is now used to do this, thus permitting the whole thing to be done under one set of locks, and thus avoiding any race between mount and lookup operations on the same directory. (4) The client management code uses a new debug facility: NFSDBG_CLIENT which is set by echoing 1024 to /proc/net/sunrpc/nfs_debug. (5) Clone mounts are now called xdev mounts. (6) Use the dentry passed to the statfs() op as the handle for retrieving fs statistics rather than the root dentry of the superblock (which is now a dummy). Signed-Off-By: NDavid Howells <dhowells@redhat.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 19 9月, 2006 1 次提交
-
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 09 9月, 2006 1 次提交
-
-
由 Trond Myklebust 提交于
The logic in nfs_direct_read_schedule and nfs_direct_write_schedule can allow data->npages to be one larger than rpages. This causes a page pointer to be written beyond the end of the pagevec in nfs_read_data (or nfs_write_data). Fix this by making nfs_(read|write)_alloc() calculate the size of the pagevec array, and initialise data->npages. Also get rid of the redundant argument to nfs_commit_alloc(). Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 04 8月, 2006 1 次提交
-
-
由 Adrian Bunk 提交于
nfs_writedata_free() and nfs_readdata_free() can now become static. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 5e1ce40f0c3c8f67591aff17756930d7a18ceb1a commit)
-
- 06 7月, 2006 1 次提交
-
-
由 Trond Myklebust 提交于
The introduction of the FLUSH_INVALIDATE argument to nfs_sync_inode_wait() does not clear the nr_unstable page state counter for pages that are being released. Also fix a longstanding similar bug when nfs_commit_list() fails. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 01 7月, 2006 3 次提交
-
-
由 Christoph Lameter 提交于
Conversion of nr_unstable to a per zone counter We need to do some special modifications to the nfs code since there are multiple cases of disposition and we need to have a page ref for proper accounting. This converts the last critical page state of the VM and therefore we need to remove several functions that were depending on GET_PAGE_STATE_LAST in order to make the kernel compile again. We are only left with event type counters in page state. [akpm@osdl.org: bugfixes] Signed-off-by: NChristoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
This makes nr_dirty a per zone counter. Looping over all processors is avoided during writeback state determination. The counter aggregation for nr_dirty had to be undone in the NFS layer since we summed up the page counts from multiple zones. Someone more familiar with NFS should probably review what I have done. [akpm@osdl.org: bugfix] Signed-off-by: NChristoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Jörn Engel 提交于
Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
- 28 6月, 2006 1 次提交
-
-
由 David Brownell 提交于
Builds on ARM report link problems with common configurations like statically linked NFS (for nfsroot). The symptom is that __init section code references __exit section code; that won't work since the exit sections are discarded (since they can never be called). The best fix for these particular cases would be an "__init_or_exit" section annotation. Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net> Acked-by: NTrond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 09 6月, 2006 3 次提交
-
-
由 David Howells 提交于
As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached patch splits it up into a number of files: (*) fs/nfs/inode.c Strictly inode specific functions. (*) fs/nfs/super.c Superblock management functions for NFS and NFS4, normal access, clones and referrals. The NFS4 superblock functions _could_ move out into a separate conditionally compiled file, but it's probably not worth it as there're so many common bits. (*) fs/nfs/namespace.c Some namespace-specific functions have been moved here. (*) fs/nfs/nfs4namespace.c NFS4-specific namespace functions (this could be merged into the previous file). This file is conditionally compiled. (*) fs/nfs/internal.h Inter-file declarations, plus a few simple utility functions moved from fs/nfs/inode.c. Additionally, all the in-.c-file externs have been moved here, and those files they were moved from now includes this file. For the most part, the functions have not been changed, only some multiplexor functions have changed significantly. I've also: (*) Added some extra banner comments above some functions. (*) Rearranged the function order within the files to be more logical and better grouped (IMO), though someone may prefer a different order. (*) Reduced the number of #ifdefs in .c files. (*) Added missing __init and __exit directives. Signed-Off-By: NDavid Howells <dhowells@redhat.com>
-
由 Trond Myklebust 提交于
In the case of a call to truncate_inode_pages(), we should really try to cancel any pending writes on the page. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Clean up use of page_array, and fix an off-by-one error noticed by Tom Talpey which causes kmalloc calls in cases where using the page_array is sufficient. Test plan: Normal client functional testing with r/wsize=32768. Signed-off-by: NChuck Lever <cel@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 27 3月, 2006 1 次提交
-
-
由 Matthew Dobson 提交于
Modify well over a dozen mempool users to call mempool_create_slab_pool() rather than calling mempool_create() with extra arguments, saving about 30 lines of code and increasing readability. Signed-off-by: NMatthew Dobson <colpatch@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 21 3月, 2006 10 次提交
-
-
由 Trond Myklebust 提交于
Kudos to Neil Brown for spotting the problem: "in nfs_sync_inode, there is effectively the sequence: nfs_wait_on_requests nfs_flush_inode nfs_commit_inode This seems a bit racy to me as if the only requests are on the ->commit list, and nfs_commit_inode is called separately after nfs_wait_on_requests completes, and before nfs_commit_inode start (say: by nfs_write_inode) then none of these function will return >0, yet there will be some pending request that aren't waited for." The solution is to search for requests to wait upon, search for dirty requests, and search for uncommitted requests while holding the nfsi->req_lock The patch also cleans up nfs_sync_inode(), getting rid of the redundant FLUSH_WAIT flag. It turns out that we were always setting it. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
We don't need to set PG_private for readahead pages, since they never get unlocked while I/O is in progress. However there is a small race in nfs_readpage_release() whereby the page may be unlocked, and have PG_private set. Fix is to have PG_private set only for the case of writes... Also fix a bug in nfs_clear_page_writeback(): Don't attempt to clear the radix_tree tag if we've already deleted the radix tree entry. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
We need to use nfs_commit_alloc() in fs/nfs/direct.c. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Duplicate infrastructure from direct read path that will allow write path to generate multiple write requests concurrently. This will enable us to add support for aio in this path. Temporarily we will lose the ability to do UNSTABLE writes followed by a COMMIT in the direct write path. However, all applications I am aware of that use NFS O_DIRECT currently write in relatively small chunks, so this should not be inconvenient in any way. Test plan: Millions of fsx-odirect ops. OraSim. Signed-off-by: NChuck Lever <cel@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
This patch inverts the callback hierarchy for NFS write calls. Instead of having the NFSv2/v3/v4-specific code set up the RPC callback ops, we allow the original caller to do so. This allows for more flexibility w.r.t. how to set up and tear down the nfs_write_data structure while still allowing the NFSv3/v4 code to perform error handling. The greater flexibility is needed by the asynchronous O_DIRECT code, which wants to be able to hold on to the original nfs_write_data structures after the WRITE RPC call has completed in order to be able to replay them if the COMMIT call determines that the server has rebooted. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Invoke the byte and event counter macros where we want to count bytes and events. Clean-up: fix a possible NULL dereference in nfs_lock, and simplify nfs_file_open. Test-plan: fsx and iozone on UP and SMP systems, with and without pre-emption. Watch for memory overwrite bugs, and performance loss (significantly more CPU required per op). Signed-off-by: NChuck Lever <cel@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Eric Sesterhenn 提交于
this converts fs/nfs to kzalloc() usage. compile tested with make allyesconfig Signed-off-by: NEric Sesterhenn <snakebyte@gmx.de> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Neil Brown 提交于
I've been reading through fs/nfs/write.c trying to track down a bug that seems to be related to pages loosing a refcount and getting freed too early (you interested in detail??) and I spotted a little bug which the following patch should fix. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 07 1月, 2006 4 次提交
-
-
由 Trond Myklebust 提交于
The SuS states that a call to write() will cause mtime to be updated on the file. In order to satisfy that requirement, we need to flush out any cached writes in nfs_getattr(). Speed things up slightly by not committing the writes. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Chuck Lever 提交于
Most NFS server implementations allow up to 64KB reads and writes on the wire. The Solaris NFS server allows up to a megabyte, for instance. Now the Linux NFS client supports transfer sizes up to 1MB, too. This will help reduce protocol and context switch overhead on read/write intensive NFS workloads, and support larger atomic read and write operations on servers that support them. Test-plan: Connectathon and iozone on mount point with wsize=rsize>32768 over TCP. Tests with NFS over UDP to verify the maximum RPC payload size cap. Signed-off-by: NChuck Lever <cel@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Shrink the RPC task structure. Instead of storing separate pointers for task->tk_exit and task->tk_release, put them in a structure. Also pass the user data pointer as a parameter instead of passing it via task->tk_calldata. This enables us to nest callbacks. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Ensure that we always initiate flushing of data before we exit a single-page ->writepage() call. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 04 12月, 2005 1 次提交
-
-
由 Trond Myklebust 提交于
Ensure that we use set_page_writeback() in the appropriate places to help the VM in keeping its page radix_tree in sync. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-