- 14 7月, 2005 4 次提交
-
-
由 Steve Dickson 提交于
Allow the setting of NLM timeouts and grace periods through the proc and sysclt interfaces on x86_64 architectures Signed-off-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Anton Altaparmakov 提交于
Something has changed in the core kernel such that we now get concurrent inode write outs, one e.g via pdflush and one via sys_sync or whatever. This causes a nasty deadlock in ntfs. The only clean solution unfortunately requires a minor vfs api extension. First the deadlock analysis: Prerequisive knowledge: NTFS has a file $MFT (inode 0) loaded at mount time. The NTFS driver uses the page cache for storing the file contents as usual. More interestingly this file contains the table of on-disk inodes as a sequence of MFT_RECORDs. Thus NTFS driver accesses the on-disk inodes by accessing the MFT_RECORDs in the page cache pages of the loaded inode $MFT. The situation: VFS inode X on a mounted ntfs volume is dirty. For same inode X, the ntfs_inode is dirty and thus corresponding on-disk inode, which is as explained above in a dirty PAGE_CACHE_PAGE belonging to the table of inodes ($MFT, inode 0). What happens: Process 1: sys_sync()/umount()/whatever... calls __sync_single_inode() for $MFT -> do_writepages() -> write_page for the dirty page containing the on-disk inode X, the page is now locked -> ntfs_write_mst_block() which clears PageUptodate() on the page to prevent anyone else getting hold of it whilst it does the write out (this is necessary as the on-disk inode needs "fixups" applied before the write to disk which are removed again after the write and PageUptodate is then set again). It then analyses the page looking for dirty on-disk inodes and when it finds one it calls ntfs_may_write_mft_record() to see if it is safe to write this on-disk inode. This then calls ilookup5() to check if the corresponding VFS inode is in icache(). This in turn calls ifind() which waits on the inode lock via wait_on_inode whilst holding the global inode_lock. Process 2: pdflush results in a call to __sync_single_inode for the same VFS inode X on the ntfs volume. This locks the inode (I_LOCK) then calls write-inode -> ntfs_write_inode -> map_mft_record() -> read_cache_page() of the page (in page cache of table of inodes $MFT, inode 0) containing the on-disk inode. This page has PageUptodate() clear because of Process 1 (see above) so read_cache_page() blocks when tries to take the page lock for the page so it can call ntfs_read_page(). Thus Process 1 is holding the page lock on the page containing the on-disk inode X and it is waiting on the inode X to be unlocked in ifind() so it can write the page out and then unlock the page. And Process 2 is holding the inode lock on inode X and is waiting for the page to be unlocked so it can call ntfs_readpage() or discover that Process 1 set PageUptodate() again and use the page. Thus we have a deadlock due to ifind() waiting on the inode lock. The only sensible solution: NTFS does not care whether the VFS inode is locked or not when it calls ilookup5() (it doesn't use the VFS inode at all, it just uses it to find the corresponding ntfs_inode which is of course attached to the VFS inode (both are one single struct); and it uses the ntfs_inode which is subject to its own locking so I_LOCK is irrelevant) hence we want a modified ilookup5_nowait() which is the same as ilookup5() but it does not wait on the inode lock. Without such functionality I would have to keep my own ntfs_inode cache in the NTFS driver just so I can find ntfs_inodes independent of their VFS inodes which would be slow, memory and cpu cycle wasting, and incredibly stupid given the icache already exists in the VFS. Below is a patch that does the ilookup5_nowait() implementation in fs/inode.c and exports it. ilookup5_nowait.diff: Introduce ilookup5_nowait() which is basically the same as ilookup5() but it does not wait on the inode's lock (i.e. it omits the wait_on_inode() done in ifind()). This is needed to avoid a nasty deadlock in NTFS. Signed-off-by: NAnton Altaparmakov <aia21@cantab.net> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Robert Love 提交于
Really simple, basic cleanup. Signed-off-by: NRobert Love <rml@novell.com> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Robert Love 提交于
This moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from "/proc/sys/fs". Also some related cleanup. Signed-off-by: NRobert Love <rml@novell.com> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 13 7月, 2005 16 次提交
-
-
由 Ian Dall 提交于
It turns out this is due to some inverted logic in xattr.c Signed-off-by: NDave Kleikamp <shaggy@austin.ibm.com>
-
由 Dave Kleikamp 提交于
All of the different xattr namespaces have different rules. user.* and ACL's are not allowed on symlinks, and since these were the first xattrs implemented, I assumed there was no need to support xattrs on symlinks. This one-line patch should fix it. Signed-off-by: NDave Kleikamp <shaggy@austin.ibm.com>
-
由 Robert Love 提交于
inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a system call that returns a fd, not SIGIO. You get a single fd, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure), Gamin (a FAM replacement), and other projects. See Documentation/filesystems/inotify.txt. Signed-off-by: NRobert Love <rml@novell.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Linus Torvalds 提交于
This was a pure indentation change, using: scripts/Lindent fs/reiserfs/*.c include/linux/reiserfs_*.h to make reiserfs match the regular Linux indentation style. As Jeff Mahoney <jeffm@suse.com> writes: The ReiserFS code is a mix of a number of different coding styles, sometimes different even from line-to-line. Since the code has been relatively stable for quite some time and there are few outstanding patches to be applied, it is time to reformat the code to conform to the Linux style standard outlined in Documentation/CodingStyle. This patch contains the result of running scripts/Lindent against fs/reiserfs/*.c and include/linux/reiserfs_*.h. There are places where the code can be made to look better, but I'd rather keep those patches separate so that there isn't a subtle by-hand hand accident in the middle of a huge patch. To be clear: This patch is reformatting *only*. A number of patches may follow that continue to make the code more consistent with the Linux coding style. Hans wasn't particularly enthusiastic about these patches, but said he wouldn't really oppose them either. Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Jeff Mahoney 提交于
indent(1) doesn't know how to handle the "do not compile" error. It results in the item_ops array declaration being indented a tab stop in when it should not be. This patch replaces it with a #error that describes why it's failing. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Brian King 提交于
While fixing an oops in the st driver in a dirty release path, I encountered an oops in cdev_put for cdevs allocated using cdev_alloc. If cdev_del is called when the cdev kobject still has an open user, when the last cdev_put is called, the cdev_put will call kobject_put, which will end up ultimately releasing the cdev in cdev_dynamic_release. Patch fixes the oops by preventing cdev_put from accessing freed memory. Signed-off-by: NBrian King <brking@us.ibm.com> Cc: <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Jan Kara 提交于
Restore old set of ext2 mount options when remounting of a filesystem fails. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Jan Kara 提交于
Fix a problem with ext3 mount option parsing. When remount of a filesystem fails, old options are now restored. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Roland McGrath 提交于
When a noninitial thread does exec, it becomes the new group leader. If there is a ITIMER_REAL timer running, it points at the old group leader and when it fires it can follow a stale pointer. The timer data needs to be reset to point at the exec'ing thread that is becoming the group leader. This has to synchronize with any concurrent firing of the timer to make sure that it_real_fn can never run when the data points to a thread that might have been reaped already. Signed-off-by: NRoland McGrath <roland@redhat.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Artem B. Bityuckiy 提交于
Bug symptoms ~~~~~~~~~~~~ For the same inode VFS calls read_inode() twice and doesn't call clear_inode() between the two read_inode() invocations. Bug description ~~~~~~~~~~~~~~~ Suppose we have an inode which has zero reference count but is still in the inode cache. Suppose kswapd invokes shrink_icache_memory() to free some RAM. In prune_icache() inodes are removed from i_hash. prune_icache () is then going to call clear_inode(), but drops the inode_lock spinlock before this. If in this moment another task calls iget() for an inode which was just removed from i_hash by prune_icache(), then iget() invokes read_inode() for this inode, because it is *already removed* from i_hash. The end result is: we call iget(#N) then iput(#N); inode #N has zero i_count now and is in the inode cache; kswapd starts. kswapd removes the inode #N from i_hash ans is preempted; we call iget(#N) again; read_inode() is invoked as the result; but we expect clear_inode() before. Fix ~~~~~~~ To fix the bug I remove inodes from i_hash later, when clear_inode() is actually called. I remove them from i_hash under spinlock protection. Since the i_state is set to I_FREEING, it is safe to do this. The others will sleep waiting for the inode state change. I also postpone removing inodes from i_sb_list. It is not compulsory to do so but I do it for readability reasons. Inodes are added/removed to the lists together everywhere in the code and there is no point to change this rule. This is harmless because the only user of i_sb_list which somehow may interfere with me (invalidate_list()) is excluded by the iprune_sem mutex. The same race is possible in invalidate_list() so I do the same for it. Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Miklos Szeredi 提交于
This patch fixes queer behavior in __wait_on_freeing_inode(). If I_LOCK was not set it called yield(), effectively busy waiting for the removal of the inode from the hash. This change was introduced within "[PATCH] eliminate inode waitqueue hashtable" Changeset 1.1938.166.16 last october by wli. The solution is to restore the old behavior, of unconditionally waiting on the waitqueue. It doesn't matter if I_LOCK is not set initally, the task will go to sleep, and wake up when wake_up_inode() is called from generic_delete_inode() after removing the inode from the hash chain. Comment is also updated to better reflect current behavior. This condition is very hard to trigger normally (simultaneous clear_inode() with iget()) so probably only heavy stress testing can reveal any change of behavior. Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Todd Poynor 提交于
Signed-off-by: NTodd Poynor <tpoynor@mvista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Artem B. Bityuckiy 提交于
In case of a mount error locks might be uninitialized but accessed by the resulting call to jffs2_kill_sb(). Signed-off-by: NArtem B. Bityuckiy <dedekind@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Artem B. Bityuckiy 提交于
We recently changed the method of collecting and sorting of tmp_dnode objects to use a temporary RB-tree instead of a temporary list. Rename function and update comments. Signed-off-by: NArtem B. Bityuckiy <dedekind@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Artem B. Bityuckiy 提交于
Signed-off-by: NArtem B. Bityuckiy <dedekind@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Artem B. Bityuckiy 提交于
Signed-off-by: NArtem B. Bityuckiy <dedekind@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 08 7月, 2005 20 次提交
-
-
由 NeilBrown 提交于
After discussion at the recent NFSv4 bake-a-thon, I realized that my assumption that NFS4_FH_PERSISTENT required filehandles to persist was a misreading of the spec. This also fixes an interoperability problem with the Solaris client. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
We shouldn't be allowing, e.g., write locks on files not open for read. To enforce this, we add a pointer from the lock stateid back to the open stateid it came from, so that the check will continue to be correct even after the open is upgraded or downgraded. Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
As long as we're here, do some miscellaneous cleanup. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The handling of close_lru in preprocess_stateid_op was a source of some confusion here recently. Try to make the logic a little clearer, by renaming find_openstateowner_id to make its purpose clearer and untangling some unnecessarily complicated goto's. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
nfs4_preprocess_seqid_op is called by NFSv4 operations that imply an implicit renewal of the client lease. Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
from RFC 3530: "Share reservations are established by OPEN operations and by their nature are mandatory in that when the OPEN denies READ or WRITE operations, that denial results in such operations being rejected with error NFS4ERR_LOCKED." (Note that share_denied is really only a legal error for OPEN.) Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
An OPEN from the same client/open stateowner requires a stateid update because of the share/deny access update. Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
We're insisting that the lock sequence id field passed in the open_to_lockowner struct always be zero. This is probably thanks to the sentence in rfc3530: "The first request issued for any given lock_owner is issued with a sequence number of zero." But there doesn't seem to be any problem with allowing initial sequence numbers other than zero. And currently this is causing lock reclaims from the Linux client to fail. In the spirit of "be liberal in what you accept, conservative in what you send", we'll relax the check (and patch the Linux client as well). Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Add some comments on the use of so_seqid, in an attempt to avoid some of the confusion outlined in the previous patch.... Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The sequence number we store in the sequence id is the last one we received from the client. So on the next operation we'll check that the client gives us the next higher number. We increment sequence id's at the last moment, in encode, so that we're sure of knowing the right error return. (The decision to increment the sequence id depends on the exact error returned.) However on the *first* use of a sequence number, if we set the sequence number to the one received from the client and then let the increment happen on encode, we'll be left with a sequence number one to high. For that reason, ENCODE_SEQID_OP_TAIL only increments the sequence id on *confirmed* stateowners. This creates a problem for open reclaims, which are confirmed on first use. Therefore the open reclaim code, as a special exception, *decrements* the sequence id, cancelling out the undesired increment on encode. But this prevents the sequence id from ever being incremented in the case where multiple reclaims are sent with the same openowner. Yuch! We could add another exception to the open reclaim code, decrementing the sequence id only if this is the first use of the open owner. But it's simpler by far to modify the meaning of the op_seqid field: instead of representing the previous value sent by the client, we take op_seqid, after encoding, to represent the *next* sequence id that we expect from the client. This eliminates the need for special-case handling of the first use of a stateowner. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Yeah, it's trivial, but this drives me up the wall.... Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
A misreading of the spec lead us to convert all errors on open and lock reclaims to RECLAIM_BAD. This causes problems--for example, a reboot within the grace period could lead to reclaims with stale stateid's, and we'd like to return STALE errors in those cases. What rfc3530 actually says about RECLAIM_BAD: "The reclaim provided by the client does not match any of the server's state consistency checks and is bad." I'm assuming that "state consistency checks" refers to checks for consistency with the state recorded to stable storage, and that the error should be reserved for that case. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
A GRACE or NOGRACE response to a lock request should also bump the sequence id. So we delay the handling of grace period errors till after we've found the relevant owner. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
The GRACE and NOGRACE errors should bump the sequence id on open. So we delay the handling of these errors until nfsd4_process_open2, at which point we've set the open owner, so the encode routine will be able to bump the sequence id. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
We oops in list_for_each_entry(), because release_stateowner frees something on the list we're traversing. Signed-off-by: NAndy Adamson <andros@citi.umich.edu> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Make sure we don't try to delete client recovery directories multiple times; fixes some spurious error messages. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Oops, this lookup_one_len needs the i_sem. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
We need to fsync the recovery directory after writing to it, but we weren't doing this correctly. (For example, we weren't taking the i_sem when calling ->fsync().) Just reuse the existing nfsd fsync code instead. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
We need to remove the recovery directory here too. (This chunk just got lost somehow in the process of commuting the reboot recovery patches past the other patches.) Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NNeil Brown <neilb@cse.unsw.edu.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Miklos Szeredi 提交于
This patch renames _mntput() to something a little more descriptive: mntput_no_expire(). Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-