- 31 3月, 2014 13 次提交
-
-
由 Jeff Layton 提交于
Due to some unfortunate history, POSIX locks have very strange and unhelpful semantics. The thing that usually catches people by surprise is that they are dropped whenever the process closes any file descriptor associated with the inode. This is extremely problematic for people developing file servers that need to implement byte-range locks. Developers often need a "lock management" facility to ensure that file descriptors are not closed until all of the locks associated with the inode are finished. Additionally, "classic" POSIX locks are owned by the process. Locks taken between threads within the same process won't conflict with one another, which renders them useless for synchronization between threads. This patchset adds a new type of lock that attempts to address these issues. These locks conflict with classic POSIX read/write locks, but have semantics that are more like BSD locks with respect to inheritance and behavior on close. This is implemented primarily by changing how fl_owner field is set for these locks. Instead of having them owned by the files_struct of the process, they are instead owned by the filp on which they were acquired. Thus, they are inherited across fork() and are only released when the last reference to a filp is put. These new semantics prevent them from being merged with classic POSIX locks, even if they are acquired by the same process. These locks will also conflict with classic POSIX locks even if they are acquired by the same process or on the same file descriptor. The new locks are managed using a new set of cmd values to the fcntl() syscall. The initial implementation of this converts these values to "classic" cmd values at a fairly high level, and the details are not exposed to the underlying filesystem. We may eventually want to push this handing out to the lower filesystem code but for now I don't see any need for it. Also, note that with this implementation the new cmd values are only available via fcntl64() on 32-bit arches. There's little need to add support for legacy apps on a new interface like this. Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
It's not really feasible to do deadlock detection with FL_FILE_PVT locks since they aren't owned by a single task, per-se. Deadlock detection also tends to be rather expensive so just skip it for these sorts of locks. Also, add a FIXME comment about adding more limited deadlock detection that just applies to ro -> rw upgrades, per Andy's request. Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
Once we introduce file private locks, we'll need to know what cmd value was used, as that affects the ownership and whether a conflict would arise. Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
FL_FILE_PVT locks are no longer tied to a particular pid, and are instead inheritable by child processes. Report a l_pid of '-1' for these sorts of locks since the pid is somewhat meaningless for them. This precedent comes from FreeBSD. There, POSIX and flock() locks can conflict with one another. If fcntl(F_GETLK, ...) returns a lock set with flock() then the l_pid member cannot be a process ID because the lock is not held by a process as such. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
In a later patch, we'll be adding a new type of lock that's owned by the struct file instead of the files_struct. Those sorts of locks will be flagged with a new FL_FILE_PVT flag. Report these types of locks as "FLPVT" in /proc/locks to distinguish them from "classic" POSIX locks. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
This function currently removes leases in addition to flock locks and in a later patch we'll have it deal with file-private locks too. Rename it to locks_remove_file to indicate that it removes locks that are associated with a particular struct file, and not just flock locks. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
Move this check into flock64_to_posix_lock instead of duplicating it in two places. This also fixes a minor wart in the code where we continue referring to the struct flock after converting it to struct file_lock. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 J. Bruce Fields 提交于
In the 32-bit case fcntl assigns the 64-bit f_pos and i_size to a 32-bit off_t. The existing range checks also seem to depend on signed arithmetic wrapping when it overflows. In practice maybe that works, but we can be more careful. That also allows us to make a more reliable distinction between -EINVAL and -EOVERFLOW. Note that in the 32-bit case SEEK_CUR or SEEK_END might allow the caller to set a lock with starting point no longer representable as a 32-bit value. We could return -EOVERFLOW in such cases, but the locks code is capable of handling such ranges, so we choose to be lenient here. The only problem is that subsequent GETLK calls on such a lock will fail with EOVERFLOW. While we're here, do some cleanup including consolidating code for the flock and flock64 cases. Signed-off-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
A leftover lock on the list is surely a sign of a problem of some sort, but it's not necessarily a reason to panic the box. Instead, just log a warning with some info about the lock, and then delete it like we would any other lock. In the event that the filesystem declares a ->lock f_op, we may end up leaking something, but that's generally preferable to an immediate panic. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
...to make sparse happy. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
It's best to let the compiler decide that. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
由 Jeff Layton 提交于
As Al Viro points out, there is an unlikely, but possible race between opening a file and setting a lease on it. generic_add_lease is done with the i_lock held, but the inode->i_flock check in break_lease is lockless. It's possible for another task doing an open to do the entire pathwalk and call break_lease between the point where generic_add_lease checks for a conflicting open and adds the lease to the list. If this occurs, we can end up with a lease set on the file with a conflicting open. To guard against that, check again for a conflicting open after adding the lease to the i_flock list. If the above race occurs, then we can simply unwind the lease setting and return -EAGAIN. Because we take dentry references and acquire write access on the file before calling break_lease, we know that if the i_flock list is empty when the open caller goes to check it then the necessary refcounts have already been incremented. Thus the additional check for a conflicting open will see that there is one and the setlease call will fail. Cc: Bruce Fields <bfields@fieldses.org> Cc: David Howells <dhowells@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Reported-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@fieldses.org>
-
- 13 11月, 2013 1 次提交
-
-
由 Dan Carpenter 提交于
We should unlock here before returning. Fixes: df4e8d2c ('locks: implement delegations') Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 09 11月, 2013 2 次提交
-
-
由 J. Bruce Fields 提交于
Implement NFSv4 delegations at the vfs level using the new FL_DELEG lock type. Note nfsd is the only delegation user and is only using read delegations. Warn on any attempt to set a write delegation for now. We'll come back to that case later. Acked-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 J. Bruce Fields 提交于
For now FL_DELEG is just a synonym for FL_LEASE. So this patch doesn't change behavior. Next we'll modify break_lease to treat FL_DELEG leases differently, to account for the fact that NFSv4 delegations should be broken in more situations than Windows oplocks. Acked-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 10月, 2013 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 08 7月, 2013 1 次提交
-
-
由 Jeff Layton 提交于
The file_lock_list is only used for /proc/locks. The vastly common case is for locks to be put onto the list and come off again, without ever being traversed. Help optimize for this use-case by moving to percpu hlist_head-s. At the same time, we can make the locking less contentious by moving to an lglock. When iterating over the lists for /proc/locks, we must take the global lock and then iterate over each CPU's list in turn. This change necessitates a new fl_link_cpu field to keep track of which CPU the entry is on. On x86_64 at least, this field is placed within an existing hole in the struct to avoid growing the size. Signed-off-by: NJeff Layton <jlayton@redhat.com> Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 7月, 2013 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 29 6月, 2013 12 次提交
-
-
由 Jeff Layton 提交于
There's no reason we have to protect the blocked_hash and file_lock_list with the same spinlock. With the tests I have, breaking it in two gives a barely measurable performance benefit, but it seems reasonable to make this locking as granular as possible. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Currently, the hashing that the locking code uses to add these values to the blocked_hash is simply calculated using fl_owner field. That's valid in most cases except for server-side lockd, which validates the owner of a lock based on fl_owner and fl_pid. In the case where you have a small number of NFS clients doing a lot of locking between different processes, you could end up with all the blocked requests sitting in a very small number of hash buckets. Add a new lm_owner_key operation to the lock_manager_operations that will generate an unsigned long to use as the key in the hashtable. That function is only implemented for server-side lockd, and simply XORs the fl_owner and fl_pid. Signed-off-by: NJeff Layton <jlayton@redhat.com> Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Break up the blocked_list into a hashtable, using the fl_owner as a key. This speeds up searching the hash chains, which is especially significant for deadlock detection. Note that the initial implementation assumes that hashing on fl_owner is sufficient. In most cases it should be, with the notable exception being server-side lockd, which compares ownership using a tuple of the nlm_host and the pid sent in the lock request. So, this may degrade to a single hash bucket when you only have a single NFS client. That will be addressed in a later patch. The careful observer may note that this patch leaves the file_lock_list alone. There's much less of a case for turning the file_lock_list into a hashtable. The only user of that list is the code that generates /proc/locks, and it always walks the entire list. Signed-off-by: NJeff Layton <jlayton@redhat.com> Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Testing has shown that iterating over the blocked_list for deadlock detection turns out to be a bottleneck. In order to alleviate that, begin the process of turning it into a hashtable. We start by turning the fl_link into a hlist_node and the global lists into hlists. A later patch will do the conversion of the blocked_list to a hashtable. Signed-off-by: NJeff Layton <jlayton@redhat.com> Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Since we always hold the i_lock when inserting a new waiter onto the fl_block list, we can avoid taking the global lock at all if we find that it's empty when we go to wake up blocked waiters. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Having a global lock that protects all of this code is a clear scalability problem. Instead of doing that, move most of the code to be protected by the i_lock instead. The exceptions are the global lists that the ->fl_link sits on, and the ->fl_block list. ->fl_link is what connects these structures to the global lists, so we must ensure that we hold those locks when iterating over or updating these lists. Furthermore, sound deadlock detection requires that we hold the blocked_list state steady while checking for loops. We also must ensure that the search and update to the list are atomic. For the checking and insertion side of the blocked_list, push the acquisition of the global lock into __posix_lock_file and ensure that checking and update of the blocked_list is done without dropping the lock in between. On the removal side, when waking up blocked lock waiters, take the global lock before walking the blocked list and dequeue the waiters from the global list prior to removal from the fl_block list. With this, deadlock detection should be race free while we minimize excessive file_lock_lock thrashing. Finally, in order to avoid a lock inversion problem when handling /proc/locks output we must ensure that manipulations of the fl_block list are also protected by the file_lock_lock. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Move the fl_link list handling routines into a separate set of helpers. Also ensure that locks and requests are always put on global lists last (after fully initializing them) and are taken off before unintializing them. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@redhat.com> Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@redhat.com> Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
commit 66189be7 (CIFS: Fix VFS lock usage for oplocked files) exported the locks_delete_block symbol. There's already an exported helper function that provides this capability however, so make cifs use that instead and turn locks_delete_block back into a static function. Note that if fl->fl_next == NULL then this lock has already been through locks_delete_block(), so we should be OK to ignore an ENOENT error here and simply not retry the lock. Cc: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: NJeff Layton <jlayton@redhat.com> Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 23 2月, 2013 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 27 9月, 2012 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 21 8月, 2012 1 次提交
-
-
由 Jeff Layton 提交于
The rules for fl_type are rather convoluted. Typically it's treated as holding specific values, except in the case of LOCK_MAND, in which case it can be or'ed with LOCK_READ|LOCK_WRITE. On some arches F_WRLCK == 2 and F_UNLCK == 3, so and'ing with F_WRLCK will also catch the F_UNLCK case. It's unlikely in either case here that we'd ever see F_UNLCK since those shouldn't end up on any lists, but it's still best to be consistent. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 02 8月, 2012 1 次提交
-
-
由 J. Bruce Fields 提交于
In commit 3b6e2723 ("locks: prevent side-effects of locks_release_private before file_lock is initialized") we removed the last user of lm_release_private without removing the field itself. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 7月, 2012 1 次提交
-
-
由 J. Bruce Fields 提交于
No point putting something only used by one caller into common code. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 27 7月, 2012 1 次提交
-
-
由 Filipe Brandenburger 提交于
When calling fcntl(fd, F_SETLEASE, lck) [with lck=F_WRLCK or F_RDLCK], the custom signal or owner (if any were previously set using F_SETSIG or F_SETOWN fcntls) would be reset when F_SETLEASE was called for the second time on the same file descriptor. This bug is a regression of 2.6.37 and is described here: https://bugzilla.kernel.org/show_bug.cgi?id=43336 This patch reverts a commit from Oct 2004 (with subject "nfs4 lease: move the f_delown processing") which originally introduced the lm_release_private callback. Signed-off-by: NFilipe Brandenburger <filbranden@gmail.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 24 7月, 2012 1 次提交
-
-
由 J. Bruce Fields 提交于
The only checks of the long argument passed to fcntl(fd,F_SETLEASE,.) are done after converting the long to an int. Thus some illegal values may be let through and cause problems in later code. [ They actually *don't* cause problems in mainline, as of Dave Jones's commit 8d657eb3 "Remove easily user-triggerable BUG from generic_setlease", but we should fix this anyway. And this patch will be necessary to fix real bugs on earlier kernels. ] Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 7月, 2012 1 次提交
-
-
由 Dave Jones 提交于
This can be trivially triggered from userspace by passing in something unexpected. kernel BUG at fs/locks.c:1468! invalid opcode: 0000 [#1] SMP RIP: 0010:generic_setlease+0xc2/0x100 Call Trace: __vfs_setlease+0x35/0x40 fcntl_setlease+0x76/0x150 sys_fcntl+0x1c6/0x810 system_call_fastpath+0x1a/0x1f Signed-off-by: NDave Jones <davej@redhat.com> Cc: stable@kernel.org # 3.2+ Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 5月, 2012 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-