提交 · 9d5b86ac13c573795525ecac6ed2db39ab23e2a8 · openanolis / cloud-kernel

16 7月, 2017 2 次提交

fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks · 9d5b86ac

由 Benjamin Coddington 提交于 7月 16, 2017

Since commit c69899a1 "NFSv4: Update of VFS byte range lock must be
atomic with the stateid update", NFSv4 has been inserting locks in rpciod
worker context. The result is that the file_lock's fl_nspid is the
kworker's pid instead of the original userspace pid.

The fl_nspid is only used to represent the namespaced virtual pid number
when displaying locks or returning from F_GETLK. There's no reason to set
it for every inserted lock, since we can usually just look it up from
fl_pid. So, instead of looking up and holding struct pid for every lock,
let's just look up the virtual pid number from fl_pid when it is needed.
That means we can remove fl_nspid entirely.

The translaton and presentation of fl_pid should handle the following four
cases:

1 - F_GETLK on a remote file with a remote lock:
In this case, the filesystem should determine the l_pid to return here.
Filesystems should indicate that the fl_pid represents a non-local pid
value that should not be translated by returning an fl_pid <= 0.

2 - F_GETLK on a local file with a remote lock:
This should be the l_pid of the lock manager process, and translated.

3 - F_GETLK on a remote file with a local lock, and
4 - F_GETLK on a local file with a local lock:
These should be the translated l_pid of the local locking process.

Fuse was already doing the correct thing by translating the pid into the
caller's namespace. With this change we must update fuse to translate
to init's pid namespace, so that the locks API can then translate from
init's pid namespace into the pid namespace of the caller.

With this change, the locks API will expect that if a filesystem returns
a remote pid as opposed to a local pid for F_GETLK, that remote pid will
be <= 0. This signifies that the pid is remote, and the locks API will
forego translating that pid into the pid namespace of the local calling
process.

Finally, we convert remote filesystems to present remote pids using
negative numbers. Have lustre, 9p, ceph, cifs, and dlm negate the remote
pid returned for F_GETLK lock requests.

Since local pids will never be larger than PID_MAX_LIMIT (which is
currently defined as <= 4 million), but pid_t is an unsigned int, we
should have plenty of room to represent remote pids with negative
numbers if we assume that remote pid numbers are similarly limited.

If this is not the case, then we run the risk of having a remote pid
returned for which there is also a corresponding local pid. This is a
problem we have now, but this patch should reduce the chances of that
occurring, while also returning those remote pid numbers, for whatever
that may be worth.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

9d5b86ac

fs/locks: Use allocation rather than the stack in fcntl_getlk() · 52306e88

由 Benjamin Coddington 提交于 7月 16, 2017

Struct file_lock is fairly large, so let's save some space on the stack by
using an allocation for struct file_lock in fcntl_getlk(), just as we do
for fcntl_setlk().
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

52306e88

27 5月, 2017 2 次提交

fs/locks: pass kernel struct flock to fcntl_getlk/setlk · a75d30c7

由 Christoph Hellwig 提交于 5月 27, 2017

This will make it easier to implement a sane compat fcntl syscall.

[ jlayton: fix undeclared identifiers in 32-bit fcntl64 syscall handler ]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

a75d30c7

fs: locks: Fix some troubles at kernel-doc comments · 80b79dd0

由 Mauro Carvalho Chehab 提交于 5月 27, 2017

There are a few syntax violations that cause outputs of
a few comments to not be properly parsed in ReST format.

No functional changes.
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

80b79dd0

21 4月, 2017 1 次提交

locks: Set FL_CLOSE when removing flock locks on close() · 50f2112c

由 Benjamin Coddington 提交于 4月 11, 2017

Set FL_CLOSE in fl_flags as in locks_remove_posix() when clearing locks.
NFS will check for this flag to ensure an unlock is sent in a following
patch.

Fuse handles flock and posix locks differently for FL_CLOSE, and so
requires a fixup to retain the existing behavior for flock.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NMiklos Szeredi <miklos@szeredi.hu>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

50f2112c

25 12月, 2016 1 次提交

Replace <asm/uaccess.h> with <linux/uaccess.h> globally · 7c0f6ba6

由 Linus Torvalds 提交于 12月 24, 2016

This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.
Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c0f6ba6

18 10月, 2016 1 次提交

locking, fs/locks: Add missing file_sem locks · 5f43086b

由 Peter Zijlstra 提交于 10月 08, 2016

I overlooked a few code-paths that can lead to
locks_delete_global_locks().
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NJeff Layton <jlayton@poochiereds.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-fsdevel@vger.kernel.org
Cc: syzkaller <syzkaller@googlegroups.com>
Link: http://lkml.kernel.org/r/20161008081228.GF3142@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

5f43086b

28 9月, 2016 1 次提交

fs: Replace current_fs_time() with current_time() · c2050a45

由 Deepa Dinamani 提交于 9月 14, 2016

current_fs_time() uses struct super_block* as an argument.
As per Linus's suggestion, this is changed to take struct
inode* as a parameter instead. This is because the function
is primarily meant for vfs inode timestamps.
Also the function was renamed as per Arnd's suggestion.

Change all calls to current_fs_time() to use the new
current_time() function instead. current_fs_time() will be
deleted.
Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c2050a45

22 9月, 2016 3 次提交

fs/locks: Use percpu_down_read_preempt_disable() · 87709e28

由 Peter Zijlstra 提交于 5月 30, 2016

Avoid spurious preemption.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: tj@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

87709e28

fs/locks: Replace lg_local with a per-cpu spinlock · 7c3f654d

由 Peter Zijlstra 提交于 6月 22, 2015

As Oleg suggested, replace file_lock_list with a structure containing
the hlist head and a spinlock.

This completely removes the lglock from fs/locks.
Suggested-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: tj@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

7c3f654d

fs/locks: Replace lg_global with a percpu-rwsem · aba37660

由 Peter Zijlstra 提交于 6月 22, 2015

Replace the global part of the lglock with a percpu-rwsem.

Since fcl_lock is a spinlock and itself nests under i_lock, which too
is a spinlock we cannot acquire sleeping locks at
locks_{insert,remove}_global_locks().

We can however wrap all fcl_lock acquisitions with percpu_down_read
such that all invocations of locks_{insert,remove}_global_locks() have
that read lock held.

This allows us to replace the lg_global part of the lglock with the
write side of the rwsem.

In the absense of writers, percpu_{down,up}_read() are free of atomic
instructions. This further avoids the very long preempt-disable
regions caused by lglock on larger machines.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: tj@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

aba37660

16 9月, 2016 2 次提交

vfs: do get_write_access() on upper layer of overlayfs · 4d0c5ba2

由 Miklos Szeredi 提交于 9月 16, 2016

The problem with writecount is: we want consistent handling of it for
underlying filesystems as well as overlayfs. Making sure i_writecount is
correct on all layers is difficult. Instead this patch makes sure that
when write access is acquired, it's always done on the underlying writable
layer (called the upper layer). We must also make sure to look at the
writecount on this layer when checking for conflicting leases.

Open for write already updates the upper layer's writecount. Leaving only
truncate.

For truncate copy up must happen before get_write_access() so that the
writecount is updated on the upper layer. Problem with this is if
something fails after that, then copy-up was done needlessly. E.g. if
break_lease() was interrupted. Probably not a big deal in practice.

Another interesting case is if there's a denywrite on a lower file that is
then opened for write or truncated. With this patch these will succeed,
which is somewhat counterintuitive. But I think it's still acceptable,
considering that the copy-up does actually create a different file, so the
old, denywrite mapping won't be touched.

On non-overlayfs d_real() is an identity function and d_real_inode() is
equivalent to d_inode() so this patch doesn't change behavior in that case.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Acked-by: NJeff Layton <jlayton@poochiereds.net>
Cc: "J. Bruce Fields" <bfields@fieldses.org>

4d0c5ba2

locks: fix file locking on overlayfs · c568d683

由 Miklos Szeredi 提交于 9月 16, 2016

This patch allows flock, posix locks, ofd locks and leases to work
correctly on overlayfs.

Instead of using the underlying inode for storing lock context use the
overlay inode.  This allows locks to be persistent across copy-up.

This is done by introducing locks_inode() helper and using it instead of
file_inode() to get the inode in locking code.  For non-overlayfs the two
are equivalent, except for an extra pointer dereference in locks_inode().

Since lock operations are in "struct file_operations" we must also make
sure not to call underlying filesystem's lock operations.  Introcude a
super block flag MS_NOREMOTELOCK to this effect.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Acked-by: NJeff Layton <jlayton@poochiereds.net>
Cc: "J. Bruce Fields" <bfields@fieldses.org>

c568d683

19 8月, 2016 1 次提交

locks: Filter /proc/locks output on proc pid ns · d67fd44f

由 Nikolay Borisov 提交于 8月 17, 2016

On busy container servers reading /proc/locks shows all the locks
created by all clients. This can cause large latency spikes. In my
case I observed lsof taking up to 5-10 seconds while processing around
50k locks. Fix this by limiting the locks shown only to those created
in the same pidns as the one the proc fs was mounted in. When reading
/proc/locks from the init_pid_ns proc instance then perform no
filtering

[ jlayton: reformat comments for 80 columns ]
Signed-off-by: NNikolay Borisov <kernel@kyup.com>
Suggested-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

d67fd44f

01 7月, 2016 1 次提交

locks: use file_inode() · 6343a212

由 Miklos Szeredi 提交于 7月 01, 2016

(Another one for the f_path debacle.)

ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask.

The reason is that generic_add_lease() used filp->f_path.dentry->inode
while all the others use file_inode().  This makes a difference for files
opened on overlayfs since the former will point to the overlay inode the
latter to the underlying inode.

So generic_add_lease() added the lease to the overlay inode and
generic_delete_lease() removed it from the underlying inode.  When the file
was released the lease remained on the overlay inode's lock list, resulting
in use after free.
Reported-by: NEryu Guan <eguan@redhat.com>
Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Cc: <stable@vger.kernel.org>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6343a212

23 1月, 2016 1 次提交

wrappers for ->i_mutex access · 5955102c

由 Al Viro 提交于 1月 22, 2016

parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5955102c

09 1月, 2016 5 次提交

locks: rename __posix_lock_file to posix_lock_inode · b4d629a3

由 Jeff Layton 提交于 1月 07, 2016

...a more descriptive name and we can drop the double underscore prefix.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>

b4d629a3

locks: prink more detail when there are leaked locks · e24dadab

由 Jeff Layton 提交于 1月 06, 2016

Right now, we just get WARN_ON_ONCE, which is not particularly helpful.
Have it dump some info about the locks and the inode to make it easier
to track down leaked locks in the future.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>

e24dadab

locks: pass inode pointer to locks_free_lock_context · f27a0fe0

由 Jeff Layton 提交于 1月 07, 2016

...so we can print information about it if there are leaked locks.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>

f27a0fe0

locks: sprinkle some tracepoints around the file locking code · 1890910f

由 Jeff Layton 提交于 1月 06, 2016

Add some tracepoints around the POSIX locking code. These were useful
when tracking down problems when handling the race between setlk and
close.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>

1890910f

locks: don't check for race with close when setting OFD lock · 0752ba80

由 Jeff Layton 提交于 1月 08, 2016

We don't clean out OFD locks on close(), so there's no need to check
for a race with them here. They'll get cleaned out at the same time
that flock locks are.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>

0752ba80

08 1月, 2016 1 次提交

locks: fix unlock when fcntl_setlk races with a close · 7f3697e2

由 Jeff Layton 提交于 1月 07, 2016

Dmitry reported that he was able to reproduce the WARN_ON_ONCE that
fires in locks_free_lock_context when the flc_posix list isn't empty.

The problem turns out to be that we're basically rebuilding the
file_lock from scratch in fcntl_setlk when we discover that the setlk
has raced with a close. If the l_whence field is SEEK_CUR or SEEK_END,
then we may end up with fl_start and fl_end values that differ from
when the lock was initially set, if the file position or length of the
file has changed in the interim.

Fix this by just reusing the same lock request structure, and simply
override fl_type value with F_UNLCK as appropriate. That ensures that
we really are unlocking the lock that was initially set.

While we're there, make sure that we do pop a WARN_ON_ONCE if the
removal ever fails. Also return -EBADF in this event, since that's
what we would have returned if the close had happened earlier.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Fixes: c293621b (stale POSIX lock handling)
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>

7f3697e2

18 12月, 2015 1 次提交

fs: make locks.c explicitly non-modular · 91899226

由 Paul Gortmaker 提交于 12月 17, 2015

The Kconfig currently controlling compilation of this code is:

config FILE_LOCKING
     bool "Enable POSIX file locking API" if EXPERT

...meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modularity so that when reading the
driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering gets bumped to one level earlier when we
use the more appropriate fs_initcall here.  However we've made similar
changes before without any fallout and none is expected here either.

Cc: Jeff Layton <jlayton@poochiereds.net>
Acked-by: NJeff Layton <jlayton@poochiereds.net>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

91899226

08 12月, 2015 1 次提交

locks: new locks_mandatory_area calling convention · acc15575

由 Christoph Hellwig 提交于 12月 03, 2015

Pass a loff_t end for the last byte instead of the 32-bit count
parameter to allow full file clones even on 32-bit architectures.
While we're at it also simplify the read/write selection.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

acc15575

18 11月, 2015 1 次提交

locks: use list_first_entry_or_null() · 8ace5dfb

由 Geliang Tang 提交于 11月 18, 2015

Simplify the code with list_first_entry_or_null().
Signed-off-by: NGeliang Tang <geliangtang@163.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

8ace5dfb

16 11月, 2015 1 次提交

locks: Allow disabling mandatory locking at compile time · 9e8925b6

由 Jeff Layton 提交于 11月 16, 2015

Mandatory locking appears to be almost unused and buggy and there
appears no real interest in doing anything with it.  Since effectively
no one uses the code and since the code is buggy let's allow it to be
disabled at compile time.  I would just suggest removing the code but
undoubtedly that will break some piece of userspace code somewhere.

For the distributions that don't care about this piece of code
this gives a nice starting point to make mandatory locking go away.

Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jeff Layton <jeff.layton@primarydata.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

9e8925b6

23 10月, 2015 3 次提交

locks: cleanup posix_lock_inode_wait and flock_lock_inode_wait · 616fb38f

由 Benjamin Coddington 提交于 10月 22, 2015

All callers use locks_lock_inode_wait() instead.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

616fb38f

Move locks API users to locks_lock_inode_wait() · 4f656367

由 Benjamin Coddington 提交于 10月 22, 2015

Instead of having users check for FL_POSIX or FL_FLOCK to call the correct
locks API function, use the check within locks_lock_inode_wait().  This
allows for some later cleanup.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

4f656367

locks: introduce locks_lock_inode_wait() · e55c34a6

由 Benjamin Coddington 提交于 10月 22, 2015

Users of the locks API commonly call either posix_lock_file_wait() or
flock_lock_file_wait() depending upon the lock type.  Add a new function
locks_lock_inode_wait() which will check and call the correct function for
the type of lock passed in.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

e55c34a6

15 10月, 2015 1 次提交

locks: Use more file_inode and fix a comment · 6ca7d910

由 Benjamin Coddington 提交于 10月 15, 2015

Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

6ca7d910

21 9月, 2015 1 次提交

fs: fix data races on inode->i_flctx · 128a3785

由 Dmitry Vyukov 提交于 9月 21, 2015

locks_get_lock_context() uses cmpxchg() to install i_flctx.
cmpxchg() is a release operation which is correct. But it uses
a plain load to load i_flctx. This is incorrect. Subsequent loads
from i_flctx can hoist above the load of i_flctx pointer itself
and observe uninitialized garbage there. This in turn can lead
to corruption of ctx->flc_lock and other members.

Documentation/memory-barriers.txt explicitly requires to use
a barrier in such context:
"A load-load control dependency requires a full read memory barrier".

Use smp_load_acquire() in locks_get_lock_context() and in bunch
of other functions that can proceed concurrently with
locks_get_lock_context().

The data race was found with KernelThreadSanitizer (KTSAN).
Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

128a3785

01 9月, 2015 1 次提交

fs: fix fs/locks.c kernel-doc warning · 7fadc59c

由 Randy Dunlap 提交于 8月 09, 2015

Fix kernel-doc warnings in fs/locks.c:

Warning(..//fs/locks.c:1577): No description found for parameter 'flags'
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

7fadc59c

13 7月, 2015 3 次提交

locks: inline posix_lock_file_wait and flock_lock_file_wait · ee296d7c

由 Jeff Layton 提交于 7月 11, 2015

They just call file_inode and then the corresponding *_inode_file_wait
function. Just make them static inlines instead.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

ee296d7c

locks: new helpers - flock_lock_inode_wait and posix_lock_inode_wait · 29d01b22

由 Jeff Layton 提交于 7月 11, 2015

Allow callers to pass in an inode instead of a filp.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Reviewed-by: N"J. Bruce Fields" <bfields@fieldses.org>
Tested-by: N"J. Bruce Fields" <bfields@fieldses.org>

29d01b22

locks: have flock_lock_file take an inode pointer instead of a filp · bcd7f78d

由 Jeff Layton 提交于 7月 11, 2015

...and rename it to better describe how it works.

In order to fix a use-after-free in NFS, we need to be able to remove
locks from an inode after the filp associated with them may have already
been freed. flock_lock_file already only dereferences the filp to get to
the inode, so just change it so the callers do that.

All of the callers already pass in a lock request that has the fl_file
set properly, so we don't need to pass it in individually. With that
change it now only dereferences the filp to get to the inode, so just
push that out to the callers.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Reviewed-by: N"J. Bruce Fields" <bfields@fieldses.org>
Tested-by: N"J. Bruce Fields" <bfields@fieldses.org>

bcd7f78d

17 4月, 2015 1 次提交

proc: show locks in /proc/pid/fdinfo/X · 6c8c9031

由 Andrey Vagin 提交于 4月 16, 2015

Let's show locks which are associated with a file descriptor in
its fdinfo file.

Currently we don't have a reliable way to determine who holds a lock.  We
can find some information in /proc/locks, but PID which is reported there
can be wrong.  For example, a process takes a lock, then forks a child and
dies.  In this case /proc/locks contains the parent pid, which can be
reused by another process.

$ cat /proc/locks
...
6: FLOCK  ADVISORY  WRITE 324 00:13:13431 0 EOF
...

$ ps -C rpcbind
  PID TTY          TIME CMD
  332 ?        00:00:00 rpcbind

$ cat /proc/332/fdinfo/4
pos:	0
flags:	0100000
mnt_id:	22
lock:	1: FLOCK  ADVISORY  WRITE 324 00:13:13431 0 EOF

$ ls -l /proc/332/fd/4
lr-x------ 1 root root 64 Mar  5 14:43 /proc/332/fd/4 -> /run/rpcbind.lock

$ ls -l /proc/324/fd/
total 0
lrwx------ 1 root root 64 Feb 27 14:50 0 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 27 14:50 1 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 27 14:49 2 -> /dev/pts/0

You can see that the process with the 324 pid doesn't hold the lock.

This information is required for proper dumping and restoring file
locks.
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Acked-by: NJeff Layton <jlayton@poochiereds.net>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6c8c9031

03 4月, 2015 4 次提交

locks: use cmpxchg to assign i_flctx pointer · 0429c2b5

由 Jeff Layton 提交于 4月 03, 2015

During the v3.20/v4.0 cycle, I had originally had the code manage the
inode->i_flctx pointer using a compare-and-swap operation instead of the
i_lock.

Sasha Levin though hit a problem while testing with trinity that made me
believe that that wasn't safe. At the time, changing the code to protect
the i_flctx pointer seemed to fix the issue, but I now think that was
just coincidence.

The issue was likely the same race that Kirill Shutemov hit while
testing the pre-rc1 v4.0 kernel and that Linus spotted. Due to the way
that the spinlock was dropped in the middle of flock_lock_file, you
could end up with multiple flock locks for the same struct file on the
inode.

Reinstate the use of a CAS operation to assign this pointer since it's
likely to be more efficient and gets the i_lock completely out of the
file locking business.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

0429c2b5

locks: get rid of WE_CAN_BREAK_LSLK_NOW dead code · 3648888e

由 Jeff Layton 提交于 4月 03, 2015

As Bruce points out, there's no compelling reason to change /proc/locks
output at this point. If we did want to do this, then we'd almost
certainly want to introduce a new file to display this info (maybe via
debugfs?).

Let's remove the dead WE_CAN_BREAK_LSLK_NOW ifdef here and just plan to
stay with the legacy format.
Reported-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

3648888e

locks: change lm_get_owner and lm_put_owner prototypes · cae80b30

由 Jeff Layton 提交于 4月 03, 2015

The current prototypes for these operations are somewhat awkward as they
deal with fl_owners but take struct file_lock arguments. In the future,
we'll want to be able to take references without necessarily dealing
with a struct file_lock.

Change them to take fl_owner_t arguments instead and have the callers
deal with assigning the values to the file_lock structs.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>

cae80b30

locks: don't allocate a lock context for an F_UNLCK request · 5c1c669a

由 Jeff Layton 提交于 4月 03, 2015

In the event that we get an F_UNLCK request on an inode that has no lock
context, there is no reason to allocate one. Change
locks_get_lock_context to take a "type" pointer and avoid allocating a
new context if it's F_UNLCK.

Then, fix the callers to return appropriately if that function returns
NULL.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>

5c1c669a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功