提交 · 64bed6cbe38bc95689fb9399872d9ce250192f90 · openanolis / cloud-kernel

10 8月, 2018 1 次提交

nfsd: fix leaked file lock with nfs exported overlayfs · 64bed6cb

由 Amir Goldstein 提交于 7月 13, 2018

nfsd and lockd call vfs_lock_file() to lock/unlock the inode
returned by locks_inode(file).

Many places in nfsd/lockd code use the inode returned by
file_inode(file) for lock manipulation. With Overlayfs, file_inode()
(the underlying inode) is not the same object as locks_inode() (the
overlay inode). This can result in "Leaked POSIX lock" messages
and eventually to a kernel crash as reported by Eddie Horng:
https://marc.info/?l=linux-unionfs&m=153086643202072&w=2

Fix all the call sites in nfsd/lockd that should use locks_inode().
This is a correctness bug that manifested when overlayfs gained
NFS export support in v4.16.
Reported-by: NEddie Horng <eddiehorng.tw@gmail.com>
Tested-by: NEddie Horng <eddiehorng.tw@gmail.com>
Cc: Jeff Layton <jlayton@kernel.org>
Fixes: 8383f174 ("ovl: wire up NFS export operations")
Cc: stable@vger.kernel.org
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

64bed6cb

15 1月, 2018 2 次提交

lockd: convert nlm_rqst.a_count from atomic_t to refcount_t · fbca30c5

由 Elena Reshetova 提交于 11月 29, 2017

atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable nlm_rqst.a_count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the nlm_rqst.a_count it might make a difference
in following places:
 - nlmclnt_release_call() and nlmsvc_release_call(): decrement
   in refcount_dec_and_test() only
   provides RELEASE ordering and control dependency on success
   vs. fully ordered atomic counterpart
Suggested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

fbca30c5

lockd: convert nlm_lockowner.count from atomic_t to refcount_t · 431f125b

由 Elena Reshetova 提交于 11月 29, 2017

atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable nlm_lockowner.count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the nlm_lockowner.count it might make a difference
in following places:
 - nlm_put_lockowner(): decrement in refcount_dec_and_lock() only
   provides RELEASE ordering, control dependency on success and
   holds a spin lock on success vs. fully ordered atomic counterpart.
   No changes in spin lock guarantees.
Suggested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

431f125b

22 12月, 2017 2 次提交

lockd: convert nlm_rqst.a_count from atomic_t to refcount_t · d9226ec9

由 Elena Reshetova 提交于 11月 29, 2017

atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable nlm_rqst.a_count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the nlm_rqst.a_count it might make a difference
in following places:
 - nlmclnt_release_call() and nlmsvc_release_call(): decrement
   in refcount_dec_and_test() only
   provides RELEASE ordering and control dependency on success
   vs. fully ordered atomic counterpart
Suggested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d9226ec9

lockd: convert nlm_lockowner.count from atomic_t to refcount_t · 8bb3ea77

由 Elena Reshetova 提交于 11月 29, 2017

atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable nlm_lockowner.count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the nlm_lockowner.count it might make a difference
in following places:
 - nlm_put_lockowner(): decrement in refcount_dec_and_lock() only
   provides RELEASE ordering, control dependency on success and
   holds a spin lock on success vs. fully ordered atomic counterpart.
   No changes in spin lock guarantees.
Suggested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8bb3ea77

21 4月, 2017 1 次提交

lockd: Introduce nlmclnt_operations · b1ece737

由 Benjamin Coddington 提交于 4月 11, 2017

NFS would enjoy the ability to modify the behavior of the NLM client's
unlock RPC task in order to delay the transmission of the unlock until IO
that was submitted under that lock has completed.  This ability can ensure
that the NLM client will always complete the transmission of an unlock even
if the waiting caller has been interrupted with fatal signal.

For this purpose, a pointer to a struct nlmclnt_operations can be assigned
in a nfs_module's nfs_rpc_ops that will install those nlmclnt_operations on
the nlm_host.  The struct nlmclnt_operations defines three callback
operations that will be used in a following patch:

nlmclnt_alloc_call - used to call back after a successful allocation of
	a struct nlm_rqst in nlmclnt_proc().

nlmclnt_unlock_prepare - used to call back during NLM unlock's
	rpc_call_prepare.  The NLM client defers calling rpc_call_start()
	until this callback returns false.

nlmclnt_release_call - used to call back when the NLM client's struct
	nlm_rqst is freed.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b1ece737

23 10月, 2015 1 次提交

Move locks API users to locks_lock_inode_wait() · 4f656367

由 Benjamin Coddington 提交于 10月 22, 2015

Instead of having users check for FL_POSIX or FL_FLOCK to call the correct
locks API function, use the check within locks_lock_inode_wait().  This
allows for some later cleanup.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

4f656367

06 8月, 2013 1 次提交

LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs · 9a1b6bf8

由 Trond Myklebust 提交于 8月 05, 2013

Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
which case we're in entirely the wrong namespace.

Secondly, commit 8aac6270 (move
exit_task_namespaces() outside of exit_notify()) now means that
exit_task_work() is called after exit_task_namespaces(), which
triggers an Oops when we're freeing up the locks.

Fix this by ensuring that we initialise the nlm_host's rpc_client at mount
time, so that the cl_nodename field is initialised to the value of
utsname()->nodename that the net namespace uses. Then replace the
lockd callers of utsname()->nodename.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 3.10.x

9a1b6bf8

22 4月, 2013 1 次提交

LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot · 1dfd89af

由 Trond Myklebust 提交于 4月 21, 2013

After a server reboot, the reclaimer thread will recover all the existing
locks. For locks that are blocked, however, it will change the value
of block->b_status to nlm_lck_denied_grace_period in order to signal that
they need to wake up and resend the original blocking lock request.

Due to a bug, however, the block->b_status never gets reset after the
blocked locks have been woken up, and so the process goes into an
infinite loop of resends until the blocked lock is satisfied.
Reported-by: NMarc Eshel <eshel@us.ibm.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

1dfd89af

23 2月, 2013 1 次提交
- A
  new helper: file_inode(file) · 496ad9aa
  由 Al Viro 提交于 1月 23, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  496ad9aa
20 2月, 2013 1 次提交

NLM: Ensure that we resend all pending blocking locks after a reclaim · 666b3d80

由 Trond Myklebust 提交于 2月 19, 2013

Currently, nlmclnt_lock will break out of the for(;;) loop when
the reclaimer wakes up the blocking lock thread by setting
nlm_lck_denied_grace_period. This causes the lock request to fail
with an ENOLCK error.
The intention was always to ensure that we resend the lock request
after the grace period has expired.
Reported-by: NWangyuan Zhang <Wangyuan.Zhang@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

666b3d80

16 2月, 2013 1 次提交

lockd: nlmclnt_reclaim(): avoid stack overflow · f25cc71e

由 Tim Gardner 提交于 2月 13, 2013

Even though nlmclnt_reclaim() is only one call into the stack frame,
928 bytes on the stack seems like a lot. Recode to dynamically
allocate the request structure once from within the reclaimer task,
then pass this pointer into nlmclnt_reclaim() for reuse on
subsequent calls.

smatch analysis:

fs/lockd/clntproc.c:620 nlmclnt_reclaim() warn: 'reqst' puts
 928 bytes on stack

Also remove redundant assignment of 0 after memset.

Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NTim Gardner <tim.gardner@canonical.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f25cc71e

05 11月, 2012 1 次提交
- T
  lockd: Remove BUG_ON()s from fs/lockd/clntproc.c · 26269348
  由 Trond Myklebust 提交于 10月 15, 2012
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
  26269348
30 7月, 2012 2 次提交

A
lockd: handle lockowner allocation failure in nlmclnt_proc() · bf884891
由 Al Viro 提交于 7月 29, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
bf884891

lockd: shift grabbing a reference to nlm_host into nlm_alloc_call() · 446945ab

由 Al Viro 提交于 7月 26, 2012

It's used both for client and server hosts; we can't do nlmclnt_release_host()
on failure exits, since the host might need nlmsvc_release_host(), with BUG_ON()
for calling the wrong one. Makes life simpler for callers, actually...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

446945ab

13 7月, 2011 1 次提交

lockd: properly convert be32 values in debug messages · 82c2c8b8

由 Vasily Averin 提交于 6月 01, 2011

lockd: server returns status 50331648
it's quite hard to understand that number in this message is 3 in big endian
Signed-off-by: NVasily Averin <vvs@sw.ru>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

82c2c8b8

15 6月, 2011 1 次提交

NLM: Don't hang forever on NLM unlock requests · 0b760113

由 Trond Myklebust 提交于 5月 31, 2011

If the NLM daemon is killed on the NFS server, we can currently end up
hanging forever on an 'unlock' request, instead of aborting. Basically,
if the rpcbind request fails, or the server keeps returning garbage, we
really want to quit instead of retrying.
Tested-by: NVasily Averin <vvs@sw.ru>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org

0b760113

17 12月, 2010 2 次提交

lockd: Create client-side nlm_host cache · 8ea6ecc8

由 Chuck Lever 提交于 12月 14, 2010

NFS clients don't need the garbage collection processing that is
performed on nlm_host structures. The client picks up an nlm_host at
mount time and holds a reference to it until the file system is
unmounted.

Servers, on the other hand, don't have a precise way to tell when an
nlm_host is no longer being used, so zero refcount nlm_host entries
are left to expire in the cache after a time.

Basically there's nothing holding a reference to an nlm_host between
individual server-side NLM requests, but we can't afford the expense
of recreating them for every new NLM request from a client. The
nlm_host cache adds some lifetime hysteresis to entries in the cache
so the next time a particular nlm_host is needed, it's likely to be
discovered by a lookup rather than created from whole cloth.

With the new implementation, client nlm_host cache items are no longer
garbage collected, and are destroyed directly by a new release
function specialized for client entries, nlmclnt_release_host(). They
are cached in their own data structure, and have their own lookup
logic, simplified and specialized for client nlm_host entries.

However, the client nlm_host cache still shares reboot recovery logic
with the server nlm_host cache. The NSM "peer rebooted" downcall for
clients and servers still come through the same RPC call. This is a
legacy formal API that would be difficult to alter, and besides, the
user space NSM implementation can't tell the difference between peers
that are clients or servers.

For this reason, the client cache continues to share the
nlm_host_mutex (and reboot recovery logic) with the server cache.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8ea6ecc8

lockd: Split nlm_release_call() · 7db836d4

由 Chuck Lever 提交于 12月 14, 2010

The nlm_release_call() function is invoked from both the server and
the client side.  We're about to introduce a distinct server- and
client-side nlm_release_host(), so nlm_release_call() must first be
split into a client-side and a server-side version.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

7db836d4

18 11月, 2010 1 次提交

BKL: remove extraneous #include <smp_lock.h> · 451a3c24

由 Arnd Bergmann 提交于 11月 17, 2010

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

451a3c24

22 9月, 2010 1 次提交

lockd: Remove BKL from the client · 63185942

由 Bryan Schumaker 提交于 9月 22, 2010

This patch removes all calls to lock_kernel() from the client.  This patch
should be applied after the "fs/lock.c prepare for BKL removal" patch submitted
by Arnd Bergmann on September 18.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

63185942

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

22 9月, 2009 1 次提交

const: make file_lock_operations const · 6aed6285

由 Alexey Dobriyan 提交于 9月 21, 2009

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6aed6285

13 7月, 2009 1 次提交

headers: smp_lock.h redux · 405f5571

由 Alexey Dobriyan 提交于 7月 11, 2009

* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

405f5571

18 6月, 2009 2 次提交

lockd: Update NSM state from SM_MON replies · 6c9dc425

由 Chuck Lever 提交于 6月 17, 2009

When rpc.statd starts up in user space at boot time, it attempts to
write the latest NSM local state number into
/proc/sys/fs/nfs/nsm_local_state.

If lockd.ko isn't loaded yet (as is the case in most configurations),
that file doesn't exist, thus the kernel's NSM state remains set to
its initial value of zero during lockd operation.

This is a problem because rpc.statd and lockd use the NSM state number
to prevent repeated lock recovery on rebooted hosts.  If lockd sends
a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state
number is received, there is no way for lockd or rpc.statd to
distinguish that stale SM_NOTIFY from an actual reboot.  Thus lock
recovery could be performed after the rebooted host has already
started reclaiming locks, and those locks will be lost.

We could change /etc/init.d/nfslock so it always modprobes lockd.ko
before starting rpc.statd.  However, if lockd.ko is ever unloaded
and reloaded, we are back at square one, since the NSM state is not
preserved across an unload/reload cycle.  This may happen frequently
on clients that use automounter.  A period of NFS inactivity causes
lockd.ko to be unloaded, and the kernel loses its NSM state setting.

Instead, let's use the fact that rpc.statd plants the local system's
NSM state in every SM_MON (and SM_UNMON) reply.  lockd performs a
synchronous SM_MON upcall to the local rpc.statd _before_ sending its
first NLM request to a new remote.  This would permit rpc.statd to
provide the current NSM state to lockd, even after lockd.ko had been
unloaded and reloaded.

Note that NLMPROC_LOCK arguments are constructed before the
nsm_monitor() call, so we have to rearrange argument construction very
slightly to make this all work out.

And, the kernel appears to treat NSM state as a u32 (see struct
nlm_args and nsm_res).  Make nsm_local_state a u32 as well, to ensure
we don't get bogus comparison results.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

6c9dc425

T
NFSv4/NLM: Push file locking BKL dependencies down into the NLM layer · 5cd973c4
由 Trond Myklebust 提交于 6月 17, 2009
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
5cd973c4

07 1月, 2009 2 次提交

NSM: Remove include/linux/lockd/sm_inter.h · e6765b83

由 Chuck Lever 提交于 12月 11, 2008

Clean up: The include/linux/lockd/sm_inter.h header is nearly empty
now.  Remove it.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

e6765b83

NLM: Remove redundant printk() in nlmclnt_lock() · 501c1ed3

由 Chuck Lever 提交于 12月 04, 2008

The nsm_monitor() function already generates a printk(KERN_NOTICE) if
the SM_MON upcall fails, so the similar printk() in the nlmclnt_lock()
function is redundant.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

501c1ed3

26 7月, 2008 1 次提交

lockd: dont return EAGAIN for a permanent error · cc77b152

由 Miklos Szeredi 提交于 7月 25, 2008

Fix nlm_fopen() to return NLM_FAILED (or NLM_LCK_DENIED_NOLOCKS) instead
of NLM_LCK_DENIED.  The latter means the lock request failed because of a
conflicting lock (i.e.  a temporary error), which is wrong in this case.

Also fix the client to return ENOLCK instead of EAGAIN if a blocking lock
request returns with NLM_LOCK_DENIED.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: David Teigland <teigland@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cc77b152

16 7月, 2008 2 次提交

SUNRPC: Remove the BKL from the callback functions · a86dc496

由 Trond Myklebust 提交于 6月 11, 2008

Push it into those callback functions that actually need it.

Note that all the NFS operations use their own locking, so don't need the
BKL. Ditto for the rpcbind client.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

a86dc496

nfs: set correct fl_len in nlmclnt_test() · d67d1c7b

由 Felix Blyakher 提交于 7月 15, 2008

fcntl(F_GETLK) on an nfs client incorrectly returns
the values for the conflicting lock. fl_len value is
always 1.
If the conflicting lock is (0, 4095) the F_GETLK
request for (1024, 10) returns (0, 1), which doesn't
even cover the requested range, and is quite confusing.
The fix is trivial, set fl_end from the fl_end value
recieved from the nfs server.
Signed-off-by: NFelix Blyakher <felixb@sgi.com>
Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d67d1c7b

30 4月, 2008 1 次提交

fs: replace remaining __FUNCTION__ occurrences · 8e24eea7

由 Harvey Harrison 提交于 4月 30, 2008

__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8e24eea7

20 4月, 2008 7 次提交

NLM/lockd: Ensure client locking calls use correct credentials · d11d10cc

由 Trond Myklebust 提交于 4月 02, 2008

Now that we've added the 'generic' credentials (that are independent of the
rpc_client) to the nfs_open_context, we can use those in the NLM client to
ensure that the lock/unlock requests are authenticated to whoever
originally opened the file.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d11d10cc

NLM/lockd: Fix a race when cancelling a blocking lock · 5f50c0c6

由 Trond Myklebust 提交于 4月 01, 2008

We shouldn't remove the lock from the list of blocked locks until the
CANCEL call has completed since we may be racing with a GRANTED callback.

Also ensure that we send an UNLOCK if the CANCEL request failed. Normally
that should only happen if the process gets hit with a fatal signal.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

5f50c0c6

NLM/lockd: Ensure that nlmclnt_cancel() returns results of the CANCEL call · 6b4b3a75

由 Trond Myklebust 提交于 4月 02, 2008

Currently, it returns success as long as the RPC call was sent. We'd like
to know if the CANCEL operation succeeded on the server.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

6b4b3a75

NLM: Remove the signal masking in nlmclnt_proc/nlmclnt_cancel · 8ec7ff74

由 Trond Myklebust 提交于 3月 28, 2008

The signal masks have been rendered obsolete by the preceding patch.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8ec7ff74

NLM/lockd: convert __nlm_async_call to use rpc_run_task() · dc9d8d04

由 Trond Myklebust 提交于 3月 28, 2008

Peter Staubach comments:

> In the course of investigating testing failures in the locking phase of
> the Connectathon testsuite, I discovered a couple of things.  One was
> that one of the tests in the locking tests was racy when it didn't seem
> to need to be and two, that the NFS client asynchronously releases locks
> when a process is exiting.
...
> The Single UNIX Specification Version 3 specifies that:  "All locks
> associated with a file for a given process shall be removed when a file
> descriptor for that file is closed by that process or the process holding
> that file descriptor terminates.".
>
> This does not specify whether those locks must be released prior to the
> completion of the exit processing for the process or not.  However,
> general assumptions seem to be that those locks will be released.  This
> leads to more deterministic behavior under normal circumstances.

The following patch converts the NFSv2/v3 locking code to use the same
mechanism as NFSv4 for sending asynchronous RPC calls and then waiting for
them to complete. This ensures that the UNLOCK and CANCEL RPC calls will
complete even if the user interrupts the call, yet satisfies the
above request for synchronous behaviour on process exit.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

dc9d8d04

NLM/lockd: Add a reference counter to struct nlm_rqst · 5e7f37a7

由 Trond Myklebust 提交于 4月 01, 2008

When we replace the existing synchronous RPC calls with asynchronous calls,
the reference count will be needed in order to allow us to examine the
result of the RPC call.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

5e7f37a7

NLM/lockd: Ensure we don't corrupt fl->fl_flags in nlmclnt_unlock() · 4a9af59f

由 Trond Myklebust 提交于 4月 01, 2008

Also fix up nlmclnt_lock() so that it doesn't pass modified versions of
fl->fl_flags to nlmclnt_cancel() and other helpers.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

4a9af59f

30 1月, 2008 1 次提交
- T
  NLM: Fix a bogus 'return' in nlmclnt_rpc_release · 65fdf7d2
  由 Trond Myklebust 提交于 1月 11, 2008
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
  65fdf7d2

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功