1. 10 8月, 2018 1 次提交
  2. 15 1月, 2018 2 次提交
    • E
      lockd: convert nlm_rqst.a_count from atomic_t to refcount_t · fbca30c5
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable nlm_rqst.a_count is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      
      **Important note for maintainers:
      
      Some functions from refcount_t API defined in lib/refcount.c
      have different memory ordering guarantees than their atomic
      counterparts.
      The full comparison can be seen in
      https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
      in state to be merged to the documentation tree.
      Normally the differences should not matter since refcount_t provides
      enough guarantees to satisfy the refcounting use cases, but in
      some rare cases it might matter.
      Please double check that you don't have some undocumented
      memory guarantees for this variable usage.
      
      For the nlm_rqst.a_count it might make a difference
      in following places:
       - nlmclnt_release_call() and nlmsvc_release_call(): decrement
         in refcount_dec_and_test() only
         provides RELEASE ordering and control dependency on success
         vs. fully ordered atomic counterpart
      Suggested-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      fbca30c5
    • E
      lockd: convert nlm_lockowner.count from atomic_t to refcount_t · 431f125b
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable nlm_lockowner.count is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      
      **Important note for maintainers:
      
      Some functions from refcount_t API defined in lib/refcount.c
      have different memory ordering guarantees than their atomic
      counterparts.
      The full comparison can be seen in
      https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
      in state to be merged to the documentation tree.
      Normally the differences should not matter since refcount_t provides
      enough guarantees to satisfy the refcounting use cases, but in
      some rare cases it might matter.
      Please double check that you don't have some undocumented
      memory guarantees for this variable usage.
      
      For the nlm_lockowner.count it might make a difference
      in following places:
       - nlm_put_lockowner(): decrement in refcount_dec_and_lock() only
         provides RELEASE ordering, control dependency on success and
         holds a spin lock on success vs. fully ordered atomic counterpart.
         No changes in spin lock guarantees.
      Suggested-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      431f125b
  3. 22 12月, 2017 2 次提交
    • E
      lockd: convert nlm_rqst.a_count from atomic_t to refcount_t · d9226ec9
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable nlm_rqst.a_count is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      
      **Important note for maintainers:
      
      Some functions from refcount_t API defined in lib/refcount.c
      have different memory ordering guarantees than their atomic
      counterparts.
      The full comparison can be seen in
      https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
      in state to be merged to the documentation tree.
      Normally the differences should not matter since refcount_t provides
      enough guarantees to satisfy the refcounting use cases, but in
      some rare cases it might matter.
      Please double check that you don't have some undocumented
      memory guarantees for this variable usage.
      
      For the nlm_rqst.a_count it might make a difference
      in following places:
       - nlmclnt_release_call() and nlmsvc_release_call(): decrement
         in refcount_dec_and_test() only
         provides RELEASE ordering and control dependency on success
         vs. fully ordered atomic counterpart
      Suggested-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      d9226ec9
    • E
      lockd: convert nlm_lockowner.count from atomic_t to refcount_t · 8bb3ea77
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable nlm_lockowner.count is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      
      **Important note for maintainers:
      
      Some functions from refcount_t API defined in lib/refcount.c
      have different memory ordering guarantees than their atomic
      counterparts.
      The full comparison can be seen in
      https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
      in state to be merged to the documentation tree.
      Normally the differences should not matter since refcount_t provides
      enough guarantees to satisfy the refcounting use cases, but in
      some rare cases it might matter.
      Please double check that you don't have some undocumented
      memory guarantees for this variable usage.
      
      For the nlm_lockowner.count it might make a difference
      in following places:
       - nlm_put_lockowner(): decrement in refcount_dec_and_lock() only
         provides RELEASE ordering, control dependency on success and
         holds a spin lock on success vs. fully ordered atomic counterpart.
         No changes in spin lock guarantees.
      Suggested-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      8bb3ea77
  4. 21 4月, 2017 1 次提交
    • B
      lockd: Introduce nlmclnt_operations · b1ece737
      Benjamin Coddington 提交于
      NFS would enjoy the ability to modify the behavior of the NLM client's
      unlock RPC task in order to delay the transmission of the unlock until IO
      that was submitted under that lock has completed.  This ability can ensure
      that the NLM client will always complete the transmission of an unlock even
      if the waiting caller has been interrupted with fatal signal.
      
      For this purpose, a pointer to a struct nlmclnt_operations can be assigned
      in a nfs_module's nfs_rpc_ops that will install those nlmclnt_operations on
      the nlm_host.  The struct nlmclnt_operations defines three callback
      operations that will be used in a following patch:
      
      nlmclnt_alloc_call - used to call back after a successful allocation of
      	a struct nlm_rqst in nlmclnt_proc().
      
      nlmclnt_unlock_prepare - used to call back during NLM unlock's
      	rpc_call_prepare.  The NLM client defers calling rpc_call_start()
      	until this callback returns false.
      
      nlmclnt_release_call - used to call back when the NLM client's struct
      	nlm_rqst is freed.
      Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      b1ece737
  5. 23 10月, 2015 1 次提交
  6. 06 8月, 2013 1 次提交
    • T
      LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs · 9a1b6bf8
      Trond Myklebust 提交于
      Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
      which case we're in entirely the wrong namespace.
      
      Secondly, commit 8aac6270 (move
      exit_task_namespaces() outside of exit_notify()) now means that
      exit_task_work() is called after exit_task_namespaces(), which
      triggers an Oops when we're freeing up the locks.
      
      Fix this by ensuring that we initialise the nlm_host's rpc_client at mount
      time, so that the cl_nodename field is initialised to the value of
      utsname()->nodename that the net namespace uses. Then replace the
      lockd callers of utsname()->nodename.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Toralf Förster <toralf.foerster@gmx.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Nix <nix@esperi.org.uk>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # 3.10.x
      9a1b6bf8
  7. 22 4月, 2013 1 次提交
  8. 23 2月, 2013 1 次提交
  9. 20 2月, 2013 1 次提交
  10. 16 2月, 2013 1 次提交
  11. 05 11月, 2012 1 次提交
  12. 30 7月, 2012 2 次提交
  13. 13 7月, 2011 1 次提交
  14. 15 6月, 2011 1 次提交
  15. 17 12月, 2010 2 次提交
    • C
      lockd: Create client-side nlm_host cache · 8ea6ecc8
      Chuck Lever 提交于
      NFS clients don't need the garbage collection processing that is
      performed on nlm_host structures.  The client picks up an nlm_host at
      mount time and holds a reference to it until the file system is
      unmounted.
      
      Servers, on the other hand, don't have a precise way to tell when an
      nlm_host is no longer being used, so zero refcount nlm_host entries
      are left to expire in the cache after a time.
      
      Basically there's nothing holding a reference to an nlm_host between
      individual server-side NLM requests, but we can't afford the expense
      of recreating them for every new NLM request from a client.  The
      nlm_host cache adds some lifetime hysteresis to entries in the cache
      so the next time a particular nlm_host is needed, it's likely to be
      discovered by a lookup rather than created from whole cloth.
      
      With the new implementation, client nlm_host cache items are no longer
      garbage collected, and are destroyed directly by a new release
      function specialized for client entries, nlmclnt_release_host().  They
      are cached in their own data structure, and have their own lookup
      logic, simplified and specialized for client nlm_host entries.
      
      However, the client nlm_host cache still shares reboot recovery logic
      with the server nlm_host cache.  The NSM "peer rebooted" downcall for
      clients and servers still come through the same RPC call.  This is a
      legacy formal API that would be difficult to alter, and besides, the
      user space NSM implementation can't tell the difference between peers
      that are clients or servers.
      
      For this reason, the client cache continues to share the
      nlm_host_mutex (and reboot recovery logic) with the server cache.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      8ea6ecc8
    • C
      lockd: Split nlm_release_call() · 7db836d4
      Chuck Lever 提交于
      The nlm_release_call() function is invoked from both the server and
      the client side.  We're about to introduce a distinct server- and
      client-side nlm_release_host(), so nlm_release_call() must first be
      split into a client-side and a server-side version.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      7db836d4
  16. 18 11月, 2010 1 次提交
  17. 22 9月, 2010 1 次提交
  18. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  19. 22 9月, 2009 1 次提交
  20. 13 7月, 2009 1 次提交
  21. 18 6月, 2009 2 次提交
    • C
      lockd: Update NSM state from SM_MON replies · 6c9dc425
      Chuck Lever 提交于
      When rpc.statd starts up in user space at boot time, it attempts to
      write the latest NSM local state number into
      /proc/sys/fs/nfs/nsm_local_state.
      
      If lockd.ko isn't loaded yet (as is the case in most configurations),
      that file doesn't exist, thus the kernel's NSM state remains set to
      its initial value of zero during lockd operation.
      
      This is a problem because rpc.statd and lockd use the NSM state number
      to prevent repeated lock recovery on rebooted hosts.  If lockd sends
      a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state
      number is received, there is no way for lockd or rpc.statd to
      distinguish that stale SM_NOTIFY from an actual reboot.  Thus lock
      recovery could be performed after the rebooted host has already
      started reclaiming locks, and those locks will be lost.
      
      We could change /etc/init.d/nfslock so it always modprobes lockd.ko
      before starting rpc.statd.  However, if lockd.ko is ever unloaded
      and reloaded, we are back at square one, since the NSM state is not
      preserved across an unload/reload cycle.  This may happen frequently
      on clients that use automounter.  A period of NFS inactivity causes
      lockd.ko to be unloaded, and the kernel loses its NSM state setting.
      
      Instead, let's use the fact that rpc.statd plants the local system's
      NSM state in every SM_MON (and SM_UNMON) reply.  lockd performs a
      synchronous SM_MON upcall to the local rpc.statd _before_ sending its
      first NLM request to a new remote.  This would permit rpc.statd to
      provide the current NSM state to lockd, even after lockd.ko had been
      unloaded and reloaded.
      
      Note that NLMPROC_LOCK arguments are constructed before the
      nsm_monitor() call, so we have to rearrange argument construction very
      slightly to make this all work out.
      
      And, the kernel appears to treat NSM state as a u32 (see struct
      nlm_args and nsm_res).  Make nsm_local_state a u32 as well, to ensure
      we don't get bogus comparison results.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      6c9dc425
    • T
  22. 07 1月, 2009 2 次提交
  23. 26 7月, 2008 1 次提交
  24. 16 7月, 2008 2 次提交
  25. 30 4月, 2008 1 次提交
  26. 20 4月, 2008 7 次提交
  27. 30 1月, 2008 1 次提交