1. 06 1月, 2012 2 次提交
    • J
      svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown · 9689dcce
      J. Bruce Fields 提交于
      This was unexpected behavior (at least for me)--why would you want
      configuration settings automatically lost on nfsd restart?
      
      In practice this won't affect distributions, which likely set everything
      on every startup.  But I'd expect the behavior to be less confusing to
      someone manually restarting nfsd for testing.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9689dcce
    • J
      svcrpc: fix double-free on shutdown of nfsd after changing pool mode · 61c8504c
      J. Bruce Fields 提交于
      The pool_to and to_pool fields of the global svc_pool_map are freed on
      shutdown, but are initialized in nfsd startup only in the
      SVC_POOL_PERCPU and SVC_POOL_PERNODE cases.
      
      They *are* initialized to zero on kernel startup.  So as long as you use
      only SVC_POOL_GLOBAL (the default), this will never be a problem.
      
      You're also OK if you only ever use SVC_POOL_PERCPU or SVC_POOL_PERNODE.
      
      However, the following sequence events leads to a double-free:
      
      	1. set SVC_POOL_PERCPU or SVC_POOL_PERNODE
      	2. start nfsd: both fields are initialized.
      	3. shutdown nfsd: both fields are freed.
      	4. set SVC_POOL_GLOBAL
      	5. start nfsd: the fields are left untouched.
      	6. shutdown nfsd: now we try to free them again.
      
      Step 4 is actually unnecessary, since (for some bizarre reason), nfsd
      automatically resets the pool mode to SVC_POOL_GLOBAL on shutdown.
      
      Cc: stable@kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      61c8504c
  2. 07 12月, 2011 3 次提交
    • J
      svcrpc: update outdated BKL comment · 94cf3179
      J. Bruce Fields 提交于
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      94cf3179
    • J
      svcrpc: avoid memory-corruption on pool shutdown · b4f36f88
      J. Bruce Fields 提交于
      Socket callbacks use svc_xprt_enqueue() to add an xprt to a
      pool->sp_sockets list.  In normal operation a server thread will later
      come along and take the xprt off that list.  On shutdown, after all the
      threads have exited, we instead manually walk the sv_tempsocks and
      sv_permsocks lists to find all the xprt's and delete them.
      
      So the sp_sockets lists don't really matter any more.  As a result,
      we've mostly just ignored them and hoped they would go away.
      
      Which has gotten us into trouble; witness for example ebc63e53
      "svcrpc: fix list-corrupting race on nfsd shutdown", the result of Ben
      Greear noticing that a still-running svc_xprt_enqueue() could re-add an
      xprt to an sp_sockets list just before it was deleted.  The fix was to
      remove it from the list at the end of svc_delete_xprt().  But that only
      made corruption less likely--I can see nothing that prevents a
      svc_xprt_enqueue() from adding another xprt to the list at the same
      moment that we're removing this xprt from the list.  In fact, despite
      the earlier xpo_detach(), I don't even see what guarantees that
      svc_xprt_enqueue() couldn't still be running on this xprt.
      
      So, instead, note that svc_xprt_enqueue() essentially does:
      	lock sp_lock
      		if XPT_BUSY unset
      			add to sp_sockets
      	unlock sp_lock
      
      So, if we do:
      
      	set XPT_BUSY on every xprt.
      	Empty every sp_sockets list, under the sp_socks locks.
      
      Then we're left knowing that the sp_sockets lists are all empty and will
      stay that way, since any svc_xprt_enqueue() will check XPT_BUSY under
      the sp_lock and see it set.
      
      And *then* we can continue deleting the xprt's.
      
      (Thanks to Jeff Layton for being correctly suspicious of this code....)
      
      Cc: Ben Greear <greearb@candelatech.com>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      b4f36f88
    • J
      svcrpc: destroy server sockets all at once · 2fefb8a0
      J. Bruce Fields 提交于
      There's no reason I can see that we need to call sv_shutdown between
      closing the two lists of sockets.
      
      Cc: stable@kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2fefb8a0
  3. 01 11月, 2011 1 次提交
  4. 25 10月, 2011 4 次提交
  5. 20 8月, 2011 1 次提交
  6. 15 7月, 2011 2 次提交
  7. 28 5月, 2011 1 次提交
    • C
      SUNRPC: Use AF_LOCAL for rpcbind upcalls · 7402ab19
      Chuck Lever 提交于
      As libtirpc does in user space, have our registration API try using an
      AF_LOCAL transport first when registering and unregistering.
      
      This means we don't chew up privileged ports, and our registration is
      bound to an "owner" (the effective uid of the process on the sending
      end of the transport).  Only that "owner" may unregister the service.
      
      The kernel could probe rpcbind via an rpcbind query to determine
      whether rpcbind has an AF_LOCAL service. For simplicity, we use the
      same technique that libtirpc uses: simply fail over to network
      loopback if creating an AF_LOCAL transport to the well-known rpcbind
      service socket fails.
      
      This means we open-code the pathname of the rpcbind socket in the
      kernel.  For now we have to do that anyway because the kernel's
      RPC over AF_LOCAL implementation does not support autobind.  That may
      be undesirable in the long term.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      7402ab19
  8. 07 1月, 2011 3 次提交
  9. 05 1月, 2011 1 次提交
    • J
      svcrpc: simpler request dropping · 9e701c61
      J. Bruce Fields 提交于
      Currently we use -EAGAIN returns to determine when to drop a deferred
      request.  On its own, that is error-prone, as it makes us treat -EAGAIN
      returns from other functions specially to prevent inadvertent dropping.
      
      So, use a flag on the request instead.
      
      Returning an error on request deferral is still required, to prevent
      further processing, but we no longer need worry that an error return on
      its own could result in a drop.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9e701c61
  10. 22 9月, 2010 1 次提交
  11. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  12. 07 3月, 2010 1 次提交
  13. 10 2月, 2010 1 次提交
  14. 30 11月, 2009 1 次提交
  15. 18 6月, 2009 4 次提交
  16. 17 6月, 2009 1 次提交
  17. 04 4月, 2009 1 次提交
  18. 30 3月, 2009 1 次提交
  19. 29 3月, 2009 6 次提交
    • C
      SUNRPC: rpcb_register() should handle errors silently · 363f724c
      Chuck Lever 提交于
      Move error reporting for RPC registration to rpcb_register's caller.
      
      This way the caller can choose to recover silently from certain
      errors, but report errors it does not recognize.  Error reporting
      for kernel RPC service registration is now handled in one place.
      
      This patch is part of a series that addresses
         http://bugzilla.kernel.org/show_bug.cgi?id=12256Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      363f724c
    • C
      SUNRPC: Simplify kernel RPC service registration · cadc0fa5
      Chuck Lever 提交于
      The kernel registers RPC services with the local portmapper with an
      rpcbind SET upcall to the local portmapper.  Traditionally, this used
      rpcbind v2 (PMAP), but registering RPC services that support IPv6
      requires rpcbind v3 or v4.
      
      Since we now want separate PF_INET and PF_INET6 listeners for each
      kernel RPC service, svc_register() will do only one of those
      registrations at a time.
      
      For PF_INET, it tries an rpcb v4 SET upcall first; if that fails, it
      does a legacy portmap SET.  This makes it entirely backwards
      compatible with legacy user space, but allows a proper v4 SET to be
      used if rpcbind is available.
      
      For PF_INET6, it does an rpcb v4 SET upcall.  If that fails, it fails
      the registration, and thus the transport creation.  This let's the
      kernel detect if user space is able to support IPv6 RPC services, and
      thus whether it should maintain a PF_INET6 listener for each service
      at all.
      
      This provides complete backwards compatibilty with legacy user space
      that only supports rpcbind v2.  The only down-side is that registering
      a new kernel RPC service may take an extra exchange with the local
      portmapper on legacy systems, but this is an infrequent operation and
      is done over UDP (no lingering sockets in TIMEWAIT), so it shouldn't
      be consequential.
      
      This patch is part of a series that addresses
         http://bugzilla.kernel.org/show_bug.cgi?id=12256Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      cadc0fa5
    • C
      SUNRPC: Simplify svc_unregister() · d5a8620f
      Chuck Lever 提交于
      Our initial implementation of svc_unregister() assumed that PMAP_UNSET
      cleared all rpcbind registrations for a [program, version] tuple.
      However, we now have evidence that PMAP_UNSET clears only "inet"
      entries, and not "inet6" entries, in the rpcbind database.
      
      For backwards compatibility with the legacy portmapper, the
      svc_unregister() function also must work if user space doesn't support
      rpcbind version 4 at all.
      
      Thus we'll send an rpcbind v4 UNSET, and if that fails, we'll send a
      PMAP_UNSET.
      
      This simplifies the code in svc_unregister() and provides better
      backwards compatibility with legacy user space that does not support
      rpcbind version 4.  We can get rid of the conditional compilation in
      here as well.
      
      This patch is part of a series that addresses
         http://bugzilla.kernel.org/show_bug.cgi?id=12256Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      d5a8620f
    • C
      SUNRPC: Don't return EPROTONOSUPPORT in svc_register()'s helpers · ba5c35e0
      Chuck Lever 提交于
      The RPC client returns -EPROTONOSUPPORT if there is a protocol version
      mismatch (ie the remote RPC server doesn't support the RPC protocol
      version sent by the client).
      
      Helpers for the svc_register() function return -EPROTONOSUPPORT if they
      don't recognize the passed-in IPPROTO_ value.
      
      These are two entirely different failure modes.
      
      Have the helpers return -ENOPROTOOPT instead of -EPROTONOSUPPORT.  This
      will allow callers to determine more precisely what the underlying
      problem is, and decide to report or recover appropriately.
      
      This patch is part of a series that addresses
         http://bugzilla.kernel.org/show_bug.cgi?id=12256Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      ba5c35e0
    • C
      SUNRPC: Remove @family argument from svc_create() and svc_create_pooled() · 49a9072f
      Chuck Lever 提交于
      Since an RPC service listener's protocol family is specified now via
      svc_create_xprt(), it no longer needs to be passed to svc_create() or
      svc_create_pooled().  Remove that argument from the synopsis of those
      functions, and remove the sv_family field from the svc_serv struct.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      49a9072f
    • C
      SUNRPC: Pass a family argument to svc_register() · 4b62e58c
      Chuck Lever 提交于
      The sv_family field is going away.  Instead of using sv_family, have
      the svc_register() function take a protocol family argument.
      
      Since this argument represents a protocol family, and not an address
      family, this argument takes an int, as this is what is passed to
      sock_create_kern().  Also make sure svc_register's helpers are
      checking for PF_FOO instead of AF_FOO.  The value of [AP]F_FOO are
      equivalent; this is simply a symbolic change to reflect the semantics
      of the value stored in that variable.
      
      sock_create_kern() should return EPFNOSUPPORT if the passed-in
      protocol family isn't supported, but it uses EAFNOSUPPORT for this
      case.  We will stick with that tradition here, as svc_register()
      is called by the RPC server in the same path as sock_create_kern().
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      4b62e58c
  20. 28 3月, 2009 1 次提交
  21. 13 3月, 2009 1 次提交
  22. 08 1月, 2009 1 次提交
  23. 30 9月, 2008 1 次提交
    • C
      SUNRPC: Fix up svc_unregister() · f6fb3f6f
      Chuck Lever 提交于
      With the new rpcbind code, a PMAP_UNSET will not have any effect on
      services registered via rpcbind v3 or v4.
      
      Implement a version of svc_unregister() that uses an RPCB_UNSET with
      an empty netid string to make sure we have cleared *all* entries for
      a kernel RPC service when shutting down, or before starting a fresh
      instance of the service.
      
      Use the new version only when CONFIG_SUNRPC_REGISTER_V4 is enabled;
      otherwise, the legacy PMAP version is used to ensure complete
      backwards-compatibility with the Linux portmapper daemon.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      f6fb3f6f