1. 20 11月, 2010 4 次提交
    • J
      svcrpc: svc_close_xprt comment · b1763316
      J. Bruce Fields 提交于
      Neil Brown had to explain to me why we do this here; record the answer
      for posterity.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      b1763316
    • J
      svcrpc: simplify svc_close_all · f8c0d226
      J. Bruce Fields 提交于
      There's no need to be fooling with XPT_BUSY now that all the threads
      are gone.
      
      The list_del_init() here could execute at the same time as the
      svc_xprt_enqueue()'s list_add_tail(), with undefined results.  We don't
      really care at this point, but it might result in a spurious
      list-corruption warning or something.
      
      And svc_close() isn't adding any value; just call svc_delete_xprt()
      directly.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f8c0d226
    • J
      nfsd4: centralize more calls to svc_xprt_received · ca7896cd
      J. Bruce Fields 提交于
      Follow up on b48fa6b9 by moving all the
      svc_xprt_received() calls for the main xprt to one place.  The clearing
      of XPT_BUSY here is critical to the correctness of the server, so I'd
      prefer it to be obvious where we do it.
      
      The only substantive result is moving svc_xprt_received() after
      svc_receive_deferred().  Other than a (likely insignificant) delay
      waking up the next thread, that should be harmless.
      
      Also reshuffle the exit code a little to skip a few other steps that we
      don't care about the in the svc_delete_xprt() case.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ca7896cd
    • J
      svcrpc: don't set then immediately clear XPT_DEFERRED · 62bac4af
      J. Bruce Fields 提交于
      There's no harm to doing this, since the only caller will immediately
      call svc_enqueue() afterwards, ensuring we don't miss the remaining
      deferred requests just because XPT_DEFERRED was briefly cleared.
      
      But why not just do this the simple way?
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      62bac4af
  2. 26 10月, 2010 3 次提交
  3. 19 10月, 2010 1 次提交
  4. 02 10月, 2010 3 次提交
  5. 27 9月, 2010 2 次提交
  6. 08 9月, 2010 2 次提交
    • J
      svcrpc: minor cache cleanup · 6610f720
      J. Bruce Fields 提交于
      Pull out some code into helper functions, fix a typo.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      6610f720
    • N
      sunrpc/cache: allow threads to block while waiting for cache update. · f16b6e8d
      NeilBrown 提交于
      The current practice of waiting for cache updates by queueing the
      whole request to be retried has (at least) two problems.
      
      1/ With NFSv4, requests can be quite complex and re-trying a whole
        request when a later part fails should only be a last-resort, not a
        normal practice.
      
      2/ Large requests, and in particular any 'write' request, will not be
        queued by the current code and doing so would be undesirable.
      
      In many cases only a very sort wait is needed before the cache gets
      valid data.
      
      So, providing the underlying transport permits it by setting
       ->thread_wait,
      arrange to wait briefly for an upcall to be completed (as reflected in
      the clearing of CACHE_PENDING).
      If the short wait was not long enough and CACHE_PENDING is still set,
      fall back on the old approach.
      
      The 'thread_wait' value is set to 5 seconds when there are spare
      threads, and 1 second when there are no spare threads.
      
      These values are probably much higher than needed, but will ensure
      some forward progress.
      
      Note that as we only request an update for a non-valid item, and as
      non-valid items are updated in place it is extremely unlikely that
      cache_check will return -ETIMEDOUT.  Normally cache_defer_req will
      sleep for a short while and then find that the item is_valid.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f16b6e8d
  7. 03 5月, 2010 1 次提交
    • N
      sunrpc: centralise most calls to svc_xprt_received · b48fa6b9
      Neil Brown 提交于
      svc_xprt_received must be called when ->xpo_recvfrom has finished
      receiving a message, so that the XPT_BUSY flag will be cleared and
      if necessary, requeued for further work.
      
      This call is currently made in each ->xpo_recvfrom function, often
      from multiple different points.  In each case it is the earliest point
      on a particular path where it is known that the protection provided by
      XPT_BUSY is no longer needed.
      
      However there are (still) some error paths which do not call
      svc_xprt_received, and requiring each ->xpo_recvfrom to make the call
      does not encourage robustness.
      
      So: move the svc_xprt_received call to be made just after the
      call to ->xpo_recvfrom(), and move it of the various ->xpo_recvfrom
      methods.
      
      This means that it may not be called at the earliest possible instant,
      but this is unlikely to be a measurable performance issue.
      
      Note that there are still other calls to svc_xprt_received as it is
      also needed when an xprt is newly created.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      b48fa6b9
  8. 30 3月, 2010 2 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
    • J
      svcrpc: don't hold sv_lock over svc_xprt_put() · 788e69e5
      J. Bruce Fields 提交于
      svc_xprt_put() can call tcp_close(), which can sleep, so we shouldn't be
      holding this lock.
      
      In fact, only the xpt_list removal and the sv_tmpcnt decrement should
      need the sv_lock here.
      Reported-by: NMi Jinlong <mijinlong@cn.fujitsu.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      788e69e5
  9. 01 3月, 2010 2 次提交
  10. 27 2月, 2010 1 次提交
    • N
      sunrpc: remove unnecessary svc_xprt_put · ab1b18f7
      Neil Brown 提交于
      The 'struct svc_deferred_req's on the xpt_deferred queue do not
      own a reference to the owning xprt.  This is seen in svc_revisit
      which is where things are added to this queue.  dr->xprt is set to
      NULL and the reference to the xprt it put.
      
      So when this list is cleaned up in svc_delete_xprt, we mustn't
      put the reference.
      
      Also, replace the 'for' with a 'while' which is arguably
      simpler and more likely to compile efficiently.
      
      Cc: Tom Tucker <tom@opengridcomputing.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      ab1b18f7
  11. 27 1月, 2010 2 次提交
    • C
      SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" · 68717908
      Chuck Lever 提交于
      write_ports() converts svc_create_xprt()'s ENOENT error return to
      EPROTONOSUPPORT so that rpc.nfsd (in user space) can report an error
      message that makes sense.
      
      It turns out that several of the other kernel APIs rpc.nfsd use can
      also return ENOENT from svc_create_xprt(), by way of lockd_up().
      
      On the client side, an NFSv2 or NFSv3 mount request can also return
      the result of lockd_up().  This error may also be returned during an
      NFSv4 mount request, since the NFSv4 callback service uses
      svc_create_xprt() to create the callback listener.  An ENOENT error
      return results in a confusing error message from the mount command.
      
      Let's have svc_create_xprt() return EPROTONOSUPPORT instead of ENOENT.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      68717908
    • C
      SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt() · d6783b2b
      Chuck Lever 提交于
      Clean up:  Bruce observed we have more or less common logic in each of
      svc_create_xprt()'s callers:  the check to create an IPv6 RPC listener
      socket only if CONFIG_IPV6 is set.  I'm about to add another case
      that does just the same.
      
      If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt()
      call sites can get rid of the "#ifdef" ugliness, and can use the same
      logic with or without IPv6 support available in the kernel.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      d6783b2b
  12. 07 1月, 2010 1 次提交
    • X
      sunrpc: fix peername failed on closed listener · b292cf9c
      Xiaotian Feng 提交于
      There're some warnings of "nfsd: peername failed (err 107)!"
      socket error -107 means Transport endpoint is not connected.
      This warning message was outputed by svc_tcp_accept() [net/sunrpc/svcsock.c],
      when kernel_getpeername returns -107. This means socket might be CLOSED.
      
      And svc_tcp_accept was called by svc_recv() [net/sunrpc/svc_xprt.c]
      
              if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
              <snip>
                      newxpt = xprt->xpt_ops->xpo_accept(xprt);
              <snip>
      
      So this might happen when xprt->xpt_flags has both XPT_LISTENER and XPT_CLOSE.
      
      Let's take a look at commit b0401d72, this commit has moved the close
      processing after do recvfrom method, but this commit also introduces this
      warnings, if the xpt_flags has both XPT_LISTENER and XPT_CLOSED, we should
      close it, not accpet then close.
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Cc: J. Bruce Fields <bfields@fieldses.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: stable@kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      b292cf9c
  13. 30 11月, 2009 1 次提交
  14. 24 11月, 2009 1 次提交
  15. 12 9月, 2009 1 次提交
  16. 28 8月, 2009 1 次提交
  17. 26 8月, 2009 1 次提交
  18. 13 7月, 2009 1 次提交
  19. 29 4月, 2009 2 次提交
    • C
      NFSD: Prevent a buffer overflow in svc_xprt_names() · 335c54bd
      Chuck Lever 提交于
      The svc_xprt_names() function can overflow its buffer if it's so near
      the end of the passed in buffer that the "name too long" string still
      doesn't fit.  Of course, it could never tell if it was near the end
      of the passed in buffer, since its only caller passes in zero as the
      buffer length.
      
      Let's make this API a little safer.
      
      Change svc_xprt_names() so it *always* checks for a buffer overflow,
      and change its only caller to pass in the correct buffer length.
      
      If svc_xprt_names() does overflow its buffer, it now fails with an
      ENAMETOOLONG errno, instead of trying to write a message at the end
      of the buffer.  I don't like this much, but I can't figure out a clean
      way that's always safe to return some of the names, *and* an
      indication that the buffer was not long enough.
      
      The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is
      "File name too long".
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      335c54bd
    • H
      net/sunrpc/svc_xprt.c: fix sparse warnings · dcf1a357
      H Hartley Sweeten 提交于
      Fix the following sparse warnings in net/sunrpc/svc_xprt.c.
      
        warning: symbol 'svc_recv' was not declared. Should it be static?
        warning: symbol 'svc_drop' was not declared. Should it be static?
        warning: symbol 'svc_send' was not declared. Should it be static?
        warning: symbol 'svc_close_all' was not declared. Should it be static?
      Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      dcf1a357
  20. 04 4月, 2009 1 次提交
  21. 29 3月, 2009 2 次提交
  22. 19 3月, 2009 2 次提交
    • G
      knfsd: add file to export stats about nfsd pools · 03cf6c9f
      Greg Banks 提交于
      Add /proc/fs/nfsd/pool_stats to export to userspace various
      statistics about the operation of rpc server thread pools.
      
      This patch is based on a forward-ported version of
      knfsd-add-pool-thread-stats which has been shipping in the SGI
      "Enhanced NFS" product since 2006 and which was previously
      posted:
      
      http://article.gmane.org/gmane.linux.nfs/10375
      
      It has also been updated thus:
      
       * moved EXPORT_SYMBOL() to near the function it exports
       * made the new struct struct seq_operations const
       * used SEQ_START_TOKEN instead of ((void *)1)
       * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
         printk in svc_pool_stats_*()" by Harshula Jayasuriya.
       * merged fix from SGI PV 964001 "Crash reading pool_stats before
         nfsds are started".
      Signed-off-by: NGreg Banks <gnb@sgi.com>
      Signed-off-by: NHarshula Jayasuriya <harshula@sgi.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      03cf6c9f
    • G
      knfsd: avoid overloading the CPU scheduler with enormous load averages · 59a252ff
      Greg Banks 提交于
      Avoid overloading the CPU scheduler with enormous load averages
      when handling high call-rate NFS loads.  When the knfsd bottom half
      is made aware of an incoming call by the socket layer, it tries to
      choose an nfsd thread and wake it up.  As long as there are idle
      threads, one will be woken up.
      
      If there are lot of nfsd threads (a sensible configuration when
      the server is disk-bound or is running an HSM), there will be many
      more nfsd threads than CPUs to run them.  Under a high call-rate
      low service-time workload, the result is that almost every nfsd is
      runnable, but only a handful are actually able to run.  This situation
      causes two significant problems:
      
      1. The CPU scheduler takes over 10% of each CPU, which is robbing
         the nfsd threads of valuable CPU time.
      
      2. At a high enough load, the nfsd threads starve userspace threads
         of CPU time, to the point where daemons like portmap and rpc.mountd
         do not schedule for tens of seconds at a time.  Clients attempting
         to mount an NFS filesystem timeout at the very first step (opening
         a TCP connection to portmap) because portmap cannot wake up from
         select() and call accept() in time.
      
      Disclaimer: these effects were observed on a SLES9 kernel, modern
      kernels' schedulers may behave more gracefully.
      
      The solution is simple: keep in each svc_pool a counter of the number
      of threads which have been woken but have not yet run, and do not wake
      any more if that count reaches an arbitrary small threshold.
      
      Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
      synthetic client threads simulating an rsync (i.e. recursive directory
      listing) workload reading from an i386 RH9 install image (161480
      regular files in 10841 directories) on the server.  That tree is small
      enough to fill in the server's RAM so no disk traffic was involved.
      This setup gives a sustained call rate in excess of 60000 calls/sec
      before being CPU-bound on the server.  The server was running 128 nfsds.
      
      Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
      taking 5.2%.  This patch drops those contributions to 3.0% and 2.2%.
      Load average was over 120 before the patch, and 20.9 after.
      
      This patch is a forward-ported version of knfsd-avoid-nfsd-overload
      which has been shipping in the SGI "Enhanced NFS" product since 2006.
      It has been posted before:
      
      http://article.gmane.org/gmane.linux.nfs/10374Signed-off-by: NGreg Banks <gnb@sgi.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      59a252ff
  23. 08 1月, 2009 3 次提交