提交 · 62832c039eab9d03cd28a66427ce8276988f28b0 · openanolis / cloud-kernel

02 10月, 2010 2 次提交

sunrpc: Pull net argument downto svc_create_socket · 62832c03

由 Pavel Emelyanov 提交于 9月 29, 2010

After this the socket creation in it knows the context.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

62832c03

sunrpc: Add net argument to svc_create_xprt · fc5d00b0

由 Pavel Emelyanov 提交于 9月 29, 2010

Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

fc5d00b0

27 9月, 2010 2 次提交

sunrpc: Tag svc_xprt with net · 4fb8518b

由 Pavel Emelyanov 提交于 9月 27, 2010

The transport representation should be per-net of course.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4fb8518b

sunrpc: Make xprt auth cache release work with the xprt · e3bfca01

由 Pavel Emelyanov 提交于 9月 27, 2010

This is done in order to facilitate getting the ip_map_cache from
which to put the ip_map.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e3bfca01

08 9月, 2010 2 次提交

svcrpc: minor cache cleanup · 6610f720

由 J. Bruce Fields 提交于 8月 26, 2010

Pull out some code into helper functions, fix a typo.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6610f720

sunrpc/cache: allow threads to block while waiting for cache update. · f16b6e8d

由 NeilBrown 提交于 8月 12, 2010

The current practice of waiting for cache updates by queueing the
whole request to be retried has (at least) two problems.

1/ With NFSv4, requests can be quite complex and re-trying a whole
  request when a later part fails should only be a last-resort, not a
  normal practice.

2/ Large requests, and in particular any 'write' request, will not be
  queued by the current code and doing so would be undesirable.

In many cases only a very sort wait is needed before the cache gets
valid data.

So, providing the underlying transport permits it by setting
 ->thread_wait,
arrange to wait briefly for an upcall to be completed (as reflected in
the clearing of CACHE_PENDING).
If the short wait was not long enough and CACHE_PENDING is still set,
fall back on the old approach.

The 'thread_wait' value is set to 5 seconds when there are spare
threads, and 1 second when there are no spare threads.

These values are probably much higher than needed, but will ensure
some forward progress.

Note that as we only request an update for a non-valid item, and as
non-valid items are updated in place it is extremely unlikely that
cache_check will return -ETIMEDOUT.  Normally cache_defer_req will
sleep for a short while and then find that the item is_valid.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f16b6e8d

03 5月, 2010 1 次提交

sunrpc: centralise most calls to svc_xprt_received · b48fa6b9

由 Neil Brown 提交于 3月 01, 2010

svc_xprt_received must be called when ->xpo_recvfrom has finished
receiving a message, so that the XPT_BUSY flag will be cleared and
if necessary, requeued for further work.

This call is currently made in each ->xpo_recvfrom function, often
from multiple different points.  In each case it is the earliest point
on a particular path where it is known that the protection provided by
XPT_BUSY is no longer needed.

However there are (still) some error paths which do not call
svc_xprt_received, and requiring each ->xpo_recvfrom to make the call
does not encourage robustness.

So: move the svc_xprt_received call to be made just after the
call to ->xpo_recvfrom(), and move it of the various ->xpo_recvfrom
methods.

This means that it may not be called at the earliest possible instant,
but this is unlikely to be a measurable performance issue.

Note that there are still other calls to svc_xprt_received as it is
also needed when an xprt is newly created.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

b48fa6b9

30 3月, 2010 2 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

svcrpc: don't hold sv_lock over svc_xprt_put() · 788e69e5

由 J. Bruce Fields 提交于 3月 29, 2010

svc_xprt_put() can call tcp_close(), which can sleep, so we shouldn't be
holding this lock.

In fact, only the xpt_list removal and the sv_tmpcnt decrement should
need the sv_lock here.
Reported-by: NMi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

788e69e5

01 3月, 2010 2 次提交

Revert "sunrpc: move the close processing after do recvfrom method" · 1b644b6e

由 J. Bruce Fields 提交于 2月 28, 2010

This reverts commit b0401d72, which
moved svc_delete_xprt() outside of XPT_BUSY, and allowed it to be called
after svc_xpt_recived(), removing its last reference and destroying it
after it had already been queued for future processing.
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

1b644b6e

Revert "sunrpc: fix peername failed on closed listener" · f5822754

由 J. Bruce Fields 提交于 2月 28, 2010

This reverts commit b292cf9c.  The
commit that it attempted to patch up,
b0401d72, was fundamentally wrong, and
will also be reverted.
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

f5822754

27 2月, 2010 1 次提交

sunrpc: remove unnecessary svc_xprt_put · ab1b18f7

由 Neil Brown 提交于 2月 27, 2010

The 'struct svc_deferred_req's on the xpt_deferred queue do not
own a reference to the owning xprt.  This is seen in svc_revisit
which is where things are added to this queue.  dr->xprt is set to
NULL and the reference to the xprt it put.

So when this list is cleaned up in svc_delete_xprt, we mustn't
put the reference.

Also, replace the 'for' with a 'while' which is arguably
simpler and more likely to compile efficiently.

Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: NNeilBrown <neilb@suse.de>
Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

ab1b18f7

27 1月, 2010 2 次提交

SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" · 68717908

由 Chuck Lever 提交于 1月 26, 2010

write_ports() converts svc_create_xprt()'s ENOENT error return to
EPROTONOSUPPORT so that rpc.nfsd (in user space) can report an error
message that makes sense.

It turns out that several of the other kernel APIs rpc.nfsd use can
also return ENOENT from svc_create_xprt(), by way of lockd_up().

On the client side, an NFSv2 or NFSv3 mount request can also return
the result of lockd_up(). This error may also be returned during an
NFSv4 mount request, since the NFSv4 callback service uses
svc_create_xprt() to create the callback listener. An ENOENT error
return results in a confusing error message from the mount command.

Let's have svc_create_xprt() return EPROTONOSUPPORT instead of ENOENT.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

68717908

SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt() · d6783b2b

由 Chuck Lever 提交于 1月 26, 2010

Clean up:  Bruce observed we have more or less common logic in each of
svc_create_xprt()'s callers:  the check to create an IPv6 RPC listener
socket only if CONFIG_IPV6 is set.  I'm about to add another case
that does just the same.

If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt()
call sites can get rid of the "#ifdef" ugliness, and can use the same
logic with or without IPv6 support available in the kernel.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

d6783b2b

07 1月, 2010 1 次提交

sunrpc: fix peername failed on closed listener · b292cf9c

由 Xiaotian Feng 提交于 12月 31, 2009

There're some warnings of "nfsd: peername failed (err 107)!"
socket error -107 means Transport endpoint is not connected.
This warning message was outputed by svc_tcp_accept() [net/sunrpc/svcsock.c],
when kernel_getpeername returns -107. This means socket might be CLOSED.

And svc_tcp_accept was called by svc_recv() [net/sunrpc/svc_xprt.c]

        if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
        <snip>
                newxpt = xprt->xpt_ops->xpo_accept(xprt);
        <snip>

So this might happen when xprt->xpt_flags has both XPT_LISTENER and XPT_CLOSE.

Let's take a look at commit b0401d72, this commit has moved the close
processing after do recvfrom method, but this commit also introduces this
warnings, if the xpt_flags has both XPT_LISTENER and XPT_CLOSED, we should
close it, not accpet then close.
Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

b292cf9c

30 11月, 2009 1 次提交

net: Move && and || to end of previous line · f64f9e71

由 Joe Perches 提交于 11月 29, 2009

Not including net/atm/

Compiled tested x86 allyesconfig only
Added a > 80 column line or two, which I ignored.
Existing checkpatch plaints willfully, cheerfully ignored.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f64f9e71

24 11月, 2009 1 次提交

Revert "knfsd: avoid overloading the CPU scheduler with enormous load averages" · 78c210ef

由 J. Bruce Fields 提交于 8月 06, 2009

This reverts commit 59a252ff.

This helps in an entirely cached workload but not necessarily in
workloads that require waiting on disk.

Conflicts:

	include/linux/sunrpc/svc.h
	net/sunrpc/svc_xprt.c
Reported-by: NSimon Kirby <sim@hostway.ca>
Tested-by: NJesper Krogh <jesper@krogh.cc>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

78c210ef

12 9月, 2009 1 次提交

nfsd41: sunrpc: Added rpc server-side backchannel handling · 4cfc7e60

由 Rahul Iyer 提交于 9月 10, 2009

When the call direction is a reply, copy the xid and call direction into the
req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns
rpc_garbage.
Signed-off-by: NRahul Iyer <iyer@netapp.com>
Signed-off-by: NMike Sager <sager@netapp.com>
Signed-off-by: NMarc Eshel <eshel@almaden.ibm.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
Signed-off-by: NRicardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
[get rid of CONFIG_NFSD_V4_1]
[sunrpc: refactoring of svc_tcp_recvfrom]
[nfsd41: sunrpc: create common send routine for the fore and the back channels]
[nfsd41: sunrpc: Use free_page() to free server backchannel pages]
[nfsd41: sunrpc: Document server backchannel locking]
[nfsd41: sunrpc: remove bc_connect_worker()]
[nfsd41: sunrpc: Define xprt_server_backchannel()[
[nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions]
[nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()]
[nfsd41: sunrpc: Don't auto close the server backchannel connection]
[nfsd41: sunrpc: Remove unused functions]
Signed-off-by: NAlexandros Batsakis <batsakis@netapp.com>
Signed-off-by: NRicardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
[nfsd41: change bc_sock to bc_xprt]
[nfsd41: sunrpc: move struct rpc_buffer def into a common header file]
[nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex]
[removed cosmetic changes]
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
[sunrpc: add new xprt class for nfsv4.1 backchannel]
[sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel]
Signed-off-by: NAlexandros Batsakis <batsakis@netapp.com>
[reverted more cosmetic leftovers]
[got rid of xprt_server_backchannel]
[separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"]
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
Cc: Trond Myklebust <trond.myklebust@netapp.com>
[sunrpc: change idle timeout value for the backchannel]
Signed-off-by: NAlexandros Batsakis <batsakis@netapp.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
Acked-by: NTrond Myklebust <trond.myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

4cfc7e60

28 8月, 2009 1 次提交

sunrpc: move the close processing after do recvfrom method · b0401d72

由 Wei Yongjun 提交于 8月 27, 2009

sunrpc: "Move close processing to a single place"
(d7979ae4) moved the
close processing before the recvfrom method. This may
cause the close processing never to execute. So this
patch moves it to the right place.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

b0401d72

26 8月, 2009 1 次提交

knfsd: Replace lock_kernel with a mutex in nfsd pool stats. · ed2d8aed

由 Ryusei Yamaguchi 提交于 8月 16, 2009

lock_kernel() in knfsd was replaced with a mutex. The later
commit 03cf6c9f ("knfsd:
add file to export stats about nfsd pools") did not follow
that change. This patch fixes the issue.

Also move the get and put of nfsd_serv to the open and close methods
(instead of start and stop methods) to allow atomic check and increment
of reference count in the open method (where we can still return an
error).
Signed-off-by: NRyusei Yamaguchi <mandel59@gmail.com>
Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Greg Banks <gnb@fmeh.org>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

ed2d8aed

13 7月, 2009 1 次提交

headers: smp_lock.h redux · 405f5571

由 Alexey Dobriyan 提交于 7月 11, 2009

* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

405f5571

29 4月, 2009 2 次提交

NFSD: Prevent a buffer overflow in svc_xprt_names() · 335c54bd

由 Chuck Lever 提交于 4月 23, 2009

The svc_xprt_names() function can overflow its buffer if it's so near
the end of the passed in buffer that the "name too long" string still
doesn't fit.  Of course, it could never tell if it was near the end
of the passed in buffer, since its only caller passes in zero as the
buffer length.

Let's make this API a little safer.

Change svc_xprt_names() so it *always* checks for a buffer overflow,
and change its only caller to pass in the correct buffer length.

If svc_xprt_names() does overflow its buffer, it now fails with an
ENAMETOOLONG errno, instead of trying to write a message at the end
of the buffer.  I don't like this much, but I can't figure out a clean
way that's always safe to return some of the names, *and* an
indication that the buffer was not long enough.

The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is
"File name too long".
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

335c54bd

net/sunrpc/svc_xprt.c: fix sparse warnings · dcf1a357

由 H Hartley Sweeten 提交于 4月 22, 2009

Fix the following sparse warnings in net/sunrpc/svc_xprt.c.

warning: symbol 'svc_recv' was not declared. Should it be static?
warning: symbol 'svc_drop' was not declared. Should it be static?
warning: symbol 'svc_send' was not declared. Should it be static?
warning: symbol 'svc_close_all' was not declared. Should it be static?
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

dcf1a357

04 4月, 2009 1 次提交

nfsd: don't use the deferral service, return NFS4ERR_DELAY · 2f425878

由 Andy Adamson 提交于 4月 03, 2009

On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be
returned. It is up to the NFSv4.1 client to resend only the operations that
have not been processed.

Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in
nfsd4_proc_compound() only when NFSv4.1 Sessions are used.

Note: this isn't an adequate solution on its own. It's acceptable as a way
to get some minimal 4.1 up and working, but we're going to have to find a
way to avoid returning DELAY in all common cases before 4.1 can really be
considered ready.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
[nfsd41: reverse rq_nodeferral negative logic]
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
[sunrpc: initialize rq_usedeferral]
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

2f425878

29 3月, 2009 2 次提交

SUNRPC: Change svc_create_xprt() to take a @family argument · 9652ada3

由 Chuck Lever 提交于 3月 18, 2009

The sv_family field is going away.  Pass a protocol family argument to
svc_create_xprt() instead of extracting the family from the passed-in
svc_serv struct.

Again, as this is a listener socket and not an address, we make this
new argument an "int" protocol family, instead of an "sa_family_t."
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

9652ada3

SUNRPC: Clean up svc_find_xprt() calling sequence · 156e6209

由 Chuck Lever 提交于 3月 18, 2009

Clean up: add documentating comment and use appropriate data types for
svc_find_xprt()'s arguments.

This also eliminates a mixed sign comparison: @port was an int, while
the return value of svc_xprt_local_port() is an unsigned short.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

156e6209

19 3月, 2009 2 次提交

knfsd: add file to export stats about nfsd pools · 03cf6c9f

由 Greg Banks 提交于 1月 13, 2009

Add /proc/fs/nfsd/pool_stats to export to userspace various
statistics about the operation of rpc server thread pools.

This patch is based on a forward-ported version of
knfsd-add-pool-thread-stats which has been shipping in the SGI
"Enhanced NFS" product since 2006 and which was previously
posted:

http://article.gmane.org/gmane.linux.nfs/10375

It has also been updated thus:

 * moved EXPORT_SYMBOL() to near the function it exports
 * made the new struct struct seq_operations const
 * used SEQ_START_TOKEN instead of ((void *)1)
 * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
   printk in svc_pool_stats_*()" by Harshula Jayasuriya.
 * merged fix from SGI PV 964001 "Crash reading pool_stats before
   nfsds are started".
Signed-off-by: NGreg Banks <gnb@sgi.com>
Signed-off-by: NHarshula Jayasuriya <harshula@sgi.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

03cf6c9f

knfsd: avoid overloading the CPU scheduler with enormous load averages · 59a252ff

由 Greg Banks 提交于 1月 13, 2009

Avoid overloading the CPU scheduler with enormous load averages
when handling high call-rate NFS loads. When the knfsd bottom half
is made aware of an incoming call by the socket layer, it tries to
choose an nfsd thread and wake it up. As long as there are idle
threads, one will be woken up.

If there are lot of nfsd threads (a sensible configuration when
the server is disk-bound or is running an HSM), there will be many
more nfsd threads than CPUs to run them. Under a high call-rate
low service-time workload, the result is that almost every nfsd is
runnable, but only a handful are actually able to run. This situation
causes two significant problems:

1. The CPU scheduler takes over 10% of each CPU, which is robbing
the nfsd threads of valuable CPU time.

2. At a high enough load, the nfsd threads starve userspace threads
of CPU time, to the point where daemons like portmap and rpc.mountd
do not schedule for tens of seconds at a time. Clients attempting
to mount an NFS filesystem timeout at the very first step (opening
a TCP connection to portmap) because portmap cannot wake up from
select() and call accept() in time.

Disclaimer: these effects were observed on a SLES9 kernel, modern
kernels' schedulers may behave more gracefully.

The solution is simple: keep in each svc_pool a counter of the number
of threads which have been woken but have not yet run, and do not wake
any more if that count reaches an arbitrary small threshold.

Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e. recursive directory
listing) workload reading from an i386 RH9 install image (161480
regular files in 10841 directories) on the server. That tree is small
enough to fill in the server's RAM so no disk traffic was involved.
This setup gives a sustained call rate in excess of 60000 calls/sec
before being CPU-bound on the server. The server was running 128 nfsds.

Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
taking 5.2%. This patch drops those contributions to 3.0% and 2.2%.
Load average was over 120 before the patch, and 20.9 after.

This patch is a forward-ported version of knfsd-avoid-nfsd-overload
which has been shipping in the SGI "Enhanced NFS" product since 2006.
It has been posted before:

http://article.gmane.org/gmane.linux.nfs/10374Signed-off-by: NGreg Banks <gnb@sgi.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

59a252ff

08 1月, 2009 3 次提交

T
SUNRPC: The sunrpc server code should not be used by out-of-tree modules · 24c3767e
由 Trond Myklebust 提交于 12月 23, 2008
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
```
24c3767e

svc: Clean up deferred requests on transport destruction · 22945e4a

由 Tom Tucker 提交于 1月 05, 2009

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent
processing.

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.
Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

22945e4a

svc: Move kfree of deferral record to common code · 2779e3ae

由 Tom Tucker 提交于 1月 05, 2009

The rqstp structure has a pointer to a svc_deferred_req record
that is allocated when requests are deferred. This record is common
to all transports and can be freed in common code.

Move the kfree of the rq_deferred to the common svc_xprt_release
function.

This also fixes a memory leak in the RDMA transport which does not
kfree the dr structure in it's version of the xpo_release_rqst callback.
Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

2779e3ae

07 1月, 2009 1 次提交

sunrpc: add sv_maxconn field to svc_serv (try ) · c9233eb7

由 Jeff Layton 提交于 10月 20, 2008

svc_check_conn_limits() attempts to prevent denial of service attacks
by having the service close old connections once it reaches a
threshold. This threshold is based on the number of threads in the
service:

	(serv->sv_nrthreads + 3) * 20

Once we reach this, we drop the oldest connections and a printk pops
to warn the admin that they should increase the number of threads.

Increasing the number of threads isn't an option however for services
like lockd. We don't want to eliminate this check entirely for such
services but we need some way to increase this limit.

This patch adds a sv_maxconn field to the svc_serv struct. When it's
set to 0, we use the current method to calculate the max number of
connections. RPC services can then set this on an as-needed basis.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

c9233eb7

30 9月, 2008 1 次提交

SUNRPC: Use proper INADDR_ANY when setting up RPC services on IPv6 · 5dd248f6

由 Chuck Lever 提交于 6月 30, 2008

Teach svc_create_xprt() to use the correct ANY address for AF_INET6 based
RPC services.

No caller uses AF_INET6 yet.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

5dd248f6

19 5月, 2008 2 次提交

svc: Remove unused header files from svc_xprt.c · aa3314c8

由 Tom Tucker 提交于 4月 24, 2008

This cosmetic patch removes unused header files that svc_xprt.c
inherited from svcsock.c
Signed-off-by: NTom Tucker <tom@opengridcomputing.com>

aa3314c8

svc: Remove extra check for XPT_DEAD bit in svc_xprt_enqueue · fc63a050

由 Tom Tucker 提交于 4月 25, 2008

Remove a redundant check for the XPT_DEAD bit in the svc_xprt_enqueue
function. This same bit is checked below while holding the pool lock
and prints a debug message if found to be dead.
Signed-off-by: NTom Tucker <tom@opengridcomputing.com>

fc63a050

24 4月, 2008 2 次提交

SUNRPC: allow svc_recv to break out of 500ms sleep when alloc_page fails · 7b54fe61

由 Jeff Layton 提交于 2月 12, 2008

svc_recv() calls alloc_page(), and if it fails it does a 500ms
uninterruptible sleep and then reattempts. There doesn't seem to be any
real reason for this to be uninterruptible, so change it to an
interruptible sleep. Also check for kthread_stop() and signalled() after
setting the task state to avoid races that might lead to sleeping after
kthread_stop() wakes up the task.

I've done some very basic smoke testing with this, but obviously it's
hard to test the actual changes since this all depends on an
alloc_page() call failing.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

7b54fe61

SUNRPC: have svc_recv() check kthread_should_stop() · 7086721f

由 Jeff Layton 提交于 2月 07, 2008

When using kthreads that call into svc_recv, we want to make sure that
they do not block there for a long time when we're trying to take down
the kthread.

This patch changes svc_recv() to check kthread_should_stop() at the same
places that it checks to see if it's signalled(). Also check just before
svc_recv() tries to schedule(). By making sure that we check it just
after setting the task state we can avoid having to use any locking or
signalling to ensure it doesn't block for a long time.

There's still a chance of a 500ms sleep if alloc_page() fails, but
that should be a rare occurrence and isn't a terribly long time in
the context of a kthread being taken down.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

7086721f

18 3月, 2008 1 次提交

[NET] endianness noise: INADDR_ANY · e6f1cebf

由 Al Viro 提交于 3月 17, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6f1cebf

02 2月, 2008 2 次提交

SUNRPC: Move exported symbol definitions after function declaration part 2 · d2f7e79e

由 Trond Myklebust 提交于 7月 14, 2007

Do it for the server code...
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

d2f7e79e

svc: Add svc_xprt_names service to replace svc_sock_names · 9571af18

由 Tom Tucker 提交于 12月 30, 2007

Create a transport independent version of the svc_sock_names function.

The toclose capability of the svc_sock_names service can be implemented
using the svc_xprt_find and svc_xprt_close services.
Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
Acked-by: NNeil Brown <neilb@suse.de>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NGreg Banks <gnb@sgi.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

9571af18

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功