提交 · 05ed690efbb28b54af79f97af8c9705e82a6fbd7 · openanolis / cloud-kernel

10 5月, 2007 2 次提交

knfsd: simplify a 'while' condition in svcsock.c · 05ed690e

由 NeilBrown 提交于 5月 09, 2007

This while loop has an overly complex condition, which performs a couple of
assignments.  This hurts readability.

We don't really need a loop at all.  We can just return -EAGAIN and (providing
we set SK_DATA), the function will be called again.

So discard the loop, make the complex conditional become a few clear function
calls, and hopefully improve readability.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05ed690e

knfsd: rename sk_defer_lock to sk_lock · 7ac1bea5

由 NeilBrown 提交于 5月 09, 2007

Now that sk_defer_lock protects two different things, make the name more
generic.

Also don't bother with disabling _bh as the lock is only ever taken from
process context.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7ac1bea5

26 4月, 2007 1 次提交

[NET]: convert network timestamps to ktime_t · b7aa0bf7

由 Eric Dumazet 提交于 4月 19, 2007

We currently use a special structure (struct skb_timeval) and plain
'struct timeval' to store packet timestamps in sk_buffs and struct
sock.

This has some drawbacks :
- Fixed resolution of micro second.
- Waste of space on 64bit platforms where sizeof(struct timeval)=16

I suggest using ktime_t that is a nice abstraction of high resolution
time services, currently capable of nanosecond resolution.

As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
a 8 byte shrink of this structure on 64bit architectures. Some other
structures also benefit from this size reduction (struct ipq in
ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)

Once this ktime infrastructure adopted, we can more easily provide
nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
SO_TIMESTAMPNS/SCM_TIMESTAMPNS)

Note : this patch includes a bug correction in
compat_sock_get_timestamp() where a "err = 0;" was missing (so this
syscall returned -ENOENT instead of 0)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
CC: Stephen Hemminger <shemminger@linux-foundation.org>
CC: John find <linux.kernel@free.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7aa0bf7

13 4月, 2007 1 次提交
- D
  [SUNRPC]: Make sure on-stack cmsg buffer is properly aligned. · bc375ea7
  由 David S. Miller 提交于 4月 12, 2007
```
Based upon a report from Meelis Roos.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  bc375ea7
05 4月, 2007 1 次提交

[PATCH] net/sunrpc/svcsock.c: fix a check · 418106d6

由 Adrian Bunk 提交于 4月 04, 2007

The return value of kernel_recvmsg() should be assigned to "err", not
compared with the random value of a never initialized "err" (and the "< 0"
check wrongly always returned false since == comparisons never have a
result < 0).

Spotted by the Coverity checker.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Acked-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

418106d6

07 3月, 2007 3 次提交

[PATCH] knfsd: fix recently introduced problem with shutting down a busy NFS server · cda1fd4a

由 NeilBrown 提交于 3月 06, 2007

When the last thread of nfsd exits, it shuts down all related sockets.  It
currently uses svc_close_socket to do this, but that only is immediately
effective if the socket is not SK_BUSY.

If the socket is busy - i.e.  if a request has arrived that has not yet been
processes - svc_close_socket is not effective and the shutdown process spins.

So create a new svc_force_close_socket which removes the SK_BUSY flag is set
and then calls svc_close_socket.

Also change some open-codes loops in svc_destroy to use
list_for_each_entry_safe.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cda1fd4a

[PATCH] knfsd: remove CONFIG_IPV6 ifdefs from sunrpc server code · 5a05ed73

由 NeilBrown 提交于 3月 06, 2007

They don't really save that much, and aren't worth the hassle.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5a05ed73

[PATCH] knfsd: use recv_msg to get peer address for NFSD instead of code-copying · 7a37f578

由 NeilBrown 提交于 3月 06, 2007

The sunrpc server code needs to know the source and destination address for
UDP packets so it can reply properly.  It currently copies code out of the
network stack to pick the pieces out of the skb.  This is ugly and causes
compile problems with the IPv6 stuff.

So, rip that out and use recv_msg instead.  This is a much cleaner interface,
but has a slight cost in that the checksum is now checked before the copy, so
we don't benefit from doing both at the same time.  This can probably be
fixed.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7a37f578

13 2月, 2007 14 次提交

[PATCH] knfsd: SUNRPC: fix up svc_create_socket() to take a sockaddr struct + length · 77f1f67a

由 Chuck Lever 提交于 2月 12, 2007

Replace existing svc_create_socket() API to allow callers to pass addresses
larger than a sockaddr_in.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

77f1f67a

[PATCH] knfsd: SUNRPC: support IPv6 addresses in RPC server's UDP receive path · 95756482

由 Chuck Lever 提交于 2月 12, 2007

Add support for IPv6 addresses in the RPC server's UDP receive path.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

95756482

[PATCH] knfsd: SUNRPC: Support IPv6 addresses in svc_tcp_accept · cdd88b9f

由 akpm@linux-foundation.org 提交于 2月 12, 2007

Modify svc_tcp_accept to support connecting on IPv6 sockets.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cdd88b9f

[PATCH] knfsd: SUNRPC: add a "generic" function to see if the peer uses a secure port · bcdb81ae

由 Chuck Lever 提交于 2月 12, 2007

The only reason svcsock.c looks at a sockaddr's port is to check whether the
remote peer is connecting from a privileged port.  Refactor this check to hide
processing that is specific to address format.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bcdb81ae

[PATCH] knfsd: SUNRPC: teach svc_sendto() to deal with IPv6 addresses · b92503b2

由 Chuck Lever 提交于 2月 12, 2007

CMSG_DATA comes in different sizes, depending on address family.

[akpm@linux-foundation.org: remove unneeded do/while (0)]
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b92503b2

[PATCH] knfsd: SUNRPC: Make rq_daddr field address-version independent · 73df0dba

由 Chuck Lever 提交于 2月 12, 2007

The rq_daddr field must support larger addresses.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

73df0dba

[PATCH] knfsd: SUNRPC: Provide room in svc_rqst for larger addresses · 27459f09

由 Chuck Lever 提交于 2月 12, 2007

Expand the rq_addr field to allow it to contain larger addresses.

Specifically, we replace a 'sockaddr_in' with a 'sockaddr_storage', then
everywhere the 'sockaddr_in' was referenced, we use instead an accessor
function (svc_addr_in) which safely casts the _storage to _in.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27459f09

[PATCH] knfsd: SUNRPC: Use sockaddr_storage to store address in svc_deferred_req · 24422222

由 Chuck Lever 提交于 2月 12, 2007

Sockaddr_storage will allow us to store arbitrary socket addresses in the
svc_deferred_req struct.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

24422222

[PATCH] knfsd: SUNRPC: Add a function to format the address in an svc_rqst for printing · ad06e4bd

由 Chuck Lever 提交于 2月 12, 2007

There are loads of places where the RPC server assumes that the rq_addr fields
contains an IPv4 address.  Top among these are error and debugging messages
that display the server's IP address.

Let's refactor the address printing into a separate function that's smart
enough to figure out the difference between IPv4 and IPv6 addresses.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ad06e4bd

[PATCH] knfsd: SUNRPC: Don't set msg_name and msg_namelen when calling sock_recvmsg · 1ba95105

由 Chuck Lever 提交于 2月 12, 2007

Clean-up: msg_name and msg_namelen are not used by sock_recvmsg, so don't
bother to set them in svc_recvfrom.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1ba95105

[PATCH] knfsd: SUNRPC: Cache remote peer's address in svc_sock · 067d7817

由 Chuck Lever 提交于 2月 12, 2007

The remote peer's address won't change after the socket has been accepted.  We
don't need to call ->getname on every incoming request.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

067d7817

[PATCH] knfsd: SUNRPC: aplit svc_sock_enqueue out of svc_setup_socket · e79eff1f

由 NeilBrown 提交于 2月 12, 2007

Rather than calling svc_sock_enqueue at the end of svc_setup_socket, we now
call it (via svc_sock_recieved) after calling svc_setup_socket at each call
site.

We do this because a subsequent patch will insert some code between the two
calls at one call site.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e79eff1f

[PATCH] knfsd: SUNRPC: allow creating an RPC service without registering with portmapper · 482fb94e

由 Chuck Lever 提交于 2月 12, 2007

Sometimes we need to create an RPC service but not register it with the local
portmapper.  NFSv4 delegation callback, for example.

Change the svc_makesock() API to allow optionally creating temporary or
permanent sockets, optionally registering with the local portmapper, and make
it return the ephemeral port of the new socket.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

482fb94e

[PATCH] knfsd: SUNRPC: update internal API: separate pmap register and temp sockets · 6b174337

由 Chuck Lever 提交于 2月 12, 2007

Currently in the RPC server, registering with the local portmapper and
creating "permanent" sockets are tied together.  Expand the internal APIs to
allow these two socket characteristics to be separately specified.

This will be externalized in the next patch.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6b174337

11 2月, 2007 1 次提交

[NET] SUNRPC: Fix whitespace errors. · cca5172a

由 YOSHIFUJI Hideaki 提交于 2月 09, 2007

Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cca5172a

10 2月, 2007 1 次提交

[PATCH] knfsd: fix a race in closing NFSd connections · aaf68cfb

由 NeilBrown 提交于 2月 08, 2007

If you lose this race, it can iput a socket inode twice and you get a BUG
in fs/inode.c

When I added the option for user-space to close a socket, I added some
cruft to svc_delete_socket so that I could call that function when closing
a socket per user-space request.

This was the wrong thing to do.  I should have just set SK_CLOSE and let
normal mechanisms do the work.

Not only wrong, but buggy.  The locking is all wrong and it openned up a
race where-by a socket could be closed twice.

So this patch:
  Introduces svc_close_socket which sets SK_CLOSE then either leave
  the close up to a thread, or calls svc_delete_socket if it can
  get SK_BUSY.

  Adds a bias to sk_busy which is removed when SK_DEAD is set,
  This avoid races around shutting down the socket.

  Changes several 'spin_lock' to 'spin_lock_bh' where the _bh
  was missing.

Bugzilla-url: http://bugzilla.kernel.org/show_bug.cgi?id=7916Signed-off-by: NNeil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aaf68cfb

31 1月, 2007 1 次提交

[PATCH] knfsd: ratelimit some nfsd messages that are triggered by external events · 34e9a63b

由 NeilBrown 提交于 1月 29, 2007

Also remove {NFSD,RPC}_PARANOIA as having the defines doesn't really add
anything.

The printks covered by RPC_PARANOIA were triggered by badly formatted
packets and so should be ratelimited.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

34e9a63b

27 1月, 2007 1 次提交

[PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned reads · 250f3915

由 NeilBrown 提交于 1月 26, 2007

NFSd assumes that largest number of pages that will be needed for a
request+response is 2+N where N pages is the size of the largest permitted
read/write request.  The '2' are 1 for the non-data part of the request, and 1
for the non-data part of the reply.

However, when a read request is not page-aligned, and we choose to use
->sendfile to send it directly from the page cache, we may need N+1 pages to
hold the whole reply.  This can overflow and array and cause an Oops.

This patch increases size of the array for holding pages by one and makes sure
that entry is NULL when it is not in use.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

250f3915

08 12月, 2006 2 次提交

[PATCH] lockdep: annotate nfs/nfsd in-kernel sockets · ed07536e

由 Peter Zijlstra 提交于 12月 06, 2006

Stick NFS sockets in their own class to avoid some lockdep warnings.  NFS
sockets are never exposed to user-space, and will hence not trigger certain
code paths that would otherwise pose deadlock scenarios.

[akpm@osdl.org: cleanups]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NSteven Dickson <SteveD@redhat.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: NNeil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
[ Fixed patch corruption by quilt, pointed out by Peter Zijlstra ]
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ed07536e

[PATCH] Add include/linux/freezer.h and move definitions from sched.h · 7dfb7103

由 Nigel Cunningham 提交于 12月 06, 2006

Move process freezing functions from include/linux/sched.h to freezer.h, so
that modifications to the freezer or the kernel configuration don't require
recompiling just about everything.

[akpm@osdl.org: fix ueagle driver]
Signed-off-by: NNigel Cunningham <nigel@suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7dfb7103

31 10月, 2006 2 次提交

[PATCH] fix "sunrpc: fix refcounting problems in rpc servers" · 202dd450

由 Andrew Morton 提交于 10月 29, 2006

- printk should remain dprintk

- fix coding-style.

Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

202dd450

[PATCH] sunrpc: fix refcounting problems in rpc servers · d6740df9

由 Neil Brown 提交于 10月 29, 2006

A recent patch fixed a problem which would occur when the refcount on an
auth_domain reached zero.  This problem has not been reported in practice
despite existing in two major kernel releases because the refcount can
never reach zero.

This patch fixes the problems that stop the refcount reaching zero.

1/ We were adding to the refcount when inserting in the hash table,
   but only removing from the hashtable when the refcount reached zero.
   Obviously it never would.  So don't count the implied reference of
   being in the hash table.

2/ There are two paths on which a socket can be destroyed.  One called
   svcauth_unix_info_release().  The other didn't.  So when the other was
   taken, we can lose a reference to an ip_map which in-turn holds a
   reference to an auth_domain

   So unify the exit paths into svc_sock_put.  This highlights the fact
   that svc_delete_socket has slightly odd semantics - it does not drop
   a reference but probably should.  Fixing this need a bit more
   thought and testing.
Signed-off-by: NNeil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d6740df9

21 10月, 2006 1 次提交

[PATCH] knfsd: fix race that can disable NFS server · 1a047060

由 NeilBrown 提交于 10月 19, 2006

This patch is suitable for just about any 2.6 kernel.  It should go in
2.6.19 and 2.6.18.2 and possible even the .17 and .16 stable series.

This is a long standing bug that seems to have only recently become
apparent, presumably due to increasing use of NFS over TCP - many
distros seem to be making it the default.

The SK_CONN bit gets set when a listening socket may be ready
for an accept, just as SK_DATA is set when data may be available.

It is entirely possible for svc_tcp_accept to be called with neither
of these set.  It doesn't happen often but there is a small race in
svc_sock_enqueue as SK_CONN and SK_DATA are tested outside the
spin_lock.  They could be cleared immediately after the test and
before the lock is gained.

This normally shouldn't be a problem.  The sockets are non-blocking so
trying to read() or accept() when ther is nothing to do is not a problem.

However: svc_tcp_recvfrom makes the decision "Should I accept() or
should I read()" based on whether SK_CONN is set or not.  This usually
works but is not safe.  The decision should be based on whether it is
a TCP_LISTEN socket or a TCP_CONNECTED socket.
Signed-off-by: NNeil Brown <neilb@suse.de>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: <stable@kernel.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1a047060

06 10月, 2006 1 次提交

[PATCH] knfsd: tidy up up meaning of 'buffer size' in nfsd/sunrpc · c6b0a9f8

由 NeilBrown 提交于 10月 06, 2006

There is some confusion about the meaning of 'bufsz' for a sunrpc server.
In some cases it is the largest message that can be sent or received.  In
other cases it is the largest 'payload' that can be included in a NFS
message.

In either case, it is not possible for both the request and the reply to be
this large.  One of the request or reply may only be one page long, which
fits nicely with NFS.

So we remove 'bufsz' and replace it with two numbers: 'max_payload' and
'max_mesg'.  Max_payload is the size that the server requests.  It is used
by the server to check the max size allowed on a particular connection:
depending on the protocol a lower limit might be used.

max_mesg is the largest single message that can be sent or received.  It is
calculated as the max_payload, rounded up to a multiple of PAGE_SIZE, and
with PAGE_SIZE added to overhead.  Only one of the request and reply may be
this size.  The other must be at most one page.

Cc: Greg Banks <gnb@sgi.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c6b0a9f8

04 10月, 2006 5 次提交

[PATCH] knfsd: knfsd: cache ipmap per TCP socket · 7b2b1fee

由 Greg Banks 提交于 10月 04, 2006

Speed up high call-rate workloads by caching the struct ip_map for the peer on
the connected struct svc_sock instead of looking it up in the ip_map cache
hashtable on every call.  This helps workloads using AUTH_SYS authentication
over TCP.

Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e.  recursive directory
listing) workload reading from an i386 RH9 install image (161480 regular files
in 10841 directories) on the server.  That tree is small enough to fill in the
server's RAM so no disk traffic was involved.  This setup gives a sustained
call rate in excess of 60000 calls/sec before being CPU-bound on the server.

Profiling showed strcmp(), called from ip_map_match(), was taking 4.8% of each
CPU, and ip_map_lookup() was taking 2.9%.  This patch drops both contribution
into the profile noise.

Note that the above result overstates this value of this patch for most
workloads.  The synthetic clients are all using separate IP addresses, so
there are 64 entries in the ip_map cache hash.  Because the kernel measured
contained the bug fixed in commit

commit 1f1e030b

and was running on 64bit little-endian machine, probably all of those 64
entries were on a single chain, thus increasing the cost of ip_map_lookup().

With a modern kernel you would need more clients to see the same amount of
performance improvement.  This patch has helped to scale knfsd to handle a
deployment with 2000 NFS clients.
Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7b2b1fee

[PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom · 3cc03b16

由 NeilBrown 提交于 10月 04, 2006

..  by allocating the array of 'kvec' in 'struct svc_rqst'.

As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer
allocate an array of this size on the stack.  So we allocate it in 'struct
svc_rqst'.

However svc_rqst contains (indirectly) an array of the same type and size
(actually several, but they are in a union).  So rather than waste space, we
move those arrays out of the separately allocated union and into svc_rqst to
share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at
different times, so there is no conflict).
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3cc03b16

[PATCH] knfsd: Replace two page lists in struct svc_rqst with one · 44524359

由 NeilBrown 提交于 10月 04, 2006

We are planning to increase RPCSVC_MAXPAGES from about 8 to about 256.  This
means we need to be a bit careful about arrays of size RPCSVC_MAXPAGES.

struct svc_rqst contains two such arrays.  However the there are never more
that RPCSVC_MAXPAGES pages in the two arrays together, so only one array is
needed.

The two arrays are for the pages holding the request, and the pages holding
the reply.  Instead of two arrays, we can simply keep an index into where the
first reply page is.

This patch also removes a number of small inline functions that probably
server to obscure what is going on rather than clarify it, and opencode the
needed functionality.

Also remove the 'rq_restailpage' variable as it is *always* 0.  i.e.  if the
response 'xdr' structure has a non-empty tail it is always in the same pages
as the head.

 check counters are initilised and incr properly
 check for consistant usage of ++ etc
 maybe extra some inlines for common approach
 general review
Signed-off-by: NNeil Brown <neilb@suse.de>
Cc: Magnus Maatta <novell@kiruna.se>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

44524359

[PATCH] knfsd: Fixed handling of lockd fail when adding nfsd socket · 5680c446

由 NeilBrown 提交于 10月 04, 2006

Arrgg..  We cannot 'lockd_up' before 'svc_addsock' as we don't know the
protocol yet....  So switch it around again and save the name of the created
sockets so that it can be closed if lock_up fails.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5680c446

[PATCH] knfsd: call lockd_down when closing a socket via a write to nfsd/portlist · 37a03472

由 NeilBrown 提交于 10月 04, 2006

The refcount that nfsd holds on lockd is based on the number of open sockets.
So when we close a socket, we should decrement the ref (with lockd_down).

Currently when a socket is closed via writing to the portlist file, that
doesn't happen.

So: make sure we get an error return if the socket that was requested does is
not found, and call lockd_down if it was.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

37a03472

02 10月, 2006 3 次提交

[PATCH] knfsd: make rpc threads pools numa aware · bfd24160

由 Greg Banks 提交于 10月 02, 2006

Actually implement multiple pools.  On NUMA machines, allocate a svc_pool per
NUMA node; on SMP a svc_pool per CPU; otherwise a single global pool.  Enqueue
sockets on the svc_pool corresponding to the CPU on which the socket bh is run
(i.e.  the NIC interrupt CPU).  Threads have their cpu mask set to limit them
to the CPUs in the svc_pool that owns them.

This is the patch that allows an Altix to scale NFS traffic linearly
beyond 4 CPUs and 4 NICs.

Incorporates changes and feedback from Neil Brown, Trond Myklebust, and
Christoph Hellwig.
Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

bfd24160

[PATCH] knfsd: split svc_serv into pools · 3262c816

由 Greg Banks 提交于 10月 02, 2006

Split out the list of idle threads and pending sockets from svc_serv into a
new svc_pool structure, and allocate a fixed number (in this patch, 1) of
pools per svc_serv. The new structure contains a lock which takes over
several of the duties of svc_serv->sv_lock, which is now relegated to
protecting only sv_tempsocks, sv_permsocks, and sv_tmpcnt in svc_serv.

The point is to move the hottest fields out of svc_serv and into svc_pool,
allowing a following patch to arrange for a svc_pool per NUMA node or per CPU.
This is a major step towards making the NFS server NUMA-friendly.
Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3262c816

[PATCH] knfsd: test and set SK_BUSY atomically · c081a0c7

由 Greg Banks 提交于 10月 02, 2006

The SK_BUSY bit in svc_sock->sk_flags ensures that we do not attempt to
enqueue a socket twice.  Currently, setting and clearing the bit is protected
by svc_serv->sv_lock.  As I intend to reduce the data that the lock protects
so it's not held when svc_sock_enqueue() tests and sets SK_BUSY, that test and
set needs to be atomic.
Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c081a0c7

openanolis / cloud-kernel 11 个月 前同步成功

openanolis / cloud-kernel
11 个月前同步成功