1. 13 2月, 2007 3 次提交
  2. 11 2月, 2007 1 次提交
  3. 10 2月, 2007 1 次提交
    • N
      [PATCH] knfsd: fix a race in closing NFSd connections · aaf68cfb
      NeilBrown 提交于
      If you lose this race, it can iput a socket inode twice and you get a BUG
      in fs/inode.c
      
      When I added the option for user-space to close a socket, I added some
      cruft to svc_delete_socket so that I could call that function when closing
      a socket per user-space request.
      
      This was the wrong thing to do.  I should have just set SK_CLOSE and let
      normal mechanisms do the work.
      
      Not only wrong, but buggy.  The locking is all wrong and it openned up a
      race where-by a socket could be closed twice.
      
      So this patch:
        Introduces svc_close_socket which sets SK_CLOSE then either leave
        the close up to a thread, or calls svc_delete_socket if it can
        get SK_BUSY.
      
        Adds a bias to sk_busy which is removed when SK_DEAD is set,
        This avoid races around shutting down the socket.
      
        Changes several 'spin_lock' to 'spin_lock_bh' where the _bh
        was missing.
      
      Bugzilla-url: http://bugzilla.kernel.org/show_bug.cgi?id=7916Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aaf68cfb
  4. 31 1月, 2007 1 次提交
  5. 27 1月, 2007 1 次提交
    • N
      [PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned reads · 250f3915
      NeilBrown 提交于
      NFSd assumes that largest number of pages that will be needed for a
      request+response is 2+N where N pages is the size of the largest permitted
      read/write request.  The '2' are 1 for the non-data part of the request, and 1
      for the non-data part of the reply.
      
      However, when a read request is not page-aligned, and we choose to use
      ->sendfile to send it directly from the page cache, we may need N+1 pages to
      hold the whole reply.  This can overflow and array and cause an Oops.
      
      This patch increases size of the array for holding pages by one and makes sure
      that entry is NULL when it is not in use.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      250f3915
  6. 08 12月, 2006 2 次提交
  7. 31 10月, 2006 2 次提交
    • A
      [PATCH] fix "sunrpc: fix refcounting problems in rpc servers" · 202dd450
      Andrew Morton 提交于
      - printk should remain dprintk
      
      - fix coding-style.
      
      Cc: Neil Brown <neilb@suse.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      202dd450
    • N
      [PATCH] sunrpc: fix refcounting problems in rpc servers · d6740df9
      Neil Brown 提交于
      A recent patch fixed a problem which would occur when the refcount on an
      auth_domain reached zero.  This problem has not been reported in practice
      despite existing in two major kernel releases because the refcount can
      never reach zero.
      
      This patch fixes the problems that stop the refcount reaching zero.
      
      1/ We were adding to the refcount when inserting in the hash table,
         but only removing from the hashtable when the refcount reached zero.
         Obviously it never would.  So don't count the implied reference of
         being in the hash table.
      
      2/ There are two paths on which a socket can be destroyed.  One called
         svcauth_unix_info_release().  The other didn't.  So when the other was
         taken, we can lose a reference to an ip_map which in-turn holds a
         reference to an auth_domain
      
         So unify the exit paths into svc_sock_put.  This highlights the fact
         that svc_delete_socket has slightly odd semantics - it does not drop
         a reference but probably should.  Fixing this need a bit more
         thought and testing.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d6740df9
  8. 21 10月, 2006 1 次提交
    • N
      [PATCH] knfsd: fix race that can disable NFS server · 1a047060
      NeilBrown 提交于
      This patch is suitable for just about any 2.6 kernel.  It should go in
      2.6.19 and 2.6.18.2 and possible even the .17 and .16 stable series.
      
      This is a long standing bug that seems to have only recently become
      apparent, presumably due to increasing use of NFS over TCP - many
      distros seem to be making it the default.
      
      The SK_CONN bit gets set when a listening socket may be ready
      for an accept, just as SK_DATA is set when data may be available.
      
      It is entirely possible for svc_tcp_accept to be called with neither
      of these set.  It doesn't happen often but there is a small race in
      svc_sock_enqueue as SK_CONN and SK_DATA are tested outside the
      spin_lock.  They could be cleared immediately after the test and
      before the lock is gained.
      
      This normally shouldn't be a problem.  The sockets are non-blocking so
      trying to read() or accept() when ther is nothing to do is not a problem.
      
      However: svc_tcp_recvfrom makes the decision "Should I accept() or
      should I read()" based on whether SK_CONN is set or not.  This usually
      works but is not safe.  The decision should be based on whether it is
      a TCP_LISTEN socket or a TCP_CONNECTED socket.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Adrian Bunk <bunk@stusta.de>
      Cc: <stable@kernel.org>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1a047060
  9. 06 10月, 2006 1 次提交
    • N
      [PATCH] knfsd: tidy up up meaning of 'buffer size' in nfsd/sunrpc · c6b0a9f8
      NeilBrown 提交于
      There is some confusion about the meaning of 'bufsz' for a sunrpc server.
      In some cases it is the largest message that can be sent or received.  In
      other cases it is the largest 'payload' that can be included in a NFS
      message.
      
      In either case, it is not possible for both the request and the reply to be
      this large.  One of the request or reply may only be one page long, which
      fits nicely with NFS.
      
      So we remove 'bufsz' and replace it with two numbers: 'max_payload' and
      'max_mesg'.  Max_payload is the size that the server requests.  It is used
      by the server to check the max size allowed on a particular connection:
      depending on the protocol a lower limit might be used.
      
      max_mesg is the largest single message that can be sent or received.  It is
      calculated as the max_payload, rounded up to a multiple of PAGE_SIZE, and
      with PAGE_SIZE added to overhead.  Only one of the request and reply may be
      this size.  The other must be at most one page.
      
      Cc: Greg Banks <gnb@sgi.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c6b0a9f8
  10. 04 10月, 2006 5 次提交
    • G
      [PATCH] knfsd: knfsd: cache ipmap per TCP socket · 7b2b1fee
      Greg Banks 提交于
      Speed up high call-rate workloads by caching the struct ip_map for the peer on
      the connected struct svc_sock instead of looking it up in the ip_map cache
      hashtable on every call.  This helps workloads using AUTH_SYS authentication
      over TCP.
      
      Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
      synthetic client threads simulating an rsync (i.e.  recursive directory
      listing) workload reading from an i386 RH9 install image (161480 regular files
      in 10841 directories) on the server.  That tree is small enough to fill in the
      server's RAM so no disk traffic was involved.  This setup gives a sustained
      call rate in excess of 60000 calls/sec before being CPU-bound on the server.
      
      Profiling showed strcmp(), called from ip_map_match(), was taking 4.8% of each
      CPU, and ip_map_lookup() was taking 2.9%.  This patch drops both contribution
      into the profile noise.
      
      Note that the above result overstates this value of this patch for most
      workloads.  The synthetic clients are all using separate IP addresses, so
      there are 64 entries in the ip_map cache hash.  Because the kernel measured
      contained the bug fixed in commit
      
      commit 1f1e030b
      
      and was running on 64bit little-endian machine, probably all of those 64
      entries were on a single chain, thus increasing the cost of ip_map_lookup().
      
      With a modern kernel you would need more clients to see the same amount of
      performance improvement.  This patch has helped to scale knfsd to handle a
      deployment with 2000 NFS clients.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7b2b1fee
    • N
      [PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom · 3cc03b16
      NeilBrown 提交于
      ..  by allocating the array of 'kvec' in 'struct svc_rqst'.
      
      As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer
      allocate an array of this size on the stack.  So we allocate it in 'struct
      svc_rqst'.
      
      However svc_rqst contains (indirectly) an array of the same type and size
      (actually several, but they are in a union).  So rather than waste space, we
      move those arrays out of the separately allocated union and into svc_rqst to
      share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at
      different times, so there is no conflict).
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3cc03b16
    • N
      [PATCH] knfsd: Replace two page lists in struct svc_rqst with one · 44524359
      NeilBrown 提交于
      We are planning to increase RPCSVC_MAXPAGES from about 8 to about 256.  This
      means we need to be a bit careful about arrays of size RPCSVC_MAXPAGES.
      
      struct svc_rqst contains two such arrays.  However the there are never more
      that RPCSVC_MAXPAGES pages in the two arrays together, so only one array is
      needed.
      
      The two arrays are for the pages holding the request, and the pages holding
      the reply.  Instead of two arrays, we can simply keep an index into where the
      first reply page is.
      
      This patch also removes a number of small inline functions that probably
      server to obscure what is going on rather than clarify it, and opencode the
      needed functionality.
      
      Also remove the 'rq_restailpage' variable as it is *always* 0.  i.e.  if the
      response 'xdr' structure has a non-empty tail it is always in the same pages
      as the head.
      
       check counters are initilised and incr properly
       check for consistant usage of ++ etc
       maybe extra some inlines for common approach
       general review
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Magnus Maatta <novell@kiruna.se>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      44524359
    • N
      [PATCH] knfsd: Fixed handling of lockd fail when adding nfsd socket · 5680c446
      NeilBrown 提交于
      Arrgg..  We cannot 'lockd_up' before 'svc_addsock' as we don't know the
      protocol yet....  So switch it around again and save the name of the created
      sockets so that it can be closed if lock_up fails.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5680c446
    • N
      [PATCH] knfsd: call lockd_down when closing a socket via a write to nfsd/portlist · 37a03472
      NeilBrown 提交于
      The refcount that nfsd holds on lockd is based on the number of open sockets.
      So when we close a socket, we should decrement the ref (with lockd_down).
      
      Currently when a socket is closed via writing to the portlist file, that
      doesn't happen.
      
      So: make sure we get an error return if the socket that was requested does is
      not found, and call lockd_down if it was.
      
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      37a03472
  11. 02 10月, 2006 10 次提交
  12. 29 9月, 2006 2 次提交
  13. 23 9月, 2006 1 次提交
  14. 22 7月, 2006 1 次提交
  15. 21 3月, 2006 1 次提交
  16. 19 1月, 2006 1 次提交
  17. 07 1月, 2006 1 次提交
    • O
      [PATCH] Keep nfsd from exiting when seeing recv() errors · 93fbf1a5
      Olaf Kirch 提交于
      I submitted this one previously - svc_tcp_recvfrom currently returns
      any errors to the caller, including ECONNRESET and the like.
      
      This is something svc_recv isn't able to deal with:
      
      	len = svsk->sk_recvfrom(rqstp);
      	[...]
      	if (len == 0 || len == -EAGAIN) {
      		[...]
      		return -EAGAIN;
      	}
      
      	[...]
      	return len;
      
      The nfsd main loop will exit when it sees an error code other than
      EAGAIN.
      
      The following patch fixes this problem
      
      svc_recv is not equipped to deal with error codes other than EAGAIN,
      and will propagate anything else (such as ECONNRESET) up to nfsd,
      causing it to exit.
      Signed-off-by: NOlaf Kirch <okir@suse.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Neil Brown <neilb@cse.unsw.edu.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      93fbf1a5
  18. 04 1月, 2006 1 次提交
    • E
      [NET]: move struct proto_ops to const · 90ddc4f0
      Eric Dumazet 提交于
      I noticed that some of 'struct proto_ops' used in the kernel may share
      a cache line used by locks or other heavily modified data. (default
      linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
      least)
      
      This patch makes sure a 'struct proto_ops' can be declared as const,
      so that all cpus can share all parts of it without false sharing.
      
      This is not mandatory : a driver can still use a read/write structure
      if it needs to (and eventually a __read_mostly)
      
      I made a global stubstitute to change all existing occurences to make
      them const.
      
      This should reduce the possibility of false sharing on SMP, and
      speedup some socket system calls.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90ddc4f0
  19. 16 11月, 2005 1 次提交
  20. 11 11月, 2005 1 次提交
    • H
      [NET]: Detect hardware rx checksum faults correctly · fb286bb2
      Herbert Xu 提交于
      Here is the patch that introduces the generic skb_checksum_complete
      which also checks for hardware RX checksum faults.  If that happens,
      it'll call netdev_rx_csum_fault which currently prints out a stack
      trace with the device name.  In future it can turn off RX checksum.
      
      I've converted every spot under net/ that does RX checksum checks to
      use skb_checksum_complete or __skb_checksum_complete with the
      exceptions of:
      
      * Those places where checksums are done bit by bit.  These will call
      netdev_rx_csum_fault directly.
      
      * The following have not been completely checked/converted:
      
      ipmr
      ip_vs
      netfilter
      dccp
      
      This patch is based on patches and suggestions from Stephen Hemminger
      and David S. Miller.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb286bb2
  21. 27 10月, 2005 1 次提交
  22. 24 9月, 2005 1 次提交