1. 04 4月, 2009 2 次提交
  2. 29 3月, 2009 5 次提交
  3. 20 3月, 2009 1 次提交
    • T
      SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC sockets · 7d1e8255
      Trond Myklebust 提交于
      This fixes a regression against FreeBSD servers as reported by Tomas
      Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers
      don't ever react to the client closing the socket, and so commit
      e06799f9 (SUNRPC: Use shutdown() instead of
      close() when disconnecting a TCP socket) causes the setup to hang forever
      whenever the client attempts to close and then reconnect.
      
      We break the deadlock by adding a 'linger2' style timeout to the socket,
      after which, the client will abort the connection using a TCP 'RST'.
      
      The default timeout is set to 15 seconds. A subsequent patch will put it
      under user control by means of a systctl.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      7d1e8255
  4. 19 3月, 2009 4 次提交
    • C
      SUNRPC: Clean up static inline functions in svc_xprt.h · 2795e53b
      Chuck Lever 提交于
      Clean up:  Enable the use of const arguments in higher level svc_ APIs
      by adding const to the arguments of the helper functions in svc_xprt.h
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      2795e53b
    • G
      knfsd: add file to export stats about nfsd pools · 03cf6c9f
      Greg Banks 提交于
      Add /proc/fs/nfsd/pool_stats to export to userspace various
      statistics about the operation of rpc server thread pools.
      
      This patch is based on a forward-ported version of
      knfsd-add-pool-thread-stats which has been shipping in the SGI
      "Enhanced NFS" product since 2006 and which was previously
      posted:
      
      http://article.gmane.org/gmane.linux.nfs/10375
      
      It has also been updated thus:
      
       * moved EXPORT_SYMBOL() to near the function it exports
       * made the new struct struct seq_operations const
       * used SEQ_START_TOKEN instead of ((void *)1)
       * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
         printk in svc_pool_stats_*()" by Harshula Jayasuriya.
       * merged fix from SGI PV 964001 "Crash reading pool_stats before
         nfsds are started".
      Signed-off-by: NGreg Banks <gnb@sgi.com>
      Signed-off-by: NHarshula Jayasuriya <harshula@sgi.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      03cf6c9f
    • G
      knfsd: avoid overloading the CPU scheduler with enormous load averages · 59a252ff
      Greg Banks 提交于
      Avoid overloading the CPU scheduler with enormous load averages
      when handling high call-rate NFS loads.  When the knfsd bottom half
      is made aware of an incoming call by the socket layer, it tries to
      choose an nfsd thread and wake it up.  As long as there are idle
      threads, one will be woken up.
      
      If there are lot of nfsd threads (a sensible configuration when
      the server is disk-bound or is running an HSM), there will be many
      more nfsd threads than CPUs to run them.  Under a high call-rate
      low service-time workload, the result is that almost every nfsd is
      runnable, but only a handful are actually able to run.  This situation
      causes two significant problems:
      
      1. The CPU scheduler takes over 10% of each CPU, which is robbing
         the nfsd threads of valuable CPU time.
      
      2. At a high enough load, the nfsd threads starve userspace threads
         of CPU time, to the point where daemons like portmap and rpc.mountd
         do not schedule for tens of seconds at a time.  Clients attempting
         to mount an NFS filesystem timeout at the very first step (opening
         a TCP connection to portmap) because portmap cannot wake up from
         select() and call accept() in time.
      
      Disclaimer: these effects were observed on a SLES9 kernel, modern
      kernels' schedulers may behave more gracefully.
      
      The solution is simple: keep in each svc_pool a counter of the number
      of threads which have been woken but have not yet run, and do not wake
      any more if that count reaches an arbitrary small threshold.
      
      Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
      synthetic client threads simulating an rsync (i.e. recursive directory
      listing) workload reading from an i386 RH9 install image (161480
      regular files in 10841 directories) on the server.  That tree is small
      enough to fill in the server's RAM so no disk traffic was involved.
      This setup gives a sustained call rate in excess of 60000 calls/sec
      before being CPU-bound on the server.  The server was running 128 nfsds.
      
      Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
      taking 5.2%.  This patch drops those contributions to 3.0% and 2.2%.
      Load average was over 120 before the patch, and 20.9 after.
      
      This patch is a forward-ported version of knfsd-avoid-nfsd-overload
      which has been shipping in the SGI "Enhanced NFS" product since 2006.
      It has been posted before:
      
      http://article.gmane.org/gmane.linux.nfs/10374Signed-off-by: NGreg Banks <gnb@sgi.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      59a252ff
    • H
      nfs: replace uses of __constant_{endian} · 77f18f5e
      Harvey Harrison 提交于
      The base versions handle constant folding now, none of these headers
      are exported to userspace, so the __ prefixed versions are not
      necessary.
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Reviewed-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      77f18f5e
  5. 12 3月, 2009 1 次提交
  6. 07 1月, 2009 1 次提交
    • J
      sunrpc: add sv_maxconn field to svc_serv (try #3) · c9233eb7
      Jeff Layton 提交于
      svc_check_conn_limits() attempts to prevent denial of service attacks
      by having the service close old connections once it reaches a
      threshold. This threshold is based on the number of threads in the
      service:
      
      	(serv->sv_nrthreads + 3) * 20
      
      Once we reach this, we drop the oldest connections and a printk pops
      to warn the admin that they should increase the number of threads.
      
      Increasing the number of threads isn't an option however for services
      like lockd. We don't want to eliminate this check entirely for such
      services but we need some way to increase this limit.
      
      This patch adds a sv_maxconn field to the svc_serv struct. When it's
      set to 0, we use the current method to calculate the max number of
      connections. RPC services can then set this on an as-needed basis.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Acked-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      c9233eb7
  7. 24 12月, 2008 5 次提交
  8. 31 10月, 2008 1 次提交
  9. 30 10月, 2008 1 次提交
  10. 29 10月, 2008 1 次提交
  11. 11 10月, 2008 2 次提交
  12. 07 10月, 2008 3 次提交
  13. 05 10月, 2008 1 次提交
  14. 04 10月, 2008 1 次提交
    • T
      svcrdma: Add Fast Reg MR Data Types · 0d3ebb9a
      Tom Tucker 提交于
      Add data types to track Fast Reg Memory Regions. The core data type is
      svc_rdma_fastreg_mr that associates a device MR with a host kva and page
      list. A field is added to the WR context to keep track of the FRMR
      used to map the local memory for an RPC.
      
      An FRMR list and spin lock are added to the transport instance to keep
      track of all FRMR allocated for the transport. Also added are device
      capability flags to indicate what the memory registration
      capabilities are for the underlying device and whether or not fast
      memory registration is supported.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      0d3ebb9a
  15. 30 9月, 2008 5 次提交
  16. 14 8月, 2008 1 次提交
    • T
      svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet · 24b8b447
      Tom Tucker 提交于
      RDMA_READ completions are kept on a separate queue from the general
      I/O request queue. Since a separate lock is used to protect the RDMA_READ
      completion queue, a race exists between the dto_tasklet and the
      svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA
      bit and adds I/O to the read-completion queue. Concurrently, the
      recvfrom thread checks the generic queue, finds it empty and resets
      the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue
      the transport for I/O and cause the transport to "stall".
      
      The fix is to protect both lists with the same lock and set the XPT_DATA
      bit with this lock held.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      24b8b447
  17. 16 7月, 2008 1 次提交
    • C
      SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon · c2e1b09f
      Chuck Lever 提交于
      Introduce a new API to register RPC services on IPv6 interfaces to allow
      the NFS server and lockd to advertise on IPv6 networks.
      
      Unlike rpcb_register(), the new rpcb_v4_register() function uses rpcbind
      protocol version 4 to contact the local rpcbind daemon.  The version 4
      SET/UNSET procedures allow services to register address families besides
      AF_INET, register at specific network interfaces, and register transport
      protocols besides UDP and TCP.  All of this functionality is exposed via
      the new rpcb_v4_register() kernel API.
      
      A user-space rpcbind daemon implementation that supports version 4 of the
      rpcbind protocol is required in order to make use of this new API.
      
      Note that rpcbind version 3 is sufficient to support the new rpcbind
      facilities listed above, but most extant implementations use version 4.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      c2e1b09f
  18. 10 7月, 2008 2 次提交
  19. 03 7月, 2008 2 次提交