1. 16 12月, 2009 1 次提交
  2. 24 11月, 2009 1 次提交
  3. 15 7月, 2009 1 次提交
    • A
      nfsd41: use globals for DRC limits · 4bd9b0f4
      Andy Adamson 提交于
      The version 4.1 DRC memory limit and tracking variables are server wide and
      session specific. Replace struct svc_serv fields with globals.
      Stop using the svc_serv sv_lock.
      
      Add a spinlock to serialize access to the DRC limit management variables which
      change on session creation and deletion (usage counter) or (future)
      administrative action to adjust the total DRC memory limit.
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
      4bd9b0f4
  4. 18 6月, 2009 3 次提交
  5. 04 4月, 2009 2 次提交
  6. 29 3月, 2009 2 次提交
    • C
      SUNRPC: Remove @family argument from svc_create() and svc_create_pooled() · 49a9072f
      Chuck Lever 提交于
      Since an RPC service listener's protocol family is specified now via
      svc_create_xprt(), it no longer needs to be passed to svc_create() or
      svc_create_pooled().  Remove that argument from the synopsis of those
      functions, and remove the sv_family field from the svc_serv struct.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      49a9072f
    • C
      SUNRPC: Pass a family argument to svc_register() · 4b62e58c
      Chuck Lever 提交于
      The sv_family field is going away.  Instead of using sv_family, have
      the svc_register() function take a protocol family argument.
      
      Since this argument represents a protocol family, and not an address
      family, this argument takes an int, as this is what is passed to
      sock_create_kern().  Also make sure svc_register's helpers are
      checking for PF_FOO instead of AF_FOO.  The value of [AP]F_FOO are
      equivalent; this is simply a symbolic change to reflect the semantics
      of the value stored in that variable.
      
      sock_create_kern() should return EPFNOSUPPORT if the passed-in
      protocol family isn't supported, but it uses EAFNOSUPPORT for this
      case.  We will stick with that tradition here, as svc_register()
      is called by the RPC server in the same path as sock_create_kern().
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      4b62e58c
  7. 19 3月, 2009 2 次提交
    • G
      knfsd: add file to export stats about nfsd pools · 03cf6c9f
      Greg Banks 提交于
      Add /proc/fs/nfsd/pool_stats to export to userspace various
      statistics about the operation of rpc server thread pools.
      
      This patch is based on a forward-ported version of
      knfsd-add-pool-thread-stats which has been shipping in the SGI
      "Enhanced NFS" product since 2006 and which was previously
      posted:
      
      http://article.gmane.org/gmane.linux.nfs/10375
      
      It has also been updated thus:
      
       * moved EXPORT_SYMBOL() to near the function it exports
       * made the new struct struct seq_operations const
       * used SEQ_START_TOKEN instead of ((void *)1)
       * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
         printk in svc_pool_stats_*()" by Harshula Jayasuriya.
       * merged fix from SGI PV 964001 "Crash reading pool_stats before
         nfsds are started".
      Signed-off-by: NGreg Banks <gnb@sgi.com>
      Signed-off-by: NHarshula Jayasuriya <harshula@sgi.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      03cf6c9f
    • G
      knfsd: avoid overloading the CPU scheduler with enormous load averages · 59a252ff
      Greg Banks 提交于
      Avoid overloading the CPU scheduler with enormous load averages
      when handling high call-rate NFS loads.  When the knfsd bottom half
      is made aware of an incoming call by the socket layer, it tries to
      choose an nfsd thread and wake it up.  As long as there are idle
      threads, one will be woken up.
      
      If there are lot of nfsd threads (a sensible configuration when
      the server is disk-bound or is running an HSM), there will be many
      more nfsd threads than CPUs to run them.  Under a high call-rate
      low service-time workload, the result is that almost every nfsd is
      runnable, but only a handful are actually able to run.  This situation
      causes two significant problems:
      
      1. The CPU scheduler takes over 10% of each CPU, which is robbing
         the nfsd threads of valuable CPU time.
      
      2. At a high enough load, the nfsd threads starve userspace threads
         of CPU time, to the point where daemons like portmap and rpc.mountd
         do not schedule for tens of seconds at a time.  Clients attempting
         to mount an NFS filesystem timeout at the very first step (opening
         a TCP connection to portmap) because portmap cannot wake up from
         select() and call accept() in time.
      
      Disclaimer: these effects were observed on a SLES9 kernel, modern
      kernels' schedulers may behave more gracefully.
      
      The solution is simple: keep in each svc_pool a counter of the number
      of threads which have been woken but have not yet run, and do not wake
      any more if that count reaches an arbitrary small threshold.
      
      Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
      synthetic client threads simulating an rsync (i.e. recursive directory
      listing) workload reading from an i386 RH9 install image (161480
      regular files in 10841 directories) on the server.  That tree is small
      enough to fill in the server's RAM so no disk traffic was involved.
      This setup gives a sustained call rate in excess of 60000 calls/sec
      before being CPU-bound on the server.  The server was running 128 nfsds.
      
      Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
      taking 5.2%.  This patch drops those contributions to 3.0% and 2.2%.
      Load average was over 120 before the patch, and 20.9 after.
      
      This patch is a forward-ported version of knfsd-avoid-nfsd-overload
      which has been shipping in the SGI "Enhanced NFS" product since 2006.
      It has been posted before:
      
      http://article.gmane.org/gmane.linux.nfs/10374Signed-off-by: NGreg Banks <gnb@sgi.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      59a252ff
  8. 07 1月, 2009 1 次提交
    • J
      sunrpc: add sv_maxconn field to svc_serv (try #3) · c9233eb7
      Jeff Layton 提交于
      svc_check_conn_limits() attempts to prevent denial of service attacks
      by having the service close old connections once it reaches a
      threshold. This threshold is based on the number of threads in the
      service:
      
      	(serv->sv_nrthreads + 3) * 20
      
      Once we reach this, we drop the oldest connections and a printk pops
      to warn the admin that they should increase the number of threads.
      
      Increasing the number of threads isn't an option however for services
      like lockd. We don't want to eliminate this check entirely for such
      services but we need some way to increase this limit.
      
      This patch adds a sv_maxconn field to the svc_serv struct. When it's
      set to 0, we use the current method to calculate the max number of
      connections. RPC services can then set this on an as-needed basis.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Acked-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      c9233eb7
  9. 30 9月, 2008 3 次提交
  10. 24 6月, 2008 2 次提交
  11. 24 4月, 2008 1 次提交
  12. 11 2月, 2008 1 次提交
    • J
      nfsd: clean up svc_reserve_auth() · fbb7878c
      J. Bruce Fields 提交于
      This is a void function attempting to return the return value from
      another void function, which seems harmless but extremely weird, and
      apparently makes some compilers complain.
      
      While we're there, clean up a little (e.g. the switch statement had a
      minor style problem and seemed overkill as long as there's only one
      case).
      
      Thanks to Trond for noticing this.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      fbb7878c
  13. 02 2月, 2008 6 次提交
  14. 18 7月, 2007 2 次提交
    • J
      knfsd: nfsd: set rq_client to ip-address-determined-domain · 3ab4d8b1
      J. Bruce Fields 提交于
      We want it to be possible for users to restrict exports both by IP address and
      by pseudoflavor.  The pseudoflavor information has previously been passed
      using special auth_domains stored in the rq_client field.  After the preceding
      patch that stored the pseudoflavor in rq_pflavor, that's now superfluous; so
      now we use rq_client for the ip information, as auth_null and auth_unix do.
      
      However, we keep around the special auth_domain in the rq_gssclient field for
      backwards compatibility purposes, so we can still do upcalls using the old
      "gss/pseudoflavor" auth_domain if upcalls using the unix domain to give us an
      appropriate export.  This allows us to continue supporting old mountd.
      
      In fact, for this first patch, we always use the "gss/pseudoflavor"
      auth_domain (and only it) if it is available; thus rq_client is ignored in the
      auth_gss case, and this patch on its own makes no change in behavior; that
      will be left to later patches.
      
      Note on idmap: I'm almost tempted to just replace the auth_domain in the idmap
      upcall by a dummy value--no version of idmapd has ever used it, and it's
      unlikely anyone really wants to perform idmapping differently depending on the
      where the client is (they may want to perform *credential* mapping
      differently, but that's a different matter--the idmapper just handles id's
      used in getattr and setattr).  But I'm updating the idmapd code anyway, just
      out of general backwards-compatibility paranoia.
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ab4d8b1
    • A
      knfsd: nfsd4: store pseudoflavor in request · c4170583
      Andy Adamson 提交于
      Add a new field to the svc_rqst structure to record the pseudoflavor that the
      request was made with.  For now we record the pseudoflavor but don't use it
      for anything.
      Signed-off-by: NAndy Adamson <andros@citi.umich.edu>
      Signed-off-by: N"J. Bruce Fields" <bfields@citi.umich.edu>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4170583
  15. 10 7月, 2007 1 次提交
  16. 10 5月, 2007 1 次提交
    • J
      RPC: add wrapper for svc_reserve to account for checksum · cd123012
      Jeff Layton 提交于
      When the kernel calls svc_reserve to downsize the expected size of an RPC
      reply, it fails to account for the possibility of a checksum at the end of
      the packet.  If a client mounts a NFSv2/3 with sec=krb5i/p, and does I/O
      then you'll generally see messages similar to this in the server's ring
      buffer:
      
      RPC request reserved 164 but used 208
      
      While I was never able to verify it, I suspect that this problem is also
      the root cause of some oopses I've seen under these conditions:
      
      https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227726
      
      This is probably also a problem for other sec= types and for NFSv4.  The
      large reserved size for NFSv4 compound packets seems to generally paper
      over the problem, however.
      
      This patch adds a wrapper for svc_reserve that accounts for the possibility
      of a checksum.  It also fixes up the appropriate callers of svc_reserve to
      call the wrapper.  For now, it just uses a hardcoded value that I
      determined via testing.  That value may need to be revised upward as things
      change, or we may want to eventually add a new auth_op that attempts to
      calculate this somehow.
      
      Unfortunately, there doesn't seem to be a good way to reliably determine
      the expected checksum length prior to actually calculating it, particularly
      with schemes like spkm3.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Acked-by: NNeil Brown <neilb@suse.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Acked-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd123012
  17. 07 3月, 2007 1 次提交
  18. 13 2月, 2007 5 次提交
  19. 27 1月, 2007 1 次提交
    • N
      [PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned reads · 250f3915
      NeilBrown 提交于
      NFSd assumes that largest number of pages that will be needed for a
      request+response is 2+N where N pages is the size of the largest permitted
      read/write request.  The '2' are 1 for the non-data part of the request, and 1
      for the non-data part of the reply.
      
      However, when a read request is not page-aligned, and we choose to use
      ->sendfile to send it directly from the page cache, we may need N+1 pages to
      hold the whole reply.  This can overflow and array and cause an Oops.
      
      This patch increases size of the array for holding pages by one and makes sure
      that entry is NULL when it is not in use.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      250f3915
  20. 21 10月, 2006 1 次提交
  21. 06 10月, 2006 1 次提交
    • N
      [PATCH] knfsd: tidy up up meaning of 'buffer size' in nfsd/sunrpc · c6b0a9f8
      NeilBrown 提交于
      There is some confusion about the meaning of 'bufsz' for a sunrpc server.
      In some cases it is the largest message that can be sent or received.  In
      other cases it is the largest 'payload' that can be included in a NFS
      message.
      
      In either case, it is not possible for both the request and the reply to be
      this large.  One of the request or reply may only be one page long, which
      fits nicely with NFS.
      
      So we remove 'bufsz' and replace it with two numbers: 'max_payload' and
      'max_mesg'.  Max_payload is the size that the server requests.  It is used
      by the server to check the max size allowed on a particular connection:
      depending on the protocol a lower limit might be used.
      
      max_mesg is the largest single message that can be sent or received.  It is
      calculated as the max_payload, rounded up to a multiple of PAGE_SIZE, and
      with PAGE_SIZE added to overhead.  Only one of the request and reply may be
      this size.  The other must be at most one page.
      
      Cc: Greg Banks <gnb@sgi.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c6b0a9f8
  22. 04 10月, 2006 1 次提交