1. 10 5月, 2007 2 次提交
  2. 26 4月, 2007 1 次提交
    • E
      [NET]: convert network timestamps to ktime_t · b7aa0bf7
      Eric Dumazet 提交于
      We currently use a special structure (struct skb_timeval) and plain
      'struct timeval' to store packet timestamps in sk_buffs and struct
      sock.
      
      This has some drawbacks :
      - Fixed resolution of micro second.
      - Waste of space on 64bit platforms where sizeof(struct timeval)=16
      
      I suggest using ktime_t that is a nice abstraction of high resolution
      time services, currently capable of nanosecond resolution.
      
      As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
      a 8 byte shrink of this structure on 64bit architectures. Some other
      structures also benefit from this size reduction (struct ipq in
      ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)
      
      Once this ktime infrastructure adopted, we can more easily provide
      nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
      SO_TIMESTAMPNS/SCM_TIMESTAMPNS)
      
      Note : this patch includes a bug correction in
      compat_sock_get_timestamp() where a "err = 0;" was missing (so this
      syscall returned -ENOENT instead of 0)
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      CC: Stephen Hemminger <shemminger@linux-foundation.org>
      CC: John find <linux.kernel@free.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7aa0bf7
  3. 13 4月, 2007 1 次提交
  4. 05 4月, 2007 1 次提交
  5. 07 3月, 2007 3 次提交
  6. 13 2月, 2007 14 次提交
  7. 11 2月, 2007 1 次提交
  8. 10 2月, 2007 1 次提交
    • N
      [PATCH] knfsd: fix a race in closing NFSd connections · aaf68cfb
      NeilBrown 提交于
      If you lose this race, it can iput a socket inode twice and you get a BUG
      in fs/inode.c
      
      When I added the option for user-space to close a socket, I added some
      cruft to svc_delete_socket so that I could call that function when closing
      a socket per user-space request.
      
      This was the wrong thing to do.  I should have just set SK_CLOSE and let
      normal mechanisms do the work.
      
      Not only wrong, but buggy.  The locking is all wrong and it openned up a
      race where-by a socket could be closed twice.
      
      So this patch:
        Introduces svc_close_socket which sets SK_CLOSE then either leave
        the close up to a thread, or calls svc_delete_socket if it can
        get SK_BUSY.
      
        Adds a bias to sk_busy which is removed when SK_DEAD is set,
        This avoid races around shutting down the socket.
      
        Changes several 'spin_lock' to 'spin_lock_bh' where the _bh
        was missing.
      
      Bugzilla-url: http://bugzilla.kernel.org/show_bug.cgi?id=7916Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aaf68cfb
  9. 31 1月, 2007 1 次提交
  10. 27 1月, 2007 1 次提交
    • N
      [PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned reads · 250f3915
      NeilBrown 提交于
      NFSd assumes that largest number of pages that will be needed for a
      request+response is 2+N where N pages is the size of the largest permitted
      read/write request.  The '2' are 1 for the non-data part of the request, and 1
      for the non-data part of the reply.
      
      However, when a read request is not page-aligned, and we choose to use
      ->sendfile to send it directly from the page cache, we may need N+1 pages to
      hold the whole reply.  This can overflow and array and cause an Oops.
      
      This patch increases size of the array for holding pages by one and makes sure
      that entry is NULL when it is not in use.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      250f3915
  11. 08 12月, 2006 2 次提交
  12. 31 10月, 2006 2 次提交
    • A
      [PATCH] fix "sunrpc: fix refcounting problems in rpc servers" · 202dd450
      Andrew Morton 提交于
      - printk should remain dprintk
      
      - fix coding-style.
      
      Cc: Neil Brown <neilb@suse.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      202dd450
    • N
      [PATCH] sunrpc: fix refcounting problems in rpc servers · d6740df9
      Neil Brown 提交于
      A recent patch fixed a problem which would occur when the refcount on an
      auth_domain reached zero.  This problem has not been reported in practice
      despite existing in two major kernel releases because the refcount can
      never reach zero.
      
      This patch fixes the problems that stop the refcount reaching zero.
      
      1/ We were adding to the refcount when inserting in the hash table,
         but only removing from the hashtable when the refcount reached zero.
         Obviously it never would.  So don't count the implied reference of
         being in the hash table.
      
      2/ There are two paths on which a socket can be destroyed.  One called
         svcauth_unix_info_release().  The other didn't.  So when the other was
         taken, we can lose a reference to an ip_map which in-turn holds a
         reference to an auth_domain
      
         So unify the exit paths into svc_sock_put.  This highlights the fact
         that svc_delete_socket has slightly odd semantics - it does not drop
         a reference but probably should.  Fixing this need a bit more
         thought and testing.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d6740df9
  13. 21 10月, 2006 1 次提交
    • N
      [PATCH] knfsd: fix race that can disable NFS server · 1a047060
      NeilBrown 提交于
      This patch is suitable for just about any 2.6 kernel.  It should go in
      2.6.19 and 2.6.18.2 and possible even the .17 and .16 stable series.
      
      This is a long standing bug that seems to have only recently become
      apparent, presumably due to increasing use of NFS over TCP - many
      distros seem to be making it the default.
      
      The SK_CONN bit gets set when a listening socket may be ready
      for an accept, just as SK_DATA is set when data may be available.
      
      It is entirely possible for svc_tcp_accept to be called with neither
      of these set.  It doesn't happen often but there is a small race in
      svc_sock_enqueue as SK_CONN and SK_DATA are tested outside the
      spin_lock.  They could be cleared immediately after the test and
      before the lock is gained.
      
      This normally shouldn't be a problem.  The sockets are non-blocking so
      trying to read() or accept() when ther is nothing to do is not a problem.
      
      However: svc_tcp_recvfrom makes the decision "Should I accept() or
      should I read()" based on whether SK_CONN is set or not.  This usually
      works but is not safe.  The decision should be based on whether it is
      a TCP_LISTEN socket or a TCP_CONNECTED socket.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Adrian Bunk <bunk@stusta.de>
      Cc: <stable@kernel.org>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1a047060
  14. 06 10月, 2006 1 次提交
    • N
      [PATCH] knfsd: tidy up up meaning of 'buffer size' in nfsd/sunrpc · c6b0a9f8
      NeilBrown 提交于
      There is some confusion about the meaning of 'bufsz' for a sunrpc server.
      In some cases it is the largest message that can be sent or received.  In
      other cases it is the largest 'payload' that can be included in a NFS
      message.
      
      In either case, it is not possible for both the request and the reply to be
      this large.  One of the request or reply may only be one page long, which
      fits nicely with NFS.
      
      So we remove 'bufsz' and replace it with two numbers: 'max_payload' and
      'max_mesg'.  Max_payload is the size that the server requests.  It is used
      by the server to check the max size allowed on a particular connection:
      depending on the protocol a lower limit might be used.
      
      max_mesg is the largest single message that can be sent or received.  It is
      calculated as the max_payload, rounded up to a multiple of PAGE_SIZE, and
      with PAGE_SIZE added to overhead.  Only one of the request and reply may be
      this size.  The other must be at most one page.
      
      Cc: Greg Banks <gnb@sgi.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c6b0a9f8
  15. 04 10月, 2006 5 次提交
    • G
      [PATCH] knfsd: knfsd: cache ipmap per TCP socket · 7b2b1fee
      Greg Banks 提交于
      Speed up high call-rate workloads by caching the struct ip_map for the peer on
      the connected struct svc_sock instead of looking it up in the ip_map cache
      hashtable on every call.  This helps workloads using AUTH_SYS authentication
      over TCP.
      
      Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
      synthetic client threads simulating an rsync (i.e.  recursive directory
      listing) workload reading from an i386 RH9 install image (161480 regular files
      in 10841 directories) on the server.  That tree is small enough to fill in the
      server's RAM so no disk traffic was involved.  This setup gives a sustained
      call rate in excess of 60000 calls/sec before being CPU-bound on the server.
      
      Profiling showed strcmp(), called from ip_map_match(), was taking 4.8% of each
      CPU, and ip_map_lookup() was taking 2.9%.  This patch drops both contribution
      into the profile noise.
      
      Note that the above result overstates this value of this patch for most
      workloads.  The synthetic clients are all using separate IP addresses, so
      there are 64 entries in the ip_map cache hash.  Because the kernel measured
      contained the bug fixed in commit
      
      commit 1f1e030b
      
      and was running on 64bit little-endian machine, probably all of those 64
      entries were on a single chain, thus increasing the cost of ip_map_lookup().
      
      With a modern kernel you would need more clients to see the same amount of
      performance improvement.  This patch has helped to scale knfsd to handle a
      deployment with 2000 NFS clients.
      Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7b2b1fee
    • N
      [PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom · 3cc03b16
      NeilBrown 提交于
      ..  by allocating the array of 'kvec' in 'struct svc_rqst'.
      
      As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer
      allocate an array of this size on the stack.  So we allocate it in 'struct
      svc_rqst'.
      
      However svc_rqst contains (indirectly) an array of the same type and size
      (actually several, but they are in a union).  So rather than waste space, we
      move those arrays out of the separately allocated union and into svc_rqst to
      share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at
      different times, so there is no conflict).
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3cc03b16
    • N
      [PATCH] knfsd: Replace two page lists in struct svc_rqst with one · 44524359
      NeilBrown 提交于
      We are planning to increase RPCSVC_MAXPAGES from about 8 to about 256.  This
      means we need to be a bit careful about arrays of size RPCSVC_MAXPAGES.
      
      struct svc_rqst contains two such arrays.  However the there are never more
      that RPCSVC_MAXPAGES pages in the two arrays together, so only one array is
      needed.
      
      The two arrays are for the pages holding the request, and the pages holding
      the reply.  Instead of two arrays, we can simply keep an index into where the
      first reply page is.
      
      This patch also removes a number of small inline functions that probably
      server to obscure what is going on rather than clarify it, and opencode the
      needed functionality.
      
      Also remove the 'rq_restailpage' variable as it is *always* 0.  i.e.  if the
      response 'xdr' structure has a non-empty tail it is always in the same pages
      as the head.
      
       check counters are initilised and incr properly
       check for consistant usage of ++ etc
       maybe extra some inlines for common approach
       general review
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: Magnus Maatta <novell@kiruna.se>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      44524359
    • N
      [PATCH] knfsd: Fixed handling of lockd fail when adding nfsd socket · 5680c446
      NeilBrown 提交于
      Arrgg..  We cannot 'lockd_up' before 'svc_addsock' as we don't know the
      protocol yet....  So switch it around again and save the name of the created
      sockets so that it can be closed if lock_up fails.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5680c446
    • N
      [PATCH] knfsd: call lockd_down when closing a socket via a write to nfsd/portlist · 37a03472
      NeilBrown 提交于
      The refcount that nfsd holds on lockd is based on the number of open sockets.
      So when we close a socket, we should decrement the ref (with lockd_down).
      
      Currently when a socket is closed via writing to the portlist file, that
      doesn't happen.
      
      So: make sure we get an error return if the socket that was requested does is
      not found, and call lockd_down if it was.
      
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      37a03472
  16. 02 10月, 2006 3 次提交