1. 28 1月, 2009 1 次提交
    • E
      net: wrong test in inet_ehash_locks_alloc() · 94cd3e6c
      Eric Dumazet 提交于
      In commit 9db66bdc (net: convert
      TCP/DCCP ehash rwlocks to spinlocks), I forgot to change one
      occurrence of rwlock_t to spinlock_t
      
      I believe sizeof(raw_spinlock_t) might be > 0 on !CONFIG_SMP if
      CONFIG_DEBUG_SPINLOCK while sizeof(raw_rwlock_t) should be 0 in this
      case.
      
      Fortunatly, CONFIG_DEBUG_SPINLOCK adds fields to both spinlock_t and
      rwlock_t, but at this might change in the future (being able to debug
      spinlocks but not rwlocks for example), better to be safe.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94cd3e6c
  2. 27 1月, 2009 3 次提交
  3. 23 1月, 2009 7 次提交
  4. 22 1月, 2009 3 次提交
    • E
      inet: Allowing more than 64k connections and heavily optimize bind(0) time. · a9d8f911
      Evgeniy Polyakov 提交于
      With simple extension to the binding mechanism, which allows to bind more
      than 64k sockets (or smaller amount, depending on sysctl parameters),
      we have to traverse the whole bind hash table to find out empty bucket.
      And while it is not a problem for example for 32k connections, bind()
      completion time grows exponentially (since after each successful binding
      we have to traverse one bucket more to find empty one) even if we start
      each time from random offset inside the hash table.
      
      So, when hash table is full, and we want to add another socket, we have
      to traverse the whole table no matter what, so effectivelly this will be
      the worst case performance and it will be constant.
      
      Attached picture shows bind() time depending on number of already bound
      sockets.
      
      Green area corresponds to the usual binding to zero port process, which
      turns on kernel port selection as described above. Red area is the bind
      process, when number of reuse-bound sockets is not limited by 64k (or
      sysctl parameters). The same exponential growth (hidden by the green
      area) before number of ports reaches sysctl limit.
      
      At this time bind hash table has exactly one reuse-enbaled socket in a
      bucket, but it is possible that they have different addresses. Actually
      kernel selects the first port to try randomly, so at the beginning bind
      will take roughly constant time, but with time number of port to check
      after random start will increase. And that will have exponential growth,
      but because of above random selection, not every next port selection
      will necessary take longer time than previous. So we have to consider
      the area below in the graph (if you could zoom it, you could find, that
      there are many different times placed there), so area can hide another.
      
      Blue area corresponds to the port selection optimization.
      
      This is rather simple design approach: hashtable now maintains (unprecise
      and racely updated) number of currently bound sockets, and when number
      of such sockets becomes greater than predefined value (I use maximum
      port range defined by sysctls), we stop traversing the whole bind hash
      table and just stop at first matching bucket after random start. Above
      limit roughly corresponds to the case, when bind hash table is full and
      we turned on mechanism of allowing to bind more reuse-enabled sockets,
      so it does not change behaviour of other sockets.
      Signed-off-by: NEvgeniy Polyakov <zbr@ioremap.net>
      Tested-by: NDenys Fedoryschenko <denys@visp.net.lb>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9d8f911
    • S
    • S
  5. 17 1月, 2009 1 次提交
  6. 11 1月, 2009 1 次提交
  7. 09 1月, 2009 2 次提交
  8. 08 1月, 2009 1 次提交
  9. 07 1月, 2009 2 次提交
  10. 05 1月, 2009 1 次提交
    • D
      ipv6: Fix sporadic sendmsg -EINVAL when sending to multicast groups. · 14deae41
      David S. Miller 提交于
      Thanks to excellent diagnosis by Eduard Guzovsky.
      
      The core problem is that on a network with lots of active
      multicast traffic, the neighbour cache can fill up.  If
      we try to allocate a new route and thus neighbour cache
      entry, the bog-standard GC attempt the neighbour layer does
      in ineffective because route entries hold a reference
      to the existing neighbour entries and GC can only liberate
      entries with no references.
      
      IPV4 already has a way to handle this, by doing a route cache
      GC in such situations (when neigh attach returns -ENOBUFS).
      
      So simply mimick this on the ipv6 side.
      Tested-by: NEduard Guzovsky <eguzovsky@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14deae41
  11. 01 1月, 2009 1 次提交
  12. 26 12月, 2008 2 次提交
  13. 25 12月, 2008 1 次提交
  14. 22 12月, 2008 1 次提交
  15. 20 12月, 2008 4 次提交
  16. 18 12月, 2008 2 次提交
  17. 16 12月, 2008 2 次提交
    • H
      tcp: Add GRO support · bf296b12
      Herbert Xu 提交于
      This patch adds the TCP-specific portion of GRO.  The criterion for
      merging is extremely strict (the TCP header must match exactly apart
      from the checksum) so as to allow refragmentation.  Otherwise this
      is pretty much identical to LRO, except that we support the merging
      of ECN packets.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf296b12
    • H
      ipv4: Add GRO infrastructure · 73cc19f1
      Herbert Xu 提交于
      This patch adds GRO support for IPv4.
      
      The criteria for merging is more stringent than LRO, in particular,
      we require all fields in the IP header to be identical except for
      the length, ID and checksum.  In addition, the ID must form an
      arithmetic sequence with a difference of one.
      
      The ID requirement might seem overly strict, however, most hardware
      TSO solutions already obey this rule.  Linux itself also obeys this
      whether GSO is in use or not.
      
      In future we could relax this rule by storing the IDs (or rather
      making sure that we don't drop them when pulling the aggregate
      skb's tail).
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73cc19f1
  18. 13 12月, 2008 4 次提交
  19. 11 12月, 2008 1 次提交