1. 24 1月, 2014 1 次提交
  2. 23 1月, 2014 1 次提交
    • C
      tcp: metrics: Fix rcu-race when deleting multiple entries · 00ca9c5b
      Christoph Paasch 提交于
      In bbf852b9 I introduced the tmlist, which allows to delete
      multiple entries from the cache that match a specified destination if no
      source-IP is specified.
      
      However, as the cache is an RCU-list, we should not create this tmlist, as
      it will change the tcpm_next pointer of the element that will be deleted
      and so a thread iterating over the cache's entries while holding the
      RCU-lock might get "redirected" to this tmlist.
      
      This patch fixes this, by reverting back to the old behavior prior to
      bbf852b9, which means that we simply change the tcpm_next
      pointer of the previous element (pp) to jump over the one we are
      deleting.
      The difference is that we call kfree_rcu() directly on the cache entry,
      which allows us to delete multiple entries from the list.
      
      Fixes: bbf852b9 (tcp: metrics: Delete all entries matching a certain destination)
      Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00ca9c5b
  3. 18 1月, 2014 1 次提交
  4. 11 1月, 2014 5 次提交
  5. 20 11月, 2013 1 次提交
  6. 15 11月, 2013 2 次提交
  7. 30 10月, 2013 1 次提交
    • Y
      tcp: temporarily disable Fast Open on SYN timeout · c968601d
      Yuchung Cheng 提交于
      Fast Open currently has a fall back feature to address SYN-data being
      dropped but it requires the middle-box to pass on regular SYN retry
      after SYN-data. This is implemented in commit aab48743 ("net-tcp:
      Fast Open client - detecting SYN-data drops")
      
      However some NAT boxes will drop all subsequent packets after first
      SYN-data and blackholes the entire connections.  An example is in
      commit 356d7d88 "netfilter: nf_conntrack: fix tcp_in_window for Fast
      Open".
      
      The sender should note such incidents and fall back to use the regular
      TCP handshake on subsequent attempts temporarily as well: after the
      second SYN timeouts the original Fast Open SYN is most likely lost.
      When such an event recurs Fast Open is disabled based on the number of
      recurrences exponentially.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c968601d
  8. 10 10月, 2013 2 次提交
  9. 09 10月, 2013 1 次提交
    • E
      ipv6: make lookups simpler and faster · efe4208f
      Eric Dumazet 提交于
      TCP listener refactoring, part 4 :
      
      To speed up inet lookups, we moved IPv4 addresses from inet to struct
      sock_common
      
      Now is time to do the same for IPv6, because it permits us to have fast
      lookups for all kind of sockets, including upcoming SYN_RECV.
      
      Getting IPv6 addresses in TCP lookups currently requires two extra cache
      lines, plus a dereference (and memory stall).
      
      inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6
      
      This patch is way bigger than its IPv4 counter part, because for IPv4,
      we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
      it's not doable easily.
      
      inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
      inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr
      
      And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
      at the same offset.
      
      We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
      macro.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efe4208f
  10. 18 9月, 2013 1 次提交
  11. 05 9月, 2013 1 次提交
  12. 31 8月, 2013 1 次提交
    • Y
      tcp: do not use cached RTT for RTT estimation · 1b7fdd2a
      Yuchung Cheng 提交于
      RTT cached in the TCP metrics are valuable for the initial timeout
      because SYN RTT usually does not account for serialization delays
      on low BW path.
      
      However using it to seed the RTT estimator maybe disruptive because
      other components (e.g., pacing) require the smooth RTT to be obtained
      from actual connection.
      
      The solution is to use the higher cached RTT to set the first RTO
      conservatively like tcp_rtt_estimator(), but avoid seeding the other
      RTT estimator variables such as srtt.  It is also a good idea to
      keep RTO conservative to obtain the first RTT sample, and the
      performance is insured by TCP loss probe if SYN RTT is available.
      
      To keep the seeding formula consistent across SYN RTT and cached RTT,
      the rttvar is twice the cached RTT instead of cached RTTVAR value. The
      reason is because cached variation may be too small (near min RTO)
      which defeats the purpose of being conservative on first RTO. However
      the metrics still keep the RTT variations as they might be useful for
      user applications (through ip).
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b7fdd2a
  13. 06 5月, 2013 1 次提交
  14. 17 11月, 2012 1 次提交
  15. 01 11月, 2012 1 次提交
  16. 11 9月, 2012 1 次提交
  17. 06 9月, 2012 1 次提交
    • J
      tcp: add generic netlink support for tcp_metrics · d23ff701
      Julian Anastasov 提交于
      Add support for genl "tcp_metrics". No locking
      is changed, only that now we can unlink and delete
      entries after grace period. We implement get/del for
      single entry and dump to support show/flush filtering
      in user space. Del without address attribute causes
      flush for all addresses, sadly under genl_mutex.
      
      v2:
      - remove rcu_assign_pointer as suggested by Eric Dumazet,
      it is not needed because there are no other writes under lock
      - move the flushing code in tcp_metrics_flush_all
      
      v3:
      - remove synchronize_rcu on flush as suggested by Eric Dumazet
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d23ff701
  18. 09 8月, 2012 1 次提交
  19. 23 7月, 2012 1 次提交
  20. 21 7月, 2012 1 次提交
  21. 20 7月, 2012 2 次提交
  22. 19 7月, 2012 1 次提交
    • E
      ipv6: add ipv6_addr_hash() helper · ddbe5032
      Eric Dumazet 提交于
      Introduce ipv6_addr_hash() helper doing a XOR on all bits
      of an IPv6 address, with an optimized x86_64 version.
      
      Use it in flow dissector, as suggested by Andrew McGregor,
      to reduce hash collision probabilities in fq_codel (and other
      users of flow dissector)
      
      Use it in ip6_tunnel.c and use more bit shuffling, as suggested
      by David Laight, as existing hash was ignoring most of them.
      
      Use it in sunrpc and use more bit shuffling, using hash_32().
      
      Use it in net/ipv6/addrconf.c, using hash_32() as well.
      
      As a cleanup, use it in net/ipv4/tcp_metrics.c
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAndrew McGregor <andrewmcgr@gmail.com>
      Cc: Dave Taht <dave.taht@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddbe5032
  23. 12 7月, 2012 1 次提交
  24. 11 7月, 2012 4 次提交