1. 18 4月, 2015 1 次提交
    • E
      inet_diag: fix access to tcp cc information · 521f1cf1
      Eric Dumazet 提交于
      Two different problems are fixed here :
      
      1) inet_sk_diag_fill() might be called without socket lock held.
         icsk->icsk_ca_ops can change under us and module be unloaded.
         -> Access to freed memory.
         Fix this using rcu_read_lock() to prevent module unload.
      
      2) Some TCP Congestion Control modules provide information
         but again this is not safe against icsk->icsk_ca_ops
         change and nla_put() errors were ignored. Some sockets
         could not get the additional info if skb was almost full.
      
      Fix this by returning a status from get_info() handlers and
      using rcu protection as well.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      521f1cf1
  2. 02 9月, 2014 1 次提交
  3. 31 7月, 2014 1 次提交
    • C
      tcp: Fix integer-overflow in TCP vegas · 1f74e613
      Christoph Paasch 提交于
      In vegas we do a multiplication of the cwnd and the rtt. This
      may overflow and thus their result is stored in a u64. However, we first
      need to cast the cwnd so that actually 64-bit arithmetic is done.
      
      Then, we need to do do_div to allow this to be used on 32-bit arches.
      
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Doug Leith <doug.leith@nuim.ie>
      Fixes: 8d3a564d (tcp: tcp_vegas cong avoid fix)
      Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f74e613
  4. 04 5月, 2014 1 次提交
  5. 27 2月, 2014 1 次提交
    • E
      tcp: switch rtt estimations to usec resolution · 740b0f18
      Eric Dumazet 提交于
      Upcoming congestion controls for TCP require usec resolution for RTT
      estimations. Millisecond resolution is simply not enough these days.
      
      FQ/pacing in DC environments also require this change for finer control
      and removal of bimodal behavior due to the current hack in
      tcp_update_pacing_rate() for 'small rtt'
      
      TCP_CONG_RTT_STAMP is no longer needed.
      
      As Julian Anastasov pointed out, we need to keep user compatibility :
      tcp_metrics used to export RTT and RTTVAR in msec resolution,
      so we added RTT_US and RTTVAR_US. An iproute2 patch is needed
      to use the new attributes if provided by the kernel.
      
      In this example ss command displays a srtt of 32 usecs (10Gbit link)
      
      lpk51:~# ./ss -i dst lpk52
      Netid  State      Recv-Q Send-Q   Local Address:Port       Peer
      Address:Port
      tcp    ESTAB      0      1         10.246.11.51:42959
      10.246.11.52:64614
               cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448
      cwnd:10 send
      3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559
      
      Updated iproute2 ip command displays :
      
      lpk51:~# ./ip tcp_metrics | grep 10.246.11.52
      10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source
      10.246.11.51
      
      Old binary displays :
      
      lpk51:~# ip tcp_metrics | grep 10.246.11.52
      10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source
      10.246.11.51
      
      With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Larry Brakmo <brakmo@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      740b0f18
  6. 14 2月, 2014 1 次提交
  7. 05 11月, 2013 1 次提交
    • Y
      tcp: properly handle stretch acks in slow start · 9f9843a7
      Yuchung Cheng 提交于
      Slow start now increases cwnd by 1 if an ACK acknowledges some packets,
      regardless the number of packets. Consequently slow start performance
      is highly dependent on the degree of the stretch ACKs caused by
      receiver or network ACK compression mechanisms (e.g., delayed-ACK,
      GRO, etc).  But slow start algorithm is to send twice the amount of
      packets of packets left so it should process a stretch ACK of degree
      N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A
      follow up patch will use the remainder of the N (if greater than 1)
      to adjust cwnd in the congestion avoidance phase.
      
      In addition this patch retires the experimental limited slow start
      (LSS) feature. LSS has multiple drawbacks but questionable benefit. The
      fractional cwnd increase in LSS requires a loop in slow start even
      though it's rarely used. Configuring such an increase step via a global
      sysctl on different BDPS seems hard. Finally and most importantly the
      slow start overshoot concern is now better covered by the Hybrid slow
      start (hystart) enabled by default.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f9843a7
  8. 10 3月, 2011 1 次提交
  9. 26 5月, 2009 1 次提交
  10. 09 12月, 2008 1 次提交
    • D
      tcp: tcp_vegas cong avoid fix · 8d3a564d
      Doug Leith 提交于
      This patch addresses a book-keeping issue in tcp_vegas.c.  At present
      tcp_vegas does separate book-keeping of cwnd based on packet sequence
      numbers.  A mismatch can develop between this book-keeping and
      tp->snd_cwnd due, for example, to delayed acks acking multiple
      packets.  When vegas transitions to reno operation (e.g. following
      loss), then this mismatch leads to incorrect behaviour (akin to a cwnd
      backoff).  This seems mostly to affect operation at low cwnds where
      delayed acking can lead to a significant fraction of cwnd being
      covered by a single ack, leading to the book-keeping mismatch.  This
      patch modifies the congestion avoidance update to avoid the need for
      separate book-keeping while leaving vegas congestion avoidance
      functionally unchanged.  A secondary advantage of this modification is
      that the use of fixed-point (via V_PARAM_SHIFT) and 64 bit arithmetic
      is no longer necessary, simplifying the code.
      
      Some example test measurements with the patched code (confirming no functional
      change in the congestion avoidance algorithm) can be seen at:
      
      http://www.hamilton.ie/doug/vegaspatch/Signed-off-by: NDoug Leith <doug.leith@nuim.ie>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d3a564d
  11. 05 12月, 2008 1 次提交
    • D
      tcp: tcp_vegas ssthresh bug fix · a6af2d6b
      Doug Leith 提交于
      This patch fixes a bug in tcp_vegas.c.  At the moment this code leaves
      ssthresh untouched.  However, this means that the vegas congestion
      control algorithm is effectively unable to reduce cwnd below the
      ssthresh value (if the vegas update lowers the cwnd below ssthresh,
      then slow start is activated to raise it back up).  One example where
      this matters is when during slow start cwnd overshoots the link
      capacity and a flow then exits slow start with ssthresh set to a value
      above where congestion avoidance would like to adjust it.
      Signed-off-by: NDoug Leith <doug.leith@nuim.ie>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6af2d6b
  12. 01 5月, 2008 1 次提交
  13. 30 4月, 2008 1 次提交
    • L
      tcp: Overflow bug in Vegas · 15913114
      Lachlan Andrew 提交于
      From: Lachlan Andrew <lachlan.andrew@gmail.com>
      
      There is an overflow bug in net/ipv4/tcp_vegas.c for large BDPs
      (e.g. 400Mbit/s, 400ms).  The multiplication (old_wnd *
      vegas->baseRTT) << V_PARAM_SHIFT overflows a u32.
      
      [ Fix tcp_veno.c too, it has similar calculations. -DaveM ]
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15913114
  14. 29 1月, 2008 1 次提交
  15. 30 10月, 2007 1 次提交
  16. 31 7月, 2007 1 次提交
  17. 18 7月, 2007 1 次提交
  18. 16 6月, 2007 1 次提交
  19. 26 4月, 2007 4 次提交
  20. 11 2月, 2007 1 次提交
  21. 03 12月, 2006 1 次提交
  22. 23 9月, 2006 1 次提交
  23. 01 7月, 2006 1 次提交
  24. 05 1月, 2006 1 次提交
  25. 07 12月, 2005 2 次提交
  26. 12 11月, 2005 1 次提交
  27. 11 11月, 2005 1 次提交
  28. 30 8月, 2005 3 次提交
  29. 24 6月, 2005 1 次提交