• E
    tcp: TSQ can use a dynamic limit · c9eeec26
    Eric Dumazet 提交于
    When TCP Small Queues was added, we used a sysctl to limit amount of
    packets queues on Qdisc/device queues for a given TCP flow.
    
    Problem is this limit is either too big for low rates, or too small
    for high rates.
    
    Now TCP stack has rate estimation in sk->sk_pacing_rate, and TSO
    auto sizing, it can better control number of packets in Qdisc/device
    queues.
    
    New limit is two packets or at least 1 to 2 ms worth of packets.
    
    Low rates flows benefit from this patch by having even smaller
    number of packets in queues, allowing for faster recovery,
    better RTT estimations.
    
    High rates flows benefit from this patch by allowing more than 2 packets
    in flight as we had reports this was a limiting factor to reach line
    rate. [ In particular if TX completion is delayed because of coalescing
    parameters ]
    
    Example for a single flow on 10Gbp link controlled by FQ/pacing
    
    14 packets in flight instead of 2
    
    $ tc -s -d qd
    qdisc fq 8001: dev eth0 root refcnt 32 limit 10000p flow_limit 100p
    buckets 1024 quantum 3028 initial_quantum 15140
     Sent 1168459366606 bytes 771822841 pkt (dropped 0, overlimits 0
    requeues 6822476)
     rate 9346Mbit 771713pps backlog 953820b 14p requeues 6822476
      2047 flow, 2046 inactive, 1 throttled, delay 15673 ns
      2372 gc, 0 highprio, 0 retrans, 9739249 throttled, 0 flows_plimit
    
    Note that sk_pacing_rate is currently set to twice the actual rate, but
    this might be refined in the future when a flow is in congestion
    avoidance.
    
    Additional change : skb->destructor should be set to tcp_wfree().
    
    A future patch (for linux 3.13+) might remove tcp_limit_output_bytes
    Signed-off-by: NEric Dumazet <edumazet@google.com>
    Cc: Wei Liu <wei.liu2@citrix.com>
    Cc: Cong Wang <xiyou.wangcong@gmail.com>
    Cc: Yuchung Cheng <ycheng@google.com>
    Cc: Neal Cardwell <ncardwell@google.com>
    Acked-by: NNeal Cardwell <ncardwell@google.com>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    c9eeec26
tcp_output.c 91.9 KB