• N
    tcp: fix cwnd-limited bug for TSO deferral where we send nothing · 299bcb55
    Neal Cardwell 提交于
    When cwnd is not a multiple of the TSO skb size of N*MSS, we can get
    into persistent scenarios where we have the following sequence:
    
    (1) ACK for full-sized skb of N*MSS arrives
      -> tcp_write_xmit() transmit full-sized skb with N*MSS
      -> move pacing release time forward
      -> exit tcp_write_xmit() because pacing time is in the future
    
    (2) TSQ callback or TCP internal pacing timer fires
      -> try to transmit next skb, but TSO deferral finds remainder of
         available cwnd is not big enough to trigger an immediate send
         now, so we defer sending until the next ACK.
    
    (3) repeat...
    
    So we can get into a case where we never mark ourselves as
    cwnd-limited for many seconds at a time, even with
    bulk/infinite-backlog senders, because:
    
    o In case (1) above, every time in tcp_write_xmit() we have enough
    cwnd to send a full-sized skb, we are not fully using the cwnd
    (because cwnd is not a multiple of the TSO skb size). So every time we
    send data, we are not cwnd limited, and so in the cwnd-limited
    tracking code in tcp_cwnd_validate() we mark ourselves as not
    cwnd-limited.
    
    o In case (2) above, every time in tcp_write_xmit() that we try to
    transmit the "remainder" of the cwnd but defer, we set the local
    variable is_cwnd_limited to true, but we do not send any packets, so
    sent_pkts is zero, so we don't call the cwnd-limited logic to update
    tp->is_cwnd_limited.
    
    Fixes: ca8a2263 ("tcp: make cwnd-limited checks measurement-based, and gentler")
    Reported-by: NIngemar Johansson <ingemar.s.johansson@ericsson.com>
    Signed-off-by: NNeal Cardwell <ncardwell@google.com>
    Signed-off-by: NYuchung Cheng <ycheng@google.com>
    Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
    Signed-off-by: NEric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20201209035759.1225145-1-ncardwell.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
    299bcb55
tcp_output.c 118.6 KB