• N
    tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks · 032ee423
    Neal Cardwell 提交于
    Helpers for mitigating ACK loops by rate-limiting dupacks sent in
    response to incoming out-of-window packets.
    
    This patch includes:
    
    - rate-limiting logic
    - sysctl to control how often we allow dupacks to out-of-window packets
    - SNMP counter for cases where we rate-limited our dupack sending
    
    The rate-limiting logic in this patch decides to not send dupacks in
    response to out-of-window segments if (a) they are SYNs or pure ACKs
    and (b) the remote endpoint is sending them faster than the configured
    rate limit.
    
    We rate-limit our responses rather than blocking them entirely or
    resetting the connection, because legitimate connections can rely on
    dupacks in response to some out-of-window segments. For example, zero
    window probes are typically sent with a sequence number that is below
    the current window, and ZWPs thus expect to thus elicit a dupack in
    response.
    
    We allow dupacks in response to TCP segments with data, because these
    may be spurious retransmissions for which the remote endpoint wants to
    receive DSACKs. This is safe because segments with data can't
    realistically be part of ACK loops, which by their nature consist of
    each side sending pure/data-less ACKs to each other.
    
    The dupack interval is controlled by a new sysctl knob,
    tcp_invalid_ratelimit, given in milliseconds, in case an administrator
    needs to dial this upward in the face of a high-rate DoS attack. The
    name and units are chosen to be analogous to the existing analogous
    knob for ICMP, icmp_ratelimit.
    
    The default value for tcp_invalid_ratelimit is 500ms, which allows at
    most one such dupack per 500ms. This is chosen to be 2x faster than
    the 1-second minimum RTO interval allowed by RFC 6298 (section 2, rule
    2.4). We allow the extra 2x factor because network delay variations
    can cause packets sent at 1 second intervals to be compressed and
    arrive much closer.
    Reported-by: NAvery Fay <avery@mixpanel.com>
    Signed-off-by: NNeal Cardwell <ncardwell@google.com>
    Signed-off-by: NYuchung Cheng <ycheng@google.com>
    Signed-off-by: NEric Dumazet <edumazet@google.com>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    032ee423
ip-sysctl.txt 64.6 KB