• K
    tcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted. · 4b01a967
    Kuniyuki Iwashima 提交于
    Commit aacd9289 ("tcp: bind() use stronger
    condition for bind_conflict") introduced a restriction to forbid to bind
    SO_REUSEADDR enabled sockets to the same (addr, port) tuple in order to
    assign ports dispersedly so that we can connect to the same remote host.
    
    The change results in accelerating port depletion so that we fail to bind
    sockets to the same local port even if we want to connect to the different
    remote hosts.
    
    You can reproduce this issue by following instructions below.
    
      1. # sysctl -w net.ipv4.ip_local_port_range="32768 32768"
      2. set SO_REUSEADDR to two sockets.
      3. bind two sockets to (localhost, 0) and the latter fails.
    
    Therefore, when ephemeral ports are exhausted, bind(0) should fallback to
    the legacy behaviour to enable the SO_REUSEADDR option and make it possible
    to connect to different remote (addr, port) tuples.
    
    This patch allows us to bind SO_REUSEADDR enabled sockets to the same
    (addr, port) only when net.ipv4.ip_autobind_reuse is set 1 and all
    ephemeral ports are exhausted. This also allows connect() and listen() to
    share ports in the following way and may break some applications. So the
    ip_autobind_reuse is 0 by default and disables the feature.
    
      1. setsockopt(sk1, SO_REUSEADDR)
      2. setsockopt(sk2, SO_REUSEADDR)
      3. bind(sk1, saddr, 0)
      4. bind(sk2, saddr, 0)
      5. connect(sk1, daddr)
      6. listen(sk2)
    
    If it is set 1, we can fully utilize the 4-tuples, but we should use
    IP_BIND_ADDRESS_NO_PORT for bind()+connect() as possible.
    
    The notable thing is that if all sockets bound to the same port have
    both SO_REUSEADDR and SO_REUSEPORT enabled, we can bind sockets to an
    ephemeral port and also do listen().
    Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    4b01a967
ip-sysctl.txt 85.2 KB