1. 08 7月, 2014 1 次提交
    • N
      tcp: switch snt_synack back to measuring transmit time of first SYNACK · 86c6a2c7
      Neal Cardwell 提交于
      Always store in snt_synack the time at which the server received the
      first client SYN and attempted to send the first SYNACK.
      
      Recent commit aa27fc50 ("tcp: tcp_v[46]_conn_request: fix snt_synack
      initialization") resolved an inconsistency between IPv4 and IPv6 in
      the initialization of snt_synack. This commit brings back the idea
      from 843f4a55 (tcp: use tcp_v4_send_synack on first SYN-ACK), which
      was going for the original behavior of snt_synack from the commit
      where it was added in 9ad7c049 ("tcp: RFC2988bis + taking RTT
      sample from 3WHS for the passive open side") in v3.1.
      
      In addition to being simpler (and probably a tiny bit faster),
      unconditionally storing the time of the first SYNACK attempt has been
      useful because it allows calculating a performance metric quantifying
      how long it took to establish a passive TCP connection.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Cc: Octavian Purdila <octavian.purdila@intel.com>
      Cc: Jerry Chu <hkchu@google.com>
      Acked-by: NOctavian Purdila <octavian.purdila@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86c6a2c7
  2. 28 6月, 2014 12 次提交
  3. 18 6月, 2014 1 次提交
  4. 24 5月, 2014 1 次提交
  5. 14 5月, 2014 4 次提交
    • L
      net: support marking accepting TCP sockets · 84f39b08
      Lorenzo Colitti 提交于
      When using mark-based routing, sockets returned from accept()
      may need to be marked differently depending on the incoming
      connection request.
      
      This is the case, for example, if different socket marks identify
      different networks: a listening socket may want to accept
      connections from all networks, but each connection should be
      marked with the network that the request came in on, so that
      subsequent packets are sent on the correct network.
      
      This patch adds a sysctl to mark TCP sockets based on the fwmark
      of the incoming SYN packet. If enabled, and an unmarked socket
      receives a SYN, then the SYN packet's fwmark is written to the
      connection's inet_request_sock, and later written back to the
      accepted socket when the connection is established.  If the
      socket already has a nonzero mark, then the behaviour is the same
      as it is today, i.e., the listening socket's fwmark is used.
      
      Black-box tested using user-mode linux:
      
      - IPv4/IPv6 SYN+ACK, FIN, etc. packets are routed based on the
        mark of the incoming SYN packet.
      - The socket returned by accept() is marked with the mark of the
        incoming SYN packet.
      - Tested with syncookies=1 and syncookies=2.
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84f39b08
    • L
      net: add a sysctl to reflect the fwmark on replies · e110861f
      Lorenzo Colitti 提交于
      Kernel-originated IP packets that have no user socket associated
      with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.)
      are emitted with a mark of zero. Add a sysctl to make them have
      the same mark as the packet they are replying to.
      
      This allows an administrator that wishes to do so to use
      mark-based routing, firewalling, etc. for these replies by
      marking the original packets inbound.
      
      Tested using user-mode linux:
       - ICMP/ICMPv6 echo replies and errors.
       - TCP RST packets (IPv4 and IPv6).
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e110861f
    • D
      tcp: IPv6 support for fastopen server · 3a19ce0e
      Daniel Lee 提交于
      After all the preparatory works, supporting IPv6 in Fast Open is now easy.
      We pretty much just mirror v4 code. The only difference is how we
      generate the Fast Open cookie for IPv6 sockets. Since Fast Open cookie
      is 128 bits and we use AES 128, we use CBC-MAC to encrypt both the
      source and destination IPv6 addresses since the cookie is a MAC tag.
      Signed-off-by: NDaniel Lee <longinus00@gmail.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NJerry Chu <hkchu@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a19ce0e
    • Y
      tcp: improve fastopen icmp handling · 0a672f74
      Yuchung Cheng 提交于
      If a fast open socket is already accepted by the user, it should
      be treated like a connected socket to record the ICMP error in
      sk_softerr, so the user can fetch it. Do that in both tcp_v4_err
      and tcp_v6_err.
      
      Also refactor the sequence window check to improve readability
      (e.g., there were two local variables named 'req').
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDaniel Lee <longinus00@gmail.com>
      Signed-off-by: NJerry Chu <hkchu@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a672f74
  6. 06 5月, 2014 1 次提交
  7. 12 4月, 2014 1 次提交
    • L
      net: ipv6: Fix oif in TCP SYN+ACK route lookup. · a36dbdb2
      Lorenzo Colitti 提交于
      net-next commit 9c76a114, ipv6: tcp_ipv6 policy route issue, had
      a boolean logic error that caused incorrect behaviour for TCP
      SYN+ACK when oif-based rules are in use. Specifically:
      
      1. If a SYN comes in from a global address, and sk_bound_dev_if
         is not set, the routing lookup has oif set to the interface
         the SYN came in on. Instead, it should have oif unset,
         because for global addresses, the incoming interface doesn't
         necessarily have any bearing on the interface the SYN+ACK is
         sent out on.
      2. If a SYN comes in from a link-local address, and
         sk_bound_dev_if is set, the routing lookup has oif set to the
         interface the SYN came in on. Instead, it should have oif set
         to sk_bound_dev_if, because that's what the application
         requested.
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a36dbdb2
  8. 01 4月, 2014 2 次提交
  9. 04 3月, 2014 1 次提交
  10. 22 1月, 2014 1 次提交
  11. 20 1月, 2014 1 次提交
  12. 18 1月, 2014 1 次提交
  13. 27 12月, 2013 1 次提交
  14. 19 12月, 2013 1 次提交
  15. 11 12月, 2013 1 次提交
  16. 10 12月, 2013 2 次提交
  17. 06 12月, 2013 1 次提交
  18. 22 10月, 2013 1 次提交
  19. 11 10月, 2013 1 次提交
  20. 10 10月, 2013 1 次提交
    • E
      inet: includes a sock_common in request_sock · 634fb979
      Eric Dumazet 提交于
      TCP listener refactoring, part 5 :
      
      We want to be able to insert request sockets (SYN_RECV) into main
      ehash table instead of the per listener hash table to allow RCU
      lookups and remove listener lock contention.
      
      This patch includes the needed struct sock_common in front
      of struct request_sock
      
      This means there is no more inet6_request_sock IPv6 specific
      structure.
      
      Following inet_request_sock fields were renamed as they became
      macros to reference fields from struct sock_common.
      Prefix ir_ was chosen to avoid name collisions.
      
      loc_port   -> ir_loc_port
      loc_addr   -> ir_loc_addr
      rmt_addr   -> ir_rmt_addr
      rmt_port   -> ir_rmt_port
      iif        -> ir_iif
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      634fb979
  21. 09 10月, 2013 2 次提交
    • E
      ipv6: make lookups simpler and faster · efe4208f
      Eric Dumazet 提交于
      TCP listener refactoring, part 4 :
      
      To speed up inet lookups, we moved IPv4 addresses from inet to struct
      sock_common
      
      Now is time to do the same for IPv6, because it permits us to have fast
      lookups for all kind of sockets, including upcoming SYN_RECV.
      
      Getting IPv6 addresses in TCP lookups currently requires two extra cache
      lines, plus a dereference (and memory stall).
      
      inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6
      
      This patch is way bigger than its IPv4 counter part, because for IPv4,
      we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
      it's not doable easily.
      
      inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
      inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr
      
      And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
      at the same offset.
      
      We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
      macro.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efe4208f
    • E
      tcp/dccp: remove twchain · 05dbc7b5
      Eric Dumazet 提交于
      TCP listener refactoring, part 3 :
      
      Our goal is to hash SYN_RECV sockets into main ehash for fast lookup,
      and parallel SYN processing.
      
      Current inet_ehash_bucket contains two chains, one for ESTABLISH (and
      friend states) sockets, another for TIME_WAIT sockets only.
      
      As the hash table is sized to get at most one socket per bucket, it
      makes little sense to have separate twchain, as it makes the lookup
      slightly more complicated, and doubles hash table memory usage.
      
      If we make sure all socket types have the lookup keys at the same
      offsets, we can use a generic and faster lookup. It turns out TIME_WAIT
      and ESTABLISHED sockets already have common lookup fields for IPv4.
      
      [ INET_TW_MATCH() is no longer needed ]
      
      I'll provide a follow-up to factorize IPv6 lookup as well, to remove
      INET6_TW_MATCH()
      
      This way, SYN_RECV pseudo sockets will be supported the same.
      
      A new sock_gen_put() helper is added, doing either a sock_put() or
      inet_twsk_put() [ and will support SYN_RECV later ].
      
      Note this helper should only be called in real slow path, when rcu
      lookup found a socket that was moved to another identity (freed/reused
      immediately), but could eventually be used in other contexts, like
      sock_edemux()
      
      Before patch :
      
      dmesg | grep "TCP established"
      
      TCP established hash table entries: 524288 (order: 11, 8388608 bytes)
      
      After patch :
      
      TCP established hash table entries: 524288 (order: 10, 4194304 bytes)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05dbc7b5
  22. 04 10月, 2013 1 次提交
    • E
      tcp: shrink tcp6_timewait_sock by one cache line · 96f817fe
      Eric Dumazet 提交于
      While working on tcp listener refactoring, I found that it
      would really make things easier if sock_common could include
      the IPv6 addresses needed in the lookups, instead of doing
      very complex games to get their values (depending on sock
      being SYN_RECV, ESTABLISHED, TIME_WAIT)
      
      For this to happen, I need to be sure that tcp6_timewait_sock
      and tcp_timewait_sock consume same number of cache lines.
      
      This is possible if we only use 32bits for tw_ttd, as we remove
      one 32bit hole in inet_timewait_sock
      
      inet_tw_time_stamp() is defined and used, even if its current
      implementation looks like tcp_time_stamp : We might need finer
      resolution for tcp_time_stamp in the future.
      
      Before patch : sizeof(struct tcp6_timewait_sock) = 0xc8
      
      After patch : sizeof(struct tcp6_timewait_sock) = 0xc0
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96f817fe
  23. 05 9月, 2013 1 次提交