1. 30 9月, 2015 22 次提交
  2. 29 9月, 2015 2 次提交
    • B
      tcp: Fix CWV being too strict on thin streams · d2e1339f
      Bendik Rønning Opstad 提交于
      Application limited streams such as thin streams, that transmit small
      amounts of payload in relatively few packets per RTT, can be prevented
      from growing the CWND when in congestion avoidance. This leads to
      increased sojourn times for data segments in streams that often transmit
      time-dependent data.
      
      Currently, a connection is considered CWND limited only after having
      successfully transmitted at least one packet with new data, while at the
      same time failing to transmit some unsent data from the output queue
      because the CWND is full. Applications that produce small amounts of
      data may be left in a state where it is never considered to be CWND
      limited, because all unsent data is successfully transmitted each time
      an incoming ACK opens up for more data to be transmitted in the send
      window.
      
      Fix by always testing whether the CWND is fully used after successful
      packet transmissions, such that a connection is considered CWND limited
      whenever the CWND has been filled. This is the correct behavior as
      specified in RFC2861 (section 3.1).
      
      Cc: Andreas Petlund <apetlund@simula.no>
      Cc: Carsten Griwodz <griff@simula.no>
      Cc: Jonas Markussen <jonassm@ifi.uio.no>
      Cc: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
      Cc: Mads Johannessen <madsjoh@ifi.uio.no>
      Signed-off-by: NBendik Rønning Opstad <bro.devel+kernel@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Tested-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2e1339f
    • E
      tcp: avoid reorders for TFO passive connections · 7c85af88
      Eric Dumazet 提交于
      We found that a TCP Fast Open passive connection was vulnerable
      to reorders, as the exchange might look like
      
      [1] C -> S S <FO ...> <request>
      [2] S -> C S. ack request <options>
      [3] S -> C . <answer>
      
      packets [2] and [3] can be generated at almost the same time.
      
      If C receives the 3rd packet before the 2nd, it will drop it as
      the socket is in SYN_SENT state and expects a SYNACK.
      
      S will have to retransmit the answer.
      
      Current OOO avoidance in linux is defeated because SYNACK
      packets are attached to the LISTEN socket, while DATA packets
      are attached to the children. They might be sent by different cpus,
      and different TX queues might be selected.
      
      It turns out that for TFO, we created a child, which is a
      full blown socket in TCP_SYN_RECV state, and we simply can attach
      the SYNACK packet to this socket.
      
      This means that at the time tcp_sendmsg() pushes DATA packet,
      skb->ooo_okay will be set iff the SYNACK packet had been sent
      and TX completed.
      
      This removes the reorder source at the host level.
      
      We also removed the export of tcp_try_fastopen(), as it is no
      longer called from IPv6.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c85af88
  3. 26 9月, 2015 13 次提交
  4. 25 9月, 2015 3 次提交
    • E
      tcp: factorize sk_txhash init · d8ed6250
      Eric Dumazet 提交于
      Neal suggested to move sk_txhash init into tcp_create_openreq_child(),
      called both from IPv4 and IPv6.
      
      This opportunity was missed in commit 58d607d3 ("tcp: provide
      skb->hash to synack packets")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8ed6250
    • J
      lwtunnel: remove source and destination UDP port config option · b194f30c
      Jiri Benc 提交于
      The UDP tunnel config is asymmetric wrt. to the ports used. The source and
      destination ports from one direction of the tunnel are not related to the
      ports of the other direction. We need to be able to respond to ARP requests
      using the correct ports without involving routing.
      
      As the consequence, UDP ports need to be fixed property of the tunnel
      interface and cannot be set per route. Remove the ability to set ports per
      route. This is still okay to do, as no kernel has been released with these
      attributes yet.
      
      Note that the ability to specify source and destination ports is preserved
      for other users of the lwtunnel API which don't use routes for tunnel key
      specification (like openvswitch).
      
      If in the future we rework ARP handling to allow port specification, the
      attributes can be added back.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b194f30c
    • J
      ipv4: send arp replies to the correct tunnel · 63d008a4
      Jiri Benc 提交于
      When using ip lwtunnels, the additional data for xmit (basically, the actual
      tunnel to use) are carried in ip_tunnel_info either in dst->lwtstate or in
      metadata dst. When replying to ARP requests, we need to send the reply to
      the same tunnel the request came from. This means we need to construct
      proper metadata dst for ARP replies.
      
      We could perform another route lookup to get a dst entry with the correct
      lwtstate. However, this won't always ensure that the outgoing tunnel is the
      same as the incoming one, and it won't work anyway for IPv4 duplicate
      address detection.
      
      The only thing to do is to "reverse" the ip_tunnel_info.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63d008a4