1. 14 1月, 2015 5 次提交
    • S
      tcp: avoid reducing cwnd when ACK+DSACK is received · 08abdffa
      Sébastien Barré 提交于
      With TLP, the peer may reply to a probe with an
      ACK+D-SACK, with ack value set to tlp_high_seq. In the current code,
      such ACK+DSACK will be missed and only at next, higher ack will the TLP
      episode be considered done. Since the DSACK is not present anymore,
      this will cost a cwnd reduction.
      
      This patch ensures that this scenario does not cause a cwnd reduction, since
      receiving an ACK+DSACK indicates that both the initial segment and the probe
      have been received by the peer.
      
      The following packetdrill test, from Neal Cardwell, validates this patch:
      
      // Establish a connection.
      0     socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0     setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0    bind(3, ..., ...) = 0
      +0    listen(3, 1) = 0
      
      +0    < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      +0    > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
      +.020 < . 1:1(0) ack 1 win 257
      +0    accept(3, ..., ...) = 4
      
      // Send 1 packet.
      +0    write(4, ..., 1000) = 1000
      +0    > P. 1:1001(1000) ack 1
      
      // Loss probe retransmission.
      // packets_out == 1 => schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
      // In this case, this means: 1.5*RTT + 200ms = 230ms
      +.230 > P. 1:1001(1000) ack 1
      +0    %{ assert tcpi_snd_cwnd == 10 }%
      
      // Receiver ACKs at tlp_high_seq with a DSACK,
      // indicating they received the original packet and probe.
      +.020 < . 1:1(0) ack 1001 win 257 <sack 1:1001,nop,nop>
      +0    %{ assert tcpi_snd_cwnd == 10 }%
      
      // Send another packet.
      +0    write(4, ..., 1000) = 1000
      +0    > P. 1001:2001(1000) ack 1
      
      // Receiver ACKs above tlp_high_seq, which should end the TLP episode
      // if we haven't already. We should not reduce cwnd.
      +.020 < . 1:1(0) ack 2001 win 257
      +0    %{ assert tcpi_snd_cwnd == 10, tcpi_snd_cwnd }%
      
      Credits:
      -Gregory helped in finding that tcp_process_tlp_ack was where the cwnd
      got reduced in our MPTCP tests.
      -Neal wrote the packetdrill test above
      -Yuchung reworked the patch to make it more readable.
      
      Cc: Gregory Detal <gregory.detal@uclouvain.be>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Tested-by: NNeal Cardwell <ncardwell@google.com>
      Reviewed-by: NYuchung Cheng <ycheng@google.com>
      Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NSébastien Barré <sebastien.barre@uclouvain.be>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      08abdffa
    • D
      Merge branch 'rhashtable-next' · 52e3ad9f
      David S. Miller 提交于
      Ying Xue says:
      
      ====================
      remove nl_sk_hash_lock from netlink socket
      
      After tipc socket successfully avoids the involvement of an extra lock
      with rhashtable_lookup_insert(), it's possible for netlink socket to
      remove its hash socket lock now. But as netlink socket needs a compare
      function to look for an object, we first introduce a new function
      called rhashtable_lookup_compare_insert() in commit #1 which is
      implemented based on original rhashtable_lookup_insert(). We
      subsequently remove nl_sk_hash_lock from netlink socket with the new
      introduced function in commit #2. Lastly, as Thomas requested, we add
      commit #3 to indicate the implementation of what the grow and shrink
      decision function must enforce min/max shift.
      
      v2:
       As Thomas pointed out, there was a race between checking portid and
       then setting it in commit #2. Now use socket lock to make the process
       of both checking and setting portid atomic, and then eliminate the
       race.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52e3ad9f
    • Y
      rhashtable: add a note for grow and shrink decision functions · 6f73d3b1
      Ying Xue 提交于
      As commit c0c09bfd ("rhashtable: avoid unnecessary wakeup for
      worker queue") moves condition statements of verifying whether hash
      table size exceeds its maximum threshold or reaches its minimum
      threshold from resizing functions to resizing decision functions,
      we should add a note in rhashtable.h to indicate the implementation
      of what the grow and shrink decision function must enforce min/max
      shift, otherwise, it's failed to take min/max shift's set watermarks
      into effect.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f73d3b1
    • Y
      netlink: eliminate nl_sk_hash_lock · c5adde94
      Ying Xue 提交于
      As rhashtable_lookup_compare_insert() can guarantee the process
      of search and insertion is atomic, it's safe to eliminate the
      nl_sk_hash_lock. After this, object insertion or removal will
      be protected with per bucket lock on write side while object
      lookup is guarded with rcu read lock on read side.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5adde94
    • Y
      rhashtable: involve rhashtable_lookup_compare_insert routine · 7a868d1e
      Ying Xue 提交于
      Introduce a new function called rhashtable_lookup_compare_insert()
      which is very similar to rhashtable_lookup_insert(). But the former
      makes use of users' given compare function to look for an object,
      and then inserts it into hash table if found. As the entire process
      of search and insertion is under protection of per bucket lock, this
      can help users to avoid the involvement of extra lock.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a868d1e
  2. 13 1月, 2015 35 次提交