1. 21 9月, 2016 5 次提交
  2. 20 9月, 2016 1 次提交
  3. 17 9月, 2016 2 次提交
  4. 16 9月, 2016 1 次提交
    • E
      tcp: fix a stale ooo_last_skb after a replace · 76f0dcbb
      Eric Dumazet 提交于
      When skb replaces another one in ooo queue, I forgot to also
      update tp->ooo_last_skb as well, if the replaced skb was the last one
      in the queue.
      
      To fix this, we simply can re-use the code that runs after an insertion,
      trying to merge skbs at the right of current skb.
      
      This not only fixes the bug, but also remove all small skbs that might
      be a subset of the new one.
      
      Example:
      
      We receive segments 2001:3001,  4001:5001
      
      Then we receive 2001:8001 : We should replace 2001:3001 with the big
      skb, but also remove 4001:50001 from the queue to save space.
      
      packetdrill test demonstrating the bug
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0 bind(3, ..., ...) = 0
      +0 listen(3, 1) = 0
      
      +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      +0.100 < . 1:1(0) ack 1 win 1024
      +0 accept(3, ..., ...) = 4
      
      +0.01 < . 1001:2001(1000) ack 1 win 1024
      +0    > . 1:1(0) ack 1 <nop,nop, sack 1001:2001>
      
      +0.01 < . 1001:3001(2000) ack 1 win 1024
      +0    > . 1:1(0) ack 1 <nop,nop, sack 1001:2001 1001:3001>
      
      Fixes: 9f5afeae ("tcp: use an RB tree for ooo receive queue")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NYuchung Cheng <ycheng@google.com>
      Cc: Yaogong Wang <wygivan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76f0dcbb
  5. 11 9月, 2016 9 次提交
  6. 10 9月, 2016 2 次提交
    • E
      ip_tunnel: do not clear l4 hashes · bf8d85d4
      Eric Dumazet 提交于
      If skb has a valid l4 hash, there is no point clearing hash and force
      a further flow dissection when a tunnel encapsulation is added.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf8d85d4
    • G
      ipv4: fix value of ->nlmsg_flags reported in RTM_NEWROUTE events · b93e1fa7
      Guillaume Nault 提交于
      fib_table_insert() inconsistently fills the nlmsg_flags field in its
      notification messages.
      
      Since commit b8f55831 ("[RTNETLINK]: Fix sending netlink message
      when replace route."), the netlink message has its nlmsg_flags set to
      NLM_F_REPLACE if the route replaced a preexisting one.
      
      Then commit a2bb6d7d ("ipv4: include NLM_F_APPEND flag in append
      route notifications") started setting nlmsg_flags to NLM_F_APPEND if
      the route matched a preexisting one but was appended.
      
      In other cases (exclusive creation or prepend), nlmsg_flags is 0.
      
      This patch sets ->nlmsg_flags in all situations, preserving the
      semantic of the NLM_F_* bits:
      
        * NLM_F_CREATE: a new fib entry has been created for this route.
        * NLM_F_EXCL: no other fib entry existed for this route.
        * NLM_F_REPLACE: this route has overwritten a preexisting fib entry.
        * NLM_F_APPEND: the new fib entry was added after other entries for
          the same route.
      
      As a result, the possible flag combination can now be reported
      (iproute2's terminology into parentheses):
      
        * NLM_F_CREATE | NLM_F_EXCL: route didn't exist, exclusive creation
          ("add").
        * NLM_F_CREATE | NLM_F_APPEND: route did already exist, new route
          added after preexisting ones ("append").
        * NLM_F_CREATE: route did already exist, new route added before
          preexisting ones ("prepend").
        * NLM_F_REPLACE: route did already exist, new route replaced the
          first preexisting one ("change").
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b93e1fa7
  7. 09 9月, 2016 5 次提交
  8. 08 9月, 2016 1 次提交
  9. 07 9月, 2016 1 次提交
  10. 02 9月, 2016 4 次提交
  11. 31 8月, 2016 1 次提交
    • R
      net: lwtunnel: Handle fragmentation · 14972cbd
      Roopa Prabhu 提交于
      Today mpls iptunnel lwtunnel_output redirect expects the tunnel
      output function to handle fragmentation. This is ok but can be
      avoided if we did not do the mpls output redirect too early.
      ie we could wait until ip fragmentation is done and then call
      mpls output for each ip fragment.
      
      To make this work we will need,
      1) the lwtunnel state to carry encap headroom
      2) and do the redirect to the encap output handler on the ip fragment
      (essentially do the output redirect after fragmentation)
      
      This patch adds tunnel headroom in lwtstate to make sure we
      account for tunnel data in mtu calculations during fragmentation
      and adds new xmit redirect handler to redirect to lwtunnel xmit func
      after ip fragmentation.
      
      This includes IPV6 and some mtu fixes and testing from David Ahern.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14972cbd
  12. 30 8月, 2016 2 次提交
  13. 29 8月, 2016 2 次提交
    • E
      tcp: add tcp_add_backlog() · c9c33212
      Eric Dumazet 提交于
      When TCP operates in lossy environments (between 1 and 10 % packet
      losses), many SACK blocks can be exchanged, and I noticed we could
      drop them on busy senders, if these SACK blocks have to be queued
      into the socket backlog.
      
      While the main cause is the poor performance of RACK/SACK processing,
      we can try to avoid these drops of valuable information that can lead to
      spurious timeouts and retransmits.
      
      Cause of the drops is the skb->truesize overestimation caused by :
      
      - drivers allocating ~2048 (or more) bytes as a fragment to hold an
        Ethernet frame.
      
      - various pskb_may_pull() calls bringing the headers into skb->head
        might have pulled all the frame content, but skb->truesize could
        not be lowered, as the stack has no idea of each fragment truesize.
      
      The backlog drops are also more visible on bidirectional flows, since
      their sk_rmem_alloc can be quite big.
      
      Let's add some room for the backlog, as only the socket owner
      can selectively take action to lower memory needs, like collapsing
      receive queues or partial ofo pruning.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9c33212
    • T
      tcp: Set read_sock and peek_len proto_ops · 32035585
      Tom Herbert 提交于
      In inet_stream_ops we set read_sock to tcp_read_sock and peek_len to
      tcp_peek_len (which is just a stub function that calls tcp_inq).
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32035585
  14. 26 8月, 2016 2 次提交
  15. 25 8月, 2016 2 次提交