1. 20 4月, 2006 1 次提交
  2. 19 4月, 2006 1 次提交
    • H
      [TCP]: Fix truesize underflow · ef5cb973
      Herbert Xu 提交于
      There is a problem with the TSO packet trimming code.  The cause of
      this lies in the tcp_fragment() function.
      
      When we allocate a fragment for a completely non-linear packet the
      truesize is calculated for a payload length of zero.  This means that
      truesize could in fact be less than the real payload length.
      
      When that happens the TSO packet trimming can cause truesize to become
      negative.  This in turn can cause sk_forward_alloc to be -n * PAGE_SIZE
      which would trigger the warning.
      
      I've copied the code DaveM used in tso_fragment which should work here.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef5cb973
  3. 15 4月, 2006 1 次提交
    • A
      [IPV4]: Possible cleanups. · 6c97e72a
      Adrian Bunk 提交于
      This patch contains the following possible cleanups:
      - make the following needlessly global function static:
        - arp.c: arp_rcv()
      - remove the following unused EXPORT_SYMBOL's:
        - devinet.c: devinet_ioctl
        - fib_frontend.c: ip_rt_ioctl
        - inet_hashtables.c: inet_bind_bucket_create
        - inet_hashtables.c: inet_bind_hash
        - tcp_input.c: sysctl_tcp_abc
        - tcp_ipv4.c: sysctl_tcp_tw_reuse
        - tcp_output.c: sysctl_tcp_mtu_probing
        - tcp_output.c: sysctl_tcp_base_mss
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c97e72a
  4. 21 3月, 2006 3 次提交
  5. 12 3月, 2006 1 次提交
  6. 04 1月, 2006 3 次提交
  7. 07 12月, 2005 1 次提交
    • D
      [TCP] Vegas: timestamp before clone · dfb4b9dc
      David S. Miller 提交于
      We have to store the congestion control timestamp on the SKB before we
      clone it, not after.  Else we get no timestamping information at all.
      
      tcp_transmit_skb() has been reworked so that we can do the timestamp
      still in one spot, instead of at all the call sites.
      
      Problem discovered, and initial fix, from Tom Young
      <tyo@ee.unimelb.edu.au>.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfb4b9dc
  8. 11 11月, 2005 3 次提交
  9. 21 10月, 2005 1 次提交
  10. 14 10月, 2005 1 次提交
  11. 13 10月, 2005 1 次提交
  12. 09 10月, 2005 1 次提交
  13. 30 9月, 2005 1 次提交
    • D
      [TCP]: Revert 6b251858 · 01ff367e
      David S. Miller 提交于
      But retain the comment fix.
      
      Alexey Kuznetsov has explained the situation as follows:
      
      --------------------
      
      I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is
      not continuous: f.e. for mss=1095 it needs initial window 1095*4, but
      for mss=1096 it is 1096*3. We do not know exactly what mss sender used
      for calculations. If we advertised 1096 (and calculate initial window
      3*1096), the sender could limit it to some value < 1096 and then it
      will need window his_mss*4 > 3*1096 to send initial burst.
      
      See?
      
      So, the honest function for inital rcv_wnd derived from
      tcp_init_cwnd() is:
      
      	init_rcv_wnd(mss)=
      	  min { init_cwnd(mss1)*mss1 for mss1 <= mss }
      
      It is something sort of:
      
      	if (mss < 1096)
      		return mss*4;
      	if (mss < 1096*2)
      		return 1096*4;
      	return mss*2;
      
      (I just scrablled a graph of piece of paper, it is difficult to see or
      to explain without this)
      
      I selected it differently giving more window than it is strictly
      required.  Initial receive window must be large enough to allow sender
      following to the rfc (or just setting initial cwnd to 2) to send
      initial burst.  But besides that it is arbitrary, so I decided to give
      slack space of one segment.
      
      Actually, the logic was:
      
      If mss is low/normal (<=ethernet), set window to receive more than
      initial burst allowed by rfc under the worst conditions
      i.e. mss*4. This gives slack space of 1 segment for ethernet frames.
      
      For msses slighlty more than ethernet frame, take 3. Try to give slack
      space of 1 frame again.
      
      If mss is huge, force 2*mss. No slack space.
      
      Value 1460*3 is really confusing. Minimal one is 1096*2, but besides
      that it is an arbitrary value. It was meant to be ~4096. 1460*3 is
      just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so
      that I guess hands typed this themselves.
      
      --------------------
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      01ff367e
  14. 29 9月, 2005 1 次提交
  15. 23 9月, 2005 1 次提交
    • H
      [TCP]: Adjust Reno SACK estimate in tcp_fragment · 83ca28be
      Herbert Xu 提交于
      Since the introduction of TSO pcount a year ago, it has been possible
      for tcp_fragment() to cause packets_out to decrease.  Prior to that,
      tcp_retrans_try_collapse() was the only way for that to happen on the
      retransmission path.
      
      When this happens with Reno, it is possible for sasked_out to become
      invalid because it is only an estimate and not tied to any particular
      packet on the retransmission queue.
      
      Therefore we need to adjust sacked_out as well as left_out in the Reno
      case.  The following patch does exactly that.
      
      This bug is pretty difficult to trigger in practice though since you
      need a SACKless peer with a retransmission that occurs just as the
      cached MTU value expires.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83ca28be
  16. 20 9月, 2005 1 次提交
  17. 15 9月, 2005 1 次提交
  18. 11 9月, 2005 1 次提交
  19. 09 9月, 2005 1 次提交
  20. 02 9月, 2005 1 次提交
  21. 30 8月, 2005 6 次提交
  22. 24 8月, 2005 1 次提交
  23. 18 8月, 2005 1 次提交
    • H
      [TCP]: Fix bug #5070: kernel BUG at net/ipv4/tcp_output.c:864 · 35d59efd
      Herbert Xu 提交于
      1) We send out a normal sized packet with TSO on to start off.
      2) ICMP is received indicating a smaller MTU.
      3) We send the current sk_send_head which needs to be fragmented
      since it was created before the ICMP event.  The first fragment
      is then sent out.
      
      At this point the remaining fragment is allocated by tcp_fragment.
      However, its size is padded to fit the L1 cache-line size therefore
      creating tail-room up to 124 bytes long.
      
      This fragment will also be sitting at sk_send_head.
      
      4) tcp_sendmsg is called again and it stores data in the tail-room of
      of the fragment.
      5) tcp_push_one is called by tcp_sendmsg which then calls tso_fragment
      since the packet as a whole exceeds the MTU.
      
      At this point we have a packet that has data in the head area being
      fed to tso_fragment which bombs out.
      
      My take on this is that we shouldn't ever call tcp_fragment on a TSO
      socket for a packet that is yet to be transmitted since this creates
      a packet on sk_send_head that cannot be extended.
      
      So here is a patch to change it so that tso_fragment is always used
      in this case.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35d59efd
  24. 17 8月, 2005 1 次提交
    • H
      [TCP]: Fix bug #5070: kernel BUG at net/ipv4/tcp_output.c:864 · c8ac3774
      Herbert Xu 提交于
      1) We send out a normal sized packet with TSO on to start off.
      2) ICMP is received indicating a smaller MTU.
      3) We send the current sk_send_head which needs to be fragmented
      since it was created before the ICMP event.  The first fragment
      is then sent out.
      
      At this point the remaining fragment is allocated by tcp_fragment.
      However, its size is padded to fit the L1 cache-line size therefore
      creating tail-room up to 124 bytes long.
      
      This fragment will also be sitting at sk_send_head.
      
      4) tcp_sendmsg is called again and it stores data in the tail-room of
      of the fragment.
      5) tcp_push_one is called by tcp_sendmsg which then calls tso_fragment
      since the packet as a whole exceeds the MTU.
      
      At this point we have a packet that has data in the head area being
      fed to tso_fragment which bombs out.
      
      My take on this is that we shouldn't ever call tcp_fragment on a TSO
      socket for a packet that is yet to be transmitted since this creates
      a packet on sk_send_head that cannot be extended.
      
      So here is a patch to change it so that tso_fragment is always used
      in this case.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8ac3774
  25. 11 8月, 2005 1 次提交
    • H
      [TCP]: Adjust {p,f}ackets_out correctly in tcp_retransmit_skb() · b5da623a
      Herbert Xu 提交于
      Well I've only found one potential cause for the assertion
      failure in tcp_mark_head_lost.  First of all, this can only
      occur if cnt > 1 since tp->packets_out is never zero here.
      If it did hit zero we'd have much bigger problems.
      
      So cnt is equal to fackets_out - reordering.  Normally
      fackets_out is less than packets_out.  The only reason
      I've found that might cause fackets_out to exceed packets_out
      is if tcp_fragment is called from tcp_retransmit_skb with a
      TSO skb and the current MSS is greater than the MSS stored
      in the TSO skb.  This might occur as the result of an expiring
      dst entry.
      
      In that case, packets_out may decrease (line 1380-1381 in
      tcp_output.c).  However, fackets_out is unchanged which means
      that it may in fact exceed packets_out.
      
      Previously tcp_retrans_try_collapse was the only place where
      packets_out can go down and it takes care of this by decrementing
      fackets_out.
      
      So we should make sure that fackets_out is reduced by an appropriate
      amount here as well.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5da623a
  26. 05 8月, 2005 2 次提交
  27. 09 7月, 2005 1 次提交
  28. 06 7月, 2005 1 次提交