1. 27 9月, 2011 1 次提交
    • E
      tcp: ECN blackhole should not force quickack mode · 7a269ffa
      Eric Dumazet 提交于
      While playing with a new ADSL box at home, I discovered that ECN
      blackhole can trigger suboptimal quickack mode on linux : We send one
      ACK for each incoming data frame, without any delay and eventual
      piggyback.
      
      This is because TCP_ECN_check_ce() considers that if no ECT is seen on a
      segment, this is because this segment was a retransmit.
      
      Refine this heuristic and apply it only if we seen ECT in a previous
      segment, to detect ECN blackhole at IP level.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Jamal Hadi Salim <jhs@mojatatu.com>
      CC: Jerry Chu <hkchu@google.com>
      CC: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      CC: Jim Gettys <jg@freedesktop.org>
      CC: Dave Taht <dave.taht@gmail.com>
      Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a269ffa
  2. 19 9月, 2011 1 次提交
  3. 17 9月, 2011 2 次提交
  4. 16 9月, 2011 1 次提交
    • E
      tcp: Change possible SYN flooding messages · 946cedcc
      Eric Dumazet 提交于
      "Possible SYN flooding on port xxxx " messages can fill logs on servers.
      
      Change logic to log the message only once per listener, and add two new
      SNMP counters to track :
      
      TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client
      
      TCPReqQFullDrop : number of times a SYN request was dropped because
      syncookies were not enabled.
      
      Based on a prior patch from Tom Herbert, and suggestions from David.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      946cedcc
  5. 31 8月, 2011 1 次提交
  6. 30 8月, 2011 1 次提交
  7. 25 8月, 2011 3 次提交
    • N
      Proportional Rate Reduction for TCP. · a262f0cd
      Nandita Dukkipati 提交于
      This patch implements Proportional Rate Reduction (PRR) for TCP.
      PRR is an algorithm that determines TCP's sending rate in fast
      recovery. PRR avoids excessive window reductions and aims for
      the actual congestion window size at the end of recovery to be as
      close as possible to the window determined by the congestion control
      algorithm. PRR also improves accuracy of the amount of data sent
      during loss recovery.
      
      The patch implements the recommended flavor of PRR called PRR-SSRB
      (Proportional rate reduction with slow start reduction bound) and
      replaces the existing rate halving algorithm. PRR improves upon the
      existing Linux fast recovery under a number of conditions including:
        1) burst losses where the losses implicitly reduce the amount of
      outstanding data (pipe) below the ssthresh value selected by the
      congestion control algorithm and,
        2) losses near the end of short flows where application runs out of
      data to send.
      
      As an example, with the existing rate halving implementation a single
      loss event can cause a connection carrying short Web transactions to
      go into the slow start mode after the recovery. This is because during
      recovery Linux pulls the congestion window down to packets_in_flight+1
      on every ACK. A short Web response often runs out of new data to send
      and its pipe reduces to zero by the end of recovery when all its packets
      are drained from the network. Subsequent HTTP responses using the same
      connection will have to slow start to raise cwnd to ssthresh. PRR on
      the other hand aims for the cwnd to be as close as possible to ssthresh
      by the end of recovery.
      
      A description of PRR and a discussion of its performance can be found at
      the following links:
      - IETF Draft:
          http://tools.ietf.org/html/draft-mathis-tcpm-proportional-rate-reduction-01
      - IETF Slides:
          http://www.ietf.org/proceedings/80/slides/tcpm-6.pdf
          http://tools.ietf.org/agenda/81/slides/tcpm-2.pdf
      - Paper to appear in Internet Measurements Conference (IMC) 2011:
          Improving TCP Loss Recovery
          Nandita Dukkipati, Matt Mathis, Yuchung Cheng
      Signed-off-by: NNandita Dukkipati <nanditad@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a262f0cd
    • I
      net: ipv4: convert to SKB frag APIs · aff65da0
      Ian Campbell 提交于
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aff65da0
    • Y
      mcast: Fix source address selection for multicast listener report · e05c4ad3
      Yan, Zheng 提交于
      Should check use count of include mode filter instead of total number
      of include mode filters.
      Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e05c4ad3
  8. 18 8月, 2011 2 次提交
  9. 12 8月, 2011 1 次提交
  10. 11 8月, 2011 2 次提交
    • J
      ipv4: some rt_iif -> rt_route_iif conversions · 97a80410
      Julian Anastasov 提交于
      As rt_iif represents input device even for packets
      coming from loopback with output route, it is not an unique
      key specific to input routes. Now rt_route_iif has such role,
      it was fl.iif in 2.6.38, so better to change the checks at
      some places to save CPU cycles and to restore 2.6.38 semantics.
      
      compare_keys:
      	- input routes: only rt_route_iif matters, rt_iif is same
      	- output routes: only rt_oif matters, rt_iif is not
      		used for matching in __ip_route_output_key
      	- now we are back to 2.6.38 state
      
      ip_route_input_common:
      	- matching rt_route_iif implies input route
      	- compared to 2.6.38 we eliminated one rth->fl.oif check
      	because it was not needed even for 2.6.38
      
      compare_hash_inputs:
      	Only the change here is not an optimization, it has
      	effect only for output routes. I assume I'm restoring
      	the original intention to ignore oif, it was using fl.iif
      	- now we are back to 2.6.38 state
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97a80410
    • M
      tcp: initialize variable ecn_ok in syncookies path · f0e3d068
      Mike Waychison 提交于
      Using a gcc 4.4.3, warnings are emitted for a possibly uninitialized use
      of ecn_ok.
      
      This can happen if cookie_check_timestamp() returns due to not having
      seen a timestamp.  Defaulting to ecn off seems like a reasonable thing
      to do in this case, so initialized ecn_ok to false.
      Signed-off-by: NMike Waychison <mikew@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f0e3d068
  11. 08 8月, 2011 5 次提交
  12. 07 8月, 2011 1 次提交
    • D
      net: Compute protocol sequence numbers and fragment IDs using MD5. · 6e5714ea
      David S. Miller 提交于
      Computers have become a lot faster since we compromised on the
      partial MD4 hash which we use currently for performance reasons.
      
      MD5 is a much safer choice, and is inline with both RFC1948 and
      other ISS generators (OpenBSD, Solaris, etc.)
      
      Furthermore, only having 24-bits of the sequence number be truly
      unpredictable is a very serious limitation.  So the periodic
      regeneration and 8-bit counter have been removed.  We compute and
      use a full 32-bit sequence number.
      
      For ipv6, DCCP was found to use a 32-bit truncated initial sequence
      number (it needs 43-bits) and that is fixed here as well.
      Reported-by: NDan Kaminsky <dan@doxpara.com>
      Tested-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e5714ea
  13. 03 8月, 2011 1 次提交
  14. 02 8月, 2011 1 次提交
  15. 01 8月, 2011 1 次提交
  16. 29 7月, 2011 1 次提交
    • J
      netfilter: ip_queue: Fix small leak in ipq_build_packet_message() · 91c66c68
      Jesper Juhl 提交于
      ipq_build_packet_message() in net/ipv4/netfilter/ip_queue.c and
      net/ipv6/netfilter/ip6_queue.c contain a small potential mem leak as
      far as I can tell.
      
      We allocate memory for 'skb' with alloc_skb() annd then call
       nlh = NLMSG_PUT(skb, 0, 0, IPQM_PACKET, size - sizeof(*nlh));
      
      NLMSG_PUT is a macro
       NLMSG_PUT(skb, pid, seq, type, len) \
        		NLMSG_NEW(skb, pid, seq, type, len, 0)
      
      that expands to NLMSG_NEW, which is also a macro which expands to:
       NLMSG_NEW(skb, pid, seq, type, len, flags) \
        	({	if (unlikely(skb_tailroom(skb) < (int)NLMSG_SPACE(len))) \
        			goto nlmsg_failure; \
        		__nlmsg_put(skb, pid, seq, type, len, flags); })
      
      If we take the true branch of the 'if' statement and 'goto
      nlmsg_failure', then we'll, at that point, return from
      ipq_build_packet_message() without having assigned 'skb' to anything
      and we'll leak the memory we allocated for it when it goes out of
      scope.
      
      Fix this by placing a 'kfree(skb)' at 'nlmsg_failure'.
      
      I admit that I do not know how likely this to actually happen or even
      if there's something that guarantees that it will never happen - I'm
      not that familiar with this code, but if that is so, I've not been
      able to spot it.
      Signed-off-by: NJesper Juhl <jj@chaosbits.net>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      91c66c68
  17. 27 7月, 2011 1 次提交
  18. 26 7月, 2011 1 次提交
  19. 24 7月, 2011 2 次提交
  20. 22 7月, 2011 6 次提交
  21. 19 7月, 2011 1 次提交
  22. 18 7月, 2011 3 次提交
  23. 17 7月, 2011 1 次提交