1. 21 10月, 2011 4 次提交
  2. 20 10月, 2011 2 次提交
  3. 19 10月, 2011 2 次提交
  4. 14 10月, 2011 1 次提交
    • E
      net: more accurate skb truesize · 87fb4b7b
      Eric Dumazet 提交于
      skb truesize currently accounts for sk_buff struct and part of skb head.
      kmalloc() roundings are also ignored.
      
      Considering that skb_shared_info is larger than sk_buff, its time to
      take it into account for better memory accounting.
      
      This patch introduces SKB_TRUESIZE(X) macro to centralize various
      assumptions into a single place.
      
      At skb alloc phase, we put skb_shared_info struct at the exact end of
      skb head, to allow a better use of memory (lowering number of
      reallocations), since kmalloc() gives us power-of-two memory blocks.
      
      Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
      aligned to cache lines, as before.
      
      Note: This patch might trigger performance regressions because of
      misconfigured protocol stacks, hitting per socket or global memory
      limits that were previously not reached. But its a necessary step for a
      more accurate memory accounting.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Andi Kleen <ak@linux.intel.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87fb4b7b
  5. 13 10月, 2011 1 次提交
  6. 12 10月, 2011 1 次提交
  7. 05 10月, 2011 2 次提交
    • Y
      tcp: properly update lost_cnt_hint during shifting · 1e5289e1
      Yan, Zheng 提交于
      lost_skb_hint is used by tcp_mark_head_lost() to mark the first unhandled skb.
      lost_cnt_hint is the number of packets or sacked packets before the lost_skb_hint;
      When shifting a skb that is before the lost_skb_hint, if tcp_is_fack() is ture,
      the skb has already been counted in the lost_cnt_hint; if tcp_is_fack() is false,
      tcp_sacktag_one() will increase the lost_cnt_hint. So tcp_shifted_skb() does not
      need to adjust the lost_cnt_hint by itself. When shifting a skb that is equal to
      lost_skb_hint, the shifted packets will not be counted by tcp_mark_head_lost().
      So tcp_shifted_skb() should adjust the lost_cnt_hint even tcp_is_fack(tp) is true.
      Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e5289e1
    • Y
      tcp: properly handle md5sig_pool references · 260fcbeb
      Yan, Zheng 提交于
      tcp_v4_clear_md5_list() assumes that multiple tcp md5sig peers
      only hold one reference to md5sig_pool. but tcp_v4_md5_do_add()
      increases use count of md5sig_pool for each peer. This patch
      makes tcp_v4_md5_do_add() only increases use count for the first
      tcp md5sig peer.
      Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      260fcbeb
  8. 04 10月, 2011 2 次提交
  9. 28 9月, 2011 1 次提交
  10. 27 9月, 2011 2 次提交
  11. 19 9月, 2011 1 次提交
  12. 17 9月, 2011 2 次提交
  13. 16 9月, 2011 1 次提交
    • E
      tcp: Change possible SYN flooding messages · 946cedcc
      Eric Dumazet 提交于
      "Possible SYN flooding on port xxxx " messages can fill logs on servers.
      
      Change logic to log the message only once per listener, and add two new
      SNMP counters to track :
      
      TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client
      
      TCPReqQFullDrop : number of times a SYN request was dropped because
      syncookies were not enabled.
      
      Based on a prior patch from Tom Herbert, and suggestions from David.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      946cedcc
  14. 31 8月, 2011 1 次提交
  15. 30 8月, 2011 1 次提交
  16. 25 8月, 2011 3 次提交
    • N
      Proportional Rate Reduction for TCP. · a262f0cd
      Nandita Dukkipati 提交于
      This patch implements Proportional Rate Reduction (PRR) for TCP.
      PRR is an algorithm that determines TCP's sending rate in fast
      recovery. PRR avoids excessive window reductions and aims for
      the actual congestion window size at the end of recovery to be as
      close as possible to the window determined by the congestion control
      algorithm. PRR also improves accuracy of the amount of data sent
      during loss recovery.
      
      The patch implements the recommended flavor of PRR called PRR-SSRB
      (Proportional rate reduction with slow start reduction bound) and
      replaces the existing rate halving algorithm. PRR improves upon the
      existing Linux fast recovery under a number of conditions including:
        1) burst losses where the losses implicitly reduce the amount of
      outstanding data (pipe) below the ssthresh value selected by the
      congestion control algorithm and,
        2) losses near the end of short flows where application runs out of
      data to send.
      
      As an example, with the existing rate halving implementation a single
      loss event can cause a connection carrying short Web transactions to
      go into the slow start mode after the recovery. This is because during
      recovery Linux pulls the congestion window down to packets_in_flight+1
      on every ACK. A short Web response often runs out of new data to send
      and its pipe reduces to zero by the end of recovery when all its packets
      are drained from the network. Subsequent HTTP responses using the same
      connection will have to slow start to raise cwnd to ssthresh. PRR on
      the other hand aims for the cwnd to be as close as possible to ssthresh
      by the end of recovery.
      
      A description of PRR and a discussion of its performance can be found at
      the following links:
      - IETF Draft:
          http://tools.ietf.org/html/draft-mathis-tcpm-proportional-rate-reduction-01
      - IETF Slides:
          http://www.ietf.org/proceedings/80/slides/tcpm-6.pdf
          http://tools.ietf.org/agenda/81/slides/tcpm-2.pdf
      - Paper to appear in Internet Measurements Conference (IMC) 2011:
          Improving TCP Loss Recovery
          Nandita Dukkipati, Matt Mathis, Yuchung Cheng
      Signed-off-by: NNandita Dukkipati <nanditad@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a262f0cd
    • I
      net: ipv4: convert to SKB frag APIs · aff65da0
      Ian Campbell 提交于
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aff65da0
    • Y
      mcast: Fix source address selection for multicast listener report · e05c4ad3
      Yan, Zheng 提交于
      Should check use count of include mode filter instead of total number
      of include mode filters.
      Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e05c4ad3
  17. 18 8月, 2011 2 次提交
  18. 12 8月, 2011 1 次提交
  19. 11 8月, 2011 2 次提交
    • J
      ipv4: some rt_iif -> rt_route_iif conversions · 97a80410
      Julian Anastasov 提交于
      As rt_iif represents input device even for packets
      coming from loopback with output route, it is not an unique
      key specific to input routes. Now rt_route_iif has such role,
      it was fl.iif in 2.6.38, so better to change the checks at
      some places to save CPU cycles and to restore 2.6.38 semantics.
      
      compare_keys:
      	- input routes: only rt_route_iif matters, rt_iif is same
      	- output routes: only rt_oif matters, rt_iif is not
      		used for matching in __ip_route_output_key
      	- now we are back to 2.6.38 state
      
      ip_route_input_common:
      	- matching rt_route_iif implies input route
      	- compared to 2.6.38 we eliminated one rth->fl.oif check
      	because it was not needed even for 2.6.38
      
      compare_hash_inputs:
      	Only the change here is not an optimization, it has
      	effect only for output routes. I assume I'm restoring
      	the original intention to ignore oif, it was using fl.iif
      	- now we are back to 2.6.38 state
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97a80410
    • M
      tcp: initialize variable ecn_ok in syncookies path · f0e3d068
      Mike Waychison 提交于
      Using a gcc 4.4.3, warnings are emitted for a possibly uninitialized use
      of ecn_ok.
      
      This can happen if cookie_check_timestamp() returns due to not having
      seen a timestamp.  Defaulting to ecn off seems like a reasonable thing
      to do in this case, so initialized ecn_ok to false.
      Signed-off-by: NMike Waychison <mikew@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f0e3d068
  20. 08 8月, 2011 5 次提交
  21. 07 8月, 2011 1 次提交
    • D
      net: Compute protocol sequence numbers and fragment IDs using MD5. · 6e5714ea
      David S. Miller 提交于
      Computers have become a lot faster since we compromised on the
      partial MD4 hash which we use currently for performance reasons.
      
      MD5 is a much safer choice, and is inline with both RFC1948 and
      other ISS generators (OpenBSD, Solaris, etc.)
      
      Furthermore, only having 24-bits of the sequence number be truly
      unpredictable is a very serious limitation.  So the periodic
      regeneration and 8-bit counter have been removed.  We compute and
      use a full 32-bit sequence number.
      
      For ipv6, DCCP was found to use a 32-bit truncated initial sequence
      number (it needs 43-bits) and that is fixed here as well.
      Reported-by: NDan Kaminsky <dan@doxpara.com>
      Tested-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e5714ea
  22. 03 8月, 2011 1 次提交
  23. 02 8月, 2011 1 次提交