1. 03 6月, 2014 1 次提交
    • E
      inetpeer: get rid of ip_id_count · 73f156a6
      Eric Dumazet 提交于
      Ideally, we would need to generate IP ID using a per destination IP
      generator.
      
      linux kernels used inet_peer cache for this purpose, but this had a huge
      cost on servers disabling MTU discovery.
      
      1) each inet_peer struct consumes 192 bytes
      
      2) inetpeer cache uses a binary tree of inet_peer structs,
         with a nominal size of ~66000 elements under load.
      
      3) lookups in this tree are hitting a lot of cache lines, as tree depth
         is about 20.
      
      4) If server deals with many tcp flows, we have a high probability of
         not finding the inet_peer, allocating a fresh one, inserting it in
         the tree with same initial ip_id_count, (cf secure_ip_id())
      
      5) We garbage collect inet_peer aggressively.
      
      IP ID generation do not have to be 'perfect'
      
      Goal is trying to avoid duplicates in a short period of time,
      so that reassembly units have a chance to complete reassembly of
      fragments belonging to one message before receiving other fragments
      with a recycled ID.
      
      We simply use an array of generators, and a Jenkin hash using the dst IP
      as a key.
      
      ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
      belongs (it is only used from this file)
      
      secure_ip_id() and secure_ipv6_id() no longer are needed.
      
      Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
      unnecessary decrement/increment of the number of segments.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73f156a6
  2. 14 5月, 2014 1 次提交
    • L
      net: add a sysctl to reflect the fwmark on replies · e110861f
      Lorenzo Colitti 提交于
      Kernel-originated IP packets that have no user socket associated
      with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.)
      are emitted with a mark of zero. Add a sysctl to make them have
      the same mark as the packet they are replying to.
      
      This allows an administrator that wishes to do so to use
      mark-based routing, firewalling, etc. for these replies by
      marking the original packets inbound.
      
      Tested using user-mode linux:
       - ICMP/ICMPv6 echo replies and errors.
       - TCP RST packets (IPv4 and IPv6).
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e110861f
  3. 01 5月, 2014 1 次提交
  4. 16 4月, 2014 1 次提交
  5. 22 1月, 2014 1 次提交
  6. 20 1月, 2014 1 次提交
  7. 16 1月, 2014 1 次提交
  8. 02 1月, 2014 1 次提交
  9. 10 12月, 2013 2 次提交
  10. 06 12月, 2013 2 次提交
  11. 24 11月, 2013 1 次提交
  12. 09 11月, 2013 1 次提交
  13. 24 10月, 2013 1 次提交
  14. 20 10月, 2013 1 次提交
  15. 22 9月, 2013 1 次提交
  16. 01 9月, 2013 1 次提交
  17. 24 8月, 2013 1 次提交
  18. 20 6月, 2013 1 次提交
  19. 26 5月, 2013 1 次提交
    • L
      net: ipv6: Add IPv6 support to the ping socket. · 6d0bfe22
      Lorenzo Colitti 提交于
      This adds the ability to send ICMPv6 echo requests without a
      raw socket. The equivalent ability for ICMPv4 was added in
      2011.
      
      Instead of having separate code paths for IPv4 and IPv6, make
      most of the code in net/ipv4/ping.c dual-stack and only add a
      few IPv6-specific bits (like the protocol definition) to a new
      net/ipv6/ping.c. Hopefully this will reduce divergence and/or
      duplication of bugs in the future.
      
      Caveats:
      
      - Setting options via ancillary data (e.g., using IPV6_PKTINFO
        to specify the outgoing interface) is not yet supported.
      - There are no separate security settings for IPv4 and IPv6;
        everything is controlled by /proc/net/ipv4/ping_group_range.
      - The proc interface does not yet display IPv6 ping sockets
        properly.
      
      Tested with a patched copy of ping6 and using raw socket calls.
      Compiles and works with all of CONFIG_IPV6={n,m,y}.
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d0bfe22
  20. 25 3月, 2013 1 次提交
  21. 09 3月, 2013 1 次提交
  22. 08 3月, 2013 1 次提交
  23. 22 2月, 2013 1 次提交
  24. 31 1月, 2013 2 次提交
  25. 30 1月, 2013 1 次提交
  26. 22 1月, 2013 1 次提交
  27. 18 1月, 2013 2 次提交
    • F
      ipv6: fix ipv6_prefix_equal64_half mask conversion · 512613d7
      Fabio Baltieri 提交于
      Fix the 64bit optimized version of ipv6_prefix_equal to convert the
      bitmask to network byte order only after the bit-shift.
      
      The bug was introduced in:
      
      38675170 ipv6: 64bit version of ipv6_prefix_equal().
      Signed-off-by: NFabio Baltieri <fabio.baltieri@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      512613d7
    • J
      net: increase fragment memory usage limits · c2a93660
      Jesper Dangaard Brouer 提交于
      Increase the amount of memory usage limits for incomplete
      IP fragments.
      
      Arguing for new thresh high/low values:
      
       High threshold = 4 MBytes
       Low  threshold = 3 MBytes
      
      The fragmentation memory accounting code, tries to account for the
      real memory usage, by measuring both the size of frag queue struct
      (inet_frag_queue (ipv4:ipq/ipv6:frag_queue)) and the SKB's truesize.
      
      We want to be able to handle/hold-on-to enough fragments, to ensure
      good performance, without causing incomplete fragments to hurt
      scalability, by causing the number of inet_frag_queue to grow too much
      (resulting longer searches for frag queues).
      
      For IPv4, how much memory does the largest frag consume.
      
      Maximum size fragment is 64K, which is approx 44 fragments with
      MTU(1500) sized packets. Sizeof(struct ipq) is 200.  A 1500 byte
      packet results in a truesize of 2944 (not 2048 as I first assumed)
      
        (44*2944)+200 = 129736 bytes
      
      The current default high thresh of 262144 bytes, is obviously
      problematic, as only two 64K fragments can fit in the queue at the
      same time.
      
      How many 64K fragment can we fit into 4 MBytes:
      
        4*2^20/((44*2944)+200) = 32.34 fragment in queues
      
      An attacker could send a separate/distinct fake fragment packets per
      queue, causing us to allocate one inet_frag_queue per packet, and thus
      attacking the hash table and its lists.
      
      How many frag queue do we need to store, and given a current hash size
      of 64, what is the average list length.
      
      Using one MTU sized fragment per inet_frag_queue, each consuming
      (2944+200) 3144 bytes.
      
        4*2^20/(2944+200) = 1334 frag queues -> 21 avg list length
      
      An attack could send small fragments, the smallest packet I could send
      resulted in a truesize of 896 bytes (I'm a little surprised by this).
      
        4*2^20/(896+200)  = 3827 frag queues -> 59 avg list length
      
      When increasing these number, we also need to followup with
      improvements, that is going to help scalability.  Simply increasing
      the hash size, is not enough as the current implementation does not
      have a per hash bucket locking.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2a93660
  28. 17 1月, 2013 1 次提交
  29. 15 1月, 2013 6 次提交
  30. 14 1月, 2013 2 次提交