1. 28 9月, 2014 1 次提交
  2. 17 2月, 2014 1 次提交
  3. 14 2月, 2014 1 次提交
    • F
      net: ip, ipv6: handle gso skbs in forwarding path · fe6cc55f
      Florian Westphal 提交于
      Marcelo Ricardo Leitner reported problems when the forwarding link path
      has a lower mtu than the incoming one if the inbound interface supports GRO.
      
      Given:
      Host <mtu1500> R1 <mtu1200> R2
      
      Host sends tcp stream which is routed via R1 and R2.  R1 performs GRO.
      
      In this case, the kernel will fail to send ICMP fragmentation needed
      messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu
      checks in forward path. Instead, Linux tries to send out packets exceeding
      the mtu.
      
      When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does
      not fragment the packets when forwarding, and again tries to send out
      packets exceeding R1-R2 link mtu.
      
      This alters the forwarding dstmtu checks to take the individual gso
      segment lengths into account.
      
      For ipv6, we send out pkt too big error for gso if the individual
      segments are too big.
      
      For ipv4, we either send icmp fragmentation needed, or, if the DF bit
      is not set, perform software segmentation and let the output path
      create fragments when the packet is leaving the machine.
      It is not 100% correct as the error message will contain the headers of
      the GRO skb instead of the original/segmented one, but it seems to
      work fine in my (limited) tests.
      
      Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid
      sofware segmentation.
      
      However it turns out that skb_segment() assumes skb nr_frags is related
      to mss size so we would BUG there.  I don't want to mess with it considering
      Herbert and Eric disagree on what the correct behavior should be.
      
      Hannes Frederic Sowa notes that when we would shrink gso_size
      skb_segment would then also need to deal with the case where
      SKB_MAX_FRAGS would be exceeded.
      
      This uses sofware segmentation in the forward path when we hit ipv4
      non-DF packets and the outgoing link mtu is too small.  Its not perfect,
      but given the lack of bug reports wrt. GRO fwd being broken this is a
      rare case anyway.  Also its not like this could not be improved later
      once the dust settles.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Reported-by: NMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe6cc55f
  4. 27 1月, 2014 1 次提交
  5. 17 1月, 2014 1 次提交
  6. 15 1月, 2014 1 次提交
    • P
      net: add skb_checksum_setup · ed1f50c3
      Paul Durrant 提交于
      This patch adds a function to set up the partial checksum offset for IP
      packets (and optionally re-calculate the pseudo-header checksum) into the
      core network code.
      The implementation was previously private and duplicated between xen-netback
      and xen-netfront, however it is not xen-specific and is potentially useful
      to any network driver.
      Signed-off-by: NPaul Durrant <paul.durrant@citrix.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed1f50c3
  7. 08 1月, 2014 1 次提交
  8. 07 1月, 2014 1 次提交
  9. 28 12月, 2013 1 次提交
  10. 20 12月, 2013 1 次提交
  11. 19 12月, 2013 1 次提交
  12. 18 12月, 2013 4 次提交
  13. 10 12月, 2013 1 次提交
  14. 03 12月, 2013 1 次提交
  15. 16 11月, 2013 1 次提交
  16. 11 11月, 2013 1 次提交
    • J
      netfilter: push reasm skb through instead of original frag skbs · 6aafeef0
      Jiri Pirko 提交于
      Pushing original fragments through causes several problems. For example
      for matching, frags may not be matched correctly. Take following
      example:
      
      <example>
      On HOSTA do:
      ip6tables -I INPUT -p icmpv6 -j DROP
      ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT
      
      and on HOSTB you do:
      ping6 HOSTA -s2000    (MTU is 1500)
      
      Incoming echo requests will be filtered out on HOSTA. This issue does
      not occur with smaller packets than MTU (where fragmentation does not happen)
      </example>
      
      As was discussed previously, the only correct solution seems to be to use
      reassembled skb instead of separete frags. Doing this has positive side
      effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams
      dances in ipvs and conntrack can be removed.
      
      Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c
      entirely and use code in net/ipv6/reassembly.c instead.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6aafeef0
  17. 08 11月, 2013 2 次提交
  18. 05 11月, 2013 1 次提交
  19. 04 11月, 2013 1 次提交
  20. 22 10月, 2013 1 次提交
    • E
      ipv6: sit: add GSO/TSO support · 61c1db7f
      Eric Dumazet 提交于
      Now ipv6_gso_segment() is stackable, its relatively easy to
      implement GSO/TSO support for SIT tunnels
      
      Performance results, when segmentation is done after tunnel
      device (as no NIC is yet enabled for TSO SIT support) :
      
      Before patch :
      
      lpq84:~# ./netperf -H 2002:af6:1153:: -Cc
      MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
       87380  16384  16384    10.00      3168.31   4.81     4.64     2.988   2.877
      
      After patch :
      
      lpq84:~# ./netperf -H 2002:af6:1153:: -Cc
      MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
       87380  16384  16384    10.00      5525.00   7.76     5.17     2.763   1.840
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      61c1db7f
  21. 20 10月, 2013 2 次提交
    • E
      ipip: add GSO/TSO support · cb32f511
      Eric Dumazet 提交于
      Now inet_gso_segment() is stackable, its relatively easy to
      implement GSO/TSO support for IPIP
      
      Performance results, when segmentation is done after tunnel
      device (as no NIC is yet enabled for TSO IPIP support) :
      
      Before patch :
      
      lpq83:~# ./netperf -H 7.7.9.84 -Cc
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
       87380  16384  16384    10.00      3357.88   5.09     3.70     2.983   2.167
      
      After patch :
      
      lpq83:~# ./netperf -H 7.7.9.84 -Cc
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
       87380  16384  16384    10.00      7710.19   4.52     6.62     1.152   1.687
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb32f511
    • E
      ipv4: gso: make inet_gso_segment() stackable · 3347c960
      Eric Dumazet 提交于
      In order to support GSO on IPIP, we need to make
      inet_gso_segment() stackable.
      
      It should not assume network header starts right after mac
      header.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3347c960
  22. 18 10月, 2013 1 次提交
    • E
      net: refactor sk_page_frag_refill() · 400dfd3a
      Eric Dumazet 提交于
      While working on virtio_net new allocation strategy to increase
      payload/truesize ratio, we found that refactoring sk_page_frag_refill()
      was needed.
      
      This patch splits sk_page_frag_refill() into two parts, adding
      skb_page_frag_refill() which can be used without a socket.
      
      While we are at it, add a minimum frag size of 32 for
      sk_page_frag_refill()
      
      Michael will either use netdev_alloc_frag() from softirq context,
      or skb_page_frag_refill() from process context in refill_work()
       (GFP_KERNEL allocations)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Michael Dalton <mwdalton@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      400dfd3a
  23. 03 10月, 2013 1 次提交
  24. 01 10月, 2013 2 次提交
  25. 27 9月, 2013 1 次提交
    • J
      net.h/skbuff.h: Remove extern from function prototypes · 7965bd4d
      Joe Perches 提交于
      There are a mix of function prototypes with and without extern
      in the kernel sources.  Standardize on not using extern for
      function prototypes.
      
      Function prototypes don't need to be written with extern.
      extern is assumed by the compiler.  Its use is as unnecessary as
      using auto to declare automatic/local variables in a block.
      Signed-off-by: NJoe Perches <joe@perches.com>
      7965bd4d
  26. 04 9月, 2013 2 次提交
  27. 08 8月, 2013 1 次提交
  28. 02 8月, 2013 2 次提交
  29. 01 8月, 2013 1 次提交
  30. 28 6月, 2013 1 次提交
  31. 26 6月, 2013 1 次提交
  32. 11 6月, 2013 1 次提交