1. 03 12月, 2006 1 次提交
  2. 23 9月, 2006 8 次提交
  3. 14 8月, 2006 1 次提交
    • H
      [INET]: Use pskb_trim_unique when trimming paged unique skbs · e9fa4f7b
      Herbert Xu 提交于
      The IPv4/IPv6 datagram output path was using skb_trim to trim paged
      packets because they know that the packet has not been cloned yet
      (since the packet hasn't been given to anything else in the system).
      
      This broke because skb_trim no longer allows paged packets to be
      trimmed.  Paged packets must be given to one of the pskb_trim functions
      instead.
      
      This patch adds a new pskb_trim_unique function to cover the IPv4/IPv6
      datagram output path scenario and replaces the corresponding skb_trim
      calls with it.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9fa4f7b
  4. 03 8月, 2006 3 次提交
    • W
      [IPV6]: SNMPv2 "ipv6IfStatsOutFragCreates" counter error · dafee490
      Wei Dong 提交于
        When I tested linux kernel 2.6.71.7 about statistics
      "ipv6IfStatsOutFragCreates", and found that it couldn't increase
      correctly. The criteria is RFC 2465:
      
        ipv6IfStatsOutFragCreates OBJECT-TYPE
            SYNTAX      Counter32
            MAX-ACCESS  read-only
            STATUS      current
            DESCRIPTION
               "The number of output datagram fragments that have
               been generated as a result of fragmentation at
               this output interface."
            ::= { ipv6IfStatsEntry 15 }
      
      I think there are two issues in Linux kernel. 
      1st:
      RFC2465 specifies the counter is "The number of output datagram
      fragments...". I think increasing this counter after output a fragment
      successfully is better. And it should not be increased even though a
      fragment is created but failed to output.
      
      2nd:
      If we send a big ICMP/ICMPv6 echo request to a host, and receive
      ICMP/ICMPv6 echo reply consisted of some fragments. As we know that in
      Linux kernel first fragmentation occurs in ICMP layer(maybe saying
      transport layer is better), but this is not the "real"
      fragmentation,just do some "pre-fragment" -- allocate space for date,
      and form a frag_list, etc. The "real" fragmentation happens in IP layer
      -- set offset and MF flag and so on. So I think in "fast path" for
      ip_fragment/ip6_fragment, if we send a fragment which "pre-fragment" by
      upper layer we should also increase "ipv6IfStatsOutFragCreates".
      Signed-off-by: NWei Dong <weid@nanjing-fnst.com>
      Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dafee490
    • W
      [IPV6]: SNMPv2 "ipv6IfStatsInHdrErrors" counter error · 32c524d1
      Wei Dong 提交于
        When I tested Linux kernel 2.6.17.7 about statistics
      "ipv6IfStatsInHdrErrors", found that this counter couldn't increase
      correctly. The criteria is RFC2465:
        ipv6IfStatsInHdrErrors OBJECT-TYPE
            SYNTAX     Counter3
            MAX-ACCESS read-only
            STATUS     current
            DESCRIPTION
               "The number of input datagrams discarded due to
               errors in their IPv6 headers, including version
               number mismatch, other format errors, hop count
               exceeded, errors discovered in processing their
               IPv6 options, etc."
            ::= { ipv6IfStatsEntry 2 }
      
      When I send TTL=0 and TTL=1 a packet to a router which need to be
      forwarded, router just sends an ICMPv6 message to tell the sender that
      TIME_EXCEED and HOPLIMITS, but no increments for this counter(in the
      function ip6_forward).
      Signed-off-by: NWei Dong <weid@nanjing-fnst.com>
      Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32c524d1
    • H
      [IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls · 497c615a
      Herbert Xu 提交于
      The current users of ip6_dst_lookup can be divided into two classes:
      
      1) The caller holds no locks and is in user-context (UDP).
      2) The caller does not want to lookup the dst cache at all.
      
      The second class covers everyone except UDP because most people do
      the cache lookup directly before calling ip6_dst_lookup.  This patch
      adds ip6_sk_dst_lookup for the first class.
      
      Similarly ip6_dst_store users can be divded into those that need to
      take the socket dst lock and those that don't.  This patch adds
      __ip6_dst_store for those (everyone except UDP/datagram) that don't
      need an extra lock.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      497c615a
  5. 09 7月, 2006 1 次提交
  6. 01 7月, 2006 2 次提交
    • H
      [IPV6]: Added GSO support for TCPv6 · f83ef8c0
      Herbert Xu 提交于
      This patch adds GSO support for IPv6 and TCPv6.  This is based on a patch
      by Ananda Raju <Ananda.Raju@neterion.com>.  His original description is:
      
      	This patch enables TSO over IPv6. Currently Linux network stacks
      	restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from
      	"dev->features". This patch will remove this restriction.
      
      	This patch will introduce a new flag NETIF_F_TSO6 which will be used
      	to check whether device supports TSO over IPv6. If device support TSO
      	over IPv6 then we don't clear of NETIF_F_TSO and which will make the
      	TCP layer to create TSO packets. Any device supporting TSO over IPv6
      	will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO.
      
      	In case when user disables TSO using ethtool, NETIF_F_TSO will get
      	cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't
      	get TSO packets created by TCP layer.
      
      	SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet.
      	SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature.
      	UFO is supported over IPv6 also
      
      	The following table shows there is significant improvement in
      	throughput with normal frames and CPU usage for both normal and jumbo.
      
      	--------------------------------------------------
      	|          |     1500        |      9600         |
      	|          ------------------|-------------------|
      	|          | thru     CPU    |  thru     CPU     |
      	--------------------------------------------------
      	| TSO OFF  | 2.00   5.5% id  |  5.66   20.0% id  |
      	--------------------------------------------------
      	| TSO ON   | 2.63   78.0 id  |  5.67   39.0% id  |
      	--------------------------------------------------
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f83ef8c0
    • J
      Remove obsolete #include <linux/config.h> · 6ab3d562
      Jörn Engel 提交于
      Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      6ab3d562
  7. 23 6月, 2006 1 次提交
    • H
      [NET]: Merge TSO/UFO fields in sk_buff · 7967168c
      Herbert Xu 提交于
      Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
      going to scale if we add any more segmentation methods (e.g., DCCP).  So
      let's merge them.
      
      They were used to tell the protocol of a packet.  This function has been
      subsumed by the new gso_type field.  This is essentially a set of netdev
      feature bits (shifted by 16 bits) that are required to process a specific
      skb.  As such it's easy to tell whether a given device can process a GSO
      skb: you just have to and the gso_type field and the netdev's features
      field.
      
      I've made gso_type a conjunction.  The idea is that you have a base type
      (e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
      For example, if we add a hardware TSO type that supports ECN, they would
      declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
      have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
      packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
      to be emulated in software.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7967168c
  8. 18 6月, 2006 2 次提交
  9. 23 3月, 2006 1 次提交
  10. 21 3月, 2006 2 次提交
  11. 13 3月, 2006 1 次提交
  12. 25 2月, 2006 1 次提交
  13. 10 1月, 2006 1 次提交
  14. 04 1月, 2006 1 次提交
  15. 30 11月, 2005 1 次提交
  16. 10 11月, 2005 1 次提交
    • Y
      [NETFILTER]: Add nf_conntrack subsystem. · 9fb9cbb1
      Yasuyuki Kozakai 提交于
      The existing connection tracking subsystem in netfilter can only
      handle ipv4.  There were basically two choices present to add
      connection tracking support for ipv6.  We could either duplicate all
      of the ipv4 connection tracking code into an ipv6 counterpart, or (the
      choice taken by these patches) we could design a generic layer that
      could handle both ipv4 and ipv6 and thus requiring only one sub-protocol
      (TCP, UDP, etc.) connection tracking helper module to be written.
      
      In fact nf_conntrack is capable of working with any layer 3
      protocol.
      
      The existing ipv4 specific conntrack code could also not deal
      with the pecularities of doing connection tracking on ipv6,
      which is also cured here.  For example, these issues include:
      
      1) ICMPv6 handling, which is used for neighbour discovery in
         ipv6 thus some messages such as these should not participate
         in connection tracking since effectively they are like ARP
         messages
      
      2) fragmentation must be handled differently in ipv6, because
         the simplistic "defrag, connection track and NAT, refrag"
         (which the existing ipv4 connection tracking does) approach simply
         isn't feasible in ipv6
      
      3) ipv6 extension header parsing must occur at the correct spots
         before and after connection tracking decisions, and there were
         no provisions for this in the existing connection tracking
         design
      
      4) ipv6 has no need for stateful NAT
      
      The ipv4 specific conntrack layer is kept around, until all of
      the ipv4 specific conntrack helpers are ported over to nf_conntrack
      and it is feature complete.  Once that occurs, the old conntrack
      stuff will get placed into the feature-removal-schedule and we will
      fully kill it off 6 months later.
      Signed-off-by: NYasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
      Signed-off-by: NHarald Welte <laforge@netfilter.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      9fb9cbb1
  17. 09 11月, 2005 1 次提交
  18. 29 10月, 2005 1 次提交
    • A
      [IPv4/IPv6]: UFO Scatter-gather approach · e89e9cf5
      Ananda Raju 提交于
      Attached is kernel patch for UDP Fragmentation Offload (UFO) feature.
      
      1. This patch incorporate the review comments by Jeff Garzik.
      2. Renamed USO as UFO (UDP Fragmentation Offload)
      3. udp sendfile support with UFO
      
      This patches uses scatter-gather feature of skb to generate large UDP
      datagram. Below is a "how-to" on changes required in network device
      driver to use the UFO interface.
      
      UDP Fragmentation Offload (UFO) Interface:
      -------------------------------------------
      UFO is a feature wherein the Linux kernel network stack will offload the
      IP fragmentation functionality of large UDP datagram to hardware. This
      will reduce the overhead of stack in fragmenting the large UDP datagram to
      MTU sized packets
      
      1) Drivers indicate their capability of UFO using
      dev->features |= NETIF_F_UFO | NETIF_F_HW_CSUM | NETIF_F_SG
      
      NETIF_F_HW_CSUM is required for UFO over ipv6.
      
      2) UFO packet will be submitted for transmission using driver xmit routine.
      UFO packet will have a non-zero value for
      
      "skb_shinfo(skb)->ufo_size"
      
      skb_shinfo(skb)->ufo_size will indicate the length of data part in each IP
      fragment going out of the adapter after IP fragmentation by hardware.
      
      skb->data will contain MAC/IP/UDP header and skb_shinfo(skb)->frags[]
      contains the data payload. The skb->ip_summed will be set to CHECKSUM_HW
      indicating that hardware has to do checksum calculation. Hardware should
      compute the UDP checksum of complete datagram and also ip header checksum of
      each fragmented IP packet.
      
      For IPV6 the UFO provides the fragment identification-id in
      skb_shinfo(skb)->ip6_frag_id. The adapter should use this ID for generating
      IPv6 fragments.
      Signed-off-by: NAnanda Raju <ananda.raju@neterion.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (forwarded)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      e89e9cf5
  19. 04 10月, 2005 1 次提交
  20. 08 9月, 2005 1 次提交
  21. 30 8月, 2005 4 次提交
  22. 28 7月, 2005 1 次提交
  23. 06 7月, 2005 1 次提交
  24. 22 6月, 2005 1 次提交
  25. 19 5月, 2005 1 次提交
    • H
      [IPV4/IPV6] Ensure all frag_list members have NULL sk · 2fdba6b0
      Herbert Xu 提交于
      Having frag_list members which holds wmem of an sk leads to nightmares
      with partially cloned frag skb's.  The reason is that once you unleash
      a skb with a frag_list that has individual sk ownerships into the stack
      you can never undo those ownerships safely as they may have been cloned
      by things like netfilter.  Since we have to undo them in order to make
      skb_linearize happy this approach leads to a dead-end.
      
      So let's go the other way and make this an invariant:
      
      	For any skb on a frag_list, skb->sk must be NULL.
      
      That is, the socket ownership always belongs to the head skb.
      It turns out that the implementation is actually pretty simple.
      
      The above invariant is actually violated in the following patch
      for a short duration inside ip_fragment.  This is OK because the
      offending frag_list member is either destroyed at the end of the
      slow path without being sent anywhere, or it is detached from
      the frag_list before being sent.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2fdba6b0