1. 03 12月, 2006 3 次提交
  2. 23 9月, 2006 1 次提交
  3. 14 8月, 2006 1 次提交
    • H
      [INET]: Use pskb_trim_unique when trimming paged unique skbs · e9fa4f7b
      Herbert Xu 提交于
      The IPv4/IPv6 datagram output path was using skb_trim to trim paged
      packets because they know that the packet has not been cloned yet
      (since the packet hasn't been given to anything else in the system).
      
      This broke because skb_trim no longer allows paged packets to be
      trimmed.  Paged packets must be given to one of the pskb_trim functions
      instead.
      
      This patch adds a new pskb_trim_unique function to cover the IPv4/IPv6
      datagram output path scenario and replaces the corresponding skb_trim
      calls with it.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9fa4f7b
  4. 08 8月, 2006 1 次提交
  5. 03 8月, 2006 3 次提交
  6. 25 7月, 2006 2 次提交
  7. 09 7月, 2006 1 次提交
  8. 04 7月, 2006 1 次提交
  9. 01 7月, 2006 1 次提交
    • H
      [IPV6]: Added GSO support for TCPv6 · f83ef8c0
      Herbert Xu 提交于
      This patch adds GSO support for IPv6 and TCPv6.  This is based on a patch
      by Ananda Raju <Ananda.Raju@neterion.com>.  His original description is:
      
      	This patch enables TSO over IPv6. Currently Linux network stacks
      	restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from
      	"dev->features". This patch will remove this restriction.
      
      	This patch will introduce a new flag NETIF_F_TSO6 which will be used
      	to check whether device supports TSO over IPv6. If device support TSO
      	over IPv6 then we don't clear of NETIF_F_TSO and which will make the
      	TCP layer to create TSO packets. Any device supporting TSO over IPv6
      	will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO.
      
      	In case when user disables TSO using ethtool, NETIF_F_TSO will get
      	cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't
      	get TSO packets created by TCP layer.
      
      	SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet.
      	SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature.
      	UFO is supported over IPv6 also
      
      	The following table shows there is significant improvement in
      	throughput with normal frames and CPU usage for both normal and jumbo.
      
      	--------------------------------------------------
      	|          |     1500        |      9600         |
      	|          ------------------|-------------------|
      	|          | thru     CPU    |  thru     CPU     |
      	--------------------------------------------------
      	| TSO OFF  | 2.00   5.5% id  |  5.66   20.0% id  |
      	--------------------------------------------------
      	| TSO ON   | 2.63   78.0 id  |  5.67   39.0% id  |
      	--------------------------------------------------
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f83ef8c0
  10. 30 6月, 2006 3 次提交
    • A
      [NET]: make skb_release_data() static · 5bba1712
      Adrian Bunk 提交于
      skb_release_data() no longer has any users in other files.
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5bba1712
    • M
      [NET]: Add ECN support for TSO · b0da8537
      Michael Chan 提交于
      In the current TSO implementation, NETIF_F_TSO and ECN cannot be
      turned on together in a TCP connection.  The problem is that most
      hardware that supports TSO does not handle CWR correctly if it is set
      in the TSO packet.  Correct handling requires CWR to be set in the
      first packet only if it is set in the TSO header.
      
      This patch adds the ability to turn on NETIF_F_TSO and ECN using
      GSO if necessary to handle TSO packets with CWR set.  Hardware
      that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev->
      features flag.
      
      All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set.  If
      the output device does not have the NETIF_F_TSO_ECN feature set, GSO
      will split the packet up correctly with CWR only set in the first
      segment.
      
      With help from Herbert Xu <herbert@gondor.apana.org.au>.
      
      Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock
      flag is completely removed.
      Signed-off-by: NMichael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0da8537
    • H
      [NET]: Added GSO header verification · 576a30eb
      Herbert Xu 提交于
      When GSO packets come from an untrusted source (e.g., a Xen guest domain),
      we need to verify the header integrity before passing it to the hardware.
      
      Since the first step in GSO is to verify the header, we can reuse that
      code by adding a new bit to gso_type: SKB_GSO_DODGY.  Packets with this
      bit set can only be fed directly to devices with the corresponding bit
      NETIF_F_GSO_ROBUST.  If the device doesn't have that bit, then the skb
      is fed to the GSO engine which will allow the packet to be sent to the
      hardware if it passes the header check.
      
      This patch changes the sg flag to a full features flag.  The same method
      can be used to implement TSO ECN support.  We simply have to mark packets
      with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
      NETIF_F_TSO_ECN can accept them.  The GSO engine can either fully segment
      the packet, or segment the first MTU and pass the rest to the hardware for
      further segmentation.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      576a30eb
  11. 23 6月, 2006 4 次提交
    • R
      [NET]: fix net-core kernel-doc · f4b8ea78
      Randy Dunlap 提交于
      Warning(/var/linsrc/linux-2617-g4//include/linux/skbuff.h:304): No description found for parameter 'dma_cookie'
      Warning(/var/linsrc/linux-2617-g4//include/net/sock.h:1274): No description found for parameter 'copied_early'
      Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'chan'
      Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'event'
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4b8ea78
    • H
      [NET]: Add software TSOv4 · f4c50d99
      Herbert Xu 提交于
      This patch adds the GSO implementation for IPv4 TCP.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4c50d99
    • H
      [NET]: Merge TSO/UFO fields in sk_buff · 7967168c
      Herbert Xu 提交于
      Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
      going to scale if we add any more segmentation methods (e.g., DCCP).  So
      let's merge them.
      
      They were used to tell the protocol of a packet.  This function has been
      subsumed by the new gso_type field.  This is essentially a set of netdev
      feature bits (shifted by 16 bits) that are required to process a specific
      skb.  As such it's easy to tell whether a given device can process a GSO
      skb: you just have to and the gso_type field and the netdev's features
      field.
      
      I've made gso_type a conjunction.  The idea is that you have a base type
      (e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
      For example, if we add a hardware TSO type that supports ECN, they would
      declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
      have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
      packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
      to be emulated in software.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7967168c
    • H
      [NET]: Avoid allocating skb in skb_pad · 5b057c6b
      Herbert Xu 提交于
      First of all it is unnecessary to allocate a new skb in skb_pad since
      the existing one is not shared.  More importantly, our hard_start_xmit
      interface does not allow a new skb to be allocated since that breaks
      requeueing.
      
      This patch uses pskb_expand_head to expand the existing skb and linearize
      it if needed.  Actually, someone should sift through every instance of
      skb_pad on a non-linear skb as they do not fit the reasons why this was
      originally created.
      
      Incidentally, this fixes a minor bug when the skb is cloned (tcpdump,
      TCP, etc.).  As it is skb_pad will simply write over a cloned skb.  Because
      of the position of the write it is unlikely to cause problems but still
      it's best if we don't do it.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b057c6b
  12. 18 6月, 2006 4 次提交
  13. 26 4月, 2006 1 次提交
  14. 20 4月, 2006 1 次提交
    • D
      [NET]: Add skb->truesize assertion checking. · dc6de336
      David S. Miller 提交于
      Add some sanity checking.  truesize should be at least sizeof(struct
      sk_buff) plus the current packet length.  If not, then truesize is
      seriously mangled and deserves a kernel log message.
      
      Currently we'll do the check for release of stream socket buffers.
      
      But we can add checks to more spots over time.
      
      Incorporating ideas from Herbert Xu.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc6de336
  15. 31 3月, 2006 1 次提交
  16. 21 3月, 2006 4 次提交
    • H
      [NET]: Replace skb_pull/skb_postpull_rcsum with skb_pull_rcsum · cbb042f9
      Herbert Xu 提交于
      We're now starting to have quite a number of places that do skb_pull
      followed immediately by an skb_postpull_rcsum.  We can merge these two
      operations into one function with skb_pull_rcsum.  This makes sense
      since most pull operations on receive skb's need to update the
      checksum.
      
      I've decided to make this out-of-line since it is fairly big and the
      fast path where hardware checksums are enabled need to call
      csum_partial anyway.
      
      Since this is a brand new function we get to add an extra check on the
      len argument.  As it is most callers of skb_pull ignore its return
      value which essentially means that there is no check on the len
      argument.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbb042f9
    • J
      [NET]: Uninline kfree_skb and allow NULL argument · 231d06ae
      Jörn Engel 提交于
      o Uninline kfree_skb, which saves some 15k of object code on my notebook.
      
      o Allow kfree_skb to be called with a NULL argument.
      
        Subsequent patches can remove conditional from drivers and further
        reduce source and object size.
      Signed-off-by: NJrn Engel <joern@wohnheim.fh-wedel.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      231d06ae
    • P
      [NETFILTER]: Fix skb->nf_bridge lifetime issues · a193a4ab
      Patrick McHardy 提交于
      The bridge netfilter code simulates the NF_IP_PRE_ROUTING hook and skips
      the real hook by registering with high priority and returning NF_STOP if
      skb->nf_bridge is present and the BRNF_NF_BRIDGE_PREROUTING flag is not
      set. The flag is only set during the simulated hook.
      
      Because skb->nf_bridge is only freed when the packet is destroyed, the
      packet will not only skip the first invocation of NF_IP_PRE_ROUTING, but
      in the case of tunnel devices on top of the bridge also all further ones.
      Forwarded packets from a bridge encapsulated by a tunnel device and sent
      as locally outgoing packet will also still have the incorrect bridge
      information from the input path attached.
      
      We already have nf_reset calls on all RX/TX paths of tunnel devices,
      so simply reset the nf_bridge field there too. As an added bonus,
      the bridge information for locally delivered packets is now also freed
      when the packet is queued to a socket.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a193a4ab
    • P
      [NET]: Reduce size of struct sk_buff on 64 bit architectures · 77d2ca35
      Patrick McHardy 提交于
      Move skb->nf_mark next to skb->tc_index to remove a 4 byte hole between
      skb->nfmark and skb->nfct and another one between skb->users and skb->head
      when CONFIG_NETFILTER, CONFIG_NET_SCHED and CONFIG_NET_CLS_ACT are enabled.
      For all other combinations the size stays the same.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77d2ca35
  17. 17 1月, 2006 1 次提交
  18. 08 1月, 2006 1 次提交
    • P
      [NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder · 3e3850e9
      Patrick McHardy 提交于
      ip_route_me_harder doesn't use the port numbers of the xfrm lookup and
      uses ip_route_input for non-local addresses which doesn't do a xfrm
      lookup, ip6_route_me_harder doesn't do a xfrm lookup at all.
      
      Use xfrm_decode_session and do the lookup manually, make sure both
      only do the lookup if the packet hasn't been transformed already.
      
      Makeing sure the lookup only happens once needs a new field in the
      IP6CB, which exceeds the size of skb->cb. The size of skb->cb is
      increased to 48b. Apparently the IPv6 mobile extensions need some
      more room anyway.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e3850e9
  19. 04 1月, 2006 3 次提交
    • B
      [NET]: Speed up __alloc_skb() · 4947d3ef
      Benjamin LaHaise 提交于
      From: Benjamin LaHaise <bcrl@kvack.org>
      
      In __alloc_skb(), the use of skb_shinfo() which casts a u8 * to the 
      shared info structure results in gcc being forced to do a reload of the 
      pointer since it has no information on possible aliasing.  Fix this by 
      using a pointer to refer to skb_shared_info.
      
      By initializing skb_shared_info sequentially, the write combining buffers 
      can reduce the number of memory transactions to a single write.  Reorder 
      the initialization in __alloc_skb() to match the structure definition.  
      There is also an alignment issue on 64 bit systems with skb_shared_info 
      by converting nr_frags to a short everything packs up nicely.
      
      Also, pass the slab cache pointer according to the fclone flag instead 
      of using two almost identical function calls.
      
      This raises bw_unix performance up to a peak of 707KB/s when combined 
      with the spinlock patch.  It should help other networking protocols, too.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4947d3ef
    • A
      [NET]: Small cleanup to socket initialization · 77d76ea3
      Andi Kleen 提交于
      sock_init can be done as a core_initcall instead of calling
      it directly in init/main.c
      
      Also I removed an out of date #ifdef.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77d76ea3
    • H
      [IP]: Simplify and consolidate MSG_PEEK error handling · 3305b80c
      Herbert Xu 提交于
      When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled
      it is left on the socket receive queue.  This means that when we detect
      a checksum error we have to be careful when trying to free the packet
      as someone could have dequeued it in the time being.
      
      Currently this delicate logic is duplicated three times between UDPv4,
      UDPv6 and RAWv6.  This patch moves them into a one place and simplifies
      the code somewhat.
      
      This is based on a suggestion by Eric Dumazet.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3305b80c
  20. 21 11月, 2005 2 次提交
  21. 11 11月, 2005 1 次提交
    • H
      [NET]: Detect hardware rx checksum faults correctly · fb286bb2
      Herbert Xu 提交于
      Here is the patch that introduces the generic skb_checksum_complete
      which also checks for hardware RX checksum faults.  If that happens,
      it'll call netdev_rx_csum_fault which currently prints out a stack
      trace with the device name.  In future it can turn off RX checksum.
      
      I've converted every spot under net/ that does RX checksum checks to
      use skb_checksum_complete or __skb_checksum_complete with the
      exceptions of:
      
      * Those places where checksums are done bit by bit.  These will call
      netdev_rx_csum_fault directly.
      
      * The following have not been completely checked/converted:
      
      ipmr
      ip_vs
      netfilter
      dccp
      
      This patch is based on patches and suggestions from Stephen Hemminger
      and David S. Miller.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb286bb2