1. 09 7月, 2006 1 次提交
  2. 01 7月, 2006 2 次提交
    • H
      [IPV6]: Added GSO support for TCPv6 · f83ef8c0
      Herbert Xu 提交于
      This patch adds GSO support for IPv6 and TCPv6.  This is based on a patch
      by Ananda Raju <Ananda.Raju@neterion.com>.  His original description is:
      
      	This patch enables TSO over IPv6. Currently Linux network stacks
      	restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from
      	"dev->features". This patch will remove this restriction.
      
      	This patch will introduce a new flag NETIF_F_TSO6 which will be used
      	to check whether device supports TSO over IPv6. If device support TSO
      	over IPv6 then we don't clear of NETIF_F_TSO and which will make the
      	TCP layer to create TSO packets. Any device supporting TSO over IPv6
      	will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO.
      
      	In case when user disables TSO using ethtool, NETIF_F_TSO will get
      	cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't
      	get TSO packets created by TCP layer.
      
      	SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet.
      	SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature.
      	UFO is supported over IPv6 also
      
      	The following table shows there is significant improvement in
      	throughput with normal frames and CPU usage for both normal and jumbo.
      
      	--------------------------------------------------
      	|          |     1500        |      9600         |
      	|          ------------------|-------------------|
      	|          | thru     CPU    |  thru     CPU     |
      	--------------------------------------------------
      	| TSO OFF  | 2.00   5.5% id  |  5.66   20.0% id  |
      	--------------------------------------------------
      	| TSO ON   | 2.63   78.0 id  |  5.67   39.0% id  |
      	--------------------------------------------------
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f83ef8c0
    • H
      [NET]: Generalise TSO-specific bits from skb_setup_caps · bcd76111
      Herbert Xu 提交于
      This patch generalises the TSO-specific bits from sk_setup_caps by adding
      the sk_gso_type member to struct sock.  This makes sk_setup_caps generic
      so that it can be used by TCPv6 or UFO.
      
      The only catch is that whoever uses this must provide a GSO implementation
      for their protocol which I think is a fair deal :) For now UFO continues to
      live without a GSO implementation which is OK since it doesn't use the sock
      caps field at the moment.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcd76111
  3. 30 6月, 2006 3 次提交
    • M
      [NET]: Add ECN support for TSO · b0da8537
      Michael Chan 提交于
      In the current TSO implementation, NETIF_F_TSO and ECN cannot be
      turned on together in a TCP connection.  The problem is that most
      hardware that supports TSO does not handle CWR correctly if it is set
      in the TSO packet.  Correct handling requires CWR to be set in the
      first packet only if it is set in the TSO header.
      
      This patch adds the ability to turn on NETIF_F_TSO and ECN using
      GSO if necessary to handle TSO packets with CWR set.  Hardware
      that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev->
      features flag.
      
      All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set.  If
      the output device does not have the NETIF_F_TSO_ECN feature set, GSO
      will split the packet up correctly with CWR only set in the first
      segment.
      
      With help from Herbert Xu <herbert@gondor.apana.org.au>.
      
      Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock
      flag is completely removed.
      Signed-off-by: NMichael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0da8537
    • H
      [NET]: Fix logical error in skb_gso_ok · d6b4991a
      Herbert Xu 提交于
      The test in skb_gso_ok is backwards.  Noticed by Michael Chan
      <mchan@broadcom.com>.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NMichael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6b4991a
    • H
      [NET]: Added GSO header verification · 576a30eb
      Herbert Xu 提交于
      When GSO packets come from an untrusted source (e.g., a Xen guest domain),
      we need to verify the header integrity before passing it to the hardware.
      
      Since the first step in GSO is to verify the header, we can reuse that
      code by adding a new bit to gso_type: SKB_GSO_DODGY.  Packets with this
      bit set can only be fed directly to devices with the corresponding bit
      NETIF_F_GSO_ROBUST.  If the device doesn't have that bit, then the skb
      is fed to the GSO engine which will allow the packet to be sent to the
      hardware if it passes the header check.
      
      This patch changes the sg flag to a full features flag.  The same method
      can be used to implement TSO ECN support.  We simply have to mark packets
      with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
      NETIF_F_TSO_ECN can accept them.  The GSO engine can either fully segment
      the packet, or segment the first MTU and pass the rest to the hardware for
      further segmentation.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      576a30eb
  4. 26 6月, 2006 1 次提交
  5. 23 6月, 2006 3 次提交
    • H
      [NET]: Added GSO toggle · 37c3185a
      Herbert Xu 提交于
      This patch adds a generic segmentation offload toggle that can be turned
      on/off for each net device.  For now it only supports in TCPv4.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37c3185a
    • H
      [NET]: Add generic segmentation offload · f6a78bfc
      Herbert Xu 提交于
      This patch adds the infrastructure for generic segmentation offload.
      The idea is to tap into the potential savings of TSO without hardware
      support by postponing the allocation of segmented skb's until just
      before the entry point into the NIC driver.
      
      The same structure can be used to support software IPv6 TSO, as well as
      UFO and segmentation offload for other relevant protocols, e.g., DCCP.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6a78bfc
    • H
      [NET]: Merge TSO/UFO fields in sk_buff · 7967168c
      Herbert Xu 提交于
      Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
      going to scale if we add any more segmentation methods (e.g., DCCP).  So
      let's merge them.
      
      They were used to tell the protocol of a packet.  This function has been
      subsumed by the new gso_type field.  This is essentially a set of netdev
      feature bits (shifted by 16 bits) that are required to process a specific
      skb.  As such it's easy to tell whether a given device can process a GSO
      skb: you just have to and the gso_type field and the netdev's features
      field.
      
      I've made gso_type a conjunction.  The idea is that you have a base type
      (e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
      For example, if we add a hardware TSO type that supports ECN, they would
      declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
      have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
      packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
      to be emulated in software.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7967168c
  6. 20 6月, 2006 1 次提交
    • H
      [NET]: Prevent multiple qdisc runs · 48d83325
      Herbert Xu 提交于
      Having two or more qdisc_run's contend against each other is bad because
      it can induce packet reordering if the packets have to be requeued.  It
      appears that this is an unintended consequence of relinquinshing the queue
      lock while transmitting.  That in turn is needed for devices that spend a
      lot of time in their transmit routine.
      
      There are no advantages to be had as devices with queues are inherently
      single-threaded (the loopback device is not but then it doesn't have a
      queue).
      
      Even if you were to add a queue to a parallel virtual device (e.g., bolt
      a tbf filter in front of an ipip tunnel device), you would still want to
      process the queue in sequence to ensure that the packets are ordered
      correctly.
      
      The solution here is to steal a bit from net_device to prevent this.
      
      BTW, as qdisc_restart is no longer used by anyone as a module inside the
      kernel (IIRC it used to with netif_wake_queue), I have not exported the
      new __qdisc_run function.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48d83325
  7. 18 6月, 2006 3 次提交
    • H
      [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM · 8648b305
      Herbert Xu 提交于
      The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
      identically so we test for them in quite a few places.  For the sake
      of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two.  We
      also test the disjunct of NETIF_F_IP_CSUM and the other two in various
      places, for that purpose I've added NETIF_F_ALL_CSUM.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8648b305
    • H
      [NET]: Add netif_tx_lock · 932ff279
      Herbert Xu 提交于
      Various drivers use xmit_lock internally to synchronise with their
      transmission routines.  They do so without setting xmit_lock_owner.
      This is fine as long as netpoll is not in use.
      
      With netpoll it is possible for deadlocks to occur if xmit_lock_owner
      isn't set.  This is because if a printk occurs while xmit_lock is held
      and xmit_lock_owner is not set can cause netpoll to attempt to take
      xmit_lock recursively.
      
      While it is possible to resolve this by getting netpoll to use
      trylock, it is suboptimal because netpoll's sole objective is to
      maximise the chance of getting the printk out on the wire.  So
      delaying or dropping the message is to be avoided as much as possible.
      
      So the only alternative is to always set xmit_lock_owner.  The
      following patch does this by introducing the netif_tx_lock family of
      functions that take care of setting/unsetting xmit_lock_owner.
      
      I renamed xmit_lock to _xmit_lock to indicate that it should not be
      used directly.  I didn't provide irq versions of the netif_tx_lock
      functions since xmit_lock is meant to be a BH-disabling lock.
      
      This is pretty much a straight text substitution except for a small
      bug fix in winbond.  It currently uses
      netif_stop_queue/spin_unlock_wait to stop transmission.  This is
      unsafe as an IRQ can potentially wake up the queue.  So it is safer to
      use netif_tx_disable.
      
      The hamradio bits used spin_lock_irq but it is unnecessary as
      xmit_lock must never be taken in an IRQ handler.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      932ff279
    • C
      [I/OAT]: Setup the networking subsystem as a DMA client · db217334
      Chris Leech 提交于
      Attempts to allocate per-CPU DMA channels
      Signed-off-by: NChris Leech <christopher.leech@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db217334
  8. 11 5月, 2006 1 次提交
  9. 09 5月, 2006 1 次提交
  10. 07 5月, 2006 1 次提交
  11. 26 4月, 2006 2 次提交
  12. 30 3月, 2006 1 次提交
    • D
      [NET]: Deinline some larger functions from netdevice.h · 56079431
      Denis Vlasenko 提交于
      On a allyesconfig'ured kernel:
      
      Size  Uses Wasted Name and definition
      ===== ==== ====== ================================================
         95  162  12075 netif_wake_queue      include/linux/netdevice.h
        129   86   9265 dev_kfree_skb_any     include/linux/netdevice.h
        127   56   5885 netif_device_attach   include/linux/netdevice.h
         73   86   4505 dev_kfree_skb_irq     include/linux/netdevice.h
         46   60   1534 netif_device_detach   include/linux/netdevice.h
        119   16   1485 __netif_rx_schedule   include/linux/netdevice.h
        143    5    492 netif_rx_schedule     include/linux/netdevice.h
         81    7    366 netif_schedule        include/linux/netdevice.h
      
      netif_wake_queue is big because __netif_schedule is a big inline:
      
      static inline void __netif_schedule(struct net_device *dev)
      {
              if (!test_and_set_bit(__LINK_STATE_SCHED, &dev->state)) {
                      unsigned long flags;
                      struct softnet_data *sd;
      
                      local_irq_save(flags);
                      sd = &__get_cpu_var(softnet_data);
                      dev->next_sched = sd->output_queue;
                      sd->output_queue = dev;
                      raise_softirq_irqoff(NET_TX_SOFTIRQ);
                      local_irq_restore(flags);
              }
      }
      
      static inline void netif_wake_queue(struct net_device *dev)
      {
      #ifdef CONFIG_NETPOLL_TRAP
              if (netpoll_trap())
                      return;
      #endif
              if (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state))
                      __netif_schedule(dev);
      }
      
      By de-inlining __netif_schedule we are saving a lot of text
      at each callsite of netif_wake_queue and netif_schedule.
      __netif_rx_schedule is also big, and it makes more sense to keep
      both of them out of line.
      
      Patch also deinlines dev_kfree_skb_any. We can deinline dev_kfree_skb_irq
      instead... oh well.
      
      netif_device_attach/detach are not hot paths, we can deinline them too.
      Signed-off-by: NDenis Vlasenko <vda@ilport.com.ua>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56079431
  13. 21 3月, 2006 2 次提交
  14. 01 12月, 2005 1 次提交
  15. 14 11月, 2005 1 次提交
  16. 11 11月, 2005 1 次提交
    • H
      [NET]: Detect hardware rx checksum faults correctly · fb286bb2
      Herbert Xu 提交于
      Here is the patch that introduces the generic skb_checksum_complete
      which also checks for hardware RX checksum faults.  If that happens,
      it'll call netdev_rx_csum_fault which currently prints out a stack
      trace with the device name.  In future it can turn off RX checksum.
      
      I've converted every spot under net/ that does RX checksum checks to
      use skb_checksum_complete or __skb_checksum_complete with the
      exceptions of:
      
      * Those places where checksums are done bit by bit.  These will call
      netdev_rx_csum_fault directly.
      
      * The following have not been completely checked/converted:
      
      ipmr
      ip_vs
      netfilter
      dccp
      
      This patch is based on patches and suggestions from Stephen Hemminger
      and David S. Miller.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb286bb2
  17. 29 10月, 2005 1 次提交
    • A
      [IPv4/IPv6]: UFO Scatter-gather approach · e89e9cf5
      Ananda Raju 提交于
      Attached is kernel patch for UDP Fragmentation Offload (UFO) feature.
      
      1. This patch incorporate the review comments by Jeff Garzik.
      2. Renamed USO as UFO (UDP Fragmentation Offload)
      3. udp sendfile support with UFO
      
      This patches uses scatter-gather feature of skb to generate large UDP
      datagram. Below is a "how-to" on changes required in network device
      driver to use the UFO interface.
      
      UDP Fragmentation Offload (UFO) Interface:
      -------------------------------------------
      UFO is a feature wherein the Linux kernel network stack will offload the
      IP fragmentation functionality of large UDP datagram to hardware. This
      will reduce the overhead of stack in fragmenting the large UDP datagram to
      MTU sized packets
      
      1) Drivers indicate their capability of UFO using
      dev->features |= NETIF_F_UFO | NETIF_F_HW_CSUM | NETIF_F_SG
      
      NETIF_F_HW_CSUM is required for UFO over ipv6.
      
      2) UFO packet will be submitted for transmission using driver xmit routine.
      UFO packet will have a non-zero value for
      
      "skb_shinfo(skb)->ufo_size"
      
      skb_shinfo(skb)->ufo_size will indicate the length of data part in each IP
      fragment going out of the adapter after IP fragmentation by hardware.
      
      skb->data will contain MAC/IP/UDP header and skb_shinfo(skb)->frags[]
      contains the data payload. The skb->ip_summed will be set to CHECKSUM_HW
      indicating that hardware has to do checksum calculation. Hardware should
      compute the UDP checksum of complete datagram and also ip header checksum of
      each fragmented IP packet.
      
      For IPV6 the UFO provides the fragment identification-id in
      skb_shinfo(skb)->ip6_frag_id. The adapter should use this ID for generating
      IPv6 fragments.
      Signed-off-by: NAnanda Raju <ananda.raju@neterion.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (forwarded)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      e89e9cf5
  18. 28 9月, 2005 1 次提交
    • E
      [NET]: Reorder some hot fields of struct net_device · 9356b8fc
      Eric Dumazet 提交于
      Place them on separate cache lines in SMP to lower memory bouncing
      between multiple CPU accessing the device.
      
           - One part is mostly used on receive path (including
             eth_type_trans()) (poll_list, poll, quota, weight, last_rx,
             dev_addr, broadcast)
      
           - One part is mostly used on queue transmit path (qdisc)
            (queue_lock, qdisc, qdisc_sleeping, qdisc_list, tx_queue_len)
      
           - One part is mostly used on xmit path (device)
            (xmit_lock, xmit_lock_owner, priv, hard_start_xmit, trans_start)
      
      'features' is placed outside of these hot points, in a location that
      may be shared by all cpus (because mostly read)
      
      name_hlist is moved close to name[IFNAMSIZ] to speedup __dev_get_by_name()
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9356b8fc
  19. 16 9月, 2005 1 次提交
    • S
      [PATCH] sky2: driver update. · 793b883e
      Stephen Hemminger 提交于
      Here is revised patch against netdev sky2 branch.
      It includes whitespace fixes, all the changes from the previous
      review as well as some optimizations and timing fixes to
      solve some of the hangs.
      
      The stall problem is better but not perfect. It appears that
      under stress the chip can't keep up with the bus
      and sends a pause frame, then hangs. This version is for
      testing, and hopefully other eyes might see the root
      cause of the problem.
      
      I don't want to reinvent the ugly watchdog code in the syskonnect
      version of sk98lin.  If you read it you will see, the original
      driver writer and the hardware developer obviously didn't
      understand each other.
      
      Dual port support is included, but not tested yet. It did
      require small change to NAPI since both ports share same
      IRQ.
      Signed-off-by: NJeff Garzik <jgarzik@pobox.com>
      793b883e
  20. 14 9月, 2005 1 次提交
  21. 30 8月, 2005 4 次提交
  22. 24 6月, 2005 4 次提交
  23. 23 6月, 2005 1 次提交
    • J
      [NETPOLL]: Introduce a netpoll_info struct · 115c1d6e
      Jeff Moyer 提交于
      This patch introduces a netpoll_info structure, which the struct net_device
      will now point to instead of pointing to a struct netpoll.  The reason for
      this is two-fold: 1) fields such as the rx_flags, poll_owner, and poll_lock
      should be maintained per net_device, not per netpoll;  and 2) this is a first
      step in providing support for multiple netpoll clients to register against the
      same net_device.
      
      The struct netpoll is now pointed to by the netpoll_info structure.  As
      such, the previous behaviour of the code is preserved.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      115c1d6e
  24. 03 6月, 2005 1 次提交
  25. 30 5月, 2005 1 次提交