1. 09 12月, 2006 1 次提交
  2. 04 12月, 2006 1 次提交
  3. 03 12月, 2006 3 次提交
    • A
      [NET]: Pack struct hh_cache · d5c42c0e
      Arnaldo Carvalho de Melo 提交于
      [acme@newtoy net-2.6.20]$ pahole net/ipv4/tcp.o hh_cache
      /* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/linux/netdevice.h:190 */
      struct hh_cache {
              struct hh_cache *          hh_next;              /*     0     4 */
              atomic_t                   hh_refcnt;            /*     4     4 */
              __be16                     hh_type;              /*     8     2 */
      
              /* XXX 2 bytes hole, try to pack */
      
              int                        hh_len;               /*    12     4 */
              int                        (*hh_output)();       /*    16     4 */
              rwlock_t                   hh_lock;              /*    20    36 */
              long unsigned int          hh_data[24];          /*    56    96 */
      }; /* size: 152, sum members: 150, holes: 1, sum holes: 2 */
      
      [acme@newtoy net-2.6.20]$ find net -name "*.[ch]" | xargs grep 'hh_len.\+=' | sort -u
      net/atm/br2684.c:               hh->hh_len = PADLEN + ETH_HLEN;
      net/ethernet/eth.c:     hh->hh_len = ETH_HLEN;
      net/ipv4/ipconfig.c:    int hh_len = LL_RESERVED_SPACE(dev);
      net/ipv4/ip_output.c:   hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
      net/ipv4/ip_output.c:   int hh_len = LL_RESERVED_SPACE(dev);
      net/ipv4/netfilter.c:   hh_len = (*pskb)->dst->dev->hard_header_len;
      net/ipv4/raw.c: hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
      net/ipv6/ip6_output.c:  hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
      net/ipv6/netfilter/ip6t_REJECT.c:       hh_len = (dst->dev->hard_header_len + 15)&~15;
      net/ipv6/raw.c: hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
      [acme@newtoy net-2.6.20]$
      
      [acme@newtoy net-2.6.20]$ find include -name "*.h" | xargs grep 'define ETH_HLEN'
      include/linux/if_ether.h:#define ETH_HLEN       14              /* Total octets in header.       */
      
              (((dev)->hard_header_len&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
      
      [acme@newtoy net-2.6.20]$ pahole net/ipv4/tcp.o net_device | grep hard_header_len
              short unsigned int         hard_header_len;      /*   106     2 */
      [acme@newtoy net-2.6.20]$
      
      So I think we're safe in turning hh_len an u16, end result:
      
      [acme@newtoy net-2.6.20]$ codiff -sV /tmp/tcp.o.before net/ipv4/tcp.o
      /pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp.c:
        struct hh_cache |   -4
          hh_len;
           from: int                   /*    12(0)     4(0) */
           to:   u16                   /*    10(0)     2(0) */
       1 struct changed
      [acme@newtoy net-2.6.20]$
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      d5c42c0e
    • G
      [TCP/DCCP]: Introduce net_xmit_eval · b9df3cb8
      Gerrit Renker 提交于
      Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
      return code of a transmit function needs to be tested against NET_XMIT_CN
      which is a value that does not indicate a strict error condition.
      
      This patch uses a macro for these recurring situations which is consistent
      with the already existing macro net_xmit_errno, saving on duplicated code.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      b9df3cb8
    • A
      [NET]: The scheduled removal of the frame diverter. · 90833aa4
      Adrian Bunk 提交于
      This patch contains the scheduled removal of the frame diverter.
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90833aa4
  4. 29 11月, 2006 1 次提交
    • D
      [NET]: Fix MAX_HEADER setting. · e81c7359
      David S. Miller 提交于
      MAX_HEADER is either set to LL_MAX_HEADER or LL_MAX_HEADER + 48, and
      this is controlled by a set of CONFIG_* ifdef tests.
      
      It is trying to use LL_MAX_HEADER + 48 when any of the tunnels are
      enabled which set hard_header_len like this:
      
      dev->hard_header_len = LL_MAX_HEADER + sizeof(struct xxx);
      
      The correct set of tunnel drivers which do this are:
      
      ipip
      ip_gre
      ip6_tunnel
      sit
      
      so make the ifdef test match.
      
      Noticed by Patrick McHardy and with help from Herbert Xu.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e81c7359
  5. 29 9月, 2006 1 次提交
  6. 26 9月, 2006 2 次提交
    • J
      [PATCH] bonding: Validate probe replies in ARP monitor · f5b2b966
      Jay Vosburgh 提交于
      	Add logic to check ARP request / reply packets used for ARP
      monitor link integrity checking.
      
      	The current method simply examines the slave device to see if it
      has sent and received traffic; this can be fooled by extraneous traffic.
      For example, if multiple hosts running bonding are behind a common
      switch, the probe traffic from the multiple instances of bonding will
      update the tx/rx times on each other's slave devices.
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      f5b2b966
    • J
      [PATCH] WE-21 support (core API) · baef1865
      John W. Linville 提交于
      This is version 21 of the Wireless Extensions. Changelog :
      	o finishes migrating the ESSID API (remove the +1)
      	o netdev->get_wireless_stats is no more
      	o long/short retry
      
      This is a redacted version of a patch originally submitted by Jean
      Tourrilhes.  I removed most of the additions, in order to minimize
      future support requirements for nl80211 (or other WE successor).
      
      CC: Jean Tourrilhes <jt@hpl.hp.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      baef1865
  7. 23 9月, 2006 1 次提交
  8. 14 9月, 2006 1 次提交
  9. 18 8月, 2006 2 次提交
  10. 22 7月, 2006 1 次提交
  11. 09 7月, 2006 2 次提交
  12. 01 7月, 2006 2 次提交
    • H
      [IPV6]: Added GSO support for TCPv6 · f83ef8c0
      Herbert Xu 提交于
      This patch adds GSO support for IPv6 and TCPv6.  This is based on a patch
      by Ananda Raju <Ananda.Raju@neterion.com>.  His original description is:
      
      	This patch enables TSO over IPv6. Currently Linux network stacks
      	restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from
      	"dev->features". This patch will remove this restriction.
      
      	This patch will introduce a new flag NETIF_F_TSO6 which will be used
      	to check whether device supports TSO over IPv6. If device support TSO
      	over IPv6 then we don't clear of NETIF_F_TSO and which will make the
      	TCP layer to create TSO packets. Any device supporting TSO over IPv6
      	will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO.
      
      	In case when user disables TSO using ethtool, NETIF_F_TSO will get
      	cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't
      	get TSO packets created by TCP layer.
      
      	SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet.
      	SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature.
      	UFO is supported over IPv6 also
      
      	The following table shows there is significant improvement in
      	throughput with normal frames and CPU usage for both normal and jumbo.
      
      	--------------------------------------------------
      	|          |     1500        |      9600         |
      	|          ------------------|-------------------|
      	|          | thru     CPU    |  thru     CPU     |
      	--------------------------------------------------
      	| TSO OFF  | 2.00   5.5% id  |  5.66   20.0% id  |
      	--------------------------------------------------
      	| TSO ON   | 2.63   78.0 id  |  5.67   39.0% id  |
      	--------------------------------------------------
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f83ef8c0
    • H
      [NET]: Generalise TSO-specific bits from skb_setup_caps · bcd76111
      Herbert Xu 提交于
      This patch generalises the TSO-specific bits from sk_setup_caps by adding
      the sk_gso_type member to struct sock.  This makes sk_setup_caps generic
      so that it can be used by TCPv6 or UFO.
      
      The only catch is that whoever uses this must provide a GSO implementation
      for their protocol which I think is a fair deal :) For now UFO continues to
      live without a GSO implementation which is OK since it doesn't use the sock
      caps field at the moment.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcd76111
  13. 30 6月, 2006 3 次提交
    • M
      [NET]: Add ECN support for TSO · b0da8537
      Michael Chan 提交于
      In the current TSO implementation, NETIF_F_TSO and ECN cannot be
      turned on together in a TCP connection.  The problem is that most
      hardware that supports TSO does not handle CWR correctly if it is set
      in the TSO packet.  Correct handling requires CWR to be set in the
      first packet only if it is set in the TSO header.
      
      This patch adds the ability to turn on NETIF_F_TSO and ECN using
      GSO if necessary to handle TSO packets with CWR set.  Hardware
      that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev->
      features flag.
      
      All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set.  If
      the output device does not have the NETIF_F_TSO_ECN feature set, GSO
      will split the packet up correctly with CWR only set in the first
      segment.
      
      With help from Herbert Xu <herbert@gondor.apana.org.au>.
      
      Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock
      flag is completely removed.
      Signed-off-by: NMichael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0da8537
    • H
      [NET]: Fix logical error in skb_gso_ok · d6b4991a
      Herbert Xu 提交于
      The test in skb_gso_ok is backwards.  Noticed by Michael Chan
      <mchan@broadcom.com>.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NMichael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6b4991a
    • H
      [NET]: Added GSO header verification · 576a30eb
      Herbert Xu 提交于
      When GSO packets come from an untrusted source (e.g., a Xen guest domain),
      we need to verify the header integrity before passing it to the hardware.
      
      Since the first step in GSO is to verify the header, we can reuse that
      code by adding a new bit to gso_type: SKB_GSO_DODGY.  Packets with this
      bit set can only be fed directly to devices with the corresponding bit
      NETIF_F_GSO_ROBUST.  If the device doesn't have that bit, then the skb
      is fed to the GSO engine which will allow the packet to be sent to the
      hardware if it passes the header check.
      
      This patch changes the sg flag to a full features flag.  The same method
      can be used to implement TSO ECN support.  We simply have to mark packets
      with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
      NETIF_F_TSO_ECN can accept them.  The GSO engine can either fully segment
      the packet, or segment the first MTU and pass the rest to the hardware for
      further segmentation.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      576a30eb
  14. 26 6月, 2006 1 次提交
  15. 23 6月, 2006 3 次提交
    • H
      [NET]: Added GSO toggle · 37c3185a
      Herbert Xu 提交于
      This patch adds a generic segmentation offload toggle that can be turned
      on/off for each net device.  For now it only supports in TCPv4.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37c3185a
    • H
      [NET]: Add generic segmentation offload · f6a78bfc
      Herbert Xu 提交于
      This patch adds the infrastructure for generic segmentation offload.
      The idea is to tap into the potential savings of TSO without hardware
      support by postponing the allocation of segmented skb's until just
      before the entry point into the NIC driver.
      
      The same structure can be used to support software IPv6 TSO, as well as
      UFO and segmentation offload for other relevant protocols, e.g., DCCP.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6a78bfc
    • H
      [NET]: Merge TSO/UFO fields in sk_buff · 7967168c
      Herbert Xu 提交于
      Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
      going to scale if we add any more segmentation methods (e.g., DCCP).  So
      let's merge them.
      
      They were used to tell the protocol of a packet.  This function has been
      subsumed by the new gso_type field.  This is essentially a set of netdev
      feature bits (shifted by 16 bits) that are required to process a specific
      skb.  As such it's easy to tell whether a given device can process a GSO
      skb: you just have to and the gso_type field and the netdev's features
      field.
      
      I've made gso_type a conjunction.  The idea is that you have a base type
      (e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
      For example, if we add a hardware TSO type that supports ECN, they would
      declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
      have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
      packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
      to be emulated in software.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7967168c
  16. 20 6月, 2006 1 次提交
    • H
      [NET]: Prevent multiple qdisc runs · 48d83325
      Herbert Xu 提交于
      Having two or more qdisc_run's contend against each other is bad because
      it can induce packet reordering if the packets have to be requeued.  It
      appears that this is an unintended consequence of relinquinshing the queue
      lock while transmitting.  That in turn is needed for devices that spend a
      lot of time in their transmit routine.
      
      There are no advantages to be had as devices with queues are inherently
      single-threaded (the loopback device is not but then it doesn't have a
      queue).
      
      Even if you were to add a queue to a parallel virtual device (e.g., bolt
      a tbf filter in front of an ipip tunnel device), you would still want to
      process the queue in sequence to ensure that the packets are ordered
      correctly.
      
      The solution here is to steal a bit from net_device to prevent this.
      
      BTW, as qdisc_restart is no longer used by anyone as a module inside the
      kernel (IIRC it used to with netif_wake_queue), I have not exported the
      new __qdisc_run function.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48d83325
  17. 18 6月, 2006 3 次提交
    • H
      [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM · 8648b305
      Herbert Xu 提交于
      The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
      identically so we test for them in quite a few places.  For the sake
      of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two.  We
      also test the disjunct of NETIF_F_IP_CSUM and the other two in various
      places, for that purpose I've added NETIF_F_ALL_CSUM.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8648b305
    • H
      [NET]: Add netif_tx_lock · 932ff279
      Herbert Xu 提交于
      Various drivers use xmit_lock internally to synchronise with their
      transmission routines.  They do so without setting xmit_lock_owner.
      This is fine as long as netpoll is not in use.
      
      With netpoll it is possible for deadlocks to occur if xmit_lock_owner
      isn't set.  This is because if a printk occurs while xmit_lock is held
      and xmit_lock_owner is not set can cause netpoll to attempt to take
      xmit_lock recursively.
      
      While it is possible to resolve this by getting netpoll to use
      trylock, it is suboptimal because netpoll's sole objective is to
      maximise the chance of getting the printk out on the wire.  So
      delaying or dropping the message is to be avoided as much as possible.
      
      So the only alternative is to always set xmit_lock_owner.  The
      following patch does this by introducing the netif_tx_lock family of
      functions that take care of setting/unsetting xmit_lock_owner.
      
      I renamed xmit_lock to _xmit_lock to indicate that it should not be
      used directly.  I didn't provide irq versions of the netif_tx_lock
      functions since xmit_lock is meant to be a BH-disabling lock.
      
      This is pretty much a straight text substitution except for a small
      bug fix in winbond.  It currently uses
      netif_stop_queue/spin_unlock_wait to stop transmission.  This is
      unsafe as an IRQ can potentially wake up the queue.  So it is safer to
      use netif_tx_disable.
      
      The hamradio bits used spin_lock_irq but it is unnecessary as
      xmit_lock must never be taken in an IRQ handler.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      932ff279
    • C
      [I/OAT]: Setup the networking subsystem as a DMA client · db217334
      Chris Leech 提交于
      Attempts to allocate per-CPU DMA channels
      Signed-off-by: NChris Leech <christopher.leech@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db217334
  18. 11 5月, 2006 1 次提交
  19. 09 5月, 2006 1 次提交
  20. 07 5月, 2006 1 次提交
  21. 26 4月, 2006 2 次提交
  22. 30 3月, 2006 1 次提交
    • D
      [NET]: Deinline some larger functions from netdevice.h · 56079431
      Denis Vlasenko 提交于
      On a allyesconfig'ured kernel:
      
      Size  Uses Wasted Name and definition
      ===== ==== ====== ================================================
         95  162  12075 netif_wake_queue      include/linux/netdevice.h
        129   86   9265 dev_kfree_skb_any     include/linux/netdevice.h
        127   56   5885 netif_device_attach   include/linux/netdevice.h
         73   86   4505 dev_kfree_skb_irq     include/linux/netdevice.h
         46   60   1534 netif_device_detach   include/linux/netdevice.h
        119   16   1485 __netif_rx_schedule   include/linux/netdevice.h
        143    5    492 netif_rx_schedule     include/linux/netdevice.h
         81    7    366 netif_schedule        include/linux/netdevice.h
      
      netif_wake_queue is big because __netif_schedule is a big inline:
      
      static inline void __netif_schedule(struct net_device *dev)
      {
              if (!test_and_set_bit(__LINK_STATE_SCHED, &dev->state)) {
                      unsigned long flags;
                      struct softnet_data *sd;
      
                      local_irq_save(flags);
                      sd = &__get_cpu_var(softnet_data);
                      dev->next_sched = sd->output_queue;
                      sd->output_queue = dev;
                      raise_softirq_irqoff(NET_TX_SOFTIRQ);
                      local_irq_restore(flags);
              }
      }
      
      static inline void netif_wake_queue(struct net_device *dev)
      {
      #ifdef CONFIG_NETPOLL_TRAP
              if (netpoll_trap())
                      return;
      #endif
              if (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state))
                      __netif_schedule(dev);
      }
      
      By de-inlining __netif_schedule we are saving a lot of text
      at each callsite of netif_wake_queue and netif_schedule.
      __netif_rx_schedule is also big, and it makes more sense to keep
      both of them out of line.
      
      Patch also deinlines dev_kfree_skb_any. We can deinline dev_kfree_skb_irq
      instead... oh well.
      
      netif_device_attach/detach are not hot paths, we can deinline them too.
      Signed-off-by: NDenis Vlasenko <vda@ilport.com.ua>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56079431
  23. 21 3月, 2006 2 次提交
  24. 01 12月, 2005 1 次提交
  25. 14 11月, 2005 1 次提交
  26. 11 11月, 2005 1 次提交
    • H
      [NET]: Detect hardware rx checksum faults correctly · fb286bb2
      Herbert Xu 提交于
      Here is the patch that introduces the generic skb_checksum_complete
      which also checks for hardware RX checksum faults.  If that happens,
      it'll call netdev_rx_csum_fault which currently prints out a stack
      trace with the device name.  In future it can turn off RX checksum.
      
      I've converted every spot under net/ that does RX checksum checks to
      use skb_checksum_complete or __skb_checksum_complete with the
      exceptions of:
      
      * Those places where checksums are done bit by bit.  These will call
      netdev_rx_csum_fault directly.
      
      * The following have not been completely checked/converted:
      
      ipmr
      ip_vs
      netfilter
      dccp
      
      This patch is based on patches and suggestions from Stephen Hemminger
      and David S. Miller.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb286bb2