1. 20 12月, 2008 1 次提交
  2. 18 12月, 2008 3 次提交
  3. 17 12月, 2008 1 次提交
  4. 16 12月, 2008 11 次提交
    • Y
      ipv6: Add IPV6_PKTINFO sticky option support to setsockopt() · b24a2516
      Yang Hongyang 提交于
      There are three reasons for me to add this support:
      1.When no interface is specified in an IPV6_PKTINFO ancillary data
        item, the interface specified in an IPV6_PKTINFO sticky optionis 
        is used.
      
      RFC3542:
      6.7.  Summary of Outgoing Interface Selection
      
         This document and [RFC-3493] specify various methods that affect the
         selection of the packet's outgoing interface.  This subsection
         summarizes the ordering among those in order to ensure deterministic
         behavior.
      
         For a given outgoing packet on a given socket, the outgoing interface
         is determined in the following order:
      
         1. if an interface is specified in an IPV6_PKTINFO ancillary data
            item, the interface is used.
      
         2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky
            option, the interface is used.
      
      2.When no IPV6_PKTINFO ancillary data is received,getsockopt() should 
        return the sticky option value which set with setsockopt().
      
      RFC 3542:
         Issuing getsockopt() for the above options will return the sticky
         option value i.e., the value set with setsockopt().  If no sticky
         option value has been set getsockopt() will return the following
         values:
      
      3.Make the setsockopt implementation POSIX compliant.
      Signed-off-by: NYang Hongyang <yanghy@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b24a2516
    • S
      net: Refactor full duplex flow control resolution · bc02ff95
      Steve Glendinning 提交于
      These 4 drivers have identical full duplex flow control resolution
      functions.  This patch changes them all to use one common function.
      
      The function in question decides whether a device should enable TX and
      RX flow control in a standard way (IEEE 802.3-2005 table 28B-3), so this
      should also be useful for other drivers.
      Signed-off-by: NSteve Glendinning <steve.glendinning@smsc.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc02ff95
    • S
      net: Move flow control definitions to mii.h · e18ce346
      Steve Glendinning 提交于
      flags used within drivers for indicating tx and rx flow control are
      defined in 4 drivers (and probably more), move these constants to mii.h.
      
      The 3 SMSC drivers use the same constants (FLOW_CTRL_TX), but TG3 uses
      TG3_FLOW_CTRL_TX, so this patch also renames the constants within TG3.
      Signed-off-by: NSteve Glendinning <steve.glendinning@smsc.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e18ce346
    • P
      netfilter: ctnetlink: fix missing CTA_NAT_SEQ_UNSPEC · 092cab7e
      Pablo Neira Ayuso 提交于
      This patch fixes an inconsistency in nfnetlink_conntrack.h that
      I introduced myself. The problem is that CTA_NAT_SEQ_UNSPEC is
      missing from enum ctattr_natseq. This inconsistency may lead to
      problems in the message parsing in userspace (if the message
      contains the CTA_NAT_SEQ_* attributes, of course).
      
      This patch breaks backward compatibility, however, the only known
      client of this code is libnetfilter_conntrack which indeed crashes
      because it assumes the existence of CTA_NAT_SEQ_UNSPEC to do
      the parsing.
      
      The CTA_NAT_SEQ_* attributes were introduced in 2.6.25.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      092cab7e
    • H
      ethtool: Add GGRO and SGRO ops · b240a0e5
      Herbert Xu 提交于
      This patch adds the ethtool ops to enable and disable GRO.  It also
      makes GRO depend on RX checksum offload much the same as how TSO
      depends on SG support.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b240a0e5
    • H
      tcp: Add GRO support · bf296b12
      Herbert Xu 提交于
      This patch adds the TCP-specific portion of GRO.  The criterion for
      merging is extremely strict (the TCP header must match exactly apart
      from the checksum) so as to allow refragmentation.  Otherwise this
      is pretty much identical to LRO, except that we support the merging
      of ECN packets.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf296b12
    • H
      net: Add skb_gro_receive · 71d93b39
      Herbert Xu 提交于
      This patch adds the helper skb_gro_receive to merge packets for
      GRO.  The current method is to allocate a new header skb and then
      chain the original packets to its frag_list.  This is done to
      make it easier to integrate into the existing GSO framework.
      
      In future as GSO is moved into the drivers, we can undo this and
      simply chain the original packets together.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      71d93b39
    • H
      ipv4: Add GRO infrastructure · 73cc19f1
      Herbert Xu 提交于
      This patch adds GRO support for IPv4.
      
      The criteria for merging is more stringent than LRO, in particular,
      we require all fields in the IP header to be identical except for
      the length, ID and checksum.  In addition, the ID must form an
      arithmetic sequence with a difference of one.
      
      The ID requirement might seem overly strict, however, most hardware
      TSO solutions already obey this rule.  Linux itself also obeys this
      whether GSO is in use or not.
      
      In future we could relax this rule by storing the IDs (or rather
      making sure that we don't drop them when pulling the aggregate
      skb's tail).
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73cc19f1
    • H
      net: Add Generic Receive Offload infrastructure · d565b0a1
      Herbert Xu 提交于
      This patch adds the top-level GRO (Generic Receive Offload) infrastructure.
      This is pretty similar to LRO except that this is protocol-independent.
      Instead of holding packets in an lro_mgr structure, they're now held in
      napi_struct.
      
      For drivers that intend to use this, they can set the NETIF_F_GRO bit and
      call napi_gro_receive instead of netif_receive_skb or just call netif_rx.
      The latter will call napi_receive_skb automatically.  When napi_gro_receive
      is used, the driver must either call napi_complete/napi_rx_complete, or
      call napi_gro_flush in softirq context if the driver uses the primitives
      __napi_complete/__napi_rx_complete.
      
      Protocols will set the gro_receive and gro_complete function pointers in
      order to participate in this scheme.
      
      In addition to the packet, gro_receive will get a list of currently held
      packets.  Each packet in the list has a same_flow field which is non-zero
      if it is a potential match for the new packet.  For each packet that may
      match, they also have a flush field which is non-zero if the held packet
      must not be merged with the new packet.
      
      Once gro_receive has determined that the new skb matches a held packet,
      the held packet may be processed immediately if the new skb cannot be
      merged with it.  In this case gro_receive should return the pointer to
      the existing skb in gro_list.  Otherwise the new skb should be merged into
      the existing packet and NULL should be returned, unless the new skb makes
      it impossible for any further merges to be made (e.g., FIN packet) where
      the merged skb should be returned.
      
      Whenever the skb is merged into an existing entry, the gro_receive
      function should set NAPI_GRO_CB(skb)->same_flow.  Note that if an skb
      merely matches an existing entry but can't be merged with it, then
      this shouldn't be set.
      
      If gro_receive finds it pointless to hold the new skb for future merging,
      it should set NAPI_GRO_CB(skb)->flush.
      
      Held packets will be flushed by napi_gro_flush which is called by
      napi_complete and napi_rx_complete.
      
      Currently held packets are stored in a singly liked list just like LRO.
      The list is limited to a maximum of 8 entries.  In future, this may be
      expanded to use a hash table to allow more flows to be held for merging.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d565b0a1
    • H
      net: Add frag_list support to GSO · 1a881f27
      Herbert Xu 提交于
      This patch allows GSO to handle frag_list in a limited way for the
      purposes of allowing packets merged by GRO to be refragmented on
      output.
      
      Most hardware won't (and aren't expected to) support handling GRO
      frag_list packets directly.  Therefore we will perform GSO in
      software for those cases.
      
      However, for drivers that can support it (such as virtual NICs) we
      may not have to segment the packets at all.
      
      Whether the added overhead of GRO/GSO is worthwhile for bridges
      and routers when weighed against the benefit of potentially
      increasing the MTU within the host is still an open question.
      However, for the case of host nodes this is undoubtedly a win.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a881f27
    • R
      Define smp_call_function_many for UP · d2ff9118
      Rusty Russell 提交于
      Otherwise those using it in transition patches (eg. kvm) can't compile
      with CONFIG_SMP=n:
      
      arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function 'make_all_cpus_request':
      arch/x86/kvm/../../../virt/kvm/kvm_main.c:380: error: implicit declaration of function 'smp_call_function_many'
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d2ff9118
  5. 13 12月, 2008 6 次提交
  6. 11 12月, 2008 15 次提交
  7. 10 12月, 2008 1 次提交
    • N
      netpoll: fix race on poll_list resulting in garbage entry · 7b363e44
      Neil Horman 提交于
      	A few months back a race was discused between the netpoll napi service
      path, and the fast path through net_rx_action:
      http://kerneltrap.org/mailarchive/linux-netdev/2007/10/16/345470
      
      A patch was submitted for that bug, but I think we missed a case.
      
      Consider the following scenario:
      
      INITIAL STATE
      CPU0 has one napi_struct A on its poll_list
      CPU1 is calling netpoll_send_skb and needs to call poll_napi on the same
      napi_struct A that CPU0 has on its list
      
      
      
      CPU0						CPU1
      net_rx_action					poll_napi
      !list_empty (returns true)			locks poll_lock for A
      						 poll_one_napi
      						  napi->poll
      						   netif_rx_complete
      						    __napi_complete
      						    (removes A from poll_list)
      list_entry(list->next)
      
      
      In the above scenario, net_rx_action assumes that the per-cpu poll_list is
      exclusive to that cpu.  netpoll of course violates that, and because the netpoll
      path can dequeue from the poll list, its possible for CPU0 to detect a non-empty
      list at the top of the while loop in net_rx_action, but have it become empty by
      the time it calls list_entry.  Since the poll_list isn't surrounded by any other
      structure, the returned data from that list_entry call in this situation is
      garbage, and any number of crashes can result based on what exactly that garbage
      is.
      
      Given that its not fasible for performance reasons to place exclusive locks
      arround each cpus poll list to provide that mutal exclusion, I think the best
      solution is modify the netpoll path in such a way that we continue to guarantee
      that the poll_list for a cpu is in fact exclusive to that cpu.  To do this I've
      implemented the patch below.  It adds an additional bit to the state field in
      the napi_struct.  When executing napi->poll from the netpoll_path, this bit will
      be set. When a driver calls netif_rx_complete, if that bit is set, it will not
      remove the napi_struct from the poll_list.  That work will be saved for the next
      iteration of net_rx_action.
      
      I've tested this and it seems to work well.  About the biggest drawback I can
      see to it is the fact that it might result in an extra loop through
      net_rx_action in the event that the device is actually contended for (i.e. the
      netpoll path actually preforms all the needed work no the device, and the call
      to net_rx_action winds up doing nothing, except removing the napi_struct from
      the poll_list.  However I think this is probably a small price to pay, given
      that the alternative is a crash.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b363e44
  8. 09 12月, 2008 2 次提交