1. 21 10月, 2010 3 次提交
    • S
      napi: unexport napi_reuse_skb · d0c2b0d2
      stephen hemminger 提交于
      The function napi_reuse_skb is only used inside core.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0c2b0d2
    • J
      vlan: Centralize handling of hardware acceleration. · 3701e513
      Jesse Gross 提交于
      Currently each driver that is capable of vlan hardware acceleration
      must be aware of the vlan groups that are configured and then pass
      the stripped tag to a specialized receive function.  This is
      
      different from other types of hardware offload in that it places a
      significant amount of knowledge in the driver itself rather keeping
      it in the networking core.
      
      This makes vlan offloading function more similarly to other forms
      of offloading (such as checksum offloading or TSO) by doing the
      following:
      * On receive, stripped vlans are passed directly to the network
      core, without attempting to check for vlan groups or reconstructing
      the header if no group
      * vlans are made less special by folding the logic into the main
      receive routines
      * On transmit, the device layer will add the vlan header in software
      if the hardware doesn't support it, instead of spreading that logic
      out in upper layers, such as bonding.
      
      There are a number of advantages to this:
      * Fixes all bugs with drivers incorrectly dropping vlan headers at once.
      * Avoids having to disable VLAN acceleration when in promiscuous mode
      (good for bridging since it always puts devices in promiscuous mode).
      * Keeps VLAN tag separate until given to ultimate consumer, which
      avoids needing to do header reconstruction as in tg3 unless absolutely
      necessary.
      * Consolidates common code in core networking.
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3701e513
    • J
      vlan: Enable software emulation for vlan accleration. · 7b9c6090
      Jesse Gross 提交于
      Currently users of hardware vlan accleration need to know whether
      the device supports it before generating packets.  However, vlan
      acceleration will soon be available in a more flexible manner so
      knowing ahead of time becomes much more difficult.  This adds
      a software fallback path for vlan packets on devices without the
      necessary offloading support, similar to other types of hardware
      accleration.
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b9c6090
  2. 20 10月, 2010 3 次提交
  3. 13 10月, 2010 1 次提交
    • E
      net: percpu net_device refcount · 29b4433d
      Eric Dumazet 提交于
      We tried very hard to remove all possible dev_hold()/dev_put() pairs in
      network stack, using RCU conversions.
      
      There is still an unavoidable device refcount change for every dst we
      create/destroy, and this can slow down some workloads (routers or some
      app servers, mmap af_packet)
      
      We can switch to a percpu refcount implementation, now dynamic per_cpu
      infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
      per device.
      
      On x86, dev_hold(dev) code :
      
      before
              lock    incl 0x280(%ebx)
      after:
              movl    0x260(%ebx),%eax
              incl    fs:(%eax)
      
      Stress bench :
      
      (Sending 160.000.000 UDP frames,
      IP route cache disabled, dual E5540 @2.53GHz,
      32bit kernel, FIB_TRIE)
      
      Before:
      
      real    1m1.662s
      user    0m14.373s
      sys     12m55.960s
      
      After:
      
      real    0m51.179s
      user    0m15.329s
      sys     10m15.942s
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29b4433d
  4. 09 10月, 2010 2 次提交
  5. 07 10月, 2010 1 次提交
  6. 06 10月, 2010 1 次提交
    • E
      net: add a core netdev->rx_dropped counter · caf586e5
      Eric Dumazet 提交于
      In various situations, a device provides a packet to our stack and we
      drop it before it enters protocol stack :
      - softnet backlog full (accounted in /proc/net/softnet_stat)
      - bad vlan tag (not accounted)
      - unknown/unregistered protocol (not accounted)
      
      We can handle a per-device counter of such dropped frames at core level,
      and automatically adds it to the device provided stats (rx_dropped), so
      that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)
      
      This is a generalization of commit 8990f468 (net: rx_dropped
      accounting), thus reverting it.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      caf586e5
  7. 05 10月, 2010 1 次提交
  8. 30 9月, 2010 2 次提交
  9. 28 9月, 2010 1 次提交
  10. 27 9月, 2010 2 次提交
  11. 18 9月, 2010 1 次提交
  12. 17 9月, 2010 1 次提交
  13. 16 9月, 2010 2 次提交
  14. 15 9月, 2010 1 次提交
  15. 09 9月, 2010 1 次提交
  16. 08 9月, 2010 1 次提交
    • H
      net: fix tx queue selection for bridged devices implementing select_queue · deabc772
      Helmut Schaa 提交于
      When a net device is implementing the select_queue callback and is part of
      a bridge, frames coming from the bridge already have a tx queue associated
      to the socket (introduced in commit a4ee3ce3,
      "net: Use sk_tx_queue_mapping for connected sockets"). The call to
      sk_tx_queue_get will then return the tx queue used by the bridge instead
      of calling the select_queue callback.
      
      In case of mac80211 this broke QoS which is implemented by using the
      select_queue callback. Furthermore it introduced problems with rt2x00
      because frames with the same TID and RA sometimes appeared on different
      tx queues which the hw cannot handle correctly.
      
      Fix this by always calling select_queue first if it is available and only
      afterwards use the socket tx queue mapping.
      Signed-off-by: NHelmut Schaa <helmut.schaa@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      deabc772
  17. 03 9月, 2010 1 次提交
  18. 02 9月, 2010 1 次提交
  19. 27 8月, 2010 1 次提交
    • E
      gro: __napi_gro_receive() optimizations · 40d0802b
      Eric Dumazet 提交于
      compare_ether_header() can have a special implementation on 64 bit
      arches if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined.
      
      __napi_gro_receive() and vlan_gro_common() can avoid a conditional
      branch to perform device match.
      
      On x86_64, __napi_gro_receive() has now 38 instructions instead of 53
      
      As gcc-4.4.3 still choose to not inline it, add inline keyword to this
      performance critical function.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40d0802b
  20. 23 8月, 2010 2 次提交
  21. 22 8月, 2010 1 次提交
  22. 20 8月, 2010 3 次提交
  23. 19 8月, 2010 1 次提交
  24. 18 8月, 2010 1 次提交
  25. 17 8月, 2010 1 次提交
    • K
      core: Factor out flow calculation from get_rps_cpu · bfb564e7
      Krishna Kumar 提交于
      Factor out flow calculation code from get_rps_cpu, since other
      functions can use the same code.
      
      Revisions:
      
      v2 (Ben): Separate flow calcuation out and use in select queue.
      v3 (Arnd): Don't re-implement MIN.
      v4 (Changli): skb->data points to ethernet header in macvtap, and
      	make a fast path. Tested macvtap with this patch.
      v5 (Changli):
      	- Cache skb->rxhash in skb_get_rxhash
      	- macvtap may not have pow(2) queues, so change code for
      	  queue selection.
          (Arnd):
      	- Use first available queue if all fails.
      Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bfb564e7
  26. 08 8月, 2010 1 次提交
  27. 06 8月, 2010 1 次提交
  28. 03 8月, 2010 2 次提交