1. 21 9月, 2012 1 次提交
    • E
      net: do not disable sg for packets requiring no checksum · c0d680e5
      Ed Cashin 提交于
      A change in a series of VLAN-related changes appears to have
      inadvertently disabled the use of the scatter gather feature of
      network cards for transmission of non-IP ethernet protocols like ATA
      over Ethernet (AoE).  Below is a reference to the commit that
      introduces a "harmonize_features" function that turns off scatter
      gather when the NIC does not support hardware checksumming for the
      ethernet protocol of an sk buff.
      
        commit f01a5236
        Author: Jesse Gross <jesse@nicira.com>
        Date:   Sun Jan 9 06:23:31 2011 +0000
      
            net offloading: Generalize netif_get_vlan_features().
      
      The can_checksum_protocol function is not equipped to consider a
      protocol that does not require checksumming.  Calling it for a
      protocol that requires no checksum is inappropriate.
      
      The patch below has harmonize_features call can_checksum_protocol when
      the protocol needs a checksum, so that the network layer is not forced
      to perform unnecessary skb linearization on the transmission of AoE
      packets.  Unnecessary linearization results in decreased performance
      and increased memory pressure, as reported here:
      
        http://www.spinics.net/lists/linux-mm/msg15184.html
      
      The problem has probably not been widely experienced yet, because
      only recently has the kernel.org-distributed aoe driver acquired the
      ability to use payloads of over a page in size, with the patchset
      recently included in the mm tree:
      
        https://lkml.org/lkml/2012/8/28/140
      
      The coraid.com-distributed aoe driver already could use payloads of
      greater than a page in size, but its users generally do not use the
      newest kernels.
      Signed-off-by: NEd Cashin <ecashin@coraid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0d680e5
  2. 19 9月, 2012 1 次提交
  3. 17 9月, 2012 3 次提交
  4. 09 9月, 2012 1 次提交
    • C
      net: small bug on rxhash calculation · 68622342
      Chema Gonzalez 提交于
      In the current rxhash calculation function, while the
      sorting of the ports/addrs is coherent (you get the
      same rxhash for packets sharing the same 4-tuple, in
      both directions), ports and addrs are sorted
      independently. This implies packets from a connection
      between the same addresses but crossed ports hash to
      the same rxhash.
      
      For example, traffic between A=S:l and B=L:s is hashed
      (in both directions) from {L, S, {s, l}}. The same
      rxhash is obtained for packets between C=S:s and D=L:l.
      
      This patch ensures that you either swap both addrs and ports,
      or you swap none. Traffic between A and B, and traffic
      between C and D, get their rxhash from different sources
      ({L, S, {l, s}} for A<->B, and {L, S, {s, l}} for C<->D)
      
      The patch is co-written with Eric Dumazet <edumazet@google.com>
      Signed-off-by: NChema Gonzalez <chema@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68622342
  5. 20 8月, 2012 2 次提交
  6. 09 8月, 2012 1 次提交
  7. 02 8月, 2012 1 次提交
    • B
      net: Allow driver to limit number of GSO segments per skb · 30b678d8
      Ben Hutchings 提交于
      A peer (or local user) may cause TCP to use a nominal MSS of as little
      as 88 (actual MSS of 76 with timestamps).  Given that we have a
      sufficiently prodigious local sender and the peer ACKs quickly enough,
      it is nevertheless possible to grow the window for such a connection
      to the point that we will try to send just under 64K at once.  This
      results in a single skb that expands to 861 segments.
      
      In some drivers with TSO support, such an skb will require hundreds of
      DMA descriptors; a substantial fraction of a TX ring or even more than
      a full ring.  The TX queue selected for the skb may stall and trigger
      the TX watchdog repeatedly (since the problem skb will be retried
      after the TX reset).  This particularly affects sfc, for which the
      issue is designated as CVE-2012-3412.
      
      Therefore:
      1. Add the field net_device::gso_max_segs holding the device-specific
         limit.
      2. In netif_skb_features(), if the number of segments is too high then
         mask out GSO features to force fall back to software GSO.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30b678d8
  8. 01 8月, 2012 1 次提交
    • M
      netvm: set PF_MEMALLOC as appropriate during SKB processing · b4b9e355
      Mel Gorman 提交于
      In order to make sure pfmemalloc packets receive all memory needed to
      proceed, ensure processing of pfmemalloc SKBs happens under PF_MEMALLOC.
      This is limited to a subset of protocols that are expected to be used for
      writing to swap.  Taps are not allowed to use PF_MEMALLOC as these are
      expected to communicate with userspace processes which could be paged out.
      
      [a.p.zijlstra@chello.nl: Ideas taken from various patches]
      [jslaby@suse.cz: Lock imbalance fix]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4b9e355
  9. 24 7月, 2012 1 次提交
  10. 23 7月, 2012 1 次提交
  11. 19 7月, 2012 1 次提交
  12. 15 7月, 2012 1 次提交
  13. 11 7月, 2012 2 次提交
  14. 10 7月, 2012 1 次提交
  15. 05 7月, 2012 1 次提交
  16. 29 6月, 2012 1 次提交
  17. 16 6月, 2012 1 次提交
  18. 13 6月, 2012 1 次提交
    • M
      net-next: add dev_loopback_xmit() to avoid duplicate code · 95603e22
      Michel Machado 提交于
      Add dev_loopback_xmit() in order to deduplicate functions
      ip_dev_loopback_xmit() (in net/ipv4/ip_output.c) and
      ip6_dev_loopback_xmit() (in net/ipv6/ip6_output.c).
      
      I was about to reinvent the wheel when I noticed that
      ip_dev_loopback_xmit() and ip6_dev_loopback_xmit() do exactly what I
      need and are not IP-only functions, but they were not available to reuse
      elsewhere.
      
      ip6_dev_loopback_xmit() does not have line "skb_dst_force(skb);", but I
      understand that this is harmless, and should be in dev_loopback_xmit().
      Signed-off-by: NMichel Machado <michel@digirati.com.br>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      CC: James Morris <jmorris@namei.org>
      CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jiri Pirko <jpirko@redhat.com>
      CC: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95603e22
  19. 19 5月, 2012 1 次提交
  20. 16 5月, 2012 2 次提交
  21. 11 5月, 2012 1 次提交
  22. 01 5月, 2012 1 次提交
    • E
      net: make GRO aware of skb->head_frag · d7e8883c
      Eric Dumazet 提交于
      GRO can check if skb to be merged has its skb->head mapped to a page
      fragment, instead of a kmalloc() area.
      
      We 'upgrade' skb->head as a fragment in itself
      
      This avoids the frag_list fallback, and permits to build true GRO skb
      (one sk_buff and up to 16 fragments), using less memory.
      
      This reduces number of cache misses when user makes its copy, since a
      single sk_buff is fetched.
      
      This is a followup of patch "net: allow skb->head to be a page fragment"
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7e8883c
  23. 20 4月, 2012 1 次提交
  24. 16 4月, 2012 1 次提交
  25. 13 4月, 2012 1 次提交
  26. 04 4月, 2012 1 次提交
  27. 29 3月, 2012 1 次提交
  28. 28 3月, 2012 1 次提交
    • B
      net/core: dev_forward_skb() should clear skb_iif · 3b9785c6
      Benjamin LaHaise 提交于
      While investigating another bug, I found that the code on the incoming path
      in __netif_receive_skb will only set skb->skb_iif if it is already 0.  When
      dev_forward_skb() is used in the case of interfaces like veth, skb_iif may
      already have been set.  Making dev_forward_skb() cause the packet to look
      like a newly received packet would seem to the the correct behaviour here,
      as otherwise the wrong incoming interface can be reported for such a packet.
      Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b9785c6
  29. 22 3月, 2012 1 次提交
  30. 07 3月, 2012 1 次提交
  31. 06 3月, 2012 1 次提交
  32. 24 2月, 2012 1 次提交
    • I
      static keys: Introduce 'struct static_key', static_key_true()/false() and... · c5905afb
      Ingo Molnar 提交于
      static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]()
      
      So here's a boot tested patch on top of Jason's series that does
      all the cleanups I talked about and turns jump labels into a
      more intuitive to use facility. It should also address the
      various misconceptions and confusions that surround jump labels.
      
      Typical usage scenarios:
      
              #include <linux/static_key.h>
      
              struct static_key key = STATIC_KEY_INIT_TRUE;
      
              if (static_key_false(&key))
                      do unlikely code
              else
                      do likely code
      
      Or:
      
              if (static_key_true(&key))
                      do likely code
              else
                      do unlikely code
      
      The static key is modified via:
      
              static_key_slow_inc(&key);
              ...
              static_key_slow_dec(&key);
      
      The 'slow' prefix makes it abundantly clear that this is an
      expensive operation.
      
      I've updated all in-kernel code to use this everywhere. Note
      that I (intentionally) have not pushed through the rename
      blindly through to the lowest levels: the actual jump-label
      patching arch facility should be named like that, so we want to
      decouple jump labels from the static-key facility a bit.
      
      On non-jump-label enabled architectures static keys default to
      likely()/unlikely() branches.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: mathieu.desnoyers@efficios.com
      Cc: davem@davemloft.net
      Cc: ddaney.cavm@gmail.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.huSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c5905afb
  33. 09 2月, 2012 2 次提交
    • E
      gro: more generic L2 header check · 5ca3b72c
      Eric Dumazet 提交于
      Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
      only, and failed on IB/ipoib traffic.
      
      He provided a patch faking a zeroed header to let GRO aggregates frames.
      
      Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
      check to be more generic, ie not assuming L2 header is 14 bytes, but
      taking into account hard_header_len.
      
      __napi_gro_receive() has special handling for the common case (Ethernet)
      to avoid a memcmp() call and use an inline optimized function instead.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NShlomo Pongratz <shlomop@mellanox.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ca3b72c
    • E
      gro: more generic L2 header check · 43480aec
      Eric Dumazet 提交于
      Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
      only, and failed on IB/ipoib traffic.
      
      He provided a patch faking a zeroed header to let GRO aggregates frames.
      
      Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
      check to be more generic, ie not assuming L2 header is 14 bytes, but
      taking into account hard_header_len.
      
      __napi_gro_receive() has special handling for the common case (Ethernet)
      to avoid a memcmp() call and use an inline optimized function instead.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NShlomo Pongratz <shlomop@mellanox.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43480aec
  34. 02 2月, 2012 1 次提交