1. 23 8月, 2012 1 次提交
    • E
      net: remove delay at device dismantle · 0115e8e3
      Eric Dumazet 提交于
      I noticed extra one second delay in device dismantle, tracked down to
      a call to dst_dev_event() while some call_rcu() are still in RCU queues.
      
      These call_rcu() were posted by rt_free(struct rtable *rt) calls.
      
      We then wait a little (but one second) in netdev_wait_allrefs() before
      kicking again NETDEV_UNREGISTER.
      
      As the call_rcu() are now completed, dst_dev_event() can do the needed
      device swap on busy dst.
      
      To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called
      after a rcu_barrier(), but outside of RTNL lock.
      
      Use NETDEV_UNREGISTER_FINAL with care !
      
      Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL
      
      Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after
      IP cache removal.
      
      With help from Gao feng
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Gao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0115e8e3
  2. 20 8月, 2012 2 次提交
  3. 15 8月, 2012 2 次提交
  4. 10 8月, 2012 2 次提交
  5. 09 8月, 2012 1 次提交
  6. 02 8月, 2012 1 次提交
    • B
      net: Allow driver to limit number of GSO segments per skb · 30b678d8
      Ben Hutchings 提交于
      A peer (or local user) may cause TCP to use a nominal MSS of as little
      as 88 (actual MSS of 76 with timestamps).  Given that we have a
      sufficiently prodigious local sender and the peer ACKs quickly enough,
      it is nevertheless possible to grow the window for such a connection
      to the point that we will try to send just under 64K at once.  This
      results in a single skb that expands to 861 segments.
      
      In some drivers with TSO support, such an skb will require hundreds of
      DMA descriptors; a substantial fraction of a TX ring or even more than
      a full ring.  The TX queue selected for the skb may stall and trigger
      the TX watchdog repeatedly (since the problem skb will be retried
      after the TX reset).  This particularly affects sfc, for which the
      issue is designated as CVE-2012-3412.
      
      Therefore:
      1. Add the field net_device::gso_max_segs holding the device-specific
         limit.
      2. In netif_skb_features(), if the number of segments is too high then
         mask out GSO features to force fall back to software GSO.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30b678d8
  7. 01 8月, 2012 1 次提交
    • M
      netvm: set PF_MEMALLOC as appropriate during SKB processing · b4b9e355
      Mel Gorman 提交于
      In order to make sure pfmemalloc packets receive all memory needed to
      proceed, ensure processing of pfmemalloc SKBs happens under PF_MEMALLOC.
      This is limited to a subset of protocols that are expected to be used for
      writing to swap.  Taps are not allowed to use PF_MEMALLOC as these are
      expected to communicate with userspace processes which could be paged out.
      
      [a.p.zijlstra@chello.nl: Ideas taken from various patches]
      [jslaby@suse.cz: Lock imbalance fix]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4b9e355
  8. 24 7月, 2012 1 次提交
  9. 23 7月, 2012 1 次提交
  10. 19 7月, 2012 1 次提交
  11. 15 7月, 2012 1 次提交
  12. 11 7月, 2012 2 次提交
  13. 10 7月, 2012 1 次提交
  14. 05 7月, 2012 1 次提交
  15. 29 6月, 2012 1 次提交
  16. 16 6月, 2012 1 次提交
  17. 13 6月, 2012 1 次提交
    • M
      net-next: add dev_loopback_xmit() to avoid duplicate code · 95603e22
      Michel Machado 提交于
      Add dev_loopback_xmit() in order to deduplicate functions
      ip_dev_loopback_xmit() (in net/ipv4/ip_output.c) and
      ip6_dev_loopback_xmit() (in net/ipv6/ip6_output.c).
      
      I was about to reinvent the wheel when I noticed that
      ip_dev_loopback_xmit() and ip6_dev_loopback_xmit() do exactly what I
      need and are not IP-only functions, but they were not available to reuse
      elsewhere.
      
      ip6_dev_loopback_xmit() does not have line "skb_dst_force(skb);", but I
      understand that this is harmless, and should be in dev_loopback_xmit().
      Signed-off-by: NMichel Machado <michel@digirati.com.br>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      CC: James Morris <jmorris@namei.org>
      CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jiri Pirko <jpirko@redhat.com>
      CC: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95603e22
  18. 19 5月, 2012 1 次提交
  19. 16 5月, 2012 2 次提交
  20. 11 5月, 2012 1 次提交
  21. 01 5月, 2012 1 次提交
    • E
      net: make GRO aware of skb->head_frag · d7e8883c
      Eric Dumazet 提交于
      GRO can check if skb to be merged has its skb->head mapped to a page
      fragment, instead of a kmalloc() area.
      
      We 'upgrade' skb->head as a fragment in itself
      
      This avoids the frag_list fallback, and permits to build true GRO skb
      (one sk_buff and up to 16 fragments), using less memory.
      
      This reduces number of cache misses when user makes its copy, since a
      single sk_buff is fetched.
      
      This is a followup of patch "net: allow skb->head to be a page fragment"
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7e8883c
  22. 20 4月, 2012 1 次提交
  23. 16 4月, 2012 1 次提交
  24. 13 4月, 2012 1 次提交
  25. 04 4月, 2012 1 次提交
  26. 29 3月, 2012 1 次提交
  27. 28 3月, 2012 1 次提交
    • B
      net/core: dev_forward_skb() should clear skb_iif · 3b9785c6
      Benjamin LaHaise 提交于
      While investigating another bug, I found that the code on the incoming path
      in __netif_receive_skb will only set skb->skb_iif if it is already 0.  When
      dev_forward_skb() is used in the case of interfaces like veth, skb_iif may
      already have been set.  Making dev_forward_skb() cause the packet to look
      like a newly received packet would seem to the the correct behaviour here,
      as otherwise the wrong incoming interface can be reported for such a packet.
      Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b9785c6
  28. 22 3月, 2012 1 次提交
  29. 07 3月, 2012 1 次提交
  30. 06 3月, 2012 1 次提交
  31. 24 2月, 2012 1 次提交
    • I
      static keys: Introduce 'struct static_key', static_key_true()/false() and... · c5905afb
      Ingo Molnar 提交于
      static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]()
      
      So here's a boot tested patch on top of Jason's series that does
      all the cleanups I talked about and turns jump labels into a
      more intuitive to use facility. It should also address the
      various misconceptions and confusions that surround jump labels.
      
      Typical usage scenarios:
      
              #include <linux/static_key.h>
      
              struct static_key key = STATIC_KEY_INIT_TRUE;
      
              if (static_key_false(&key))
                      do unlikely code
              else
                      do likely code
      
      Or:
      
              if (static_key_true(&key))
                      do likely code
              else
                      do unlikely code
      
      The static key is modified via:
      
              static_key_slow_inc(&key);
              ...
              static_key_slow_dec(&key);
      
      The 'slow' prefix makes it abundantly clear that this is an
      expensive operation.
      
      I've updated all in-kernel code to use this everywhere. Note
      that I (intentionally) have not pushed through the rename
      blindly through to the lowest levels: the actual jump-label
      patching arch facility should be named like that, so we want to
      decouple jump labels from the static-key facility a bit.
      
      On non-jump-label enabled architectures static keys default to
      likely()/unlikely() branches.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: mathieu.desnoyers@efficios.com
      Cc: davem@davemloft.net
      Cc: ddaney.cavm@gmail.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.huSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c5905afb
  32. 09 2月, 2012 2 次提交
    • E
      gro: more generic L2 header check · 5ca3b72c
      Eric Dumazet 提交于
      Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
      only, and failed on IB/ipoib traffic.
      
      He provided a patch faking a zeroed header to let GRO aggregates frames.
      
      Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
      check to be more generic, ie not assuming L2 header is 14 bytes, but
      taking into account hard_header_len.
      
      __napi_gro_receive() has special handling for the common case (Ethernet)
      to avoid a memcmp() call and use an inline optimized function instead.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NShlomo Pongratz <shlomop@mellanox.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ca3b72c
    • E
      gro: more generic L2 header check · 43480aec
      Eric Dumazet 提交于
      Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
      only, and failed on IB/ipoib traffic.
      
      He provided a patch faking a zeroed header to let GRO aggregates frames.
      
      Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
      check to be more generic, ie not assuming L2 header is 14 bytes, but
      taking into account hard_header_len.
      
      __napi_gro_receive() has special handling for the common case (Ethernet)
      to avoid a memcmp() call and use an inline optimized function instead.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NShlomo Pongratz <shlomop@mellanox.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43480aec
  33. 02 2月, 2012 1 次提交
  34. 18 1月, 2012 1 次提交