1. 01 8月, 2012 1 次提交
    • M
      netvm: allow skb allocation to use PFMEMALLOC reserves · c93bdd0e
      Mel Gorman 提交于
      Change the skb allocation API to indicate RX usage and use this to fall
      back to the PFMEMALLOC reserve when needed.  SKBs allocated from the
      reserve are tagged in skb->pfmemalloc.  If an SKB is allocated from the
      reserve and the socket is later found to be unrelated to page reclaim, the
      packet is dropped so that the memory remains available for page reclaim.
      Network protocols are expected to recover from this packet loss.
      
      [a.p.zijlstra@chello.nl: Ideas taken from various patches]
      [davem@davemloft.net: Use static branches, coding style corrections]
      [sebastian@breakpoint.cc: Avoid unnecessary cast, fix !CONFIG_NET build]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c93bdd0e
  2. 23 7月, 2012 2 次提交
  3. 19 7月, 2012 1 次提交
  4. 16 7月, 2012 1 次提交
  5. 13 7月, 2012 1 次提交
    • A
      net: Update alloc frag to reduce get/put page usage and recycle pages · 540eb7bf
      Alexander Duyck 提交于
      This patch is meant to help improve performance by reducing the number of
      locked operations required to allocate a frag on x86 and other platforms.
      This is accomplished by using atomic_set operations on the page count
      instead of calling get_page and put_page.  It is based on work originally
      provided by Eric Dumazet.
      
      In addition it also helps to reduce memory overhead when using TCP.  This
      is done by recycling the page if the only holder of the frame is the
      netdev_alloc_frag call itself.  This can occur when skb heads are stolen by
      either GRO or TCP and the driver providing the packets is using paged frags
      to store all of the data for the packets.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      540eb7bf
  6. 11 7月, 2012 1 次提交
  7. 14 6月, 2012 1 次提交
    • E
      splice: fix racy pipe->buffers uses · 047fe360
      Eric Dumazet 提交于
      Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
      by splice_shrink_spd() called from vmsplice_to_pipe()
      
      commit 35f3d14d (pipe: add support for shrinking and growing pipes)
      added capability to adjust pipe->buffers.
      
      Problem is some paths don't hold pipe mutex and assume pipe->buffers
      doesn't change for their duration.
      
      Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
      use it in place of pipe->buffers where appropriate.
      
      splice_shrink_spd() loses its struct pipe_inode_info argument.
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Tom Herbert <therbert@google.com>
      Cc: stable <stable@vger.kernel.org> # 2.6.35
      Tested-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      047fe360
  8. 09 6月, 2012 1 次提交
  9. 08 6月, 2012 1 次提交
  10. 20 5月, 2012 1 次提交
    • E
      net: introduce skb_try_coalesce() · bad43ca8
      Eric Dumazet 提交于
      Move tcp_try_coalesce() protocol independent part to
      skb_try_coalesce().
      
      skb_try_coalesce() can be used in IPv4 defrag and IPv6 reassembly,
      to build optimized skbs (less sk_buff, and possibly less 'headers')
      
      skb_try_coalesce() is zero copy, unless the copy can fit in destination
      header (its a rare case)
      
      kfree_skb_partial() is also moved to net/core/skbuff.c and exported,
      because IPv6 will need it in patch (ipv6: use skb coalescing in
      reassembly).
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bad43ca8
  11. 19 5月, 2012 1 次提交
    • E
      net: introduce netdev_alloc_frag() · 6f532612
      Eric Dumazet 提交于
      Fix two issues introduced in commit a1c7fff7
      ( net: netdev_alloc_skb() use build_skb() )
      
      - Must be IRQ safe (non NAPI drivers can use it)
      - Must not leak the frag if build_skb() fails to allocate sk_buff
      
      This patch introduces netdev_alloc_frag() for drivers willing to
      use build_skb() instead of __netdev_alloc_skb() variants.
      
      Factorize code so that :
      __dev_alloc_skb() is a wrapper around __netdev_alloc_skb(), and
      dev_alloc_skb() a wrapper around netdev_alloc_skb()
      
      Use __GFP_COLD flag.
      
      Almost all network drivers now benefit from skb->head_frag
      infrastructure.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f532612
  12. 18 5月, 2012 1 次提交
  13. 17 5月, 2012 1 次提交
  14. 16 5月, 2012 1 次提交
  15. 07 5月, 2012 3 次提交
  16. 04 5月, 2012 2 次提交
  17. 03 5月, 2012 1 次提交
  18. 01 5月, 2012 3 次提交
    • E
      net: makes skb_splice_bits() aware of skb->head_frag · 1d0c0b32
      Eric Dumazet 提交于
      __skb_splice_bits() can check if skb to be spliced has its skb->head
      mapped to a page fragment, instead of a kmalloc() area.
      
      If so we can avoid a copy of the skb head and get a reference on
      underlying page.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d0c0b32
    • E
      net: make GRO aware of skb->head_frag · d7e8883c
      Eric Dumazet 提交于
      GRO can check if skb to be merged has its skb->head mapped to a page
      fragment, instead of a kmalloc() area.
      
      We 'upgrade' skb->head as a fragment in itself
      
      This avoids the frag_list fallback, and permits to build true GRO skb
      (one sk_buff and up to 16 fragments), using less memory.
      
      This reduces number of cache misses when user makes its copy, since a
      single sk_buff is fetched.
      
      This is a followup of patch "net: allow skb->head to be a page fragment"
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7e8883c
    • E
      net: allow skb->head to be a page fragment · d3836f21
      Eric Dumazet 提交于
      skb->head is currently allocated from kmalloc(). This is convenient but
      has the drawback the data cannot be converted to a page fragment if
      needed.
      
      We have three spots were it hurts :
      
      1) GRO aggregation
      
       When a linear skb must be appended to another skb, GRO uses the
      frag_list fallback, very inefficient since we keep all struct sk_buff
      around. So drivers enabling GRO but delivering linear skbs to network
      stack aren't enabling full GRO power.
      
      2) splice(socket -> pipe).
      
       We must copy the linear part to a page fragment.
       This kind of defeats splice() purpose (zero copy claim)
      
      3) TCP coalescing.
      
       Recently introduced, this permits to group several contiguous segments
      into a single skb. This shortens queue lengths and save kernel memory,
      and greatly reduce probabilities of TCP collapses. This coalescing
      doesnt work on linear skbs (or we would need to copy data, this would be
      too slow)
      
      Given all these issues, the following patch introduces the possibility
      of having skb->head be a fragment in itself. We use a new skb flag,
      skb->head_frag to carry this information.
      
      build_skb() is changed to accept a frag_size argument. Drivers willing
      to provide a page fragment instead of kmalloc() data will set a non zero
      value, set to the fragment size.
      
      Then, on situations we need to convert the skb head to a frag in itself,
      we can check if skb->head_frag is set and avoid the copies or various
      fallbacks we have.
      
      This means drivers currently using frags could be updated to avoid the
      current skb->head allocation and reduce their memory footprint (aka skb
      truesize). (thats 512 or 1024 bytes saved per skb). This also makes
      bpf/netfilter faster since the 'first frag' will be part of skb linear
      part, no need to copy data.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3836f21
  19. 24 4月, 2012 3 次提交
  20. 22 4月, 2012 1 次提交
  21. 20 4月, 2012 1 次提交
  22. 16 4月, 2012 1 次提交
  23. 11 4月, 2012 1 次提交
  24. 06 4月, 2012 1 次提交
  25. 05 4月, 2012 1 次提交
  26. 29 3月, 2012 1 次提交
  27. 26 3月, 2012 1 次提交
  28. 24 2月, 2012 1 次提交
  29. 14 2月, 2012 1 次提交
  30. 17 12月, 2011 1 次提交
  31. 05 12月, 2011 1 次提交
    • E
      tcp: take care of misalignments · 117632e6
      Eric Dumazet 提交于
      We discovered that TCP stack could retransmit misaligned skbs if a
      malicious peer acknowledged sub MSS frame. This currently can happen
      only if output interface is non SG enabled : If SG is enabled, tcp
      builds headless skbs (all payload is included in fragments), so the tcp
      trimming process only removes parts of skb fragments, header stay
      aligned.
      
      Some arches cant handle misalignments, so force a head reallocation and
      shrink headroom to MAX_TCP_HEADER.
      
      Dont care about misaligments on x86 and PPC (or other arches setting
      NET_IP_ALIGN to 0)
      
      This patch introduces __pskb_copy() which can specify the headroom of
      new head, and pskb_copy() becomes a wrapper on top of __pskb_copy()
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      117632e6
  32. 23 11月, 2011 1 次提交