1. 18 1月, 2013 1 次提交
  2. 14 1月, 2013 1 次提交
  3. 16 11月, 2012 1 次提交
  4. 04 11月, 2012 1 次提交
  5. 02 11月, 2012 1 次提交
  6. 25 9月, 2012 1 次提交
    • E
      net: use a per task frag allocator · 5640f768
      Eric Dumazet 提交于
      We currently use a per socket order-0 page cache for tcp_sendmsg()
      operations.
      
      This page is used to build fragments for skbs.
      
      Its done to increase probability of coalescing small write() into
      single segments in skbs still in write queue (not yet sent)
      
      But it wastes a lot of memory for applications handling many mostly
      idle sockets, since each socket holds one page in sk->sk_sndmsg_page
      
      Its also quite inefficient to build TSO 64KB packets, because we need
      about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit
      page allocator more than wanted.
      
      This patch adds a per task frag allocator and uses bigger pages,
      if available. An automatic fallback is done in case of memory pressure.
      
      (up to 32768 bytes per frag, thats order-3 pages on x86)
      
      This increases TCP stream performance by 20% on loopback device,
      but also benefits on other network devices, since 8x less frags are
      mapped on transmit and unmapped on tx completion. Alexander Duyck
      mentioned a probable performance win on systems with IOMMU enabled.
      
      Its possible some SG enabled hardware cant cope with bigger fragments,
      but their ndo_start_xmit() should already handle this, splitting a
      fragment in sub fragments, since some arches have PAGE_SIZE=65536
      
      Successfully tested on various ethernet devices.
      (ixgbe, igb, bnx2x, tg3, mellanox mlx4)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5640f768
  7. 11 9月, 2012 1 次提交
  8. 30 8月, 2012 1 次提交
    • P
      netfilter: nf_conntrack_ipv6: improve fragmentation handling · 4cdd3408
      Patrick McHardy 提交于
      The IPv6 conntrack fragmentation currently has a couple of shortcomings.
      Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the
      defragmented packet is then passed to conntrack, the resulting conntrack
      information is attached to each original fragment and the fragments then
      continue their way through the stack.
      
      Helper invocation occurs in the POSTROUTING hook, at which point only
      the original fragments are available. The result of this is that
      fragmented packets are never passed to helpers.
      
      This patch improves the situation in the following way:
      
      - If a reassembled packet belongs to a connection that has a helper
        assigned, the reassembled packet is passed through the stack instead
        of the original fragments.
      
      - During defragmentation, the largest received fragment size is stored.
        On output, the packet is refragmented if required. If the largest
        received fragment size exceeds the outgoing MTU, a "packet too big"
        message is generated, thus behaving as if the original fragments
        were passed through the stack from an outside point of view.
      
      - The ipv6_helper() hook function can't receive fragments anymore for
        connections using a helper, so it is switched to use ipv6_skip_exthdr()
        instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the
        reassembled packets are passed to connection tracking helpers.
      
      The result of this is that we can properly track fragmented packets, but
      still generate ICMPv6 Packet too big messages if we would have before.
      
      This patch is also required as a precondition for IPv6 NAT, where NAT
      helpers might enlarge packets up to a point that they require
      fragmentation. In that case we can't generate Packet too big messages
      since the proper MTU can't be calculated in all cases (f.i. when
      changing textual representation of a variable amount of addresses),
      so the packet is transparently fragmented iff the original packet or
      fragments would have fit the outgoing MTU.
      
      IPVS parts by Jesper Dangaard Brouer <brouer@redhat.com>.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      4cdd3408
  9. 11 7月, 2012 1 次提交
  10. 06 7月, 2012 1 次提交
  11. 05 7月, 2012 2 次提交
  12. 13 6月, 2012 1 次提交
    • M
      net-next: add dev_loopback_xmit() to avoid duplicate code · 95603e22
      Michel Machado 提交于
      Add dev_loopback_xmit() in order to deduplicate functions
      ip_dev_loopback_xmit() (in net/ipv4/ip_output.c) and
      ip6_dev_loopback_xmit() (in net/ipv6/ip6_output.c).
      
      I was about to reinvent the wheel when I noticed that
      ip_dev_loopback_xmit() and ip6_dev_loopback_xmit() do exactly what I
      need and are not IP-only functions, but they were not available to reuse
      elsewhere.
      
      ip6_dev_loopback_xmit() does not have line "skb_dst_force(skb);", but I
      understand that this is harmless, and should be in dev_loopback_xmit().
      Signed-off-by: NMichel Machado <michel@digirati.com.br>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      CC: James Morris <jmorris@namei.org>
      CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jiri Pirko <jpirko@redhat.com>
      CC: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95603e22
  13. 09 6月, 2012 1 次提交
  14. 08 6月, 2012 1 次提交
    • V
      snmp: fix OutOctets counter to include forwarded datagrams · 2d8dbb04
      Vincent Bernat 提交于
      RFC 4293 defines ipIfStatsOutOctets (similar definition for
      ipSystemStatsOutOctets):
      
         The total number of octets in IP datagrams delivered to the lower
         layers for transmission.  Octets from datagrams counted in
         ipIfStatsOutTransmits MUST be counted here.
      
      And ipIfStatsOutTransmits:
      
         The total number of IP datagrams that this entity supplied to the
         lower layers for transmission.  This includes datagrams generated
         locally and those forwarded by this entity.
      
      Therefore, IPSTATS_MIB_OUTOCTETS must be incremented when incrementing
      IPSTATS_MIB_OUTFORWDATAGRAMS.
      
      IP_UPD_PO_STATS is not used since ipIfStatsOutRequests must not
      include forwarded datagrams:
      
         The total number of IP datagrams that local IP user-protocols
         (including ICMP) supplied to IP in requests for transmission.  Note
         that this counter does not include any datagrams counted in
         ipIfStatsOutForwDatagrams.
      Signed-off-by: NVincent Bernat <bernat@luffy.cx>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d8dbb04
  15. 27 5月, 2012 1 次提交
    • G
      ipv6: fix incorrect ipsec fragment · 0c183379
      Gao feng 提交于
      Since commit ad0081e4
      "ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
      the fragment of packets is incorrect.
      because tunnel mode needs IPsec headers and trailer for all fragments,
      while on transport mode it is sufficient to add the headers to the
      first fragment and the trailer to the last.
      
      so modify mtu and maxfraglen base on ipsec mode and if fragment is first
      or last.
      
      with my test,it work well(every fragment's size is the mtu)
      and does not trigger slow fragment path.
      
      Changes from v1:
      	though optimization, mtu_prev and maxfraglen_prev can be delete.
      	replace xfrm mode codes with dst_entry's new frag DST_XFRM_TUNNEL.
      	add fuction ip6_append_data_mtu to make codes clearer.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c183379
  16. 19 5月, 2012 3 次提交
  17. 16 5月, 2012 1 次提交
  18. 01 5月, 2012 1 次提交
  19. 26 4月, 2012 1 次提交
  20. 20 3月, 2012 1 次提交
  21. 28 1月, 2012 2 次提交
  22. 23 12月, 2011 1 次提交
    • E
      net: introduce DST_NOPEER dst flag · e688a604
      Eric Dumazet 提交于
      Chris Boot reported crashes occurring in ipv6_select_ident().
      
      [  461.457562] RIP: 0010:[<ffffffff812dde61>]  [<ffffffff812dde61>]
      ipv6_select_ident+0x31/0xa7
      
      [  461.578229] Call Trace:
      [  461.580742] <IRQ>
      [  461.582870]  [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2
      [  461.589054]  [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155
      [  461.595140]  [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b
      [  461.601198]  [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e
      [nf_conntrack_ipv6]
      [  461.608786]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
      [  461.614227]  [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543
      [  461.620659]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
      [  461.626440]  [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a
      [bridge]
      [  461.633581]  [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459
      [  461.639577]  [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76
      [bridge]
      [  461.646887]  [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f
      [bridge]
      [  461.653997]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
      [  461.659473]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
      [  461.665485]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
      [  461.671234]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
      [  461.677299]  [<ffffffffa0379215>] ?
      nf_bridge_update_protocol+0x20/0x20 [bridge]
      [  461.684891]  [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
      [  461.691520]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
      [  461.697572]  [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56
      [bridge]
      [  461.704616]  [<ffffffffa0379031>] ?
      nf_bridge_push_encap_header+0x1c/0x26 [bridge]
      [  461.712329]  [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95
      [bridge]
      [  461.719490]  [<ffffffffa037900a>] ?
      nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
      [  461.727223]  [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
      [  461.734292]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
      [  461.739758]  [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge]
      [  461.746203]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
      [  461.751950]  [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge]
      [  461.758378]  [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56
      [bridge]
      
      This is caused by bridge netfilter special dst_entry (fake_rtable), a
      special shared entry, where attaching an inetpeer makes no sense.
      
      Problem is present since commit 87c48fa3 (ipv6: make fragment
      identifications less predictable)
      
      Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
      __ip_select_ident() fallback to the 'no peer attached' handling.
      Reported-by: NChris Boot <bootc@bootc.net>
      Tested-by: NChris Boot <bootc@bootc.net>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e688a604
  23. 06 12月, 2011 1 次提交
  24. 04 12月, 2011 1 次提交
  25. 23 11月, 2011 1 次提交
  26. 19 11月, 2011 1 次提交
    • H
      ipv6: Remove all uses of LL_ALLOCATED_SPACE · a7ae1992
      Herbert Xu 提交于
      ipv6: Remove all uses of LL_ALLOCATED_SPACE
      
      The macro LL_ALLOCATED_SPACE was ill-conceived.  It applies the
      alignment to the sum of needed_headroom and needed_tailroom.  As
      the amount that is then reserved for head room is needed_headroom
      with alignment, this means that the tail room left may be too small.
      
      This patch replaces all uses of LL_ALLOCATED_SPACE in net/ipv6
      with the macro LL_RESERVED_SPACE and direct reference to
      needed_tailroom.
      
      This also fixes the problem with needed_headroom changing between
      allocating the skb and reserving the head room.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7ae1992
  27. 28 10月, 2011 1 次提交
  28. 27 10月, 2011 1 次提交
  29. 19 10月, 2011 2 次提交
  30. 25 8月, 2011 1 次提交
  31. 03 8月, 2011 1 次提交
  32. 22 7月, 2011 1 次提交
  33. 18 7月, 2011 1 次提交
  34. 17 7月, 2011 2 次提交