1. 02 9月, 2014 1 次提交
    • T
      udp: Add support for doing checksum unnecessary conversion · 2abb7cdc
      Tom Herbert 提交于
      Add support for doing CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE
      conversion in UDP tunneling path.
      
      In the normal UDP path, we call skb_checksum_try_convert after locating
      the UDP socket. The check is that checksum conversion is enabled for
      the socket (new flag in UDP socket) and that checksum field is
      non-zero.
      
      In the UDP GRO path, we call skb_gro_checksum_try_convert after
      checksum is validated and checksum field is non-zero. Since this is
      already in GRO we assume that checksum conversion is always wanted.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2abb7cdc
  2. 25 8月, 2014 1 次提交
  3. 24 8月, 2014 1 次提交
  4. 24 7月, 2014 1 次提交
  5. 17 7月, 2014 2 次提交
  6. 15 7月, 2014 1 次提交
  7. 12 7月, 2014 1 次提交
  8. 27 6月, 2014 1 次提交
  9. 14 6月, 2014 1 次提交
  10. 05 6月, 2014 3 次提交
  11. 24 5月, 2014 2 次提交
    • T
      net: Make enabling of zero UDP6 csums more restrictive · 1c19448c
      Tom Herbert 提交于
      RFC 6935 permits zero checksums to be used in IPv6 however this is
      recommended only for certain tunnel protocols, it does not make
      checksums completely optional like they are in IPv4.
      
      This patch restricts the use of IPv6 zero checksums that was previously
      intoduced. no_check6_tx and no_check6_rx have been added to control
      the use of checksums in UDP6 RX and TX path. The normal
      sk_no_check_{rx,tx} settings are not used (this avoids ambiguity when
      dealing with a dual stack socket).
      
      A helper function has been added (udp_set_no_check6) which can be
      called by tunnel impelmentations to all zero checksums (send on the
      socket, and accept them as valid).
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c19448c
    • T
      net: Split sk_no_check into sk_no_check_{rx,tx} · 28448b80
      Tom Herbert 提交于
      Define separate fields in the sock structure for configuring disabling
      checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx.
      The SO_NO_CHECK socket option only affects sk_no_check_tx. Also,
      removed UDP_CSUM_* defines since they are no longer necessary.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28448b80
  12. 15 5月, 2014 2 次提交
  13. 09 5月, 2014 1 次提交
  14. 06 5月, 2014 1 次提交
  15. 20 2月, 2014 1 次提交
  16. 19 1月, 2014 1 次提交
  17. 15 1月, 2014 1 次提交
  18. 03 1月, 2014 1 次提交
    • W
      ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NIC · 7a7ffbab
      Wei-Chun Chao 提交于
      VM to VM GSO traffic is broken if it goes through VXLAN or GRE
      tunnel and the physical NIC on the host supports hardware VXLAN/GRE
      GSO offload (e.g. bnx2x and next-gen mlx4).
      
      Two issues -
      (VXLAN) VM traffic has SKB_GSO_DODGY and SKB_GSO_UDP_TUNNEL with
      SKB_GSO_TCP/UDP set depending on the inner protocol. GSO header
      integrity check fails in udp4_ufo_fragment if inner protocol is
      TCP. Also gso_segs is calculated incorrectly using skb->len that
      includes tunnel header. Fix: robust check should only be applied
      to the inner packet.
      
      (VXLAN & GRE) Once GSO header integrity check passes, NULL segs
      is returned and the original skb is sent to hardware. However the
      tunnel header is already pulled. Fix: tunnel header needs to be
      restored so that hardware can perform GSO properly on the original
      packet.
      Signed-off-by: NWei-Chun Chao <weichunc@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a7ffbab
  19. 18 12月, 2013 1 次提交
  20. 12 12月, 2013 2 次提交
  21. 11 12月, 2013 1 次提交
    • E
      udp: ipv4: fix an use after free in __udp4_lib_rcv() · 8afdd99a
      Eric Dumazet 提交于
      Dave Jones reported a use after free in UDP stack :
      
      [ 5059.434216] =========================
      [ 5059.434314] [ BUG: held lock freed! ]
      [ 5059.434420] 3.13.0-rc3+ #9 Not tainted
      [ 5059.434520] -------------------------
      [ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there!
      [ 5059.434815]  (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
      [ 5059.435012] 3 locks held by named/863:
      [ 5059.435086]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940
      [ 5059.435295]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410
      [ 5059.435500]  #2:  (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
      [ 5059.435734]
      stack backtrace:
      [ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ #9 [loadavg: 0.21 0.06 0.06 1/115 1365]
      [ 5059.436052] Hardware name:                  /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010
      [ 5059.436223]  0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0
      [ 5059.436390]  ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0
      [ 5059.436554]  ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48
      [ 5059.436718] Call Trace:
      [ 5059.436769]  <IRQ>  [<ffffffff8153a372>] dump_stack+0x4d/0x66
      [ 5059.436904]  [<ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160
      [ 5059.437037]  [<ffffffff8141c490>] ? __sk_free+0x110/0x230
      [ 5059.437147]  [<ffffffff8112da2a>] kmem_cache_free+0x6a/0x150
      [ 5059.437260]  [<ffffffff8141c490>] __sk_free+0x110/0x230
      [ 5059.437364]  [<ffffffff8141c5c9>] sk_free+0x19/0x20
      [ 5059.437463]  [<ffffffff8141cb25>] sock_edemux+0x25/0x40
      [ 5059.437567]  [<ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280
      [ 5059.437685]  [<ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0
      [ 5059.437805]  [<ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240
      [ 5059.437925]  [<ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70
      [ 5059.438038]  [<ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0
      [ 5059.438155]  [<ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00
      [ 5059.438269]  [<ffffffff8149d7f5>] udp_rcv+0x15/0x20
      [ 5059.438367]  [<ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410
      [ 5059.438492]  [<ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410
      [ 5059.438621]  [<ffffffff81468653>] ip_local_deliver+0x43/0x80
      [ 5059.438733]  [<ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0
      [ 5059.438843]  [<ffffffff81468926>] ip_rcv+0x296/0x3f0
      [ 5059.438945]  [<ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940
      [ 5059.439074]  [<ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940
      [ 5059.442231]  [<ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10
      [ 5059.442231]  [<ffffffff81430d83>] __netif_receive_skb+0x13/0x60
      [ 5059.442231]  [<ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0
      [ 5059.442231]  [<ffffffff814334e0>] napi_gro_receive+0x70/0xa0
      [ 5059.442231]  [<ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169]
      [ 5059.442231]  [<ffffffff81432bc9>] net_rx_action+0x129/0x1e0
      [ 5059.442231]  [<ffffffff810478cd>] __do_softirq+0xed/0x240
      [ 5059.442231]  [<ffffffff81047e25>] irq_exit+0x125/0x140
      [ 5059.442231]  [<ffffffff81004241>] do_IRQ+0x51/0xc0
      [ 5059.442231]  [<ffffffff81542bef>] common_interrupt+0x6f/0x6f
      
      We need to keep a reference on the socket, by using skb_steal_sock()
      at the right place.
      
      Note that another patch is needed to fix a race in
      udp_sk_rx_dst_set(), as we hold no lock protecting the dst.
      
      Fixes: 421b3885 ("udp: ipv4: Add udp early demux")
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8afdd99a
  22. 06 12月, 2013 1 次提交
  23. 30 11月, 2013 2 次提交
  24. 24 11月, 2013 1 次提交
  25. 19 11月, 2013 1 次提交
  26. 15 11月, 2013 1 次提交
  27. 20 10月, 2013 2 次提交
  28. 09 10月, 2013 4 次提交
    • E
      udp: fix a typo in __udp4_lib_mcast_demux_lookup · f69b923a
      Eric Dumazet 提交于
      At this point sk might contain garbage.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f69b923a
    • S
      net: ipv4 only populate IP_PKTINFO when needed · fbf8866d
      Shawn Bohrer 提交于
      The since the removal of the routing cache computing
      fib_compute_spec_dst() does a fib_table lookup for each UDP multicast
      packet received.  This has introduced a performance regression for some
      UDP workloads.
      
      This change skips populating the packet info for sockets that do not have
      IP_PKTINFO set.
      
      Benchmark results from a netperf UDP_RR test:
      Before 89789.68 transactions/s
      After  90587.62 transactions/s
      
      Benchmark results from a fio 1 byte UDP multicast pingpong test
      (Multicast one way unicast response):
      Before 12.63us RTT
      After  12.48us RTT
      Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbf8866d
    • S
      udp: ipv4: Add udp early demux · 421b3885
      Shawn Bohrer 提交于
      The removal of the routing cache introduced a performance regression for
      some UDP workloads since a dst lookup must be done for each packet.
      This change caches the dst per socket in a similar manner to what we do
      for TCP by implementing early_demux.
      
      For UDP multicast we can only cache the dst if there is only one
      receiving socket on the host.  Since caching only works when there is
      one receiving socket we do the multicast socket lookup using RCU.
      
      For UDP unicast we only demux sockets with an exact match in order to
      not break forwarding setups.  Additionally since the hash chains may be
      long we only check the first socket to see if it is a match and not
      waste extra time searching the whole chain when we might not find an
      exact match.
      
      Benchmark results from a netperf UDP_RR test:
      Before 87961.22 transactions/s
      After  89789.68 transactions/s
      
      Benchmark results from a fio 1 byte UDP multicast pingpong test
      (Multicast one way unicast response):
      Before 12.97us RTT
      After  12.63us RTT
      Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      421b3885
    • S
      udp: Only allow busy read/poll on connected sockets · 005ec974
      Shawn Bohrer 提交于
      UDP sockets can receive packets from multiple endpoints and thus may be
      received on multiple receive queues.  Since packets packets can arrive
      on multiple receive queues we should not mark the napi_id for all
      packets.  This makes busy read/poll only work for connected UDP sockets.
      
      This additionally enables busy read/poll for UDP multicast packets as
      long as the socket is connected by moving the check into
      __udp_queue_rcv_skb().
      Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      005ec974
  29. 01 10月, 2013 1 次提交