1. 15 1月, 2014 1 次提交
  2. 07 1月, 2014 1 次提交
    • E
      vxlan: keep original skb ownership · 8f646c92
      Eric Dumazet 提交于
      Sathya Perla posted a patch trying to address following problem :
      
      <quote>
       The vxlan driver sets itself as the socket owner for all the TX flows
       it encapsulates (using vxlan_set_owner()) and assigns it's own skb
       destructor. This causes all tunneled traffic to land up on only one TXQ
       as all encapsulated skbs refer to the vxlan socket and not the original
       socket.  Also, the vxlan skb destructor breaks some functionality for
       tunneled traffic like wmem accounting and as TCP small queues and
       FQ/pacing packet scheduler.
      </quote>
      
      I reworked Sathya patch and added some explanations.
      
      vxlan_xmit() can avoid one skb_clone()/dev_kfree_skb() pair
      and gain better drop monitor accuracy, by calling kfree_skb() when
      appropriate.
      
      The UDP socket used by vxlan to perform encapsulation of xmit packets
      do not need to be alive while packets leave vxlan code. Its better
      to keep original socket ownership to get proper feedback from qdisc and
      NIC layers.
      
      We use skb->sk to
      
      A) control amount of bytes/packets queued on behalf of a socket, but
      prior vxlan code did the skb->sk transfert without any limit/control
      on vxlan socket sk_sndbuf.
      
      B) security purposes (as selinux) or netfilter uses, and I do not think
      anything is prepared to handle vxlan stacked case in this area.
      
      By not changing ownership, vxlan tunnels behave like other tunnels.
      As Stephen mentioned, we might do the same change in L2TP.
      Reported-by: NSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f646c92
  3. 05 1月, 2014 1 次提交
  4. 04 1月, 2014 1 次提交
    • F
      {vxlan, inet6} Mark vxlan_dev flags with VXLAN_F_IPV6 properly · 7bda701e
      fan.du 提交于
      Even if user doesn't supply the physical netdev to attach vxlan dev
      to, and at the same time user want to vxlan sit top of IPv6, mark
      vxlan_dev flags with VXLAN_F_IPV6 to create IPv6 based socket.
      Otherwise kernel crashes safely every time spitting below messages,
      
      Steps to reproduce:
      ip link add vxlan0 type vxlan id 42 group ff0e::110
      ip link set vxlan0 up
      
      [   62.656266] BUG: unable to handle kernel NULL pointer dereference[   62.656320] ip (3008) used greatest stack depth: 3912 bytes left
       at 0000000000000046
      [   62.656423] IP: [<ffffffff816d822d>] ip6_route_output+0xbd/0xe0
      [   62.656525] PGD 2c966067 PUD 2c9a2067 PMD 0
      [   62.656674] Oops: 0000 [#1] SMP
      [   62.656781] Modules linked in: vxlan netconsole deflate zlib_deflate af_key
      [   62.657083] CPU: 1 PID: 2128 Comm: whoopsie Not tainted 3.12.0+ #182
      [   62.657083] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
      [   62.657083] task: ffff88002e2335d0 ti: ffff88002c94c000 task.ti: ffff88002c94c000
      [   62.657083] RIP: 0010:[<ffffffff816d822d>]  [<ffffffff816d822d>] ip6_route_output+0xbd/0xe0
      [   62.657083] RSP: 0000:ffff88002fd038f8  EFLAGS: 00210296
      [   62.657083] RAX: 0000000000000000 RBX: ffff88002fd039e0 RCX: 0000000000000000
      [   62.657083] RDX: ffff88002fd0eb68 RSI: ffff88002fd0d278 RDI: ffff88002fd0d278
      [   62.657083] RBP: ffff88002fd03918 R08: 0000000002000000 R09: 0000000000000000
      [   62.657083] R10: 00000000000001ff R11: 0000000000000000 R12: 0000000000000001
      [   62.657083] R13: ffff88002d96b480 R14: ffffffff81c8e2c0 R15: 0000000000000001
      [   62.657083] FS:  0000000000000000(0000) GS:ffff88002fd00000(0063) knlGS:00000000f693b740
      [   62.657083] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
      [   62.657083] CR2: 0000000000000046 CR3: 000000002c9d2000 CR4: 00000000000006e0
      [   62.657083] Stack:
      [   62.657083]  ffff88002fd03a40 ffffffff81c8e2c0 ffff88002fd039e0 ffff88002d96b480
      [   62.657083]  ffff88002fd03958 ffffffff816cac8b ffff880019277cc0 ffff8800192b5d00
      [   62.657083]  ffff88002d5bc000 ffff880019277cc0 0000000000001821 0000000000000001
      [   62.657083] Call Trace:
      [   62.657083]  <IRQ>
      [   62.657083]  [<ffffffff816cac8b>] ip6_dst_lookup_tail+0xdb/0xf0
      [   62.657083]  [<ffffffff816caea0>] ip6_dst_lookup+0x10/0x20
      [   62.657083]  [<ffffffffa0020c13>] vxlan_xmit_one+0x193/0x9c0 [vxlan]
      [   62.657083]  [<ffffffff8137b3b7>] ? account+0xc7/0x1f0
      [   62.657083]  [<ffffffffa0021513>] vxlan_xmit+0xd3/0x400 [vxlan]
      [   62.657083]  [<ffffffff8161390d>] dev_hard_start_xmit+0x49d/0x5e0
      [   62.657083]  [<ffffffff81613d29>] dev_queue_xmit+0x2d9/0x480
      [   62.657083]  [<ffffffff817cb854>] ? _raw_write_unlock_bh+0x14/0x20
      [   62.657083]  [<ffffffff81630565>] ? eth_header+0x35/0xe0
      [   62.657083]  [<ffffffff8161bc5e>] neigh_resolve_output+0x11e/0x1e0
      [   62.657083]  [<ffffffff816ce0e0>] ? ip6_fragment+0xad0/0xad0
      [   62.657083]  [<ffffffff816cb465>] ip6_finish_output2+0x2f5/0x470
      [   62.657083]  [<ffffffff816ce166>] ip6_finish_output+0x86/0xc0
      [   62.657083]  [<ffffffff816ce218>] ip6_output+0x78/0xb0
      [   62.657083]  [<ffffffff816eadd6>] mld_sendpack+0x256/0x2a0
      [   62.657083]  [<ffffffff816ebd8c>] mld_ifc_timer_expire+0x17c/0x290
      [   62.657083]  [<ffffffff816ebc10>] ? igmp6_timer_handler+0x80/0x80
      [   62.657083]  [<ffffffff816ebc10>] ? igmp6_timer_handler+0x80/0x80
      [   62.657083]  [<ffffffff81051065>] call_timer_fn+0x45/0x150
      [   62.657083]  [<ffffffff816ebc10>] ? igmp6_timer_handler+0x80/0x80
      [   62.657083]  [<ffffffff81052353>] run_timer_softirq+0x1f3/0x2a0
      [   62.657083]  [<ffffffff8102dfd8>] ? lapic_next_event+0x18/0x20
      [   62.657083]  [<ffffffff8109e36f>] ? clockevents_program_event+0x6f/0x110
      [   62.657083]  [<ffffffff8104a2f6>] __do_softirq+0xd6/0x2b0
      [   62.657083]  [<ffffffff8104a75e>] irq_exit+0x7e/0xa0
      [   62.657083]  [<ffffffff8102ea15>] smp_apic_timer_interrupt+0x45/0x60
      [   62.657083]  [<ffffffff817d3eca>] apic_timer_interrupt+0x6a/0x70
      [   62.657083]  <EOI>
      [   62.657083]  [<ffffffff817d4a35>] ? sysenter_dispatch+0x7/0x1a
      [   62.657083] Code: 4d 8b 85 a8 02 00 00 4c 89 e9 ba 03 04 00 00 48 c7 c6 c0 be 8d 81 48 c7 c7 48 35 a3 81 31 c0 e8 db 68 0e 00 49 8b 85 a8 02 00 00 <0f> b6 40 46 c0 e8 05 0f b6 c0 c1 e0 03 41 09 c4 e9 77 ff ff ff
      [   62.657083] RIP  [<ffffffff816d822d>] ip6_route_output+0xbd/0xe0
      [   62.657083]  RSP <ffff88002fd038f8>
      [   62.657083] CR2: 0000000000000046
      [   62.657083] ---[ end trace ba8a9583d7cd1934 ]---
      [   62.657083] Kernel panic - not syncing: Fatal exception in interrupt
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Reported-by: NRyan Whelan <rcwhelan@gmail.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7bda701e
  5. 23 12月, 2013 1 次提交
  6. 18 12月, 2013 1 次提交
  7. 12 12月, 2013 2 次提交
    • G
      vxlan: leave multicast group when vxlan device down · 95ab0991
      Gao feng 提交于
      vxlan_group_used only allows device to leave multicast group
      when the remote_ip of this vxlan device is difference from
      other vxlan devices' remote_ip. this will cause device not
      leave multicast group untile the vn_sock of this vxlan deivce
      being released.
      
      The check in vxlan_group_used is not quite precise. since even
      the remote_ip is same, but these vxlan devices may use different
      lower devices, and they may use different vn_socks.
      
      Only when some vxlan devices use the same vn_sock,same lower
      device and same remote_ip, the mc_list of the vn_sock should
      not be changed.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95ab0991
    • G
      vxlan: remove vxlan_group_used in vxlan_open · 79d4a94f
      Gao feng 提交于
      In vxlan_open, vxlan_group_used always returns true,
      because the state of the vxlan deivces which we want
      to open has alreay been running. and it has already
      in vxlan_list.
      
      Since ip_mc_join_group takes care of the reference
      of struct ip_mc_list. removing vxlan_group_used here
      is safe.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79d4a94f
  8. 11 12月, 2013 1 次提交
  9. 06 11月, 2013 1 次提交
    • J
      net: Explicitly initialize u64_stats_sync structures for lockdep · 827da44c
      John Stultz 提交于
      In order to enable lockdep on seqcount/seqlock structures, we
      must explicitly initialize any locks.
      
      The u64_stats_sync structure, uses a seqcount, and thus we need
      to introduce a u64_stats_init() function and use it to initialize
      the structure.
      
      This unfortunately adds a lot of fairly trivial initialization code
      to a number of drivers. But the benefit of ensuring correctness makes
      this worth while.
      
      Because these changes are required for lockdep to be enabled, and the
      changes are quite trivial, I've not yet split this patch out into 30-some
      separate patches, as I figured it would be better to get the various
      maintainers thoughts on how to best merge this change along with
      the seqcount lockdep enablement.
      
      Feedback would be appreciated!
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Mirko Lindner <mlindner@marvell.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Roger Luethi <rl@hellgate.ch>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Wensong Zhang <wensong@linux-vs.org>
      Cc: netdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      827da44c
  10. 05 11月, 2013 1 次提交
  11. 29 10月, 2013 2 次提交
  12. 01 10月, 2013 2 次提交
  13. 18 9月, 2013 1 次提交
  14. 16 9月, 2013 1 次提交
  15. 06 9月, 2013 2 次提交
    • P
      vxlan: Fix kernel panic on device delete. · f011baf9
      Pravin B Shelar 提交于
      On vxlan device create if socket create fails vxlan device is not
      added to hash table. Therefore we need to check if device
      is in hashtable before we delete it from hlist.
      Following patch avoid the crash. net-next already has this fix.
      
      ---8<---
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffffa05f9ca7>] vxlan_dellink+0x77/0xf0 [vxlan]
      PGD 42b2d9067 PUD 42e04c067 PMD 0
      Oops: 0002 [#1] SMP
      Modules linked in: vxlan(-)
      Hardware name: Dell Inc. PowerEdge R620/0KCKR5, BIOS 1.4.8 10/25/2012
      task: ffff88042ecf8760 ti: ffff88042f106000 task.ti: ffff88042f106000
      RIP: 0010:[<ffffffffa05f9ca7>]  [<ffffffffa05f9ca7>]
      vxlan_dellink+0x77/0xf0 [vxlan]
      RSP: 0018:ffff88042f107e28  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff88082af08000 RCX: ffff88083fd80000
      RDX: 0000000000000000 RSI: ffff88042f107e58 RDI: ffff88042e12f810
      RBP: ffff88042f107e48 R08: ffffffff8166eca0 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000000 R12: ffff88082af087c0
      R13: ffff88042e12f000 R14: ffff88042f107e58 R15: ffff88042f107e58
      FS:  00007f4ed2de7700(0000) GS:ffff88043fc80000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000000 CR3: 000000042e076000 CR4: 00000000000407e0
      Stack:
       ffff88082af08000 ffffffff81654848 ffffffffa05fb4e0 ffffffff81654780
       ffff88042f107e98 ffffffff813b9c7a ffff88042f107e58 ffff88042f107e58
       ffff88042f107e88 ffffffffa05fb4e0 ffffffffa05fb780 ffff88042f107f18
      Call Trace:
       [<ffffffff813b9c7a>] __rtnl_link_unregister+0xca/0xd0
       [<ffffffff813bb0e9>] rtnl_link_unregister+0x19/0x30
       [<ffffffffa05faa4c>] vxlan_cleanup_module+0x10/0x2f [vxlan]
       [<ffffffff81099fef>] SyS_delete_module+0x1cf/0x2c0
       [<ffffffff8146c069>] ? do_page_fault+0x9/0x10
       [<ffffffff8146f012>] system_call_fastpath+0x16/0x1b
      Code: 4d 85 ed 0f 84 95 00 00 00 4c 8d a7 c0 07 00 00 49 8d bd 10 08 00
      00 e8 28 e8 e6 e0 48 8b 83 c0 07 00 00 49 8b 54 24 08 48 85 c0 <48> 89
      02 74 04 48 89 50 08 49 b8 00 02 20 00 00 00 ad de 4d 89
      RIP  [<ffffffffa05f9ca7>] vxlan_dellink+0x77/0xf0 [vxlan]
       RSP <ffff88042f107e28>
      CR2: 0000000000000000
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f011baf9
    • J
      vxlan: Notify drivers for listening UDP port changes · 53cf5275
      Joseph Gasparakis 提交于
      This patch adds two more ndo ops: ndo_add_rx_vxlan_port() and
      ndo_del_rx_vxlan_port().
      
      Drivers can get notifications through the above functions about changes
      of the UDP listening port of VXLAN. Also, when physical ports come up,
      now they can call vxlan_get_rx_port() in order to obtain the port number(s)
      of the existing VXLAN interface in case they already up before them.
      
      This information about the listening UDP port would be used for VXLAN
      related offloads.
      
      A big thank you to John Fastabend (john.r.fastabend@intel.com) for his
      input and his suggestions on this patch set.
      
      CC: John Fastabend <john.r.fastabend@intel.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NJoseph Gasparakis <joseph.gasparakis@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53cf5275
  16. 05 9月, 2013 1 次提交
  17. 04 9月, 2013 4 次提交
  18. 03 9月, 2013 2 次提交
  19. 01 9月, 2013 3 次提交
  20. 21 8月, 2013 1 次提交
  21. 20 8月, 2013 7 次提交
  22. 10 8月, 2013 2 次提交
  23. 05 8月, 2013 1 次提交