1. 29 4月, 2013 1 次提交
  2. 23 4月, 2013 1 次提交
    • D
      VXLAN: Allow L2 redirection with L3 switching · ae884082
      David Stevens 提交于
      Allow L2 redirection when VXLAN L3 switching is enabled
      
      This patch restricts L3 switching to destination MAC addresses that are
      marked as routers in order to allow virtual IP appliances that do L2
      redirection to function with VXLAN L3 switching enabled.
      
      We use L3 switching on VXLAN networks to avoid extra hops when the nominal
      router for cross-subnet traffic for a VM is remote and the ultimate
      destination may be local, or closer to the local node. Currently, the
      destination IP address takes precedence over the MAC address in all cases.
      Some network appliances receive packets for a virtualized IP address and
      redirect by changing the destination MAC address (only) to be the final
      destination for packet processing. VXLAN tunnel endpoints with L3 switching
      enabled may then overwrite this destination MAC address based on the packet IP
      address, resulting in potential loops and, at least, breaking L2 redirections
      that travel through tunnel endpoints.
      
      This patch limits L3 switching to the intended case where the original
      destination MAC address is a next-hop router and relies on the destination
      MAC address for all other cases, thus allowing L2 redirection and L3 switching
      to coexist peacefully.
      Signed-Off-By: NDavid L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae884082
  3. 17 4月, 2013 1 次提交
  4. 16 4月, 2013 1 次提交
  5. 15 4月, 2013 1 次提交
  6. 13 4月, 2013 1 次提交
  7. 08 4月, 2013 1 次提交
  8. 27 3月, 2013 4 次提交
  9. 18 3月, 2013 1 次提交
    • D
      vxlan: generalize forwarding tables · 6681712d
      David Stevens 提交于
      This patch generalizes VXLAN forwarding table entries allowing an administrator
      to:
      	1) specify multiple destinations for a given MAC
      	2) specify alternate vni's in the VXLAN header
      	3) specify alternate destination UDP ports
      	4) use multicast MAC addresses as fdb lookup keys
      	5) specify multicast destinations
      	6) specify the outgoing interface for forwarded packets
      
      The combination allows configuration of more complex topologies using VXLAN
      encapsulation.
      
      Changes since v1: rebase to 3.9.0-rc2
      Signed-Off-By: NDavid L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6681712d
  10. 10 3月, 2013 2 次提交
  11. 08 3月, 2013 1 次提交
    • Z
      vxlan: fix oops when delete netns containing vxlan · 9cb6cb7e
      Zang MingJie 提交于
      The following script will produce a kernel oops:
      
          sudo ip netns add v
          sudo ip netns exec v ip ad add 127.0.0.1/8 dev lo
          sudo ip netns exec v ip link set lo up
          sudo ip netns exec v ip ro add 224.0.0.0/4 dev lo
          sudo ip netns exec v ip li add vxlan0 type vxlan id 42 group 239.1.1.1 dev lo
          sudo ip netns exec v ip link set vxlan0 up
          sudo ip netns del v
      
      where inspect by gdb:
      
          Program received signal SIGSEGV, Segmentation fault.
          [Switching to Thread 107]
          0xffffffffa0289e33 in ?? ()
          (gdb) bt
          #0  vxlan_leave_group (dev=0xffff88001bafa000) at drivers/net/vxlan.c:533
          #1  vxlan_stop (dev=0xffff88001bafa000) at drivers/net/vxlan.c:1087
          #2  0xffffffff812cc498 in __dev_close_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:1299
          #3  0xffffffff812cd920 in dev_close_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:1335
          #4  0xffffffff812cef31 in rollback_registered_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:4851
          #5  0xffffffff812cf040 in unregister_netdevice_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:5752
          #6  0xffffffff812cf1ba in default_device_exit_batch (net_list=0xffff88001f2e7e18) at net/core/dev.c:6170
          #7  0xffffffff812cab27 in cleanup_net (work=<optimized out>) at net/core/net_namespace.c:302
          #8  0xffffffff810540ef in process_one_work (worker=0xffff88001ba9ed40, work=0xffffffff8167d020) at kernel/workqueue.c:2157
          #9  0xffffffff810549d0 in worker_thread (__worker=__worker@entry=0xffff88001ba9ed40) at kernel/workqueue.c:2276
          #10 0xffffffff8105870c in kthread (_create=0xffff88001f2e5d68) at kernel/kthread.c:168
          #11 <signal handler called>
          #12 0x0000000000000000 in ?? ()
          #13 0x0000000000000000 in ?? ()
          (gdb) fr 0
          #0  vxlan_leave_group (dev=0xffff88001bafa000) at drivers/net/vxlan.c:533
          533		struct sock *sk = vn->sock->sk;
          (gdb) l
          528	static int vxlan_leave_group(struct net_device *dev)
          529	{
          530		struct vxlan_dev *vxlan = netdev_priv(dev);
          531		struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
          532		int err = 0;
          533		struct sock *sk = vn->sock->sk;
          534		struct ip_mreqn mreq = {
          535			.imr_multiaddr.s_addr	= vxlan->gaddr,
          536			.imr_ifindex		= vxlan->link,
          537		};
          (gdb) p vn->sock
          $4 = (struct socket *) 0x0
      
      The kernel calls `vxlan_exit_net` when deleting the netns before shutting down
      vxlan interfaces. Later the removal of all vxlan interfaces, where `vn->sock`
      is already gone causes the oops. so we should manually shutdown all interfaces
      before deleting `vn->sock` as the patch does.
      Signed-off-by: NZang MingJie <zealot0630@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cb6cb7e
  12. 06 3月, 2013 1 次提交
    • Z
      reset nf before xmit vxlan encapsulated packet · 88c4c066
      Zang MingJie 提交于
      We should reset nf settings bond to the skb as ipip/ipgre do.
      
      If not, the conntrack/nat info bond to the origin packet may continually
      redirect the packet to vxlan interface causing a routing loop.
      
      this is the scenario:
      
           VETP     VXLAN Gateway
          /----\  /---------------\
          |    |  |               |
          |  vx+--+vx --NAT-> eth0+--> Internet
          |    |  |               |
          \----/  \---------------/
      
      when there are any packet coming from internet to the vetp, there will be lots
      of garbage packets coming out the gateway's vxlan interface, but none actually
      sent to the physical interface, because they are redirected back to the vxlan
      interface in the postrouting chain of NAT rule, and dmesg complains:
      
          Mar  1 21:52:53 debian kernel: [ 8802.997699] Dead loop on virtual device vxlan0, fix it urgently!
          Mar  1 21:52:54 debian kernel: [ 8804.004907] Dead loop on virtual device vxlan0, fix it urgently!
          Mar  1 21:52:55 debian kernel: [ 8805.012189] Dead loop on virtual device vxlan0, fix it urgently!
          Mar  1 21:52:56 debian kernel: [ 8806.020593] Dead loop on virtual device vxlan0, fix it urgently!
      
      the patch should fix the problem
      Signed-off-by: NZang MingJie <zealot0630@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88c4c066
  13. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  14. 26 2月, 2013 1 次提交
  15. 14 2月, 2013 1 次提交
  16. 31 1月, 2013 1 次提交
  17. 03 1月, 2013 1 次提交
  18. 27 12月, 2012 1 次提交
  19. 09 12月, 2012 2 次提交
  20. 21 11月, 2012 1 次提交
    • D
      add DOVE extensions for VXLAN · e4f67add
      David Stevens 提交于
      This patch provides extensions to VXLAN for supporting Distributed
      Overlay Virtual Ethernet (DOVE) networks. The patch includes:
      
      	+ a dove flag per VXLAN device to enable DOVE extensions
      	+ ARP reduction, whereby a bridge-connected VXLAN tunnel endpoint
      		answers ARP requests from the local bridge on behalf of
      		remote DOVE clients
      	+ route short-circuiting (aka L3 switching). Known destination IP
      		addresses use the corresponding destination MAC address for
      		switching rather than going to a (possibly remote) router first.
      	+ netlink notification messages for forwarding table and L3 switching
      		misses
      
      Changes since v2
      	- combined bools into "u32 flags"
      	- replaced loop with !is_zero_ether_addr()
      Signed-off-by: NDavid L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4f67add
  21. 18 11月, 2012 1 次提交
  22. 15 11月, 2012 1 次提交
  23. 14 11月, 2012 3 次提交
  24. 04 11月, 2012 1 次提交
  25. 01 11月, 2012 1 次提交
  26. 11 10月, 2012 8 次提交