1. 20 9月, 2014 1 次提交
    • F
      net: allow macvlans to move to net namespace · 0d0162e7
      Francesco Ruggeri 提交于
      I cannot move a macvlan interface created on top of a bonding interface
      to a different namespace:
      
      % ip netns add dummy0
      % ip link add link bond0 mac0 type macvlan
      % ip link set mac0 netns dummy0
      RTNETLINK answers: Invalid argument
      %
      
      The problem seems to be that commit f9399814 ("bonding: Don't allow
      bond devices to change network namespaces.") sets NETIF_F_NETNS_LOCAL
      on bonding interfaces, and commit 797f87f8 ("macvlan: fix netdev
      feature propagation from lower device") causes macvlan interfaces
      to inherit its features from the lower device.
      
      NETIF_F_NETNS_LOCAL should not be inherited from the lower device
      by a macvlan.
      Patch tested on 3.16.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
      Acked-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d0162e7
  2. 22 8月, 2014 1 次提交
  3. 15 8月, 2014 1 次提交
    • D
      Revert "macvlan: simplify the structure port" · 5e3c516b
      David S. Miller 提交于
      This reverts commit a188a54d.
      
      It causes crashes
      
      ====================
      [   80.643286] BUG: unable to handle kernel NULL pointer dereference at 0000000000000878
      [   80.670103] IP: [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   80.691289] PGD 22c102067 PUD 235bf0067 PMD 0
      [   80.706611] Oops: 0002 [#1] SMP
      [   80.717836] Modules linked in: macvlan nfsd lockd nfs_acl exportfs auth_rpcgss sunrpc oid_registry ioatdma ixgbe(-) mdio igb dca
      [   80.757935] CPU: 37 PID: 6724 Comm: rmmod Not tainted 3.16.0-net-next-08-12-2014-FCoE+ #1
      [   80.785688] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014
      [   80.820310] task: ffff880235a9eae0 ti: ffff88022e844000 task.ti: ffff88022e844000
      [   80.845770] RIP: 0010:[<ffffffff810832e4>]  [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   80.875326] RSP: 0018:ffff88022e847b28  EFLAGS: 00010046
      [   80.893251] RAX: 0000000000037a6a RBX: 0000000000000878 RCX: 0000000000000000
      [   80.917187] RDX: ffff880235a9eae0 RSI: 0000000000000001 RDI: ffffffff810832db
      [   80.941125] RBP: ffff88022e847b58 R08: 0000000000000000 R09: 0000000000000000
      [   80.965056] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022e847b70
      [   80.988994] R13: 0000000000000000 R14: ffff88022e847be8 R15: ffffffff81ebe440
      [   81.012929] FS:  00007fab90b07700(0000) GS:ffff88043f7a0000(0000) knlGS:0000000000000000
      [   81.040400] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   81.059757] CR2: 0000000000000878 CR3: 0000000235a42000 CR4: 00000000001407e0
      [   81.083689] Stack:
      [   81.090739]  ffff880235a9eae0 0000000000000878 ffff88022e847b70 0000000000000000
      [   81.116253]  ffff88022e847be8 ffffffff81ebe440 ffff88022e847b98 ffffffff810847f1
      [   81.141766]  ffff88022e847b78 0000000000000286 ffff880234200000 0000000000000000
      [   81.167282] Call Trace:
      [   81.175768]  [<ffffffff810847f1>] __cancel_work_timer+0x31/0x170
      [   81.195985]  [<ffffffff8108494b>] cancel_work_sync+0xb/0x10
      [   81.214769]  [<ffffffffa015ae68>] macvlan_port_destroy+0x28/0x60 [macvlan]
      [   81.237844]  [<ffffffffa015b930>] macvlan_uninit+0x40/0x50 [macvlan]
      [   81.259209]  [<ffffffff816bf6e2>] rollback_registered_many+0x1a2/0x2c0
      [   81.281140]  [<ffffffff816bf81a>] unregister_netdevice_many+0x1a/0xb0
      [   81.302786]  [<ffffffffa015a4ff>] macvlan_device_event+0x1ef/0x240 [macvlan]
      [   81.326439]  [<ffffffff8108a13d>] notifier_call_chain+0x4d/0x70
      [   81.346366]  [<ffffffff8108a201>] raw_notifier_call_chain+0x11/0x20
      [   81.367439]  [<ffffffff816bf25b>] call_netdevice_notifiers_info+0x3b/0x70
      [   81.390228]  [<ffffffff816bf2a1>] call_netdevice_notifiers+0x11/0x20
      [   81.411587]  [<ffffffff816bf6bd>] rollback_registered_many+0x17d/0x2c0
      [   81.433518]  [<ffffffff816bf925>] unregister_netdevice_queue+0x75/0x110
      [   81.455735]  [<ffffffff816bfb2b>] unregister_netdev+0x1b/0x30
      [   81.475094]  [<ffffffffa0039b50>] ixgbe_remove+0x170/0x1d0 [ixgbe]
      [   81.495886]  [<ffffffff813512a2>] pci_device_remove+0x32/0x60
      [   81.515246]  [<ffffffff814c75c4>] __device_release_driver+0x64/0xd0
      [   81.536321]  [<ffffffff814c76f8>] driver_detach+0xc8/0xd0
      [   81.554530]  [<ffffffff814c656e>] bus_remove_driver+0x4e/0xa0
      [   81.573888]  [<ffffffff814c828b>] driver_unregister+0x2b/0x60
      [   81.593246]  [<ffffffff8135143e>] pci_unregister_driver+0x1e/0xa0
      [   81.613749]  [<ffffffffa005db18>] ixgbe_exit_module+0x1c/0x2e [ixgbe]
      [   81.635401]  [<ffffffff810e738b>] SyS_delete_module+0x15b/0x1e0
      [   81.655334]  [<ffffffff8187a395>] ? sysret_check+0x22/0x5d
      [   81.673833]  [<ffffffff810abd2d>] ? trace_hardirqs_on_caller+0x11d/0x1e0
      [   81.696339]  [<ffffffff8132bfde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [   81.717985]  [<ffffffff8187a369>] system_call_fastpath+0x16/0x1b
      [   81.738199] Code: 00 48 83 3d 6e bb da 00 00 48 89 c2 0f 84 67 01 00 00 fa 66 0f 1f 44 00 00 49 89 14 24 e8 b5 4b 02 00 45 84 ed 0f 85 ac 00 00 00 <f0> 0f ba 2b 00 72 1d 31 c0 48 8b 5d d8 4c 8b 65 e0 4c 8b 6d e8
      [   81.807026] RIP  [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   81.828468]  RSP <ffff88022e847b28>
      [   81.840384] CR2: 0000000000000878
      [   81.851731] ---[ end trace 9f6c7232e3464e11 ]---
      ====================
      
      This bug could be triggered by these steps:
      
      modprobe ixgbe ; modprobe macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:00 macvlan0 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:01 macvlan1 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:02 macvlan2 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:03 macvlan3 type macvlan
      rmmod ixgbe
      Reported-by: N"Keller, Jacob E" <jacob.e.keller@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e3c516b
  4. 01 8月, 2014 1 次提交
  5. 09 6月, 2014 1 次提交
  6. 05 6月, 2014 1 次提交
    • V
      macvlan: Support bonding events · 4c991255
      Vlad Yasevich 提交于
      Bonding and team drivers generate specific events during failover
      that trigger switch updates.  When a macvlan device is configured
      on top of bonding, we want switches to learn about the macvlan
      devices as well.   This patch adds a handler to macvlan driver to
      propagate these events to all macvlan devices.  We let the generic
      inetdev event handler do the work.
      
      This allows macvlan to operated correctly over active-backup
      mode bond.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c991255
  7. 03 6月, 2014 2 次提交
  8. 17 5月, 2014 1 次提交
    • V
      macvlan: Fix lockdep warnings with stacked macvlan devices · c674ac30
      Vlad Yasevich 提交于
      Macvlan devices try to avoid stacking, but that's not always
      successfull or even desired.  As an example, the following
      configuration is perefectly legal and valid:
      
      eth0 <--- macvlan0 <---- vlan0.10 <--- macvlan1
      
      However, this configuration produces the following lockdep
      trace:
      [  115.620418] ======================================================
      [  115.620477] [ INFO: possible circular locking dependency detected ]
      [  115.620516] 3.15.0-rc1+ #24 Not tainted
      [  115.620540] -------------------------------------------------------
      [  115.620577] ip/1704 is trying to acquire lock:
      [  115.620604]  (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
      [  115.620686]
      but task is already holding lock:
      [  115.620723]  (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
      [  115.620795]
      which lock already depends on the new lock.
      
      [  115.620853]
      the existing dependency chain (in reverse order) is:
      [  115.620894]
      -> #1 (&macvlan_netdev_addr_lock_key){+.....}:
      [  115.620935]        [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
      [  115.620974]        [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
      [  115.621019]        [<ffffffffa07296c3>] vlan_dev_set_rx_mode+0x53/0x110 [8021q]
      [  115.621066]        [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
      [  115.621105]        [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
      [  115.621143]        [<ffffffff815da6be>] __dev_open+0xde/0x140
      [  115.621174]        [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
      [  115.621174]        [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
      [  115.621174]        [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
      [  115.621174]        [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
      [  115.621174]        [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
      [  115.621174]        [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
      [  115.621174]        [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
      [  115.621174]        [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
      [  115.621174]        [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
      [  115.621174]        [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
      [  115.621174]        [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
      [  115.621174]        [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
      [  115.621174]        [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
      [  115.621174]        [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
      [  115.621174]
      -> #0 (&vlan_netdev_addr_lock_key/1){+.....}:
      [  115.621174]        [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
      [  115.621174]        [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
      [  115.621174]        [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
      [  115.621174]        [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
      [  115.621174]        [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
      [  115.621174]        [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
      [  115.621174]        [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
      [  115.621174]        [<ffffffff815da6be>] __dev_open+0xde/0x140
      [  115.621174]        [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
      [  115.621174]        [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
      [  115.621174]        [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
      [  115.621174]        [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
      [  115.621174]        [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
      [  115.621174]        [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
      [  115.621174]        [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
      [  115.621174]        [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
      [  115.621174]        [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
      [  115.621174]        [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
      [  115.621174]        [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
      [  115.621174]        [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
      [  115.621174]        [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
      [  115.621174]        [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
      [  115.621174]
      other info that might help us debug this:
      
      [  115.621174]  Possible unsafe locking scenario:
      
      [  115.621174]        CPU0                    CPU1
      [  115.621174]        ----                    ----
      [  115.621174]   lock(&macvlan_netdev_addr_lock_key);
      [  115.621174]                                lock(&vlan_netdev_addr_lock_key/1);
      [  115.621174]                                lock(&macvlan_netdev_addr_lock_key);
      [  115.621174]   lock(&vlan_netdev_addr_lock_key/1);
      [  115.621174]
       *** DEADLOCK ***
      
      [  115.621174] 2 locks held by ip/1704:
      [  115.621174]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff815e6dbb>] rtnetlink_rcv+0x1b/0x40
      [  115.621174]  #1:  (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
      [  115.621174]
      stack backtrace:
      [  115.621174] CPU: 3 PID: 1704 Comm: ip Not tainted 3.15.0-rc1+ #24
      [  115.621174] Hardware name: Hewlett-Packard HP xw8400 Workstation/0A08h, BIOS 786D5 v02.38 10/25/2010
      [  115.621174]  ffffffff82339ae0 ffff880465f79568 ffffffff816ee20c ffffffff82339ae0
      [  115.621174]  ffff880465f795a8 ffffffff816e9e1b ffff880465f79600 ffff880465b019c8
      [  115.621174]  0000000000000001 0000000000000002 ffff880465b019c8 ffff880465b01230
      [  115.621174] Call Trace:
      [  115.621174]  [<ffffffff816ee20c>] dump_stack+0x4d/0x66
      [  115.621174]  [<ffffffff816e9e1b>] print_circular_bug+0x200/0x20e
      [  115.621174]  [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
      [  115.621174]  [<ffffffff810d3172>] ? trace_hardirqs_on_caller+0xb2/0x1d0
      [  115.621174]  [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
      [  115.621174]  [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
      [  115.621174]  [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
      [  115.621174]  [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
      [  115.621174]  [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
      [  115.621174]  [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
      [  115.621174]  [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
      [  115.621174]  [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
      [  115.621174]  [<ffffffff815da6be>] __dev_open+0xde/0x140
      [  115.621174]  [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
      [  115.621174]  [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
      [  115.621174]  [<ffffffff811e1db1>] ? mem_cgroup_bad_page_check+0x21/0x30
      [  115.621174]  [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
      [  115.621174]  [<ffffffff810d394c>] ? __lock_acquire+0x37c/0x1a60
      [  115.621174]  [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
      [  115.621174]  [<ffffffff815ea169>] ? rtnl_newlink+0xe9/0x730
      [  115.621174]  [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
      [  115.621174]  [<ffffffff810d329d>] ? trace_hardirqs_on+0xd/0x10
      [  115.621174]  [<ffffffff815e6dbb>] ? rtnetlink_rcv+0x1b/0x40
      [  115.621174]  [<ffffffff815e6de0>] ? rtnetlink_rcv+0x40/0x40
      [  115.621174]  [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
      [  115.621174]  [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
      [  115.621174]  [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
      [  115.621174]  [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
      [  115.621174]  [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
      [  115.621174]  [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
      [  115.621174]  [<ffffffff8119d4f8>] ? might_fault+0xa8/0xb0
      [  115.621174]  [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
      [  115.621174]  [<ffffffff815cb51e>] ? verify_iovec+0x5e/0xe0
      [  115.621174]  [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
      [  115.621174]  [<ffffffff816faa0d>] ? __do_page_fault+0x11d/0x570
      [  115.621174]  [<ffffffff810cfe9f>] ? up_read+0x1f/0x40
      [  115.621174]  [<ffffffff816fab04>] ? __do_page_fault+0x214/0x570
      [  115.621174]  [<ffffffff8120a10b>] ? mntput_no_expire+0x6b/0x1c0
      [  115.621174]  [<ffffffff8120a0b7>] ? mntput_no_expire+0x17/0x1c0
      [  115.621174]  [<ffffffff8120a284>] ? mntput+0x24/0x40
      [  115.621174]  [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
      [  115.621174]  [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
      [  115.621174]  [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
      
      Fix this by correctly providing macvlan lockdep class.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c674ac30
  9. 16 5月, 2014 1 次提交
  10. 15 5月, 2014 1 次提交
  11. 13 5月, 2014 1 次提交
  12. 01 5月, 2014 1 次提交
    • V
      Revert "macvlan : fix checksums error when we are in bridge mode" · f114890c
      Vlad Yasevich 提交于
      This reverts commit 12a2856b.
      The commit above doesn't appear to be necessary any more as the
      checksums appear to be correctly computed/validated.
      
      Additionally the above commit breaks kvm configurations where
      one VM is using a device that support checksum offload (virtio) and
      the other VM does not.
      In this case, packets leaving virtio device will have CHECKSUM_PARTIAL
      set.  The packets is forwarded to a macvtap that has offload features
      turned off.  Since we use CHECKSUM_UNNECESSARY, the host does does not
      update the checksum and thus a bad checksum is passed up to
      the guest.
      
      CC: Daniel Lezcano <daniel.lezcano@free.fr>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Andrian Nord <nightnord@gmail.com>
      CC: Eric Dumazet <eric.dumazet@gmail.com>
      CC: Michael S. Tsirkin <mst@redhat.com>
      CC: Jason Wang <jasowang@redhat.com>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f114890c
  13. 24 4月, 2014 1 次提交
  14. 21 4月, 2014 1 次提交
    • H
      macvlan: Move broadcasts into a work queue · 412ca155
      Herbert Xu 提交于
      Currently broadcasts are handled in network RX context, where
      the packets are sent through netif_rx.  This means that the number
      of macvlans will be constrained by the capacity of netif_rx.
      
      For example, setting up 4096 macvlans practically causes all
      broadcast packets to be dropped as the default netif_rx queue
      size simply can't handle 4096 skbs being stuffed into it all
      at once.
      
      Fundamentally, we need to ensure that the amount of work handled
      in each netif_rx backlog run is constrained.  As broadcasts are
      anything but constrained, it either needs to be limited per run
      or moved to process context.
      
      This patch picks the second option and moves all broadcast handling
      bar the trivial case of packets going to a single interface into
      a work queue.  Obviously there also needs to be a limit on how
      many broadcast packets we postpone in this way.  I've arbitrarily
      chosen tx_queue_len of the master device as the limit (act_mirred
      also happens to use this parameter in a similar way).
      
      In order to ensure we don't exceed the backlog queue we will use
      netif_rx_ni instead of netif_rx for broadcast packets.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      
      Thanks,
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      412ca155
  15. 15 3月, 2014 1 次提交
  16. 04 3月, 2014 1 次提交
    • V
      macvlan: Add support for 'always_on' offload features · 8b4703e9
      Vlad Yasevich 提交于
      Macvlan currently inherits all of its features from the lower
      device.  When lower device disables offload support, this causes
      macvlan to disable offload support as well.  This causes
      performance regression when using macvlan/macvtap in bridge
      mode.
      
      It can be easily demonstrated by creating 2 namespaces using
      macvlan in bridge mode and running netperf between them:
      
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    20.00    1204.61
      
      To restore the performance, we add software offload features
      to the list of "always_on" features for macvlan.  This way
      when a namespace or a guest using macvtap initially sends a
      packet, this packet will not be segmented at macvlan level.
      It will only be segmented when macvlan sends the packet
      to the lower device.
      
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    20.00    5507.35
      
      Fixes: 6acf54f1 (macvtap: Add support of packet capture on macvtap device.)
      Fixes: 797f87f8 (macvlan: fix netdev feature propagation from lower device)
      CC: Florian Westphal <fw@strlen.de>
      CC: Christian Borntraeger <borntraeger@de.ibm.com>
      CC: Jason Wang <jasowang@redhat.com>
      CC: Michael S. Tsirkin <mst@redhat.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b4703e9
  17. 15 2月, 2014 1 次提交
  18. 14 2月, 2014 1 次提交
  19. 11 1月, 2014 2 次提交
    • J
      net: core: explicitly select a txq before doing l2 forwarding · f663dd9a
      Jason Wang 提交于
      Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
      will cause several issues:
      
      - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
        instead of lower device which misses the necessary txq synchronization for
        lower device such as txq stopping or frozen required by dev watchdog or
        control path.
      - dev_hard_start_xmit() was called with NULL txq which bypasses the net device
        watchdog.
      - dev_hard_start_xmit() does not check txq everywhere which will lead a crash
        when tso is disabled for lower device.
      
      Fix this by explicitly introducing a new param for .ndo_select_queue() for just
      selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
      extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
      forwarding transmission.
      
      With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
      to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
      a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
      dev_queue_xmit() to do the transmission.
      
      In the future, it was also required for macvtap l2 forwarding support since it
      provides a necessary synchronization method.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: e1000-devel@lists.sourceforge.net
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f663dd9a
    • J
      macvlan: forbid L2 fowarding offload for macvtap · b13ba1b8
      Jason Wang 提交于
      L2 fowarding offload will bypass the rx handler of real device. This will make
      the packet could not be forwarded to macvtap device. Another problem is the
      dev_hard_start_xmit() called for macvtap does not have any synchronization.
      
      Fix this by forbidding L2 forwarding for macvtap.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b13ba1b8
  20. 05 1月, 2014 1 次提交
  21. 28 12月, 2013 1 次提交
  22. 27 12月, 2013 1 次提交
    • F
      macvlan: fix netdev feature propagation from lower device · 797f87f8
      Florian Westphal 提交于
      There are inconsistencies wrt. feature propagation/inheritance between
      macvlan and the underlying interface.
      
      When a feature is turned off on the real device before a macvlan is
      created on top, these will remain enabled on the macvlan device, whereas
      turning off the feature on the lower device after macvlan creation the
      kernel will propagate the changes to the macvlan.
      
      The second issue is that, when propagating changes from underlying device
      to the macvlan interface, macvlan can erronously lose its NETIF_F_LLTX flag,
      as features are anded with the underlying device.
      
      However, LLTX should be kept since it has no dependencies on physical
      hardware (LLTX is set on macvlan creation regardless of the lower
      device properties, see 8ffab51b
      (macvlan: lockless tx path).
      
      The LLTX flag is now forced regardless of user settings in absence of
      layer2 hw acceleration (a6cc0cfa,
      net: Add layer 2 hardware acceleration operations for macvlan devices).
      
      Use netdev_increment_features to rebuild the feature set on capability
      changes on either the lower device or on the macvlan interface.
      
      As pointed out by Ben Hutchings, use netdev_update_features on
      NETDEV_FEAT_CHANGE event (it calls macvlan_fix_features/netdev_features_change
      if needed).
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      797f87f8
  23. 13 12月, 2013 1 次提交
  24. 06 12月, 2013 1 次提交
    • K
      macvlan: Support creating macvtaps from macvlans · d70f2cf5
      Kevin Wallace 提交于
      When running in a network namespace whose only link to the outside
      world is a macvlan device, not being able to create a macvtap off of
      it is a real pain.
      
      So modify macvtap creation to automatically forward a creation of a
      macvtap on a macvlan to become a creation of a macvtap on the
      underlying network device, just like is currently done with
      macvlan-on-macvlan devices.
      
      v2: Use netif_is_macvlan and macvlan_dev_real_dev helpers to make it
          more clear what we're doing.
      Signed-off-by: NKevin Wallace <kevin@pentabarf.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d70f2cf5
  25. 08 11月, 2013 1 次提交
  26. 06 11月, 2013 1 次提交
    • J
      net: Explicitly initialize u64_stats_sync structures for lockdep · 827da44c
      John Stultz 提交于
      In order to enable lockdep on seqcount/seqlock structures, we
      must explicitly initialize any locks.
      
      The u64_stats_sync structure, uses a seqcount, and thus we need
      to introduce a u64_stats_init() function and use it to initialize
      the structure.
      
      This unfortunately adds a lot of fairly trivial initialization code
      to a number of drivers. But the benefit of ensuring correctness makes
      this worth while.
      
      Because these changes are required for lockdep to be enabled, and the
      changes are quite trivial, I've not yet split this patch out into 30-some
      separate patches, as I figured it would be better to get the various
      maintainers thoughts on how to best merge this change along with
      the seqcount lockdep enablement.
      
      Feedback would be appreciated!
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Mirko Lindner <mlindner@marvell.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Roger Luethi <rl@hellgate.ch>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Wensong Zhang <wensong@linux-vs.org>
      Cc: netdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      827da44c
  27. 23 10月, 2013 1 次提交
    • J
      macvlan: resolve ENOENT errors on creation · 47d4ab91
      John Fastabend 提交于
      After the commit below attempting to create macvlan devices was
      resulting in ENOENT errors,
      
      # ip link add link p3p2 type macvlan
      RTNETLINK answers: Invalid argument
      
      This happens because netdev_upper_dev_link() is called before
      register_netdevice() in the macvlan code. Through a call chain
      this results in a call to __netdev_adjacent_dev_insert() and
      finally a sysfs_create_link(). This requires the kobject of
      the macvlan to be registered which is done in register_netdevice().
      If there is no kobject which is the case here the ENOENT error
      is seen on the command line.
      
      To resolve this move the netdev_upper_dev_link() call below
      the register_netdevice() call. This aligns with vlan driver
      flow.
      
      Regression introduced here,
      
      commit 5831d66e
      Author: Veaceslav Falico <vfalico@redhat.com>
      Date:   Wed Sep 25 09:20:32 2013 +0200
      
          net: create sysfs symlinks for neighbour devices
      
      CC: Veaceslav Falico <vfalico@redhat.com>
      CC: Neil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: NVeaceslav Falico <vfalico@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47d4ab91
  28. 12 9月, 2013 1 次提交
  29. 04 9月, 2013 1 次提交
  30. 31 8月, 2013 1 次提交
  31. 06 8月, 2013 1 次提交
    • M
      macvlan: validate flags · 15127478
      Michael S. Tsirkin 提交于
      commit df8ef8f3
          macvlan: add FDB bridge ops and macvlan flags
      added a flags field to macvlan, which can be
      controlled from userspace.
      The idea is to make the interface future-proof
      so we can add flags and not new fields.
      
      However, flags value isn't validated, as a result,
      userspace can't detect which flags are supported.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15127478
  32. 02 8月, 2013 2 次提交
  33. 24 7月, 2013 1 次提交
  34. 26 6月, 2013 1 次提交
  35. 13 6月, 2013 1 次提交
    • M
      macvlan: don't touch promisc without passthrough · 99ffc3e7
      Michael S. Tsirkin 提交于
      commit df8ef8f3
      "macvlan: add FDB bridge ops and macvlan flags"
      added a way to control NOPROMISC macvlan flag through netlink.
      
      However, with a non passthrough device we never set promisc on open,
      even if NOPROMISC is off.  As a result:
      
      If userspace clears NOPROMISC on open, then does not clear it on a
      netlink command, promisc counter is not decremented on stop and there
      will be no way to clear it once macvlan is detached.
      
      If userspace does not clear NOPROMISC on open, then sets NOPROMISC on a
      netlink command, promisc counter will be decremented from 0 and overflow
      to fffffffff with no way to clear promisc.
      
      To fix, simply ignore NOPROMISC flag in a netlink command for
      non-passthrough devices, same as we do at open/close.
      
      Since we touch this code anyway - check dev_set_promiscuity return code
      and pass it to users (though an error here is unlikely).
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Reviewed-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99ffc3e7
  36. 29 5月, 2013 1 次提交
  37. 12 5月, 2013 1 次提交
    • J
      macvlan: fix passthru mode race between dev removal and rx path · 233c7df0
      Jiri Pirko 提交于
      Currently, if macvlan in passthru mode is created and data are rxed and
      you remove this device, following panic happens:
      
      NULL pointer dereference at 0000000000000198
      IP: [<ffffffffa0196058>] macvlan_handle_frame+0x153/0x1f7 [macvlan]
      
      I'm using following script to trigger this:
      <script>
      while [ 1 ]
      do
      	ip link add link e1 name macvtap0 type macvtap mode passthru
      	ip link set e1 up
      	ip link set macvtap0 up
      	IFINDEX=`ip link |grep macvtap0 | cut -f 1 -d ':'`
      	cat /dev/tap$IFINDEX  >/dev/null &
      	ip link del dev macvtap0
      done
      </script>
      
      I run this script while "ping -f" is running on another machine to send
      packets to e1 rx.
      
      Reason of the panic is that list_first_entry() is blindly called in
      macvlan_handle_frame() even if the list was empty. vlan is set to
      incorrect pointer which leads to the crash.
      
      I'm fixing this by protecting port->vlans list by rcu and by preventing
      from getting incorrect pointer in case the list is empty.
      
      Introduced by: commit eb06acdc "macvlan: Introduce 'passthru' mode to takeover the underlying device"
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      233c7df0