1. 16 5月, 2017 1 次提交
    • V
      macvlan: Fix performance issues with vlan tagged packets · 70957eae
      Vlad Yasevich 提交于
      Macvlan always turns on offload features that have sofware
      fallback (NETIF_GSO_SOFTWARE).  This allows much higher guest-guest
      communications over macvtap.
      
      However, macvtap does not turn on these features for vlan tagged traffic.
      As a result, depending on the HW that mactap is configured on, the
      performance of guest-guest communication over a vlan is very
      inconsistent.  If the HW supports TSO/UFO over vlans, then the
      performance will be fine.  If not, the the performance will suffer
      greatly since the VM may continue using TSO/UFO, and will force the host
      segment the traffic and possibly overlow the macvtap queue.
      
      This patch adds the always on offloads to vlan_features.  This
      makes sure that any vlan tagged traffic between 2 guest will not
      be segmented needlessly.
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70957eae
  2. 25 4月, 2017 1 次提交
  3. 12 2月, 2017 1 次提交
  4. 21 1月, 2017 1 次提交
  5. 09 1月, 2017 1 次提交
  6. 08 12月, 2016 1 次提交
  7. 24 11月, 2016 1 次提交
  8. 22 11月, 2016 1 次提交
  9. 15 11月, 2016 1 次提交
  10. 10 11月, 2016 1 次提交
  11. 21 10月, 2016 1 次提交
    • J
      net: use core MTU range checking in core net infra · 91572088
      Jarod Wilson 提交于
      geneve:
      - Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu
      - This one isn't quite as straight-forward as others, could use some
        closer inspection and testing
      
      macvlan:
      - set min/max_mtu
      
      tun:
      - set min/max_mtu, remove tun_net_change_mtu
      
      vxlan:
      - Merge __vxlan_change_mtu back into vxlan_change_mtu
      - Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in
        change_mtu function
      - This one is also not as straight-forward and could use closer inspection
        and testing from vxlan folks
      
      bridge:
      - set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in
        change_mtu function
      
      openvswitch:
      - set min/max_mtu, remove internal_dev_change_mtu
      - note: max_mtu wasn't checked previously, it's been set to 65535, which
        is the largest possible size supported
      
      sch_teql:
      - set min/max_mtu (note: max_mtu previously unchecked, used max of 65535)
      
      macsec:
      - min_mtu = 0, max_mtu = 65535
      
      macvlan:
      - min_mtu = 0, max_mtu = 65535
      
      ntb_netdev:
      - min_mtu = 0, max_mtu = 65535
      
      veth:
      - min_mtu = 68, max_mtu = 65535
      
      8021q:
      - min_mtu = 0, max_mtu = 65535
      
      CC: netdev@vger.kernel.org
      CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      CC: Tom Herbert <tom@herbertland.com>
      CC: Daniel Borkmann <daniel@iogearbox.net>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      CC: Paolo Abeni <pabeni@redhat.com>
      CC: Jiri Benc <jbenc@redhat.com>
      CC: WANG Cong <xiyou.wangcong@gmail.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      CC: Pravin B Shelar <pshelar@ovn.org>
      CC: Sabrina Dubroca <sd@queasysnail.net>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: Pravin Shelar <pshelar@nicira.com>
      CC: Maxim Krasnyansky <maxk@qti.qualcomm.com>
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91572088
  12. 14 8月, 2016 1 次提交
    • S
      net: remove type_check from dev_get_nest_level() · 952fcfd0
      Sabrina Dubroca 提交于
      The idea for type_check in dev_get_nest_level() was to count the number
      of nested devices of the same type (currently, only macvlan or vlan
      devices).
      This prevented the false positive lockdep warning on configurations such
      as:
      
      eth0 <--- macvlan0 <--- vlan0 <--- macvlan1
      
      However, this doesn't prevent a warning on a configuration such as:
      
      eth0 <--- macvlan0 <--- vlan0
      eth1 <--- vlan1 <--- macvlan1
      
      In this case, all the locks end up with a nesting subclass of 1, so
      lockdep thinks that there is still a deadlock:
      
      - in the first case we have (macvlan_netdev_addr_lock_key, 1) and then
        take (vlan_netdev_xmit_lock_key, 1)
      - in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then
        take (macvlan_netdev_addr_lock_key, 1)
      
      By removing the linktype check in dev_get_nest_level() and always
      incrementing the nesting depth, lockdep considers this configuration
      valid.
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      952fcfd0
  13. 10 6月, 2016 1 次提交
  14. 02 6月, 2016 2 次提交
    • H
      macvlan: Avoid unnecessary multicast cloning · 9c127a01
      Herbert Xu 提交于
      Currently we always queue a multicast packet for further processing,
      even if none of the macvlan devices are subscribed to the address.
      
      This patch optimises this by adding a global multicast filter for
      a macvlan_port.
      
      Note that this patch doesn't handle the broadcast addresses of the
      individual macvlan devices correctly, if they are not all identical
      to vlan->lowerdev.  However, this is already broken because there
      is no mechanism in place to update the individual multicast filters
      when you change the broadcast address.
      
      If someone cares enough they should fix this by collecting all
      broadcast addresses for a macvlan as we do for multicast and unicast.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c127a01
    • H
      macvlan: Fix potential use-after free for broadcasts · 260916df
      Herbert Xu 提交于
      When we postpone a broadcast packet we save the source port in
      the skb if it is local.  However, the source port can disappear
      before we get a chance to process the packet.
      
      This patch fixes this by holding a ref count on the netdev.
      
      It also delays the skb->cb modification until after we allocate
      the new skb as you should not modify shared skbs.
      
      Fixes: 412ca155 ("macvlan: Move broadcasts into a work queue")
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      260916df
  15. 27 4月, 2016 1 次提交
  16. 18 3月, 2016 1 次提交
  17. 26 2月, 2016 1 次提交
  18. 18 2月, 2016 1 次提交
  19. 30 1月, 2016 1 次提交
    • N
      macvlan: make operstate and carrier more accurate · de7d244d
      Nikolay Aleksandrov 提交于
      Currently when a macvlan is being initialized and the lower device is
      netif_carrier_ok(), the macvlan device doesn't run through
      rfc2863_policy() and is left with UNKNOWN operstate. Fix it by adding an
      unconditional linkwatch event for the new macvlan device. Similar fix is
      already used by the 8021q device (see register_vlan_dev()). Also fix the
      inconsistent state when the lower device has been down and its carrier
      was changed (when a device is down NETDEV_CHANGE doesn't get generated).
      The second issue can be seen f.e. when we have a macvlan on top of a 8021q
      device which has been down and its real device has been changing carrier
      states, after setting the 8021q device up, the macvlan device will have
      the same carrier state as it was before even though the 8021q can now
      have a different state.
      Example for case 1:
      4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
      state UP mode DEFAULT group default qlen 1000
      
      $ ip l add l eth2 macvl0 type macvlan
      $ ip l set macvl0 up
      $ ip l sh macvl0
      72: macvl0@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
      noqueue state UNKNOWN mode DEFAULT group default
          link/ether f6:0b:54:0a:9d:a3 brd ff:ff:ff:ff:ff:ff
      
      Example for case 2 (order is important):
      Prestate: eth2 UP/CARRIER, vlan1 down, vlan1-macvlan down
      $ ip l set vlan1-macvlan up
      $ ip l sh vlan1-macvlan
      71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
      qdisc noqueue state UNKNOWN mode DEFAULT group default
          link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff
      
      [ eth2 loses CARRIER before vlan1 has been UP-ed ]
      
      $ ip l sh eth2
      4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast
      state DOWN mode DEFAULT group default qlen 1000
          link/ether 52:54:00:bf:57:16 brd ff:ff:ff:ff:ff:ff
      $ ip l sh vlan1-macvlan
      71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
      qdisc noqueue state UNKNOWN mode DEFAULT group default
          link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff
      $ ip l set vlan1 up
      $ ip l sh vlan1
      70: vlan1@eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc
      noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
          link/ether 52:54:00:bf:57:16 brd ff:ff:ff:ff:ff:ff
      $ ip l sh vlan1-macvlan
      71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
      qdisc noqueue state UNKNOWN mode DEFAULT group default
          link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff
      
      vlan1-macvlan is still UP, still has carrier and is still in the same
      operstate as before. After the patch in case 1 macvl0 has state UP as it
      should and in case 2 vlan1-macvlan has state LOWERLAYERDOWN again as it
      should. Note that while the lower macvlan device is down their carrier
      and thus operstate can go out of sync but that will be fixed once the
      lower device goes up again.
      This behaviour seems to have been present since beginning of git history.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de7d244d
  20. 16 12月, 2015 2 次提交
  21. 18 11月, 2015 1 次提交
  22. 13 10月, 2015 1 次提交
  23. 04 8月, 2015 1 次提交
  24. 04 5月, 2015 1 次提交
  25. 03 4月, 2015 1 次提交
  26. 03 3月, 2015 1 次提交
  27. 24 1月, 2015 1 次提交
  28. 10 12月, 2014 2 次提交
    • M
      macvlan: play well with ipvlan device · d6b00fec
      Mahesh Bandewar 提交于
      If device is already used as an ipvlan port then refuse to
      use it as a macvlan port at early stage of port creation.
      
      	thost1:~# ip link add link eth0 ipvl0 type ipvlan
      	thost1:~# echo $?
      	0
      	thost1:~# ip link add link eth0 mvl0 type macvlan
      	RTNETLINK answers: Device or resource busy
      	thost1:~# echo $?
      	2
      	thost1:~#
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6b00fec
    • M
      macvlan: allow setting LRO independently of lower device · 62dbe830
      Michal Kubeček 提交于
      Since commit fbe168ba ("net: generic dev_disable_lro() stacked
      device handling"), dev_disable_lro() zeroes NETIF_F_LRO feature flag
      first for a macvlan device and then for its lower device. As an attempt
      to set NETIF_F_LRO to zero is ignored, dev_disable_lro() issues a
      warning and taints kernel.
      
      Allowing NETIF_F_LRO to be set independently of the lower device
      consists of three parts:
      
        - add the flag to hw_features to allow toggling it
        - allow setting it to 0 even if lower device has the flag set
        - add the flag to MACVLAN_FEATURES to restore copying from lower
          device on macvlan creation
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62dbe830
  29. 03 12月, 2014 1 次提交
  30. 30 11月, 2014 1 次提交
  31. 26 10月, 2014 1 次提交
  32. 11 10月, 2014 2 次提交
    • J
      macvlan: optimize the receive path · d1dd9119
      jbaron@akamai.com 提交于
      The netif_rx() call on the fast path of macvlan_handle_frame() appears to
      be there to ensure that we properly throttle incoming packets. However, it
      would appear as though the proper throttling is already in place for all
      possible ingress paths, and that the call is redundant. If packets are arriving
      from the physical NIC, we've already throttled them by this point. Otherwise,
      if they are coming via macvlan_queue_xmit(), it calls either
      'dev_forward_skb()', which ends up calling netif_rx_internal(), or else in
      the broadcast case, we are throttling via macvlan_broadcast_enqueue().
      
      The test results below are from off the box to an lxc instance running macvlan.
      Once the tranactions/sec stop increasing, the cpu idle time has gone to 0.
      Results are from a quad core Intel E3-1270 V2@3.50GHz box with bnx2x 10G card.
      
      for i in {10,100,200,300,400,500};
      do super_netperf $i -H $ip -t TCP_RR; done
      Average of 5 runs.
      
      trans/sec 		 trans/sec
      (3.17-rc7-net-next)      (3.17-rc7-net-next + this patch)
      ----------               ----------
      208101                   211534 (+1.6%)
      839493                   850162 (+1.3%)
      845071                   844053 (-.12%)
      816330                   819623 (+.4%)
      778700                   789938 (+1.4%)
      735984                   754408 (+2.5%)
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d1dd9119
    • J
      macvlan: pass 'bool' type to macvlan_count_rx() · 4c979935
      jbaron@akamai.com 提交于
      Pass last argument to macvlan_count_rx() as the correct bool type.
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c979935
  33. 08 10月, 2014 1 次提交
    • E
      net: better IFF_XMIT_DST_RELEASE support · 02875878
      Eric Dumazet 提交于
      Testing xmit_more support with netperf and connected UDP sockets,
      I found strange dst refcount false sharing.
      
      Current handling of IFF_XMIT_DST_RELEASE is not optimal.
      
      Dropping dst in validate_xmit_skb() is certainly too late in case
      packet was queued by cpu X but dequeued by cpu Y
      
      The logical point to take care of drop/force is in __dev_queue_xmit()
      before even taking qdisc lock.
      
      As Julian Anastasov pointed out, need for skb_dst() might come from some
      packet schedulers or classifiers.
      
      This patch adds new helper to cleanly express needs of various drivers
      or qdiscs/classifiers.
      
      Drivers that need skb_dst() in their ndo_start_xmit() should call
      following helper in their setup instead of the prior :
      
      	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
      ->
      	netif_keep_dst(dev);
      
      Instead of using a single bit, we use two bits, one being
      eventually rebuilt in bonding/team drivers.
      
      The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being
      rebuilt in bonding/team. Eventually, we could add something
      smarter later.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02875878
  34. 30 9月, 2014 1 次提交
    • M
      macvlan: add source mode · 79cf79ab
      Michael Braun 提交于
      This patch adds a new mode of operation to macvlan, called "source".
      It allows one to set a list of allowed mac address, which is used
      to match against source mac address from received frames on underlying
      interface.
      This enables creating mac based VLAN associations, instead of standard
      port or tag based. The feature is useful to deploy 802.1x mac based
      behavior, where drivers of underlying interfaces doesn't allows that.
      
      Configuration is done through the netlink interface using e.g.:
       ip link add link eth0 name macvlan0 type macvlan mode source
       ip link add link eth0 name macvlan1 type macvlan mode source
       ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
       ip link set link dev macvlan0 type macvlan macaddr add 00:22:22:22:22:22
       ip link set link dev macvlan0 type macvlan macaddr add 00:33:33:33:33:33
       ip link set link dev macvlan1 type macvlan macaddr add 00:33:33:33:33:33
       ip link set link dev macvlan1 type macvlan macaddr add 00:44:44:44:44:44
      
      This allows clients with MAC addresses 00:11:11:11:11:11,
      00:22:22:22:22:22 to be part of only VLAN associated with macvlan0
      interface. Clients with MAC addresses 00:44:44:44:44:44 with only VLAN
      associated with macvlan1 interface. And client with MAC address
      00:33:33:33:33:33 to be associated with both VLANs.
      
      Based on work of Stefan Gula <steweg@gmail.com>
      
      v8: last version of Stefan Gula for Kernel 3.2.1
      v9: rework onto linux-next 2014-03-12 by Michael Braun
          add MACADDR_SET command, enable to configure mac for source mode
          while creating interface
      v10:
        - reduce indention level
        - rename source_list to source_entry
        - use aligned 64bit ether address
        - use hash_64 instead of addr[5]
      v11:
        - rebase for 3.14 / linux-next 20.04.2014
      v12
        - rebase for linux-next 2014-09-25
      Signed-off-by: NMichael Braun <michael-dev@fami-braun.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79cf79ab
  35. 23 9月, 2014 1 次提交
  36. 20 9月, 2014 1 次提交
    • F
      net: allow macvlans to move to net namespace · 0d0162e7
      Francesco Ruggeri 提交于
      I cannot move a macvlan interface created on top of a bonding interface
      to a different namespace:
      
      % ip netns add dummy0
      % ip link add link bond0 mac0 type macvlan
      % ip link set mac0 netns dummy0
      RTNETLINK answers: Invalid argument
      %
      
      The problem seems to be that commit f9399814 ("bonding: Don't allow
      bond devices to change network namespaces.") sets NETIF_F_NETNS_LOCAL
      on bonding interfaces, and commit 797f87f8 ("macvlan: fix netdev
      feature propagation from lower device") causes macvlan interfaces
      to inherit its features from the lower device.
      
      NETIF_F_NETNS_LOCAL should not be inherited from the lower device
      by a macvlan.
      Patch tested on 3.16.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
      Acked-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d0162e7