1. 14 12月, 2018 1 次提交
  2. 04 12月, 2018 1 次提交
    • M
      macvlan: return correct error value · 59f997b0
      Matteo Croce 提交于
      A MAC address must be unique among all the macvlan devices with the same
      lower device. The only exception is the passthru [sic] mode,
      which shares the lower device address.
      
      When duplicate addresses are detected, EBUSY is returned when bringing
      the interface up:
      
          # ip link add macvlan0 link eth0 type macvlan
          # read addr </sys/class/net/eth0/address
          # ip link set macvlan0 address $addr
          # ip link set macvlan0 up
          RTNETLINK answers: Device or resource busy
      
      Use correct error code which is EADDRINUSE, and do the check also
      earlier, on address change:
      
          # ip link set macvlan0 address $addr
          RTNETLINK answers: Address already in use
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59f997b0
  3. 20 10月, 2018 1 次提交
    • D
      netpoll: allow cleanup to be synchronous · c9fbd71f
      Debabrata Banerjee 提交于
      This fixes a problem introduced by:
      commit 2cde6acd ("netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock")
      
      When using netconsole on a bond, __netpoll_cleanup can asynchronously
      recurse multiple times, each __netpoll_free_async call can result in
      more __netpoll_free_async's. This means there is now a race between
      cleanup_work queues on multiple netpoll_info's on multiple devices and
      the configuration of a new netpoll. For example if a netconsole is set
      to enable 0, reconfigured, and enable 1 immediately, this netconsole
      will likely not work.
      
      Given the reason for __netpoll_free_async is it can be called when rtnl
      is not locked, if it is locked, we should be able to execute
      synchronously. It appears to be locked everywhere it's called from.
      
      Generalize the design pattern from the teaming driver for current
      callers of __netpoll_free_async.
      
      CC: Neil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NDebabrata Banerjee <dbanerje@akamai.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9fbd71f
  4. 12 7月, 2018 1 次提交
  5. 10 7月, 2018 1 次提交
  6. 25 4月, 2018 2 次提交
  7. 12 3月, 2018 1 次提交
    • S
      macvlan: filter out unsupported feature flags · 13fbcc8d
      Shannon Nelson 提交于
      Adding a macvlan device on top of a lowerdev that supports
      the xfrm offloads fails with a new regression:
        # ip link add link ens1f0 mv0 type macvlan
        RTNETLINK answers: Operation not permitted
      
      Tracing down the failure shows that the macvlan device inherits
      the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags
      from the lowerdev, but with no dev->xfrmdev_ops API filled
      in, it doesn't actually support xfrm.  When the request is
      made to add the new macvlan device, the XFRM listener for
      NETDEV_REGISTER calls xfrm_api_check() which fails the new
      registration because dev->xfrmdev_ops is NULL.
      
      The macvlan creation succeeds when we filter out the ESP
      feature flags in macvlan_fix_features(), so let's filter them
      out like we're already filtering out ~NETIF_F_NETNS_LOCAL.
      When XFRM support is added in the future, we can add the flags
      into MACVLAN_FEATURES.
      
      This same problem could crop up in the future with any other
      new feature flags, so let's filter out any flags that aren't
      defined as supported in macvlan.
      
      Fixes: d77e38e6 ("xfrm: Add an IPsec hardware offloading API")
      Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: NShannon Nelson <shannon.nelson@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13fbcc8d
  8. 23 2月, 2018 1 次提交
    • A
      macvlan: fix use-after-free in macvlan_common_newlink() · 4e14bf42
      Alexey Kodanev 提交于
      The following use-after-free was reported by KASan when running
      LTP macvtap01 test on 4.16-rc2:
      
      [10642.528443] BUG: KASAN: use-after-free in
                     macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10642.626607] Read of size 8 at addr ffff880ba49f2100 by task ip/18450
      ...
      [10642.963873] Call Trace:
      [10642.994352]  dump_stack+0x5c/0x7c
      [10643.035325]  print_address_description+0x75/0x290
      [10643.092938]  kasan_report+0x28d/0x390
      [10643.137971]  ? macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10643.207963]  macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10643.275978]  macvtap_newlink+0x171/0x260 [macvtap]
      [10643.334532]  rtnl_newlink+0xd4f/0x1300
      ...
      [10646.256176] Allocated by task 18450:
      [10646.299964]  kasan_kmalloc+0xa6/0xd0
      [10646.343746]  kmem_cache_alloc_trace+0xf1/0x210
      [10646.397826]  macvlan_common_newlink+0x6de/0x14a0 [macvlan]
      [10646.464386]  macvtap_newlink+0x171/0x260 [macvtap]
      [10646.522728]  rtnl_newlink+0xd4f/0x1300
      ...
      [10647.022028] Freed by task 18450:
      [10647.061549]  __kasan_slab_free+0x138/0x180
      [10647.111468]  kfree+0x9e/0x1c0
      [10647.147869]  macvlan_port_destroy+0x3db/0x650 [macvlan]
      [10647.211411]  rollback_registered_many+0x5b9/0xb10
      [10647.268715]  rollback_registered+0xd9/0x190
      [10647.319675]  register_netdevice+0x8eb/0xc70
      [10647.370635]  macvlan_common_newlink+0xe58/0x14a0 [macvlan]
      [10647.437195]  macvtap_newlink+0x171/0x260 [macvtap]
      
      Commit d02fd6e7 ("macvlan: Fix one possible double free") handles
      the case when register_netdevice() invokes ndo_uninit() on error and
      as a result free the port. But 'macvlan_port_get_rtnl(dev))' check
      (returns dev->rx_handler_data), which was added by this commit in order
      to prevent double free, is not quite correct:
      
      * for macvlan it always returns NULL because 'lowerdev' is the one that
        was used to register rx handler (port) in macvlan_port_create() as
        well as to unregister it in macvlan_port_destroy().
      * for macvtap it always returns a valid pointer because macvtap registers
        its own rx handler before macvlan_common_newlink().
      
      Fixes: d02fd6e7 ("macvlan: Fix one possible double free")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e14bf42
  9. 03 1月, 2018 1 次提交
  10. 19 10月, 2017 1 次提交
    • A
      macvlan/macvtap: Add support for L2 forwarding offloads with macvtap · 56fd2b2c
      Alexander Duyck 提交于
      This patch reverts earlier commit b13ba1b8 ("macvlan: forbid L2
      fowarding offload for macvtap"). The reason for reverting this is because
      the original patch no longer fixes what it previously did as the
      underlying structure has changed for macvtap. Specifically macvtap
      originally pulled packets directly off of the lowerdev. However in commit
      6acf54f1 ("macvtap: Add support of packet capture on macvtap device.")
      that code was changed and instead macvtap would listen directly on the
      macvtap device itself instead of the lower device. As such, the L2
      forwarding offload should now be able to provide a performance advantage of
      skipping the checks on the lower dev while not introducing any sort of
      regression.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56fd2b2c
  11. 15 10月, 2017 2 次提交
  12. 05 10月, 2017 1 次提交
  13. 21 9月, 2017 1 次提交
  14. 19 8月, 2017 1 次提交
  15. 18 7月, 2017 1 次提交
  16. 27 6月, 2017 3 次提交
  17. 22 6月, 2017 4 次提交
  18. 15 6月, 2017 1 次提交
  19. 08 6月, 2017 1 次提交
    • D
      net: Fix inconsistent teardown and release of private netdev state. · cf124db5
      David S. Miller 提交于
      Network devices can allocate reasources and private memory using
      netdev_ops->ndo_init().  However, the release of these resources
      can occur in one of two different places.
      
      Either netdev_ops->ndo_uninit() or netdev->destructor().
      
      The decision of which operation frees the resources depends upon
      whether it is necessary for all netdev refs to be released before it
      is safe to perform the freeing.
      
      netdev_ops->ndo_uninit() presumably can occur right after the
      NETDEV_UNREGISTER notifier completes and the unicast and multicast
      address lists are flushed.
      
      netdev->destructor(), on the other hand, does not run until the
      netdev references all go away.
      
      Further complicating the situation is that netdev->destructor()
      almost universally does also a free_netdev().
      
      This creates a problem for the logic in register_netdevice().
      Because all callers of register_netdevice() manage the freeing
      of the netdev, and invoke free_netdev(dev) if register_netdevice()
      fails.
      
      If netdev_ops->ndo_init() succeeds, but something else fails inside
      of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
      it is not able to invoke netdev->destructor().
      
      This is because netdev->destructor() will do a free_netdev() and
      then the caller of register_netdevice() will do the same.
      
      However, this means that the resources that would normally be released
      by netdev->destructor() will not be.
      
      Over the years drivers have added local hacks to deal with this, by
      invoking their destructor parts by hand when register_netdevice()
      fails.
      
      Many drivers do not try to deal with this, and instead we have leaks.
      
      Let's close this hole by formalizing the distinction between what
      private things need to be freed up by netdev->destructor() and whether
      the driver needs unregister_netdevice() to perform the free_netdev().
      
      netdev->priv_destructor() performs all actions to free up the private
      resources that used to be freed by netdev->destructor(), except for
      free_netdev().
      
      netdev->needs_free_netdev is a boolean that indicates whether
      free_netdev() should be done at the end of unregister_netdevice().
      
      Now, register_netdevice() can sanely release all resources after
      ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
      and netdev->priv_destructor().
      
      And at the end of unregister_netdevice(), we invoke
      netdev->priv_destructor() and optionally call free_netdev().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf124db5
  20. 16 5月, 2017 1 次提交
    • V
      macvlan: Fix performance issues with vlan tagged packets · 70957eae
      Vlad Yasevich 提交于
      Macvlan always turns on offload features that have sofware
      fallback (NETIF_GSO_SOFTWARE).  This allows much higher guest-guest
      communications over macvtap.
      
      However, macvtap does not turn on these features for vlan tagged traffic.
      As a result, depending on the HW that mactap is configured on, the
      performance of guest-guest communication over a vlan is very
      inconsistent.  If the HW supports TSO/UFO over vlans, then the
      performance will be fine.  If not, the the performance will suffer
      greatly since the VM may continue using TSO/UFO, and will force the host
      segment the traffic and possibly overlow the macvtap queue.
      
      This patch adds the always on offloads to vlan_features.  This
      makes sure that any vlan tagged traffic between 2 guest will not
      be segmented needlessly.
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70957eae
  21. 25 4月, 2017 1 次提交
  22. 12 2月, 2017 1 次提交
  23. 21 1月, 2017 1 次提交
  24. 09 1月, 2017 1 次提交
  25. 08 12月, 2016 1 次提交
  26. 24 11月, 2016 1 次提交
  27. 22 11月, 2016 1 次提交
  28. 15 11月, 2016 1 次提交
  29. 10 11月, 2016 1 次提交
  30. 21 10月, 2016 1 次提交
    • J
      net: use core MTU range checking in core net infra · 91572088
      Jarod Wilson 提交于
      geneve:
      - Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu
      - This one isn't quite as straight-forward as others, could use some
        closer inspection and testing
      
      macvlan:
      - set min/max_mtu
      
      tun:
      - set min/max_mtu, remove tun_net_change_mtu
      
      vxlan:
      - Merge __vxlan_change_mtu back into vxlan_change_mtu
      - Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in
        change_mtu function
      - This one is also not as straight-forward and could use closer inspection
        and testing from vxlan folks
      
      bridge:
      - set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in
        change_mtu function
      
      openvswitch:
      - set min/max_mtu, remove internal_dev_change_mtu
      - note: max_mtu wasn't checked previously, it's been set to 65535, which
        is the largest possible size supported
      
      sch_teql:
      - set min/max_mtu (note: max_mtu previously unchecked, used max of 65535)
      
      macsec:
      - min_mtu = 0, max_mtu = 65535
      
      macvlan:
      - min_mtu = 0, max_mtu = 65535
      
      ntb_netdev:
      - min_mtu = 0, max_mtu = 65535
      
      veth:
      - min_mtu = 68, max_mtu = 65535
      
      8021q:
      - min_mtu = 0, max_mtu = 65535
      
      CC: netdev@vger.kernel.org
      CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      CC: Tom Herbert <tom@herbertland.com>
      CC: Daniel Borkmann <daniel@iogearbox.net>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      CC: Paolo Abeni <pabeni@redhat.com>
      CC: Jiri Benc <jbenc@redhat.com>
      CC: WANG Cong <xiyou.wangcong@gmail.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      CC: Pravin B Shelar <pshelar@ovn.org>
      CC: Sabrina Dubroca <sd@queasysnail.net>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: Pravin Shelar <pshelar@nicira.com>
      CC: Maxim Krasnyansky <maxk@qti.qualcomm.com>
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91572088
  31. 14 8月, 2016 1 次提交
    • S
      net: remove type_check from dev_get_nest_level() · 952fcfd0
      Sabrina Dubroca 提交于
      The idea for type_check in dev_get_nest_level() was to count the number
      of nested devices of the same type (currently, only macvlan or vlan
      devices).
      This prevented the false positive lockdep warning on configurations such
      as:
      
      eth0 <--- macvlan0 <--- vlan0 <--- macvlan1
      
      However, this doesn't prevent a warning on a configuration such as:
      
      eth0 <--- macvlan0 <--- vlan0
      eth1 <--- vlan1 <--- macvlan1
      
      In this case, all the locks end up with a nesting subclass of 1, so
      lockdep thinks that there is still a deadlock:
      
      - in the first case we have (macvlan_netdev_addr_lock_key, 1) and then
        take (vlan_netdev_xmit_lock_key, 1)
      - in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then
        take (macvlan_netdev_addr_lock_key, 1)
      
      By removing the linktype check in dev_get_nest_level() and always
      incrementing the nesting depth, lockdep considers this configuration
      valid.
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      952fcfd0
  32. 10 6月, 2016 1 次提交
  33. 02 6月, 2016 1 次提交
    • H
      macvlan: Avoid unnecessary multicast cloning · 9c127a01
      Herbert Xu 提交于
      Currently we always queue a multicast packet for further processing,
      even if none of the macvlan devices are subscribed to the address.
      
      This patch optimises this by adding a global multicast filter for
      a macvlan_port.
      
      Note that this patch doesn't handle the broadcast addresses of the
      individual macvlan devices correctly, if they are not all identical
      to vlan->lowerdev.  However, this is already broken because there
      is no mechanism in place to update the individual multicast filters
      when you change the broadcast address.
      
      If someone cares enough they should fix this by collecting all
      broadcast addresses for a macvlan as we do for multicast and unicast.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c127a01