1. 26 12月, 2019 1 次提交
  2. 26 11月, 2019 1 次提交
    • M
      macvlan: schedule bc_work even if error · 1d7ea556
      Menglong Dong 提交于
      While enqueueing a broadcast skb to port->bc_queue, schedule_work()
      is called to add port->bc_work, which processes the skbs in
      bc_queue, to "events" work queue. If port->bc_queue is full, the
      skb will be discarded and schedule_work(&port->bc_work) won't be
      called. However, if port->bc_queue is full and port->bc_work is not
      running or pending, port->bc_queue will keep full and schedule_work()
      won't be called any more, and all broadcast skbs to macvlan will be
      discarded. This case can happen:
      
      macvlan_process_broadcast() is the pending function of port->bc_work,
      it moves all the skbs in port->bc_queue to the queue "list", and
      processes the skbs in "list". During this, new skbs will keep being
      added to port->bc_queue in macvlan_broadcast_enqueue(), and
      port->bc_queue may already full when macvlan_process_broadcast()
      return. This may happen, especially when there are a lot of real-time
      threads and the process is preempted.
      
      Fix this by calling schedule_work(&port->bc_work) even if
      port->bc_work is full in macvlan_broadcast_enqueue().
      
      Fixes: 412ca155 ("macvlan: Move broadcasts into a work queue")
      Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d7ea556
  3. 25 10月, 2019 2 次提交
    • T
      net: remove unnecessary variables and callback · f3b0a18b
      Taehee Yoo 提交于
      This patch removes variables and callback these are related to the nested
      device structure.
      devices that can be nested have their own nest_level variable that
      represents the depth of nested devices.
      In the previous patch, new {lower/upper}_level variables are added and
      they replace old private nest_level variable.
      So, this patch removes all 'nest_level' variables.
      
      In order to avoid lockdep warning, ->ndo_get_lock_subclass() was added
      to get lockdep subclass value, which is actually lower nested depth value.
      But now, they use the dynamic lockdep key to avoid lockdep warning instead
      of the subclass.
      So, this patch removes ->ndo_get_lock_subclass() callback.
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3b0a18b
    • T
      net: core: add generic lockdep keys · ab92d68f
      Taehee Yoo 提交于
      Some interface types could be nested.
      (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
      These interface types should set lockdep class because, without lockdep
      class key, lockdep always warn about unexisting circular locking.
      
      In the current code, these interfaces have their own lockdep class keys and
      these manage itself. So that there are so many duplicate code around the
      /driver/net and /net/.
      This patch adds new generic lockdep keys and some helper functions for it.
      
      This patch does below changes.
      a) Add lockdep class keys in struct net_device
         - qdisc_running, xmit, addr_list, qdisc_busylock
         - these keys are used as dynamic lockdep key.
      b) When net_device is being allocated, lockdep keys are registered.
         - alloc_netdev_mqs()
      c) When net_device is being free'd llockdep keys are unregistered.
         - free_netdev()
      d) Add generic lockdep key helper function
         - netdev_register_lockdep_key()
         - netdev_unregister_lockdep_key()
         - netdev_update_lockdep_key()
      e) Remove unnecessary generic lockdep macro and functions
      f) Remove unnecessary lockdep code of each interfaces.
      
      After this patch, each interface modules don't need to maintain
      their lockdep keys.
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab92d68f
  4. 31 5月, 2019 1 次提交
  5. 29 5月, 2019 1 次提交
  6. 21 5月, 2019 1 次提交
    • G
      macvlan: Mark expected switch fall-through · 02596252
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wimplicit-fallthrough, mark switch
      cases where we are expecting to fall through.
      
      This patch fixes the following warning:
      
      drivers/net/macvlan.c: In function ‘macvlan_do_ioctl’:
      drivers/net/macvlan.c:839:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
         if (!net_eq(dev_net(dev), &init_net))
            ^
      drivers/net/macvlan.c:841:2: note: here
        case SIOCGHWTSTAMP:
        ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable
      -Wimplicit-fallthrough.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02596252
  7. 10 5月, 2019 1 次提交
  8. 28 4月, 2019 1 次提交
    • M
      netlink: make nla_nest_start() add NLA_F_NESTED flag · ae0be8de
      Michal Kubecek 提交于
      Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
      netlink based interfaces (including recently added ones) are still not
      setting it in kernel generated messages. Without the flag, message parsers
      not aware of attribute semantics (e.g. wireshark dissector or libmnl's
      mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
      the structure of their contents.
      
      Unfortunately we cannot just add the flag everywhere as there may be
      userspace applications which check nlattr::nla_type directly rather than
      through a helper masking out the flags. Therefore the patch renames
      nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
      as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
      are rewritten to use nla_nest_start().
      
      Except for changes in include/net/netlink.h, the patch was generated using
      this semantic patch:
      
      @@ expression E1, E2; @@
      -nla_nest_start(E1, E2)
      +nla_nest_start_noflag(E1, E2)
      
      @@ expression E1, E2; @@
      -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
      +nla_nest_start(E1, E2)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae0be8de
  9. 21 3月, 2019 1 次提交
  10. 25 2月, 2019 1 次提交
  11. 01 2月, 2019 1 次提交
  12. 18 1月, 2019 2 次提交
  13. 14 12月, 2018 1 次提交
  14. 04 12月, 2018 1 次提交
    • M
      macvlan: return correct error value · 59f997b0
      Matteo Croce 提交于
      A MAC address must be unique among all the macvlan devices with the same
      lower device. The only exception is the passthru [sic] mode,
      which shares the lower device address.
      
      When duplicate addresses are detected, EBUSY is returned when bringing
      the interface up:
      
          # ip link add macvlan0 link eth0 type macvlan
          # read addr </sys/class/net/eth0/address
          # ip link set macvlan0 address $addr
          # ip link set macvlan0 up
          RTNETLINK answers: Device or resource busy
      
      Use correct error code which is EADDRINUSE, and do the check also
      earlier, on address change:
      
          # ip link set macvlan0 address $addr
          RTNETLINK answers: Address already in use
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59f997b0
  15. 20 10月, 2018 1 次提交
    • D
      netpoll: allow cleanup to be synchronous · c9fbd71f
      Debabrata Banerjee 提交于
      This fixes a problem introduced by:
      commit 2cde6acd ("netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock")
      
      When using netconsole on a bond, __netpoll_cleanup can asynchronously
      recurse multiple times, each __netpoll_free_async call can result in
      more __netpoll_free_async's. This means there is now a race between
      cleanup_work queues on multiple netpoll_info's on multiple devices and
      the configuration of a new netpoll. For example if a netconsole is set
      to enable 0, reconfigured, and enable 1 immediately, this netconsole
      will likely not work.
      
      Given the reason for __netpoll_free_async is it can be called when rtnl
      is not locked, if it is locked, we should be able to execute
      synchronously. It appears to be locked everywhere it's called from.
      
      Generalize the design pattern from the teaming driver for current
      callers of __netpoll_free_async.
      
      CC: Neil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NDebabrata Banerjee <dbanerje@akamai.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9fbd71f
  16. 12 7月, 2018 1 次提交
  17. 10 7月, 2018 1 次提交
  18. 25 4月, 2018 2 次提交
  19. 12 3月, 2018 1 次提交
    • S
      macvlan: filter out unsupported feature flags · 13fbcc8d
      Shannon Nelson 提交于
      Adding a macvlan device on top of a lowerdev that supports
      the xfrm offloads fails with a new regression:
        # ip link add link ens1f0 mv0 type macvlan
        RTNETLINK answers: Operation not permitted
      
      Tracing down the failure shows that the macvlan device inherits
      the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags
      from the lowerdev, but with no dev->xfrmdev_ops API filled
      in, it doesn't actually support xfrm.  When the request is
      made to add the new macvlan device, the XFRM listener for
      NETDEV_REGISTER calls xfrm_api_check() which fails the new
      registration because dev->xfrmdev_ops is NULL.
      
      The macvlan creation succeeds when we filter out the ESP
      feature flags in macvlan_fix_features(), so let's filter them
      out like we're already filtering out ~NETIF_F_NETNS_LOCAL.
      When XFRM support is added in the future, we can add the flags
      into MACVLAN_FEATURES.
      
      This same problem could crop up in the future with any other
      new feature flags, so let's filter out any flags that aren't
      defined as supported in macvlan.
      
      Fixes: d77e38e6 ("xfrm: Add an IPsec hardware offloading API")
      Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: NShannon Nelson <shannon.nelson@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13fbcc8d
  20. 23 2月, 2018 1 次提交
    • A
      macvlan: fix use-after-free in macvlan_common_newlink() · 4e14bf42
      Alexey Kodanev 提交于
      The following use-after-free was reported by KASan when running
      LTP macvtap01 test on 4.16-rc2:
      
      [10642.528443] BUG: KASAN: use-after-free in
                     macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10642.626607] Read of size 8 at addr ffff880ba49f2100 by task ip/18450
      ...
      [10642.963873] Call Trace:
      [10642.994352]  dump_stack+0x5c/0x7c
      [10643.035325]  print_address_description+0x75/0x290
      [10643.092938]  kasan_report+0x28d/0x390
      [10643.137971]  ? macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10643.207963]  macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10643.275978]  macvtap_newlink+0x171/0x260 [macvtap]
      [10643.334532]  rtnl_newlink+0xd4f/0x1300
      ...
      [10646.256176] Allocated by task 18450:
      [10646.299964]  kasan_kmalloc+0xa6/0xd0
      [10646.343746]  kmem_cache_alloc_trace+0xf1/0x210
      [10646.397826]  macvlan_common_newlink+0x6de/0x14a0 [macvlan]
      [10646.464386]  macvtap_newlink+0x171/0x260 [macvtap]
      [10646.522728]  rtnl_newlink+0xd4f/0x1300
      ...
      [10647.022028] Freed by task 18450:
      [10647.061549]  __kasan_slab_free+0x138/0x180
      [10647.111468]  kfree+0x9e/0x1c0
      [10647.147869]  macvlan_port_destroy+0x3db/0x650 [macvlan]
      [10647.211411]  rollback_registered_many+0x5b9/0xb10
      [10647.268715]  rollback_registered+0xd9/0x190
      [10647.319675]  register_netdevice+0x8eb/0xc70
      [10647.370635]  macvlan_common_newlink+0xe58/0x14a0 [macvlan]
      [10647.437195]  macvtap_newlink+0x171/0x260 [macvtap]
      
      Commit d02fd6e7 ("macvlan: Fix one possible double free") handles
      the case when register_netdevice() invokes ndo_uninit() on error and
      as a result free the port. But 'macvlan_port_get_rtnl(dev))' check
      (returns dev->rx_handler_data), which was added by this commit in order
      to prevent double free, is not quite correct:
      
      * for macvlan it always returns NULL because 'lowerdev' is the one that
        was used to register rx handler (port) in macvlan_port_create() as
        well as to unregister it in macvlan_port_destroy().
      * for macvtap it always returns a valid pointer because macvtap registers
        its own rx handler before macvlan_common_newlink().
      
      Fixes: d02fd6e7 ("macvlan: Fix one possible double free")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e14bf42
  21. 03 1月, 2018 1 次提交
  22. 19 10月, 2017 1 次提交
    • A
      macvlan/macvtap: Add support for L2 forwarding offloads with macvtap · 56fd2b2c
      Alexander Duyck 提交于
      This patch reverts earlier commit b13ba1b8 ("macvlan: forbid L2
      fowarding offload for macvtap"). The reason for reverting this is because
      the original patch no longer fixes what it previously did as the
      underlying structure has changed for macvtap. Specifically macvtap
      originally pulled packets directly off of the lowerdev. However in commit
      6acf54f1 ("macvtap: Add support of packet capture on macvtap device.")
      that code was changed and instead macvtap would listen directly on the
      macvtap device itself instead of the lower device. As such, the L2
      forwarding offload should now be able to provide a performance advantage of
      skipping the checks on the lower dev while not introducing any sort of
      regression.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56fd2b2c
  23. 15 10月, 2017 2 次提交
  24. 05 10月, 2017 1 次提交
  25. 21 9月, 2017 1 次提交
  26. 19 8月, 2017 1 次提交
  27. 18 7月, 2017 1 次提交
  28. 27 6月, 2017 3 次提交
  29. 22 6月, 2017 4 次提交
  30. 15 6月, 2017 1 次提交
  31. 08 6月, 2017 1 次提交
    • D
      net: Fix inconsistent teardown and release of private netdev state. · cf124db5
      David S. Miller 提交于
      Network devices can allocate reasources and private memory using
      netdev_ops->ndo_init().  However, the release of these resources
      can occur in one of two different places.
      
      Either netdev_ops->ndo_uninit() or netdev->destructor().
      
      The decision of which operation frees the resources depends upon
      whether it is necessary for all netdev refs to be released before it
      is safe to perform the freeing.
      
      netdev_ops->ndo_uninit() presumably can occur right after the
      NETDEV_UNREGISTER notifier completes and the unicast and multicast
      address lists are flushed.
      
      netdev->destructor(), on the other hand, does not run until the
      netdev references all go away.
      
      Further complicating the situation is that netdev->destructor()
      almost universally does also a free_netdev().
      
      This creates a problem for the logic in register_netdevice().
      Because all callers of register_netdevice() manage the freeing
      of the netdev, and invoke free_netdev(dev) if register_netdevice()
      fails.
      
      If netdev_ops->ndo_init() succeeds, but something else fails inside
      of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
      it is not able to invoke netdev->destructor().
      
      This is because netdev->destructor() will do a free_netdev() and
      then the caller of register_netdevice() will do the same.
      
      However, this means that the resources that would normally be released
      by netdev->destructor() will not be.
      
      Over the years drivers have added local hacks to deal with this, by
      invoking their destructor parts by hand when register_netdevice()
      fails.
      
      Many drivers do not try to deal with this, and instead we have leaks.
      
      Let's close this hole by formalizing the distinction between what
      private things need to be freed up by netdev->destructor() and whether
      the driver needs unregister_netdevice() to perform the free_netdev().
      
      netdev->priv_destructor() performs all actions to free up the private
      resources that used to be freed by netdev->destructor(), except for
      free_netdev().
      
      netdev->needs_free_netdev is a boolean that indicates whether
      free_netdev() should be done at the end of unregister_netdevice().
      
      Now, register_netdevice() can sanely release all resources after
      ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
      and netdev->priv_destructor().
      
      And at the end of unregister_netdevice(), we invoke
      netdev->priv_destructor() and optionally call free_netdev().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf124db5