1. 08 6月, 2017 1 次提交
    • D
      net: Fix inconsistent teardown and release of private netdev state. · cf124db5
      David S. Miller 提交于
      Network devices can allocate reasources and private memory using
      netdev_ops->ndo_init().  However, the release of these resources
      can occur in one of two different places.
      
      Either netdev_ops->ndo_uninit() or netdev->destructor().
      
      The decision of which operation frees the resources depends upon
      whether it is necessary for all netdev refs to be released before it
      is safe to perform the freeing.
      
      netdev_ops->ndo_uninit() presumably can occur right after the
      NETDEV_UNREGISTER notifier completes and the unicast and multicast
      address lists are flushed.
      
      netdev->destructor(), on the other hand, does not run until the
      netdev references all go away.
      
      Further complicating the situation is that netdev->destructor()
      almost universally does also a free_netdev().
      
      This creates a problem for the logic in register_netdevice().
      Because all callers of register_netdevice() manage the freeing
      of the netdev, and invoke free_netdev(dev) if register_netdevice()
      fails.
      
      If netdev_ops->ndo_init() succeeds, but something else fails inside
      of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
      it is not able to invoke netdev->destructor().
      
      This is because netdev->destructor() will do a free_netdev() and
      then the caller of register_netdevice() will do the same.
      
      However, this means that the resources that would normally be released
      by netdev->destructor() will not be.
      
      Over the years drivers have added local hacks to deal with this, by
      invoking their destructor parts by hand when register_netdevice()
      fails.
      
      Many drivers do not try to deal with this, and instead we have leaks.
      
      Let's close this hole by formalizing the distinction between what
      private things need to be freed up by netdev->destructor() and whether
      the driver needs unregister_netdevice() to perform the free_netdev().
      
      netdev->priv_destructor() performs all actions to free up the private
      resources that used to be freed by netdev->destructor(), except for
      free_netdev().
      
      netdev->needs_free_netdev is a boolean that indicates whether
      free_netdev() should be done at the end of unregister_netdevice().
      
      Now, register_netdevice() can sanely release all resources after
      ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
      and netdev->priv_destructor().
      
      And at the end of unregister_netdevice(), we invoke
      netdev->priv_destructor() and optionally call free_netdev().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf124db5
  2. 25 4月, 2017 1 次提交
  3. 14 4月, 2017 2 次提交
    • J
      netlink: pass extended ACK struct where available · fe52145f
      Johannes Berg 提交于
      This is an add-on to the previous patch that passes the extended ACK
      structure where it's already available by existing genl_info or extack
      function arguments.
      
      This was done with this spatch (with some manual adjustment of
      indentation):
      
      @@
      expression A, B, C, D, E;
      identifier fn, info;
      @@
      fn(..., struct genl_info *info, ...) {
      ...
      -nlmsg_parse(A, B, C, D, E, NULL)
      +nlmsg_parse(A, B, C, D, E, info->extack)
      ...
      }
      
      @@
      expression A, B, C, D, E;
      identifier fn, info;
      @@
      fn(..., struct genl_info *info, ...) {
      <...
      -nla_parse_nested(A, B, C, D, NULL)
      +nla_parse_nested(A, B, C, D, info->extack)
      ...>
      }
      
      @@
      expression A, B, C, D, E;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nlmsg_parse(A, B, C, D, E, NULL)
      +nlmsg_parse(A, B, C, D, E, extack)
      ...>
      }
      
      @@
      expression A, B, C, D, E;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nla_parse(A, B, C, D, E, NULL)
      +nla_parse(A, B, C, D, E, extack)
      ...>
      }
      
      @@
      expression A, B, C, D, E;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      ...
      -nlmsg_parse(A, B, C, D, E, NULL)
      +nlmsg_parse(A, B, C, D, E, extack)
      ...
      }
      
      @@
      expression A, B, C, D;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nla_parse_nested(A, B, C, D, NULL)
      +nla_parse_nested(A, B, C, D, extack)
      ...>
      }
      
      @@
      expression A, B, C, D;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nlmsg_validate(A, B, C, D, NULL)
      +nlmsg_validate(A, B, C, D, extack)
      ...>
      }
      
      @@
      expression A, B, C, D;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nla_validate(A, B, C, D, NULL)
      +nla_validate(A, B, C, D, extack)
      ...>
      }
      
      @@
      expression A, B, C;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nla_validate_nested(A, B, C, NULL)
      +nla_validate_nested(A, B, C, extack)
      ...>
      }
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe52145f
    • J
      netlink: pass extended ACK struct to parsing functions · fceb6435
      Johannes Berg 提交于
      Pass the new extended ACK reporting struct to all of the generic
      netlink parsing functions. For now, pass NULL in almost all callers
      (except for some in the core.)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fceb6435
  4. 07 4月, 2017 1 次提交
  5. 09 3月, 2017 1 次提交
    • J
      team: use ETH_MAX_MTU as max mtu · 3331aa37
      Jarod Wilson 提交于
      This restores the ability to set a team device's mtu to anything higher
      than 1500. Similar to the reported issue with bonding, the team driver
      calls ether_setup(), which sets an initial max_mtu of 1500, while the
      underlying hardware can handle something much larger. Just set it to
      ETH_MAX_MTU to support all possible values, and the limitations of the
      underlying devices will prevent setting anything too large.
      
      Fixes: 91572088 ("net: use core MTU range checking in core net infra")
      CC: Cong Wang <xiyou.wangcong@gmail.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3331aa37
  6. 07 2月, 2017 1 次提交
  7. 09 1月, 2017 1 次提交
  8. 28 10月, 2016 3 次提交
    • J
      genetlink: mark families as __ro_after_init · 56989f6d
      Johannes Berg 提交于
      Now genl_register_family() is the only thing (other than the
      users themselves, perhaps, but I didn't find any doing that)
      writing to the family struct.
      
      In all families that I found, genl_register_family() is only
      called from __init functions (some indirectly, in which case
      I've add __init annotations to clarifly things), so all can
      actually be marked __ro_after_init.
      
      This protects the data structure from accidental corruption.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56989f6d
    • J
      genetlink: statically initialize families · 489111e5
      Johannes Berg 提交于
      Instead of providing macros/inline functions to initialize
      the families, make all users initialize them statically and
      get rid of the macros.
      
      This reduces the kernel code size by about 1.6k on x86-64
      (with allyesconfig).
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      489111e5
    • J
      genetlink: no longer support using static family IDs · a07ea4d9
      Johannes Berg 提交于
      Static family IDs have never really been used, the only
      use case was the workaround I introduced for those users
      that assumed their family ID was also their multicast
      group ID.
      
      Additionally, because static family IDs would never be
      reserved by the generic netlink code, using a relatively
      low ID would only work for built-in families that can be
      registered immediately after generic netlink is started,
      which is basically only the control family (apart from
      the workaround code, which I also had to add code for so
      it would reserve those IDs)
      
      Thus, anything other than GENL_ID_GENERATE is flawed and
      luckily not used except in the cases I mentioned. Move
      those workarounds into a few lines of code, and then get
      rid of GENL_ID_GENERATE entirely, making it more robust.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a07ea4d9
  9. 06 7月, 2016 1 次提交
  10. 23 6月, 2016 1 次提交
  11. 10 6月, 2016 1 次提交
  12. 08 6月, 2016 1 次提交
  13. 26 5月, 2016 1 次提交
    • I
      team: don't call netdev_change_features under team->lock · f6988cb6
      Ivan Vecera 提交于
      The team_device_event() notifier calls team_compute_features() to fix
      vlan_features under team->lock to protect team->port_list. The problem is
      that subsequent __team_compute_features() calls netdev_change_features()
      to propagate vlan_features to upper vlan devices while team->lock is still
      taken. This can lead to deadlock when NETIF_F_LRO is modified on lower
      devices or team device itself.
      
      Example:
      The team0 as active backup with eth0 and eth1 NICs. Both eth0 & eth1 are
      LRO capable and LRO is enabled. Thus LRO is also enabled on team0.
      
      The command 'ethtool -K team0 lro off' now hangs due to this deadlock:
      
      dev_ethtool()
      -> ethtool_set_features()
       -> __netdev_update_features(team)
        -> netdev_sync_lower_features()
         -> netdev_update_features(lower_1)
          -> __netdev_update_features(lower_1)
          -> netdev_features_change(lower_1)
           -> call_netdevice_notifiers(...)
            -> team_device_event(lower_1)
             -> team_compute_features(team) [TAKES team->lock]
              -> netdev_change_features(team)
               -> __netdev_update_features(team)
                -> netdev_sync_lower_features()
                 -> netdev_update_features(lower_2)
                  -> __netdev_update_features(lower_2)
                  -> netdev_features_change(lower_2)
                   -> call_netdevice_notifiers(...)
                    -> team_device_event(lower_2)
                     -> team_compute_features(team) [DEADLOCK]
      
      The bug is present in team from the beginning but it appeared after the commit
      fd867d51 (net/core: generic support for disabling netdev features down stack)
      that adds synchronization of features with lower devices.
      
      Fixes: fd867d51 (net/core: generic support for disabling netdev features down stack)
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NIvan Vecera <ivecera@redhat.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6988cb6
  14. 31 3月, 2016 1 次提交
  15. 19 3月, 2016 1 次提交
  16. 26 2月, 2016 1 次提交
  17. 06 2月, 2016 1 次提交
  18. 19 1月, 2016 1 次提交
  19. 18 12月, 2015 1 次提交
  20. 16 12月, 2015 1 次提交
    • T
      net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK · a188222b
      Tom Herbert 提交于
      The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
      set of features for offloading all checksums. This is a mask of the
      checksum offload related features bits. It is incorrect to set both
      NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
      features of a device.
      
      This patch:
        - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
          NETIF_F_ALL_CSUM is being used as a mask).
        - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
          use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a188222b
  21. 04 12月, 2015 6 次提交
  22. 19 8月, 2015 1 次提交
  23. 18 5月, 2015 1 次提交
  24. 13 5月, 2015 5 次提交
  25. 30 3月, 2015 1 次提交
  26. 05 3月, 2015 1 次提交
  27. 28 2月, 2015 1 次提交
  28. 24 2月, 2015 1 次提交
    • J
      team: fix possible null pointer dereference in team_handle_frame · 57e59563
      Jiri Pirko 提交于
      Currently following race is possible in team:
      
      CPU0                                        CPU1
                                                  team_port_del
                                                    team_upper_dev_unlink
                                                      priv_flags &= ~IFF_TEAM_PORT
      team_handle_frame
        team_port_get_rcu
          team_port_exists
            priv_flags & IFF_TEAM_PORT == 0
          return NULL (instead of port got
                       from rx_handler_data)
                                                    netdev_rx_handler_unregister
      
      The thing is that the flag is removed before rx_handler is unregistered.
      If team_handle_frame is called in between, team_port_exists returns 0
      and team_port_get_rcu will return NULL.
      So do not check the flag here. It is guaranteed by netdev_rx_handler_unregister
      that team_handle_frame will always see valid rx_handler_data pointer.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Fixes: 3d249d4c ("net: introduce ethernet teaming device")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57e59563