1. 28 3月, 2018 1 次提交
  2. 27 3月, 2018 1 次提交
  3. 28 2月, 2018 1 次提交
  4. 21 12月, 2017 1 次提交
  5. 29 11月, 2017 1 次提交
  6. 08 11月, 2017 2 次提交
    • J
      bonding: fix slave stuck in BOND_LINK_FAIL state · 055db695
      Jay Vosburgh 提交于
      The bonding miimon logic has a flaw, in that a failure of the
      rtnl_trylock can cause a slave to become permanently stuck in
      BOND_LINK_FAIL state.
      
      	The sequence of events to cause this is as follows:
      
      	1) bond_miimon_inspect finds that a slave's link is down, and so
      calls bond_propose_link_state, setting slave->new_link_state to
      BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns
      non-zero.
      
      	2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is
      rescheduled.  No change is committed.
      
      	3) bond_miimon_inspect is called again, but this time the slave
      from step 1 has recovered.  slave->new_link is reset to NOCHANGE, and, as
      slave->link was never changed, the switch enters the BOND_LINK_UP case,
      and does nothing.  The pending BOND_LINK_FAIL state from step 1 remains
      pending, as new_link_state is not reset.
      
      	4) The state from step 3 persists until another slave changes link
      state and causes bond_miimon_inspect to return non-zero.  At this point,
      the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed,
      and the slave will remain stuck in BOND_LINK_FAIL state even though it
      is actually link up.
      
      	The remedy for this is to initialize new_link_state on each entry
      to bond_miimon_inspect, as is already done with new_link.
      
      Fixes: fb9eb899 ("bonding: handle link transition from FAIL to UP correctly")
      Reported-by: NAlex Sidorenko <alexandre.sidorenko@hpe.com>
      Reviewed-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
      Acked-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      055db695
    • H
      bonding: discard lowest hash bit for 802.3ad layer3+4 · b5f86218
      Hangbin Liu 提交于
      After commit 07f4c900 ("tcp/dccp: try to not exhaust ip_local_port_range
      in connect()"), we will try to use even ports for connect(). Then if an
      application (seen clearly with iperf) opens multiple streams to the same
      destination IP and port, each stream will be given an even source port.
      
      So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing
      will always hash all these streams to the same interface. And the total
      throughput will limited to a single slave.
      
      Change the tcp code will impact the whole tcp behavior, only for bonding
      usage. Paolo Abeni suggested fix this by changing the bonding code only,
      which should be more reasonable, and less impact.
      
      Fix this by discarding the lowest hash bit because it contains little entropy.
      After the fix we can re-balance between slaves.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5f86218
  7. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  8. 25 10月, 2017 2 次提交
    • M
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns... · 6aa7de05
      Mark Rutland 提交于
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
      
      Please do not apply this to mainline directly, instead please re-run the
      coccinelle script shown below and apply its output.
      
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't harmful, and changing them results in
      churn.
      
      However, for some features, the read/write distinction is critical to
      correct operation. To distinguish these cases, separate read/write
      accessors must be used. This patch migrates (most) remaining
      ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
      coccinelle script:
      
      ----
      // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
      // WRITE_ONCE()
      
      // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
      
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: tj@kernel.org
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6aa7de05
    • X
      bonding: remove rtmsg_ifinfo called in bond_master_upper_dev_link · 4597efe3
      Xin Long 提交于
      Since commit 42e52bf9 ("net: add netnotifier event for upper device
      change"), netdev_master_upper_dev_link has generated NETDEV_CHANGEUPPER
      event which would send a notification to userspace in rtnetlink_event.
      
      There's no need to call rtmsg_ifinfo to send the notification any more.
      So this patch is to remove it from bond_master_upper_dev_link as well
      as bond_upper_dev_unlink to avoid the redundant notifications.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4597efe3
  9. 05 10月, 2017 3 次提交
  10. 04 10月, 2017 1 次提交
    • M
      bonding: speed/duplex update at NETDEV_UP event · 4d2c0cda
      Mahesh Bandewar 提交于
      Some NIC drivers don't have correct speed/duplex settings at the
      time they send NETDEV_UP notification and that messes up the
      bonding state. Especially 802.3ad mode which is very sensitive
      to these settings. In the current implementation we invoke
      bond_update_speed_duplex() when we receive NETDEV_UP, however,
      ignore the return value. If the values we get are invalid
      (UNKNOWN), then slave gets removed from the aggregator with
      speed and duplex set to UNKNOWN while link is still marked as UP.
      
      This patch fixes this scenario. Also 802.3ad mode is sensitive to
      these conditions while other modes are not, so making sure that it
      doesn't change the behavior for other modes.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d2c0cda
  11. 29 9月, 2017 1 次提交
  12. 13 9月, 2017 1 次提交
    • N
      net: bonding: fix tlb_dynamic_lb default value · f13ad104
      Nikolay Aleksandrov 提交于
      Commit 8b426dc5 ("bonding: remove hardcoded value") changed the
      default value for tlb_dynamic_lb which lead to either broken ALB mode
      (since tlb_dynamic_lb can be changed only in TLB) or setting TLB mode
      with tlb_dynamic_lb equal to 0.
      The first issue was recently fixed by setting tlb_dynamic_lb to 1 always
      when switching to ALB mode, but the default value is still wrong and
      we'll enter TLB mode with tlb_dynamic_lb equal to 0 if the mode is
      changed via netlink or sysfs. In order to restore the previous behaviour
      and default value simply remove the mode check around the default param
      initialization for tlb_dynamic_lb which will always set it to 1 as
      before.
      
      Fixes: 8b426dc5 ("bonding: remove hardcoded value")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f13ad104
  13. 12 9月, 2017 1 次提交
    • K
      net: bonding: Fix transmit load balancing in balance-alb mode if specified by sysfs · c6644d07
      Kosuke Tatsukawa 提交于
      Commit cbf5ecb3 ("net: bonding: Fix transmit load balancing in
      balance-alb mode") tried to fix transmit dynamic load balancing in
      balance-alb mode, which wasn't working after commit 8b426dc5
      ("bonding: remove hardcoded value").
      
      It turned out that my previous patch only fixed the case when
      balance-alb was specified as bonding module parameter, and not when
      balance-alb mode was set using /sys/class/net/*/bonding/mode (the most
      common usage).  In the latter case, tlb_dynamic_lb was set up according
      to the default mode of the bonding interface, which happens to be
      balance-rr.
      
      This additional patch addresses this issue by setting up tlb_dynamic_lb
      to 1 if "mode" is set to balance-alb through the sysfs interface.
      
      I didn't add code to change tlb_balance_lb back to the default value for
      other modes, because "mode" is usually set up only once during
      initialization, and it's not worthwhile to change the static variable
      bonding_defaults in bond_main.c to a global variable just for this
      purpose.
      
      Commit 8b426dc5 also changes the value of tlb_dynamic_lb for
      balance-tlb mode if it is set up using the sysfs interface.  I didn't
      change that behavior, because the value of tlb_balance_lb can be changed
      using the sysfs interface for balance-tlb, and I didn't like changing
      the default value back and forth for balance-tlb.
      
      As for balance-alb, /sys/class/net/*/bonding/tlb_balance_lb cannot be
      written to.  However, I think balance-alb with tlb_dynamic_lb set to 0
      is not an intended usage, so there is little use making it writable at
      this moment.
      
      Fixes: 8b426dc5 ("bonding: remove hardcoded value")
      Reported-by: NReinis Rozitis <r@roze.lv>
      Signed-off-by: NKosuke Tatsukawa <tatsu@ab.jp.nec.com>
      Cc: stable@vger.kernel.org  # v4.12+
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6644d07
  14. 14 8月, 2017 1 次提交
    • A
      bonding: ratelimit failed speed/duplex update warning · 11e9d782
      Andreas Born 提交于
      bond_miimon_commit() handles the UP transition for each slave of a bond
      in the case of MII. It is triggered 10 times per second for the default
      MII Polling interval of 100ms. For device drivers that do not implement
      __ethtool_get_link_ksettings() the call to bond_update_speed_duplex()
      fails persistently while the MII status could remain UP. That is, in
      this and other cases where the speed/duplex update keeps failing over a
      longer period of time while the MII state is UP, a warning is printed
      every MII polling interval.
      
      To address these excessive warnings net_ratelimit() should be used.
      Printing a warning once would not be sufficient since the call to
      bond_update_speed_duplex() could recover to succeed and fail again
      later. In that case there would be no new indication what went wrong.
      
      Fixes: b5bf0f5b (bonding: correctly update link status during mii-commit phase)
      Signed-off-by: NAndreas Born <futur.andy@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      11e9d782
  15. 12 8月, 2017 1 次提交
  16. 27 7月, 2017 1 次提交
  17. 21 7月, 2017 1 次提交
  18. 19 7月, 2017 1 次提交
  19. 08 7月, 2017 1 次提交
    • W
      bonding: avoid NETDEV_CHANGEMTU event when unregistering slave · f51048c3
      WANG Cong 提交于
      As Hongjun/Nicolas summarized in their original patch:
      
      "
      When a device changes from one netns to another, it's first unregistered,
      then the netns reference is updated and the dev is registered in the new
      netns. Thus, when a slave moves to another netns, it is first
      unregistered. This triggers a NETDEV_UNREGISTER event which is caught by
      the bonding driver. The driver calls bond_release(), which calls
      dev_set_mtu() and thus triggers NETDEV_CHANGEMTU (the device is still in
      the old netns).
      "
      
      This is a very special case, because the device is being unregistered
      no one should still care about the NETDEV_CHANGEMTU event triggered
      at this point, we can avoid broadcasting this event on this path,
      and avoid touching inetdev_event()/addrconf_notify() path.
      
      It requires to export __dev_set_mtu() to bonding driver.
      Reported-by: NHongjun Li <hongjun.li@6wind.com>
      Reported-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f51048c3
  20. 30 6月, 2017 1 次提交
  21. 27 6月, 2017 4 次提交
  22. 21 6月, 2017 1 次提交
  23. 16 6月, 2017 2 次提交
    • J
      networking: make skb_put & friends return void pointers · 4df864c1
      Johannes Berg 提交于
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions (skb_put, __skb_put and pskb_put) return void *
      and remove all the casts across the tree, adding a (u8 *) cast only
      where the unsigned char pointer was used directly, all done with the
      following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = { skb_put, __skb_put };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = { skb_put, __skb_put };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      
      which actually doesn't cover pskb_put since there are only three
      users overall.
      
      A handful of stragglers were converted manually, notably a macro in
      drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
      instances in net/bluetooth/hci_sock.c. In the former file, I also
      had to fix one whitespace problem spatch introduced.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4df864c1
    • J
      networking: introduce and use skb_put_data() · 59ae1d12
      Johannes Berg 提交于
      A common pattern with skb_put() is to just want to memcpy()
      some data into the new space, introduce skb_put_data() for
      this.
      
      An spatch similar to the one for skb_put_zero() converts many
      of the places using it:
      
          @@
          identifier p, p2;
          expression len, skb, data;
          type t, t2;
          @@
          (
          -p = skb_put(skb, len);
          +p = skb_put_data(skb, data, len);
          |
          -p = (t)skb_put(skb, len);
          +p = skb_put_data(skb, data, len);
          )
          (
          p2 = (t2)p;
          -memcpy(p2, data, len);
          |
          -memcpy(p, data, len);
          )
      
          @@
          type t, t2;
          identifier p, p2;
          expression skb, data;
          @@
          t *p;
          ...
          (
          -p = skb_put(skb, sizeof(t));
          +p = skb_put_data(skb, data, sizeof(t));
          |
          -p = (t *)skb_put(skb, sizeof(t));
          +p = skb_put_data(skb, data, sizeof(t));
          )
          (
          p2 = (t2)p;
          -memcpy(p2, data, sizeof(*p));
          |
          -memcpy(p, data, sizeof(*p));
          )
      
          @@
          expression skb, len, data;
          @@
          -memcpy(skb_put(skb, len), data, len);
          +skb_put_data(skb, data, len);
      
      (again, manually post-processed to retain some comments)
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59ae1d12
  24. 11 6月, 2017 1 次提交
  25. 09 6月, 2017 3 次提交
  26. 08 6月, 2017 1 次提交
    • D
      net: Fix inconsistent teardown and release of private netdev state. · cf124db5
      David S. Miller 提交于
      Network devices can allocate reasources and private memory using
      netdev_ops->ndo_init().  However, the release of these resources
      can occur in one of two different places.
      
      Either netdev_ops->ndo_uninit() or netdev->destructor().
      
      The decision of which operation frees the resources depends upon
      whether it is necessary for all netdev refs to be released before it
      is safe to perform the freeing.
      
      netdev_ops->ndo_uninit() presumably can occur right after the
      NETDEV_UNREGISTER notifier completes and the unicast and multicast
      address lists are flushed.
      
      netdev->destructor(), on the other hand, does not run until the
      netdev references all go away.
      
      Further complicating the situation is that netdev->destructor()
      almost universally does also a free_netdev().
      
      This creates a problem for the logic in register_netdevice().
      Because all callers of register_netdevice() manage the freeing
      of the netdev, and invoke free_netdev(dev) if register_netdevice()
      fails.
      
      If netdev_ops->ndo_init() succeeds, but something else fails inside
      of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
      it is not able to invoke netdev->destructor().
      
      This is because netdev->destructor() will do a free_netdev() and
      then the caller of register_netdevice() will do the same.
      
      However, this means that the resources that would normally be released
      by netdev->destructor() will not be.
      
      Over the years drivers have added local hacks to deal with this, by
      invoking their destructor parts by hand when register_netdevice()
      fails.
      
      Many drivers do not try to deal with this, and instead we have leaks.
      
      Let's close this hole by formalizing the distinction between what
      private things need to be freed up by netdev->destructor() and whether
      the driver needs unregister_netdevice() to perform the free_netdev().
      
      netdev->priv_destructor() performs all actions to free up the private
      resources that used to be freed by netdev->destructor(), except for
      free_netdev().
      
      netdev->needs_free_netdev is a boolean that indicates whether
      free_netdev() should be done at the end of unregister_netdevice().
      
      Now, register_netdevice() can sanely release all resources after
      ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
      and netdev->priv_destructor().
      
      And at the end of unregister_netdevice(), we invoke
      netdev->priv_destructor() and optionally call free_netdev().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf124db5
  27. 28 5月, 2017 1 次提交
    • V
      bonding: Prevent duplicate userspace notification · 7a7e96e0
      Vlad Yasevich 提交于
      Whenever a user changes bonding options, a NETDEV_CHANGEINFODATA
      notificatin is generated which results in a rtnelink message to
      be sent.  While runnig 'ip monitor', we can actually see 2 messages,
      one a result of the event, and the other a result of state change
      that is generated bo netdev_state_change().  However, this is not
      always the case. If bonding changes were done via sysfs or ifenslave
      (old ioctl interface), then only 1 message is seen.
      
      This patch removes duplicate messages in the case of using netlink
      to configure bonding.  It introduceds a separte function that
      triggers a netdev event and uses that function in the syfs and ioctl
      cases.
      
      This was discovered while auditing all the different envents and
      continues the effort of cleaning up duplicated netlink messages.
      
      CC: David Ahern <dsa@cumulusnetworks.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a7e96e0
  28. 26 5月, 2017 1 次提交
    • N
      bonding: Don't update slave->link until ready to commit · 797a9364
      Nithin Sujir 提交于
      In the loadbalance arp monitoring scheme, when a slave link change is
      detected, the slave->link is immediately updated and slave_state_changed
      is set. Later down the function, the rtnl_lock is acquired and the
      changes are committed, updating the bond link state.
      
      However, the acquisition of the rtnl_lock can fail. The next time the
      monitor runs, since slave->link is already updated, it determines that
      link is unchanged. This results in the bond link state permanently out
      of sync with the slave link.
      
      This patch modifies bond_loadbalance_arp_mon() to handle link changes
      identical to bond_ab_arp_{inspect/commit}(). The new link state is
      maintained in slave->new_link until we're ready to commit at which point
      it's copied into slave->link.
      
      NOTE: miimon_{inspect/commit}() has a more complex state machine
      requiring the use of the bond_{propose,commit}_link_state() functions
      which maintains the intermediate state in slave->link_new_state. The arp
      monitors don't require that.
      
      Testing: This bug is very easy to reproduce with the following steps.
      1. In a loop, toggle a slave link of a bond slave interface.
      2. In a separate loop, do ifconfig up/down of an unrelated interface to
      create contention for rtnl_lock.
      Within a few iterations, the bond link goes out of sync with the slave
      link.
      Signed-off-by: NNithin Nayak Sujir <nsujir@tintri.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
      Acked-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      797a9364
  29. 23 5月, 2017 2 次提交
    • J
      bonding: fix randomly populated arp target array · 72ccc471
      Jarod Wilson 提交于
      In commit dc9c4d0f, the arp_target array moved from a static global
      to a local variable. By the nature of static globals, the array used to
      be initialized to all 0. At present, it's full of random data, which
      that gets interpreted as arp_target values, when none have actually been
      specified. Systems end up booting with spew along these lines:
      
      [   32.161783] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
      [   32.168475] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
      [   32.175089] 8021q: adding VLAN 0 to HW filter on device lacp0
      [   32.193091] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
      [   32.204892] lacp0: Setting MII monitoring interval to 100
      [   32.211071] lacp0: Removing ARP target 216.124.228.17
      [   32.216824] lacp0: Removing ARP target 218.160.255.255
      [   32.222646] lacp0: Removing ARP target 185.170.136.184
      [   32.228496] lacp0: invalid ARP target 255.255.255.255 specified for removal
      [   32.236294] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
      [   32.243987] lacp0: Removing ARP target 56.125.228.17
      [   32.249625] lacp0: Removing ARP target 218.160.255.255
      [   32.255432] lacp0: Removing ARP target 15.157.233.184
      [   32.261165] lacp0: invalid ARP target 255.255.255.255 specified for removal
      [   32.268939] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
      [   32.276632] lacp0: Removing ARP target 16.0.0.0
      [   32.281755] lacp0: Removing ARP target 218.160.255.255
      [   32.287567] lacp0: Removing ARP target 72.125.228.17
      [   32.293165] lacp0: Removing ARP target 218.160.255.255
      [   32.298970] lacp0: Removing ARP target 8.125.228.17
      [   32.304458] lacp0: Removing ARP target 218.160.255.255
      
      None of these were actually specified as ARP targets, and the driver does
      seem to clean up the mess okay, but it's rather noisy and confusing, leaks
      values to userspace, and the 255.255.255.255 spew shows up even when debug
      prints are disabled.
      
      The fix: just zero out arp_target at init time.
      
      While we're in here, init arp_all_targets_value in the right place.
      
      Fixes: dc9c4d0f ("bonding: reduce scope of some global variables")
      CC: Mahesh Bandewar <maheshb@google.com>
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      CC: stable@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Acked-by: NAndy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72ccc471
    • J
      bonding: fix accounting of active ports in 3ad · 751da2a6
      Jarod Wilson 提交于
      As of 7bb11dc9 and 0622cab0, bond slaves in a 3ad bond are not
      removed from the aggregator when they are down, and the active slave count
      is NOT equal to number of ports in the aggregator, but rather the number
      of ports in the aggregator that are still enabled. The sysfs spew for
      bonding_show_ad_num_ports() has a comment that says "Show number of active
      802.3ad ports.", but it's currently showing total number of ports, both
      active and inactive. Remedy it by using the same logic introduced in
      0622cab0 in __bond_3ad_get_active_agg_info(), so sysfs, procfs and
      netlink all report the number of active ports. Note that this means that
      IFLA_BOND_AD_INFO_NUM_PORTS really means NUM_ACTIVE_PORTS instead of
      NUM_PORTS, and thus perhaps should be renamed for clarity.
      
      Lightly tested on a dual i40e lacp bond, simulating link downs with an ip
      link set dev <slave2> down, was able to produce the state where I could
      see both in the same aggregator, but a number of ports count of 1.
      
      MII Status: up
      Active Aggregator Info:
              Aggregator ID: 1
              Number of ports: 2 <---
      Slave Interface: ens10
      MII Status: up <---
      Aggregator ID: 1
      Slave Interface: ens11
      MII Status: up
      Aggregator ID: 1
      
      MII Status: up
      Active Aggregator Info:
              Aggregator ID: 1
              Number of ports: 1 <---
      Slave Interface: ens10
      MII Status: down <---
      Aggregator ID: 1
      Slave Interface: ens11
      MII Status: up
      Aggregator ID: 1
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      751da2a6