1. 13 6月, 2013 2 次提交
    • N
      bonding: fix igmp_retrans type and two related races · 4f5474e7
      Nikolay Aleksandrov 提交于
      First the type of igmp_retrans (which is the actual counter of
      igmp_resend parameter) is changed to u8 to be able to store values up
      to 255 (as per documentation). There are two races that were hidden
      there and which are easy to trigger after the previous fix, the first is
      between bond_resend_igmp_join_requests and bond_change_active_slave
      where igmp_retrans is set and can be altered by the periodic. The second
      race condition is between multiple running instances of the periodic
      (upon execution it can be scheduled again for immediate execution which
      can cause the counter to go < 0 which in the unsigned case leads to
      unnecessary igmp retransmissions).
      Since in bond_change_active_slave bond->lock is held for reading and
      curr_slave_lock for writing, we use curr_slave_lock for mutual
      exclusion. We can't drop them as there're cases where RTNL is not held
      when bond_change_active_slave is called. RCU is unlocked in
      bond_resend_igmp_join_requests before getting curr_slave_lock since we
      don't need it there and it's pointless to delay.
      The decrement is moved inside the "if" block because if we decrement
      unconditionally there's still a possibility for a race condition although
      it is much more difficult to hit (many changes have to happen in
      a very short period in order to trigger) which in the case of 3 parallel
      running instances of this function and igmp_retrans == 1
      (with check bond->igmp_retrans-- > 1) is:
      f1 passes, doesn't re-schedule, but decrements - igmp_retrans = 0
      f2 then passes, doesn't re-schedule, but decrements - igmp_retrans = 255
      f3 does the unnecessary retransmissions.
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f5474e7
    • N
      bonding: reset master mac on first enslave failure · b8fad459
      Nikolay Aleksandrov 提交于
      If the bond device is supposed to get the first slave's MAC address and
      the first enslavement fails then we need to reset the master's MAC
      otherwise it will stay the same as the failed slave device. We do it
      after err_undo_flags since that is the first place where the MAC can be
      changed and we check if it should've been the first slave and if the
      bond's MAC was set to it because that err place is used by multiple
      locations prior to changing the master's MAC address.
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8fad459
  2. 08 6月, 2013 2 次提交
    • J
      bonding: disallow change of MAC if fail_over_mac enabled · 1b5acd29
      Jay Vosburgh 提交于
      Currently, if fail_over_mac is set to active, then attempts to
      change the MAC of the bond itself silently fail.  However, if fail_over_mac
      is set to follow, changes are permitted.
      
      	Permitting the bond's MAC to change with fail_over_mac=follow
      will disrupt the follow functionality, which normally controls the
      assignment of MAC address to the bond and its slaves, and can cause
      multiple ports to be assigned the same MAC address. which will interfere
      with the functioning of the device (where the device here is a
      virtualization-aware card for s390, qeth).
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b5acd29
    • J
      bonding: Convert hw addr handling to sync/unsync, support ucast addresses · 303d1cbf
      Jay Vosburgh 提交于
      This patch converts bonding to use the dev_uc/mc_sync and
      dev_uc/mc_sync_multiple functions for updating the hardware addresses
      of bonding slaves.
      
      	The existing functions to add or remove addresses are removed,
      and their functionality is replaced with calls to dev_mc_sync or
      dev_mc_sync_multiple, depending upon the bonding mode.
      
      	Calls to dev_uc_sync and dev_uc_sync_multiple are also added,
      so that unicast addresses added to a bond will be properly synced with
      its slaves.
      
      	Various functions are renamed to better reflect the new
      situation, and relevant comments are updated.
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Cc: Vlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      303d1cbf
  3. 29 5月, 2013 1 次提交
  4. 20 5月, 2013 2 次提交
  5. 17 5月, 2013 1 次提交
    • E
      bonding: allow TSO being set on bonding master · b0ce3508
      Eric Dumazet 提交于
      In some situations, we need to disable TSO on bonding slaves.
      
      bonding device automatically unset TSO in bond_fix_features(), and
      performance is not good because :
      
      1) We consume more cpu cycles.
      
      2) GSO segmentation has some bugs leading to out of order TCP packets
      if this segmentation is done before virtual device. This particular
      problem will be addressed in a separate patch.
      
      This patch allows TSO being set/unset on the bonding master,
      so that GSO segmentation is done after bonding layer.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Michał Mirosław <mirqus@gmail.com>
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0ce3508
  6. 25 4月, 2013 1 次提交
  7. 20 4月, 2013 9 次提交
  8. 19 4月, 2013 1 次提交
  9. 12 4月, 2013 2 次提交
  10. 09 4月, 2013 2 次提交
  11. 05 4月, 2013 1 次提交
    • V
      bonding: remove sysfs before removing devices · 4de79c73
      Veaceslav Falico 提交于
      We have a race condition if we try to rmmod bonding and simultaneously add
      a bond master through sysfs. In bonding_exit() we first remove the devices
      (through rtnl_link_unregister() ) and only after that we remove the sysfs.
      If we manage to add a device through sysfs after that the devices were
      removed - we'll end up with that device/sysfs structure and with the module
      unloaded.
      
      Fix this by first removing the sysfs and only after that calling
      rtnl_link_unregister().
      Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4de79c73
  12. 03 4月, 2013 1 次提交
  13. 27 3月, 2013 1 次提交
  14. 13 3月, 2013 1 次提交
    • V
      bonding: don't call update_speed_duplex() under spinlocks · 876254ae
      Veaceslav Falico 提交于
      bond_update_speed_duplex() might sleep while calling underlying slave's
      routines. Move it out of atomic context in bond_enslave() and remove it
      from bond_miimon_commit() - it was introduced by commit 546add79, however
      when the slave interfaces go up/change state it's their responsibility to
      fire NETDEV_UP/NETDEV_CHANGE events so that bonding can properly update
      their speed.
      
      I've tested it on all combinations of ifup/ifdown, autoneg/speed/duplex
      changes, remote-controlled and local, on (not) MII-based cards. All changes
      are visible.
      Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      876254ae
  15. 08 3月, 2013 1 次提交
  16. 27 2月, 2013 1 次提交
  17. 19 2月, 2013 2 次提交
    • D
      bonding: set sysfs device_type to 'bond' · b3f92b63
      Doug Goldstein 提交于
      Sets the sysfs device_type to 'bond' for udev. This allows udev rules to
      be created for bond devices. This is similar to how other network
      devices set their device_type.
      Signed-off-by: NDoug Goldstein <cardoe@cardoe.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3f92b63
    • N
      bonding: fix bond_release_all inconsistencies · 0896341a
      nikolay@redhat.com 提交于
      This patch fixes the following inconsistencies in bond_release_all:
      - IFF_BONDING flag is not stripped from slaves
      - MTU is not restored
      - no netdev notifiers are sent
      Instead of trying to keep bond_release and bond_release_all in sync
      I think we can re-use bond_release as the environment for calling it
      is correct (RTNL is held). I have been running tests for the past
      week and they came out successful. The only way for bond_release to fail
      is for the slave to be attached in a different bond or to not be a slave
      but that cannot happen as RTNL is held and no slave manipulations can be
      achieved.
      
      V2: As suggested bond_release is renamed to __bond_release_one with a
      new parameter "all" introduced so to avoid calling unnecessary code while
      destroying a bond, and a wrapper for it called bond_release is created
      because of ndo_del_link. bond_release_all() is removed.
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0896341a
  18. 12 2月, 2013 1 次提交
    • N
      netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock · 2cde6acd
      Neil Horman 提交于
      __netpoll_rcu_free is used to free netpoll structures when the rtnl_lock is
      already held.  The mechanism is used to asynchronously call __netpoll_cleanup
      outside of the holding of the rtnl_lock, so as to avoid deadlock.
      Unfortunately, __netpoll_cleanup modifies pointers (dev->np), which means the
      rtnl_lock must be held while calling it.  Further, it cannot be held, because
      rcu callbacks may be issued in softirq contexts, which cannot sleep.
      
      Fix this by converting the rcu callback to a work queue that is guaranteed to
      get scheduled in process context, so that we can hold the rtnl properly while
      calling __netpoll_cleanup
      
      Tested successfully by myself.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Cong Wang <amwang@redhat.com>
      CC: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2cde6acd
  19. 05 2月, 2013 1 次提交
  20. 31 1月, 2013 1 次提交
  21. 07 1月, 2013 1 次提交
  22. 05 1月, 2013 1 次提交
  23. 15 12月, 2012 1 次提交
  24. 08 12月, 2012 1 次提交
  25. 30 11月, 2012 2 次提交
    • N
      bonding: make arp_ip_target parameter checks consistent with sysfs · 90fb6250
      nikolay@redhat.com 提交于
      The module can be loaded with arp_ip_target="255.255.255.255" which makes
       it impossible to remove as the function in sysfs checks for that value,
       so we make the parameter checks consistent with sysfs.
      
       v2: Fix formatting
       v3: Make description text < 75 columns
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90fb6250
    • N
      bonding: fix miimon and arp_interval delayed work race conditions · fbb0c41b
      nikolay@redhat.com 提交于
      First I would give three observations which will be used later.
      Observation 1: if (delayed_work_pending(wq)) cancel_delayed_work(wq)
       This usage is wrong because the pending bit is cleared just before the
       work's fn is executed and if the function re-arms itself we might end up
       with the work still running. It's safe to call cancel_delayed_work_sync()
       even if the work is not queued at all.
      Observation 2: Use of INIT_DELAYED_WORK()
       Work needs to be initialized only once prior to (de/en)queueing.
      Observation 3: IFF_UP is set only after ndo_open is called
      
      Related race conditions:
      1. Race between bonding_store_miimon() and bonding_store_arp_interval()
       Because of Obs.1 we can end up having both works enqueued.
      2. Multiple races with INIT_DELAYED_WORK()
       Since the works are not protected by anything between INIT_DELAYED_WORK()
       and calls to (en/de)queue it is possible for races between the following
       functions:
       (races are also possible between the calls to INIT_DELAYED_WORK()
        and workqueue code)
       bonding_store_miimon() - bonding_store_arp_interval(), bond_close(),
      			  bond_open(), enqueued functions
       bonding_store_arp_interval() - bonding_store_miimon(), bond_close(),
      				bond_open(), enqueued functions
      3. By Obs.1 we need to change bond_cancel_all()
      
      Bugs 1 and 2 are fixed by moving all work initializations in bond_open
      which by Obs. 2 and Obs. 3 and the fact that we make sure that all works
      are cancelled in bond_close(), is guaranteed not to have any work
      enqueued.
      Also RTNL lock is now acquired in bonding_store_miimon/arp_interval so
      they can't race with bond_close and bond_open. The opposing work is
      cancelled only if the IFF_UP flag is set and it is cancelled
      unconditionally. The opposing work is already cancelled if the interface
      is down so no need to cancel it again. This way we don't need new
      synchronizations for the bonding workqueue. These bugs (and fixes) are
      tied together and belong in the same patch.
      Note: I have left 1 line intentionally over 80 characters (84) because I
            didn't like how it looks broken down. If you'd prefer it otherwise,
            then simply break it.
      
       v2: Make description text < 75 columns
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbb0c41b