提交 · 6c8d23f78646b20bdbdad476d015b6594ac8ff1c · openeuler / raspberrypi-kernel

04 9月, 2013 1 次提交

bonding: simplify and fix peer notification · 6c8d23f7

由 nikolay@redhat.com 提交于 9月 02, 2013

This patch aims to remove a use of the bond->lock for mutual exclusion
which will later allow easier migration to RCU of the users of this
functionality. We use RTNL as a synchronizing mechanism since it's
always held when send_peer_notif is set, and when it is decremented from
the notifier function. We can also drop some locking, and fix the
leakage of the send_peer_notif counter.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c8d23f7

30 8月, 2013 5 次提交

bonding: pr_debug instead of pr_warn in bond_arp_send_all · 3e32582f

由 Veaceslav Falico 提交于 8月 28, 2013

They're simply annoying and will spam dmesg constantly if we hit them, so
convert to pr_debug so that we still can access them in case of debugging.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e32582f

bonding: remove vlan_list/current_alb_vlan · e868b0c9

由 Veaceslav Falico 提交于 8月 28, 2013

Currently there are no real users of vlan_list/current_alb_vlan, only the
helpers which maintain them, so remove them.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e868b0c9

bonding: use vlan_uses_dev() in __bond_release_one() · a59d3d21

由 Veaceslav Falico 提交于 8月 28, 2013

We always hold the rtnl_lock() in __bond_release_one(), so use
vlan_uses_dev() instead of bond_vlan_used().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a59d3d21

bonding: convert bond_has_this_ip() to use upper devices · 50223ce4

由 Veaceslav Falico 提交于 8月 28, 2013

Currently, bond_has_this_ip() is aware only of vlan upper devices, and thus
will return false if the address is associated with the upper bridge or any
other device, and thus will break the arp logic.

Fix this by using the upper device list. For every upper device we verify
if the address associated with it is our address, and if yes - return true.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50223ce4

bonding: make bond_arp_send_all use upper device list · 27bc11e6

由 Veaceslav Falico 提交于 8月 28, 2013

Currently, bond_arp_send_all() is aware only of vlans, which breaks
configurations like bond <- bridge (or any other 'upper' device) with IP
(which is quite a common scenario for virt setups).

To fix this we convert the bond_arp_send_all() to first verify if the rt
device is the bond itself, and if not - to go through its list of upper
vlans and their respectiv upper devices (if the vlan's upper device matches
- tag the packet), if still not found - go through all of our upper list
devices to see if any of them match the route device for the target. If the
match is a vlan device - we also save its vlan_id and tag it in
bond_arp_send().

Also, clean the function a bit to be more readable.

CC: Vlad Yasevich <vyasevic@redhat.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27bc11e6

26 8月, 2013 1 次提交

bonding: fix error return code in bond_enslave() · b8e2fde4

由 Wei Yongjun 提交于 8月 23, 2013

Fix to return a negative error code in the add bond vlan ids error
handling case instead of 0, as done elsewhere in this function.

Introduced by commit 1ff412ad.
(bonding: change the bond's vlan syncing functions with the standard ones)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8e2fde4

09 8月, 2013 2 次提交

bonding: unwind on bond_add_vlan failure · b20903f2

由 nikolay@redhat.com 提交于 8月 06, 2013

In case of bond_add_vlan() failure currently we'll have the vlan's
refcnt bumped up in all slaves, but it will never go down because it
failed to get added to the bond, so properly unwind the added vlan if
bond_add_vlan fails.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b20903f2

bonding: change the bond's vlan syncing functions with the standard ones · 1ff412ad

由 nikolay@redhat.com 提交于 8月 06, 2013

Now we have vlan_vids_add/del_by_dev() which serve the same purpose as
bond's bond_add/del_vlans_on_slave() with the good side effect of
reverting the changes if one of the additions fails.
There's only 1 change in the behaviour of enslave: if adding of the
vlans to the slave fails, we'll fail the enslaving because otherwise we
might delete some vlan that wasn't added by the bonding.
The only way this may happen is with ENOMEM currently, so we're in trouble
anyway.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ff412ad

06 8月, 2013 4 次提交

bonding: modify only neigh_parms owned by us · 3b380877

由 Veaceslav Falico 提交于 8月 02, 2013

Otherwise, on neighbour creation, bond_neigh_init() will be called with a
foreign netdev.
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b380877

bonding: remove locking from bond_set_rx_mode() · 7864a1ad

由 Veaceslav Falico 提交于 8月 05, 2013

We're already protected by RTNL lock, so nothing can happen to bond/its
slaves, and thus the locking is useless here (both bond->lock and
bond->curr_active_slave).

Also, add ASSERT_RTNL() both to bond_set_rx_mode() and bond_hw_addr_swap()
to catch possible uses of it without RTNL locking.

This patch also saves us from a lockdep false-positive in
bond_set_rx_mode() vs bond_hw_addr_swap().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7864a1ad

bonding: add bond_time_in_interval() and use it for time comparison · e7f63f1d

由 Veaceslav Falico 提交于 8月 03, 2013

Currently we use a lot of time comparison math for arp_interval
comparisons, which are sometimes quite hard to read and understand.

All the time comparisons have one pattern:
(time - arp_interval_jiffies) <= jiffies <= (time + mod *
arp_interval_jiffies + arp_interval_jiffies/2)

Introduce a new helper - bond_time_in_interval(), which will do the math in
one place and, thus, will clean up the logical code. This helper introduces
a bit of overhead (by always calculating the jiffies from arp_interval),
however it's really not visible, considering that functions using it
usually run once in arp_interval milliseconds.

There are several lines slightly over 80 chars, however breaking them would
result in more hard-to-read code than several character after the 80 mark.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7f63f1d

bonding: call slave_last_rx() only once per slave · def4460c

由 Veaceslav Falico 提交于 8月 03, 2013

Simple cleanup to not call slave_last_rx() on every time function. It won't
give any measurable boost - but looks cleaner and easier to understand.

There are no time-consuming functions in between these calls, so it's safe
to call it in the beginning only once.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

def4460c

03 8月, 2013 1 次提交

bonding: modify only neigh_parms owned by us · 9918d5bf

由 Veaceslav Falico 提交于 8月 02, 2013

Otherwise, on neighbour creation, bond_neigh_init() will be called with a
foreign netdev.
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9918d5bf

02 8月, 2013 6 次提交

bonding: initial RCU conversion · 278b2083

由 nikolay@redhat.com 提交于 8月 01, 2013

This patch does the initial bonding conversion to RCU. After it the
following modes are protected by RCU alone: roundrobin, active-backup,
broadcast and xor. Modes ALB/TLB and 3ad still acquire bond->lock for
reading, and will be dealt with later. curr_active_slave needs to be
dereferenced via rcu in the converted modes because the only thing
protecting the slave after this patch is rcu_read_lock, so we need the
proper barrier for weakly ordered archs and to make sure we don't have
stale pointer. It's not tagged with __rcu yet because there's still work
to be done to remove the curr_slave_lock, so sparse will complain when
rcu_assign_pointer and rcu_dereference are used, but the alternative to use
rcu_dereference_protected would've created much bigger code churn which is
more difficult to test and review. That will be converted in time.

1. Active-backup mode
 1.1 Perf recording while doing iperf -P 4
  - old bonding: iperf spent 0.55% in bonding, system spent 0.29% CPU
                 in bonding
  - new bonding: iperf spent 0.29% in bonding, system spent 0.15% CPU
                 in bonding
 1.2. Bandwidth measurements
  - old bonding: 16.1 gbps consistently
  - new bonding: 17.5 gbps consistently

2. Round-robin mode
 2.1 Perf recording while doing iperf -P 4
  - old bonding: iperf spent 0.51% in bonding, system spent 0.24% CPU
                 in bonding
  - new bonding: iperf spent 0.16% in bonding, system spent 0.11% CPU
                 in bonding
 2.2 Bandwidth measurements
  - old bonding: 8 gbps (variable due to packet reorderings)
  - new bonding: 10 gbps (variable due to packet reorderings)

Of course the latency has improved in all converted modes, and moreover
while
doing enslave/release (since it doesn't affect tx anymore).

Also I've stress tested all modes doing enslave/release in a loop while
transmitting traffic.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

278b2083

bonding: factor out slave id tx code and simplify xmit paths · 15077228

由 Nikolay Aleksandrov 提交于 8月 01, 2013

I factored out the tx xmit code which relies on slave id in
bond_xmit_slave_id. It is global because later it can be used also in
3ad mode xmit. Unnecessary obvious comments are removed. Active-backup
mode is simplified because bond_dev_queue_xmit always consumes the skb.
bond_xmit_xor becomes one line because of bond_xmit_slave_id.
bond_for_each_slave_from is not used in bond_xmit_slave_id because later
when RCU is used we can avoid important race condition by using standard
rculist routines.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15077228

bonding: simplify broadcast_xmit function · 78a646ce

由 Nikolay Aleksandrov 提交于 8月 01, 2013

We don't need to start from the curr_active_slave as the frame will be
sent to all eligible slaves anyway, so we remove the unnecessary local
variables, checks and comments, and make it use the standard list API.
This has the nice side-effect that later when it's converted to RCU
a race condition will be avoided which could lead to double packet tx.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78a646ce

bonding: remove unnecessary read_locks of curr_slave_lock · 71bc3b2d

由 nikolay@redhat.com 提交于 8月 01, 2013

In all the cases we already hold bond->lock for reading, so the slave
can't get away and the check != NULL is sufficient. curr_active_slave
can still change after the read_lock is unlocked prior to use of the
dereferenced value, so there's no need for it. It either contains a
valid slave which we use (and can't get away), or it is NULL which is
checked.
In some places the read_lock of curr_slave_lock was left because we need
it not to change while performing some action (e.g. syncing current
active slave's addresses, sending ARP requests through the active slave)
such cases will be dealt with individually while converting to RCU.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71bc3b2d

bonding: convert to list API and replace bond's custom list · dec1e90e

由 nikolay@redhat.com 提交于 8月 01, 2013

This patch aims to remove struct bonding's first_slave and struct
slave's next and prev pointers, and replace them with the standard Linux
list API. The old macros are converted to list API as well and some new
primitives are available now. The checks if there're slaves that used
slave_cnt have been replaced by the list_empty macro.
Also a few small style fixes, changing longest -> shortest line in local
variable declarations, leaving an empty line before return and removing
unnecessary brackets.
This is the first step to gradual RCU conversion.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dec1e90e

bonding: fix system hang due to fast igmp timer rescheduling · 4beac029

由 Nikolay Aleksandrov 提交于 8月 01, 2013

After commit 4aa5dee4 ("net: convert resend IGMP to notifier event")
we try to acquire rtnl in bond_resend_igmp_join_requests but it can be
scheduled with rtnl already held (e.g. when bond_change_active_slave is
called with rtnl) causing a loop of immediate reschedules + calls because
rtnl_trylock fails each time since it's being already held.
For me this issue leads to system hangs very easy:
modprobe bonding; ifconfig bond0 up; ifenslave bond0 eth0; rmmod
bonding;

The fix is to introduce a small (1 jiffy) delay which is enough for the
sections holding rtnl to finish without putting any strain on the system.
Also adjust the timer in bond_change_active_slave to be 1 jiffy, since
most of the time it's called with rtnl already held.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4beac029

28 7月, 2013 1 次提交

bonding: remove bond_resend_igmp_join_requests read_unlock leftover · dcfe8048

由 nikolay@redhat.com 提交于 7月 27, 2013

After commit 4aa5dee4 ("net: convert resend IGMP to notifier event") we
have 1 read_unlock in bond_resend_igmp_join_requests which isn't paired
with a read_lock because it's removed by that commit.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Reviewed-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dcfe8048

27 7月, 2013 2 次提交

bond: cleanup netpoll code · 10eccb46

由 stephen hemminger 提交于 7月 24, 2013

This started out with fixing a sparse warning, then I realized that
the wrapper function bond_netpoll_info could just be removed
by rolling it into the enable code.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Reviewed-by: NJiri Pirko <jiri@resnulli.us>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10eccb46

bonding: use pre-defined macro in bond_mode_name instead of magic number 0 · f5280948

由 Wang Sheng-Hui 提交于 7月 24, 2013

We have BOND_MODE_ROUNDROBIN pre-defined as 0, and it's the lowest
mode number.
Use it to check the arg lower bound instead of magic number 0 in
bond_mode_name.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5280948

25 7月, 2013 2 次提交

bonding: Fixed up a error "do not initialise statics to 0 or NULL" in bond_main.c · b07ea07b

由 dingtianhong 提交于 7月 23, 2013

The error is found by the checkpatch.pl tools.
Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b07ea07b

bonding: don't call slave_xxx_netpoll under spinlocks · c4cdef9b

由 dingtianhong 提交于 7月 23, 2013

The slave_xxx_netpoll will call synchronize_rcu_bh(),
so the function may schedule and sleep, it should't be
called under spinlocks.

bond_netpoll_setup() and bond_netpoll_cleanup() are always
protected by rtnl lock, it is no need to take the read lock,
as the slave list couldn't be changed outside rtnl lock.
Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4cdef9b

24 7月, 2013 1 次提交

net: convert resend IGMP to notifier event · 4aa5dee4

由 Jiri Pirko 提交于 7月 20, 2013

Until now, bond_resend_igmp_join_requests() looks for vlans attached to
bonding device, bridge where bonding act as port manually. It does not
care of other scenarios, like stacked bonds or team device above. Make
this more generic and use netdev notifier to propagate the event to
upper devices and to actually call ip_mc_rejoin_groups().
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Acked-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4aa5dee4

30 6月, 2013 1 次提交

bonding: combine pr_debugs in bond_set_dev_addr into one · 008aebde

由 Nikolay Aleksandrov 提交于 6月 29, 2013

Combine the multiple pr_debugs in bond_set_dev_addr into one pr_debug.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

008aebde

28 6月, 2013 3 次提交

bonding: when cloning a MAC use NET_ADDR_STOLEN · ae0d6750

由 nikolay@redhat.com 提交于 6月 26, 2013

A simple semantic change, when a slave's MAC is cloned by the bond
master then set addr_assign_type to NET_ADDR_STOLEN instead of
NET_ADDR_SET. Also use bond_set_dev_addr() in BOND_FOM_ACTIVE mode
to change the bond's MAC address because the assign_type has to be
set properly.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae0d6750

bonding: remove unnecessary dev_addr_from_first member · 97a1e639

由 nikolay@redhat.com 提交于 6月 26, 2013

In struct bonding there's a member called dev_addr_from_first which is
used to denote when the bond dev should clone the first slave's MAC
address but since we have netdev's addr_assign_type variable that is not
necessary. We clone the first slave's MAC each time we have a random MAC
set to the bond device. This has the nice side-effect of also fixing an
inconsistency - when the MAC address of the bond dev is set after its
creation, but prior to having slaves, it's not kept and the first slave's
MAC is cloned. The only way to keep the MAC was to create the bond device
with the MAC address set (e.g. through ip link). In all cases if the
bond device is left without any slaves - its MAC gets reset to a random
one as before.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97a1e639

bonding: remove unnecessary setup_by_slave member · 8d2ada77

由 nikolay@redhat.com 提交于 6月 26, 2013

We have a member called setup_by_slave in struct bonding to denote if the
bond dev has different type than ARPHRD_ETHER, but that is already denoted
in bond's netdev type variable if it was setup by the slave, so use that
instead of the member.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d2ada77

26 6月, 2013 5 次提交

bonding: add an option to fail when any of arp_ip_target is inaccessible · 8599b52e

由 Veaceslav Falico 提交于 6月 24, 2013

Currently, we fail only when all of the ips in arp_ip_target are gone.
However, in some situations we might need to fail if even one host from
arp_ip_target becomes unavailable.

All situations, obviously, rely on the idea that we need *completely*
functional network, with all interfaces/addresses working correctly.

One real world example might be:
vlans on top on bond (hybrid port). If bond and vlans have ips assigned
and we have their peers monitored via arp_ip_target - in case of switch
misconfiguration (trunk/access port), slave driver malfunction or
tagged/untagged traffic dropped on the way - we will be able to switch
to another slave.

Though any other configuration needs that if we need to have access to all
arp_ip_targets.

This patch adds this possibility by adding a new parameter -
arp_all_targets (both as a module parameter and as a sysfs knob). It can be
set to:

	0 or any (the default) - which works exactly as it's working now -
	the slave is up if any of the arp_ip_targets are up.

	1 or all - the slave is up if all of the arp_ip_targets are up.

This parameter can be changed on the fly (via sysfs), and requires the mode
to be active-backup and arp_validate to be enabled (it obeys the
arp_validate config on which slaves to validate).

Internally it's done through:

1) Add target_last_arp_rx[BOND_MAX_ARP_TARGETS] array to slave struct. It's
   an array of jiffies, meaning that slave->target_last_arp_rx[i] is the
   last time we've received arp from bond->params.arp_targets[i] on this
   slave.

2) If we successfully validate an arp from bond->params.arp_targets[i] in
   bond_validate_arp() - update the slave->target_last_arp_rx[i] with the
   current jiffies value.

3) When getting slave's last_rx via slave_last_rx(), we return the oldest
   time when we've received an arp from any address in
   bond->params.arp_targets[].

If the value of arp_all_targets == 0 - we still work the same way as
before.

Also, update the documentation to reflect the new parameter.

v3->v4:
Kill the forgotten rtnl_unlock(), rephrase the documentation part to be
more clear, don't fail setting arp_all_targets if arp_validate is not set -
it has no effect anyway but can be easier to set up. Also, print a warning
if the last arp_ip_target is removed while the arp_interval is on, but not
the arp_validate.

v2->v3:
Use _bh spinlock, remove useless rtnl_lock() and use jiffies for new
arp_ip_target last arp, instead of slave_last_rx(). On bond_enslave(),
use the same initialization value for target_last_arp_rx[] as is used
for the default last_arp_rx, to avoid useless interface flaps.

Also, instead of failing to remove the last arp_ip_target just print a
warning - otherwise it might break existing scripts.

v1->v2:
Correctly handle adding/removing hosts in arp_ip_target - we need to
shift/initialize all slave's target_last_arp_rx. Also, don't fail module
loading on arp_all_targets misconfiguration, just disable it, and some
minor style fixes.
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8599b52e

bonding: don't trust arp requests unless active slave really works · aeea64ac

由 Veaceslav Falico 提交于 6月 24, 2013

Currently, if we receive any arp packet on a backup slave in active-backup
mode and arp_validate enabled, we suppose that it's an arp request, swap
source/target ip and try to validate it. This optimization gives us
virtually no downtime in the most common situation (active and backup
slaves are in the same broadcast domain and the active slave failed).

However, if we can't reach the arp_ip_target(s), we end up in an endless
loop of reselecting slaves, because we receive our arp requests, sent by
the active slave, and think that backup slaves are up, thus selecting them
as active and, again, sending arp requests, which fool our backup slaves.

Fix this by not validating the swapped arp packets if the current active
slave didn't receive any arp reply after it was selected as active. This
way we will only accept arp requests if we know that the current active
slave can actually reach arp_ip_target.

v3->v4:
Obey 80 lines and make checkpatch.pl happy, per Sergei's suggestion.

v1->v3:
No change.
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aeea64ac

bonding: don't validate arp if we don't have to · 2c146102

由 Veaceslav Falico 提交于 6月 24, 2013

Currently, we validate all the incoming arps if arp_validate not 0.
However, we don't have to validate backup slaves if arp_validate == active
and vice versa, so return early in bond_arp_rcv() in these cases.

It works correctly now because we verify arp_validate in slave_last_rx(),
however we're just doing useless work in bond_arp_rcv().
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c146102

bonding: don't add duplicate targets to arp_ip_target · 0afee4e8

由 Veaceslav Falico 提交于 6月 24, 2013

Print a warning and skip them.
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0afee4e8

bonding: add helper function bond_get_targets_ip(targets, ip) · 87a7b84b

由 Veaceslav Falico 提交于 6月 24, 2013

Add function bond_get_targets_ip(targets, ip) which searches through
targets array of ips (arp_targets) and returns the position of first
match. If ip == 0, returns the first free slot. On failure to find the
ip or free slot, return -1.

Use it to verify if the arp we've received is valid and in sysfs.

v1->v2:
Fix "[2/6] bonding: add helper function bond_get_targets_ip(targets, ip)",
per Nikolay's advice, to verify if source ip != 0.0.0.0, otherwise we might
update 'null' arp_ip_targets' last_rx. Also, address style.
Signed-off-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87a7b84b

24 6月, 2013 1 次提交

bonding: fix slave speed reporting in bond_miimon_commit · db4e9b2b

由 Nikolay Aleksandrov 提交于 6月 20, 2013

When we have BOND_LINK_UP the speed is reported unconditionally with %u
format although it can be SPEED_UNKNOWN (-1). After this patch it returns
0 in that case in an attempt to keep the existing scripts happy.
One line is intenionally left 81 chars because it gets ugly if broken.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db4e9b2b

13 6月, 2013 2 次提交

bonding: fix igmp_retrans type and two related races · 4f5474e7

由 Nikolay Aleksandrov 提交于 6月 12, 2013

First the type of igmp_retrans (which is the actual counter of
igmp_resend parameter) is changed to u8 to be able to store values up
to 255 (as per documentation). There are two races that were hidden
there and which are easy to trigger after the previous fix, the first is
between bond_resend_igmp_join_requests and bond_change_active_slave
where igmp_retrans is set and can be altered by the periodic. The second
race condition is between multiple running instances of the periodic
(upon execution it can be scheduled again for immediate execution which
can cause the counter to go < 0 which in the unsigned case leads to
unnecessary igmp retransmissions).
Since in bond_change_active_slave bond->lock is held for reading and
curr_slave_lock for writing, we use curr_slave_lock for mutual
exclusion. We can't drop them as there're cases where RTNL is not held
when bond_change_active_slave is called. RCU is unlocked in
bond_resend_igmp_join_requests before getting curr_slave_lock since we
don't need it there and it's pointless to delay.
The decrement is moved inside the "if" block because if we decrement
unconditionally there's still a possibility for a race condition although
it is much more difficult to hit (many changes have to happen in
a very short period in order to trigger) which in the case of 3 parallel
running instances of this function and igmp_retrans == 1
(with check bond->igmp_retrans-- > 1) is:
f1 passes, doesn't re-schedule, but decrements - igmp_retrans = 0
f2 then passes, doesn't re-schedule, but decrements - igmp_retrans = 255
f3 does the unnecessary retransmissions.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f5474e7

bonding: reset master mac on first enslave failure · b8fad459

由 Nikolay Aleksandrov 提交于 6月 12, 2013

If the bond device is supposed to get the first slave's MAC address and
the first enslavement fails then we need to reset the master's MAC
otherwise it will stay the same as the failed slave device. We do it
after err_undo_flags since that is the first place where the MAC can be
changed and we check if it should've been the first slave and if the
bond's MAC was set to it because that err place is used by multiple
locations prior to changing the master's MAC address.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8fad459

08 6月, 2013 2 次提交

bonding: disallow change of MAC if fail_over_mac enabled · 1b5acd29

由 Jay Vosburgh 提交于 5月 31, 2013

Currently, if fail_over_mac is set to active, then attempts to
change the MAC of the bond itself silently fail.  However, if fail_over_mac
is set to follow, changes are permitted.

	Permitting the bond's MAC to change with fail_over_mac=follow
will disrupt the follow functionality, which normally controls the
assignment of MAC address to the bond and its slaves, and can cause
multiple ports to be assigned the same MAC address. which will interfere
with the functioning of the device (where the device here is a
virtualization-aware card for s390, qeth).
Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b5acd29

bonding: Convert hw addr handling to sync/unsync, support ucast addresses · 303d1cbf

由 Jay Vosburgh 提交于 5月 31, 2013

This patch converts bonding to use the dev_uc/mc_sync and
dev_uc/mc_sync_multiple functions for updating the hardware addresses
of bonding slaves.

	The existing functions to add or remove addresses are removed,
and their functionality is replaced with calls to dev_mc_sync or
dev_mc_sync_multiple, depending upon the bonding mode.

	Calls to dev_uc_sync and dev_uc_sync_multiple are also added,
so that unicast addresses added to a bond will be properly synced with
its slaves.

	Various functions are renamed to better reflect the new
situation, and relevant comments are updated.
Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

303d1cbf