提交 · 63c64de7be1f762b8c6c224f2994cf5769b542ae · openanolis / cloud-kernel

18 11月, 2016 1 次提交

netns: make struct pernet_operations::id unsigned int · c7d03a00

由 Alexey Dobriyan 提交于 11月 17, 2016

Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

	void f(long *p, int i)
	{
		g(p[i]);
	}

  roughly translates to

	movsx	rsi, esi
	mov	rdi, [rsi+...]
	call 	g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

	static inline void *net_generic(const struct net *net, int id)
	{
		...
		ptr = ng->ptr[id - 1];
		...
	}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
	function                                     old     new   delta
	nfsd4_lock                                  3886    3959     +73
	tipc_link_build_proto_msg                   1096    1140     +44
	mac80211_hwsim_new_radio                    2776    2808     +32
	tipc_mon_rcv                                1032    1058     +26
	svcauth_gss_legacy_init                     1413    1429     +16
	tipc_bcbase_select_primary                   379     392     +13
	nfsd4_exchange_id                           1247    1260     +13
	nfsd4_setclientid_confirm                    782     793     +11
		...
	put_client_renew_locked                      494     480     -14
	ip_set_sockfn_get                            730     716     -14
	geneve_sock_add                              829     813     -16
	nfsd4_sequence_done                          721     703     -18
	nlmclnt_lookup_host                          708     686     -22
	nfsd4_lockt                                 1085    1063     -22
	nfs_get_client                              1077    1050     -27
	tcf_bpf_init                                1106    1076     -30
	nfsd4_encode_fattr                          5997    5930     -67
	Total: Before=154856051, After=154854321, chg -0.00%
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7d03a00

31 10月, 2016 1 次提交

net: bonding: use new api ethtool_{get|set}_link_ksettings · d46b6349

由 Philippe Reynes 提交于 10月 25, 2016

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
Signed-off-by: NPhilippe Reynes <tremyfr@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d46b6349

18 10月, 2016 1 次提交

net: bonding: Flip to the new dev walk API · b3208b20

由 David Ahern 提交于 10月 17, 2016

Convert alb_send_learning_packets and bond_has_this_ip to use the new
netdev_walk_all_upper_dev_rcu API. In both cases this is just a code
conversion; no functional change is intended.

v2
- removed typecast of data and simplified bond_upper_dev_walk
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3208b20

28 9月, 2016 1 次提交

bonding: quit messing with IOCTL · 4ad41c1e

由 Al Viro 提交于 9月 03, 2016

The only remaining users are issuing SIOCGMIIPHY and SIOCGMIIREG,
neither of which deals with userland pointers.  Simply calling
->ndo_do_ioctl() is fine; no messing with set_fs() is needed.
It used to mess with SIOCETHTOOL, which would've needed set_fs(),
but that has been killed in "[NET] ethtool ops are the only way"
9 years ago...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4ad41c1e

05 9月, 2016 1 次提交

bonding: Fix bonding crash · 24b27fc4

由 Mahesh Bandewar 提交于 9月 01, 2016

Following few steps will crash kernel -

  (a) Create bonding master
      > modprobe bonding miimon=50
  (b) Create macvlan bridge on eth2
      > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
	   type macvlan
  (c) Now try adding eth2 into the bond
      > echo +eth2 > /sys/class/net/bond0/bonding/slaves
      <crash>

Bonding does lots of things before checking if the device enslaved is
busy or not.

In this case when the notifier call-chain sends notifications, the
bond_netdev_event() assumes that the rx_handler /rx_handler_data is
registered while the bond_enslave() hasn't progressed far enough to
register rx_handler for the new slave.

This patch adds a rx_handler check that can be performed right at the
beginning of the enslave code to avoid getting into this situation.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24b27fc4

02 9月, 2016 1 次提交

bonding: Remove deprecated create_singlethread_workqueue · f9f225eb

由 Bhaktipriya Shridhar 提交于 8月 30, 2016

alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "wq" queues multiple work items viz
&bond->mcast_work, &nnw->work, &bond->mii_work, &bond->arp_work,
&bond->alb_work, &bond->mii_work, &bond->ad_work, &bond->slave_arr_work
which require strict execution ordering. Hence, an ordered dedicated
workqueue has been used.

Since, it is a network driver, WQ_MEM_RECLAIM has been set to
ensure forward progress under memory pressure.
Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9f225eb

10 8月, 2016 1 次提交

bonding: fix the typo · 0d039f33

由 Zhu Yanjun 提交于 8月 09, 2016

The message "803.ad" should be "802.3ad".
Signed-off-by: NZhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d039f33

26 7月, 2016 1 次提交

net/bonding: Enforce active-backup policy for IPoIB bonds · 1533e773

由 Mark Bloch 提交于 7月 21, 2016

When using an IPoIB bond currently only active-backup mode is a valid
use case and this commit strengthens it.

Since commit 2ab82852 ("net/bonding: Enable bonding to enslave
netdevices not supporting set_mac_address()") was introduced till
4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the
fail over mac policy always applied to IPoIB bonds.

With the introduction of commit 492a7e67 ("IB/IPoIB: Allow setting
the device address"), that doesn't hold and practically IPoIB bonds are
broken as of that. To fix it, lets go to fail over mac if the device
doesn't support the ndo OR this is IPoIB device.

As a by-product, this commit also prevents a stack corruption which
occurred when trying to copy 20 bytes (IPoIB) device address
to a sockaddr struct that has only 16 bytes of storage.
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Acked-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1533e773

06 7月, 2016 2 次提交

bonding: fix enslavement slave link notifications · a30b0168

由 Aviv Heller 提交于 7月 05, 2016

Currently, link notifications are not sent by
bond_set_slave_link_state() upon enslavement if
the slave is enslaved when up.

This happens because slave->link default init value
is 0, which is the same as BOND_LINK_UP, resulting
in bond_set_slave_link_state() ignoring this transition.

This patch sets the default value of slave->link to
BOND_LINK_NOCHANGE, assuring it will count as a state
transition and thus trigger notification logic.
Signed-off-by: NAviv Heller <avivh@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a30b0168

net: introduce default neigh_construct/destroy ndo calls for L2 upper devices · 18bfb924

由 Jiri Pirko 提交于 7月 05, 2016

L2 upper device needs to propagate neigh_construct/destroy calls down to
lower devices. Do this by defining default ndo functions and use them in
team, bond, bridge and vlan.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18bfb924

10 6月, 2016 1 次提交

net: add netdev_lockdep_set_classes() helper · d3fff6c4

由 Eric Dumazet 提交于 6月 09, 2016

It is time to add netdev_lockdep_set_classes() helper
so that lockdep annotations per device type are easier to manage.

This removes a lot of copies and missing annotations.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3fff6c4

08 6月, 2016 1 次提交

net_sched: transform qdisc running bit into a seqcount · f9eb8aea

由 Eric Dumazet 提交于 6月 06, 2016

Instead of using a single bit (__QDISC___STATE_RUNNING)
in sch->__state, use a seqcount.

This adds lockdep support, but more importantly it will allow us
to sample qdisc/class statistics without having to grab qdisc root lock.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9eb8aea

19 3月, 2016 2 次提交

bonding: fix bond_get_stats() · fe30937b

由 Eric Dumazet 提交于 3月 17, 2016

bond_get_stats() can be called from rtnetlink (with RTNL held)
or from /proc/net/dev seq handler (with RCU held)

The logic added in commit 5f0c5f73 ("bonding: make global bonding
stats more reliable") kind of assumed only one cpu could run there.

If multiple threads are reading /proc/net/dev, stats can be really
messed up after a while.

A second problem is that some fields are 32bit, so we need to properly
handle the wrap around problem.

Given that RTNL is not always held, we need to use
bond_for_each_slave_rcu().

Fixes: 5f0c5f73 ("bonding: make global bonding stats more reliable")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe30937b

bonding: remove duplicate set of flag IFF_MULTICAST · 1098cee6

由 Zhang Shengju 提交于 3月 16, 2016

Remove unnecessary set of flag IFF_MULTICAST, since ether_setup
already does this.
Signed-off-by: NZhang Shengju <zhangshengju@cmss.chinamobile.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1098cee6

26 2月, 2016 1 次提交

net: bonding: use __ethtool_get_ksettings · 9856909c

由 David Decotigny 提交于 2月 24, 2016

Signed-off-by: NDavid Decotigny <decot@googlers.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9856909c

17 2月, 2016 1 次提交

bonding: don't use stale speed and duplex information · 266b495f

由 Jay Vosburgh 提交于 2月 08, 2016

There is presently a race condition between the bonding periodic
link monitor and the updating of a slave's speed and duplex.  The former
occurs on a periodic basis, and the latter in response to a driver's
calling of netif_carrier_on.

	It is possible for the periodic monitor to run between the
driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
event that causes bonding to update the slave's speed and duplex.  This
manifests most notably as a report that a slave is up and "0 Mbps full
duplex" after enslavement, but in principle could report an incorrect
speed and duplex after any link up event if the device comes up with a
different speed or duplex.  This affects the 802.3ad aggregator
selection, as the speed and duplex are selection criteria.

	This is fixed by updating the speed and duplex in the periodic
monitor, prior to using that information.

	This was done historically in bonding, but the call to
bond_update_speed_duplex was removed in commit 876254ae ("bonding:
don't call update_speed_duplex() under spinlocks"), as it might sleep
under lock.  Later, the locking was changed to only hold RTNL, and so
after commit 876254ae ("bonding: don't call update_speed_duplex()
under spinlocks") this call is again safe.
Tested-by: N"Tantilov, Emil S" <emil.s.tantilov@intel.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Fixes: 876254ae ("bonding: don't call update_speed_duplex() under spinlocks")
Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: NDing Tianhong <dingtianhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

266b495f

13 2月, 2016 1 次提交

bonding: Fix ARP monitor validation · 21a75f09

由 Jay Vosburgh 提交于 2月 02, 2016

The current logic in bond_arp_rcv will accept an incoming ARP for
validation if (a) the receiving slave is either "active" (which includes
the currently active slave, or the current ARP slave) or, (b) there is a
currently active slave, and it has received an ARP since it became active.
For case (b), the receiving slave isn't the currently active slave, and is
receiving the original broadcast ARP request, not an ARP reply from the
target.

This logic can fail if there is no currently active slave. In
this situation, the ARP probe logic cycles through all slaves, assigning
each in turn as the "current_arp_slave" for one arp_interval, then setting
that one as "active," and sending an ARP probe from that slave. The
current logic expects the ARP reply to arrive on the sending
current_arp_slave, however, due to switch FDB updating delays, the reply
may be directed to another slave.

This can arise if the bonding slaves and switch are working, but
the ARP target is not responding. When the ARP target recovers, a
condition may result wherein the ARP target host replies faster than the
switch can update its forwarding table, causing each ARP reply to be sent
to the previous current_arp_slave. This will never pass the logic in
bond_arp_rcv, as neither of the above conditions (a) or (b) are met.

Some experimentation on a LAN shows ARP reply round trips in the
200 usec range, but my available switches never update their FDB in less
than 4000 usec.

This patch changes the logic in bond_arp_rcv to additionally
accept an ARP reply for validation on any slave if there is a current ARP
slave and it sent an ARP probe during the previous arp_interval.

Fixes: aeea64ac ("bonding: don't trust arp requests unless active slave really works")
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21a75f09

11 2月, 2016 1 次提交

bonding: use return instead of goto · 1e2a8868

由 Zhang Shengju 提交于 2月 09, 2016

Replace 'goto' with 'return' to remove unnecessary check at label:
err_undo_flags.

The reason is that 'err_undo_flags' do two things for the first slave device:
1.revert bond mac address if it is set by the slave device.
2.revert bond device type if it's not ARPHRD_ETHER.

It's not necessary for the following three places, they changed neither bond
mac address nor type. It's straightforward to return directly.
Signed-off-by: NZhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e2a8868

08 2月, 2016 1 次提交

bonding: trivial: style fixes · d66bd905

由 Zhang Shengju 提交于 2月 03, 2016

remove some redudant brackets, use sizeof(*) instead of sizeof(struct x).
Signed-off-by: NZhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d66bd905

06 2月, 2016 2 次提交

bonding: add slave device name for debug · c6140a29

由 Zhang Shengju 提交于 2月 02, 2016

netdev_dbg() will add bond device name, it will be helpful if we print
slave device name.
Signed-off-by: NZhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6140a29

bond: track sum of rx_nohandler for all slaves · f344b0d9

由 Jarod Wilson 提交于 2月 01, 2016

Sample output with this set applied for an active-backup bond:

$ cat /sys/devices/virtual/net/bond0/lower_p7p1/statistics/rx_nohandler
16568
$ cat /sys/devices/virtual/net/bond0/lower_p5p2/statistics/rx_nohandler
16583
$ cat /sys/devices/virtual/net/bond0/statistics/rx_nohandler
33151

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: netdev@vger.kernel.org
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f344b0d9

12 1月, 2016 1 次提交

bonding: Prevent IPv6 link local address on enslaved devices · 03d84a5f

由 Karl Heiss 提交于 1月 11, 2016

Commit 1f718f0f ("bonding: populate neighbour's private on enslave")
undoes the fix provided by commit c2edacf8 ("bonding / ipv6: no addrconf
for slaves separately from master") by effectively setting the slave flag
after the slave has been opened. If the slave comes up quickly enough, it
will go through the IPv6 addrconf before the slave flag has been set and
will get a link local IPv6 address.

In order to ensure that addrconf knows to ignore the slave devices on state
change, set IFF_SLAVE before dev_open() during bonding enslavement.

Fixes: 1f718f0f ("bonding: populate neighbour's private on enslave")
Signed-off-by: NKarl Heiss <kheiss@gmail.com>
Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Reviewed-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03d84a5f

16 12月, 2015 1 次提交

net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK · a188222b

由 Tom Herbert 提交于 12月 14, 2015

The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
set of features for offloading all checksums. This is a mask of the
checksum offload related features bits. It is incorrect to set both
NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
features of a device.

This patch:
  - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
    NETIF_F_ALL_CSUM is being used as a mask).
  - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
    use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a188222b

04 12月, 2015 7 次提交

net: bonding: remove redudant brackets · ce3ea1c7

由 yzhu1 提交于 12月 03, 2015

It is not necessary to use two brackets. As such, the redudant brackets
are removed.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NZhu Yanjun <yanjun.zhu@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce3ea1c7

bonding: set inactive flags on release · 57beaca8

由 Jiri Pirko 提交于 12月 03, 2015

Be correct and symmetric to enslave and set inactive flags during release.
That gives LAG offload drivers - lower state change listeners - possibility
to do proper cleanup.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57beaca8

bonding: implement lower state change propagation · f7c7eb7f

由 Jiri Pirko 提交于 12月 03, 2015

Let netdev notifier listeners know about link and slave state change.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7c7eb7f

bonding: allow notifications for bond_set_slave_link_state · 5d397061

由 Jiri Pirko 提交于 12月 03, 2015

Similar to state notifications.

We allow caller to indicate if the notification should happen now or later,
depending on if he holds rtnl mutex or not. Introduce bond_slave_link_notify
function (similar to bond_slave_state_notify) which is later on called
with rtnl mutex and goes over slaves and executes delayed notification.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d397061

bonding: fill-up LAG changeupper info struct and pass it along · 41f0b049

由 Jiri Pirko 提交于 12月 03, 2015

Initialize netdev_lag_upper_info structure by TX type according to
current bonding mode and pass it along via netdev_master_upper_dev_link.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41f0b049

net: add possibility to pass information about upper device via notifier · 29bf24af

由 Jiri Pirko 提交于 12月 03, 2015

Sometimes the drivers and other code would find it handy to know some
internal information about upper device being changed. So allow upper-code
to pass information down to notifier listeners during linking.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29bf24af

net: propagate upper priv via netdev_master_upper_dev_link · 6dffb044

由 Jiri Pirko 提交于 12月 03, 2015

Eliminate netdev_master_upper_dev_link_private and pass priv directly as
a parameter of netdev_master_upper_dev_link.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6dffb044

08 11月, 2015 1 次提交

bonding: fix panic on non-ARPHRD_ETHER enslave failure · 40baec22

由 Jay Vosburgh 提交于 11月 06, 2015

Since commit 7d5cd2ce529b, when bond_enslave fails on devices that
are not ARPHRD_ETHER, if needed, it resets the bonding device back to
ARPHRD_ETHER by calling ether_setup.

	Unfortunately, ether_setup clobbers dev->flags, clearing IFF_UP
if the bond device is up, leaving it in a quasi-down state without
having actually gone through dev_close.  For bonding, if any periodic
work queue items are active (miimon, arp_interval, etc), those will
remain running, as they are stopped by bond_close.  At this point, if
the bonding module is unloaded or the bond is deleted, the system will
panic when the work function is called.

	This panic is resolved by calling dev_close on the bond itself
prior to calling ether_setup.

Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Fixes: 7d5cd2ce ("bonding: correctly handle bonding type change on enslave failure")
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40baec22

03 11月, 2015 1 次提交

bonding: simplify / unify event handling code for 3ad mode. · 52bc6716

由 Mahesh Bandewar 提交于 10月 31, 2015

Old logic of updating state-machine is not required since
ad_update_actor_keys() does it implicitly. The only loss is
the notification differentiation between speed vs. duplex
change. Now only one unified notification is printed.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52bc6716

16 10月, 2015 1 次提交

bonding: support encapsulated ipv6 TSO · e87eb405

由 Eric Dumazet 提交于 10月 15, 2015

If using a sixtofour device on top of a bonding device,
skb segmentation of TCP traffic is done right before calling
bonding xmit, because bonding only enables TSO for IPv4.

This patch improves single flow performance by about 120 % on my hosts,
because segmentation is deferred right before calling slave xmit.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e87eb405

18 9月, 2015 1 次提交

bonding: use l4 hash if available · 4b1b865e

由 Eric Dumazet 提交于 9月 15, 2015

If skb carries a l4 hash, no need to perform a flow dissection.

Performance is slightly better :

lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39012e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39393e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39988e+06

After patch :

lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.43579e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.44304e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.44312e+06
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b1b865e

02 9月, 2015 1 次提交

flow_dissector: Add flags argument to skb_flow_dissector functions · cd79a238

由 Tom Herbert 提交于 9月 01, 2015

The flags argument will allow control of the dissection process (for
instance whether to parse beyond L3).
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd79a238

29 8月, 2015 1 次提交

bonding: fix bond_poll_controller bh_enable warning · b0d4943e

由 Nikolay Aleksandrov 提交于 8月 28, 2015

The problem is rcu_read_unlock_bh() which triggers a warning when irqs are
disabled. ndo_poll_controller should run with irqs disabled always so we
can drop the rcu_read_lock_bh.

[   98.502922] bond0: making interface eth1 the new active one
[   98.503039] ------------[ cut here ]------------
[   98.503039] WARNING: CPU: 0 PID: 1744 at kernel/softirq.c:150 __local_bh_enable_ip+0x96/0xc0()
[   98.503039] Modules linked in: bonding(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netconsole ppdev joydev parport_pc serio_raw parport i2c_piix4 video acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc virtio_net e1000 ata_generic pcnet32 mii virtio_pci virtio_ring virtio pata_acpi
[   98.503039] CPU: 0 PID: 1744 Comm: ifenslave Tainted: G           OE   4.2.0-rc7+ #56
[   98.503039] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   98.503039]  0000000000000000 00000000e96ba230 ffff880020c236b8 ffffffff8183f105
[   98.503039]  0000000000000000 0000000000000000 ffff880020c236f8 ffffffff810a9496
[   98.503039]  ffff88002ea99e08 0000000000000200 ffffffffa02a8e06 ffff88002ea99e08
[   98.503039] Call Trace:
[   98.503039]  [<ffffffff8183f105>] dump_stack+0x4c/0x65
[   98.503039]  [<ffffffff810a9496>] warn_slowpath_common+0x86/0xc0
[   98.503039]  [<ffffffffa02a8e06>] ? bond_poll_controller+0x146/0x250 [bonding]
[   98.503039]  [<ffffffff810a95ca>] warn_slowpath_null+0x1a/0x20
[   98.503039]  [<ffffffff810ae376>] __local_bh_enable_ip+0x96/0xc0
[   98.503039]  [<ffffffffa02a8e2f>] bond_poll_controller+0x16f/0x250 [bonding]
[   98.503039]  [<ffffffffa02a8cf3>] ? bond_poll_controller+0x33/0x250 [bonding]
[   98.503039]  [<ffffffff810feaed>] ? trace_hardirqs_off+0xd/0x10
[   98.503039]  [<ffffffff81848afb>] ? _raw_spin_unlock_irqrestore+0x5b/0x60
[   98.503039]  [<ffffffff816ec48e>] netpoll_poll_dev+0x6e/0x350
[   98.503039]  [<ffffffff816eb977>] ? netpoll_start_xmit+0x137/0x1d0
[   98.503039]  [<ffffffff816b2e8b>] ? __alloc_skb+0x5b/0x210
[   98.503039]  [<ffffffff816ec89d>] netpoll_send_skb_on_dev+0x12d/0x2a0
[   98.503039]  [<ffffffff816eccde>] netpoll_send_udp+0x2ce/0x430
[   98.503039]  [<ffffffffa0190850>] write_msg+0xb0/0xf0 [netconsole]
[   98.503039]  [<ffffffff81116b63>] call_console_drivers.constprop.25+0x133/0x260
[   98.503039]  [<ffffffff81117934>] console_unlock+0x2f4/0x580
[   98.503039]  [<ffffffff81117ea5>] ? vprintk_emit+0x2e5/0x630
[   98.503039]  [<ffffffff81117ee5>] vprintk_emit+0x325/0x630
[   98.503039]  [<ffffffff81118379>] vprintk_default+0x29/0x40
[   98.503039]  [<ffffffff8183de4f>] printk+0x55/0x6b
[   98.503039]  [<ffffffff816c754c>] __netdev_printk+0x16c/0x260
[   98.503039]  [<ffffffff816c7a12>] netdev_info+0x62/0x80
[   98.503039]  [<ffffffffa02ab464>] bond_change_active_slave+0x134/0x6a0 [bonding]
[   98.503039]  [<ffffffffa02aba95>] bond_select_active_slave+0xc5/0x310 [bonding]
[   98.503039]  [<ffffffffa02aeb78>] bond_enslave+0x1088/0x10c0 [bonding]
[   98.503039]  [<ffffffffa02af46b>] bond_do_ioctl+0x37b/0x400 [bonding]
[   98.503039]  [<ffffffff81101d8d>] ? trace_hardirqs_on+0xd/0x10
[   98.503039]  [<ffffffff816dc437>] ? rtnl_lock+0x17/0x20
[   98.503039]  [<ffffffff816e5fd1>] dev_ifsioc+0x331/0x3e0
[   98.503039]  [<ffffffff816e62dc>] dev_ioctl+0xec/0x6c0
[   98.503039]  [<ffffffff816a6c6a>] sock_do_ioctl+0x4a/0x60
[   98.503039]  [<ffffffff816a7300>] sock_ioctl+0x1c0/0x250
[   98.503039]  [<ffffffff81271bfe>] do_vfs_ioctl+0x2ee/0x540
[   98.503039]  [<ffffffff810fd943>] ? up_read+0x23/0x40
[   98.503039]  [<ffffffff81070993>] ? __do_page_fault+0x1d3/0x420
[   98.503039]  [<ffffffff8127e246>] ? __fget_light+0x66/0x90
[   98.503039]  [<ffffffff81271ec9>] SyS_ioctl+0x79/0x90
[   98.503039]  [<ffffffff8184936e>] entry_SYSCALL_64_fastpath+0x12/0x76
[   98.503039] ---[ end trace 00cfa804b0670051 ]---

Fixes: 616f4541 ("bonding: implement bond_poll_controller()")
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0d4943e

19 8月, 2015 1 次提交

net: bonding: convert to using IFF_NO_QUEUE · 1e6f20ca

由 Phil Sutter 提交于 8月 18, 2015

Signed-off-by: NPhil Sutter <phil@nwl.cc>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e6f20ca

13 8月, 2015 1 次提交

bonding: Gratuitous ARP gets dropped when first slave added · b02e3e94

由 Venkat Venkatsubra 提交于 8月 11, 2015

When the first slave is added (such as during bootup) the first
gratuitous ARP gets dropped. We don't see this drop during a failover.
The packet gets dropped in qdisc (noop_enqueue).

The fix is to delay the sending of gratuitous ARPs till the bond dev's
carrier is present.

It can also be worked around by setting num_grat_arp to more than 1.
Signed-off-by: NVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b02e3e94

21 7月, 2015 2 次提交

bonding: correct the MAC address for "follow" fail_over_mac policy · a951bc1e

由 dingtianhong 提交于 7月 16, 2015

The "follow" fail_over_mac policy is useful for multiport devices that
either become confused or incur a performance penalty when multiple
ports are programmed with the same MAC address, but the same MAC
address still may happened by this steps for this policy:

1) echo +eth0 > /sys/class/net/bond0/bonding/slaves
   bond0 has the same mac address with eth0, it is MAC1.

2) echo +eth1 > /sys/class/net/bond0/bonding/slaves
   eth1 is backup, eth1 has MAC2.

3) ifconfig eth0 down
   eth1 became active slave, bond will swap MAC for eth0 and eth1,
   so eth1 has MAC1, and eth0 has MAC2.

4) ifconfig eth1 down
   there is no active slave, and eth1 still has MAC1, eth2 has MAC2.

5) ifconfig eth0 up
   the eth0 became active slave again, the bond set eth0 to MAC1.

Something wrong here, then if you set eth1 up, the eth0 and eth1 will have the same
MAC address, it will break this policy for ACTIVE_BACKUP mode.

This patch will fix this problem by finding the old active slave and
swap them MAC address before change active slave.
Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
Tested-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a951bc1e

bonding: correctly handle bonding type change on enslave failure · 7d5cd2ce

由 Nikolay Aleksandrov 提交于 7月 15, 2015

If the bond is enslaving a device with different type it will be setup
by it, but if after being setup the enslave fails the bond doesn't
switch back its type and also keeps pointers to foreign structures that can
be long gone. Thus revert back any type changes if the enslave failed and
the bond had to change its type.
Example:
 Before patch:
$ echo lo > bond0/bonding/slaves
-bash: echo: write error: Cannot assign requested address
$ ip l sh bond0
20: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default
    link/loopback 16:54:78:34:bd:41 brd 00:00:00:00:00:00
$ echo +eth1 > bond0/bonding/slaves
$ ip l sh bond0
20: bond0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
    link/ether 52:54:00:3f:47:69 brd ff:ff:ff:ff:ff:ff
(notice the MASTER flag is gone)

 After patch:
$ echo lo > bond0/bonding/slaves
-bash: echo: write error: Cannot assign requested address
$ ip l sh bond0
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default qlen 1000
    link/ether 6e:66:94:f6:07:fc brd ff:ff:ff:ff:ff:ff
$ echo +eth1 > bond0/bonding/slaves
$ ip l sh bond0
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default qlen 1000
    link/ether 52:54:00:3f:47:69 brd ff:ff:ff:ff:ff:ff
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: e36b9d16 ("bonding: clean muticast addresses when device changes type")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d5cd2ce

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功