- 26 5月, 2011 1 次提交
-
-
由 Neil Horman 提交于
This soft lockup was recently reported: [root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters [root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves bonding: bond5: doing slave updates when interface is down. bonding bond5: master_dev is not up in bond_enslave [root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves bonding: bond5: doing slave updates when interface is down. BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444] CPU 12: Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc be2d Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1 RIP: 0010:[<ffffffff80064bf0>] [<ffffffff80064bf0>] .text.lock.spinlock+0x26/00 RSP: 0018:ffff810113167da8 EFLAGS: 00000286 RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025 RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8 RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000 R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282 FS: 00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0 Call Trace: [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14 [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1 [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0 [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450 [<ffffffff8006457b>] __down_write_nested+0x12/0x92 [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7 [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8 [<ffffffff80016b87>] vfs_write+0xce/0x174 [<ffffffff80017450>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 It occurs because we are able to change the slave configuarion of a bond while the bond interface is down. The bonding driver initializes some data structures only after its ndo_open routine is called. Among them is the initalization of the alb tx and rx hash locks. So if we add or remove a slave without first opening the bond master device, we run the risk of trying to lock/unlock a spinlock that has garbage for data in it, which results in our above softlock. Note that sometimes this works, because in many cases an unlocked spinlock has the raw_lock parameter initialized to zero (meaning that the kzalloc of the net_device private data is equivalent to calling spin_lock_init), but thats not true in all cases, and we aren't guaranteed that condition, so we need to pass the relevant spinlocks through the spin_lock_init function. Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to the ndo_init path, so they are ready for use by the bond_store_slaves path. Change notes: v2) Based on conversation with Jay and Nicolas it seems that the ability to enslave devices while the bond master is down should be safe to do. As such this is an outlier bug, and so instead we'll just initalize the errant spinlocks in the init path rather than the open path, solving the problem. We'll also remove the warnings about the bond being down during enslave operations, since it should be safe v3) Fix spelling error Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Reported-by: jtluka@redhat.com CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: nicolas.2p.debian@gmail.com CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 4月, 2011 1 次提交
-
-
由 Ben Hutchings 提交于
For backward compatibility, we should retain the module parameters and sysfs attributes to control the number of peer notifications (gratuitous ARPs and unsolicited NAs) sent after bonding failover. Also, it is possible for failover to take place even though the new active slave does not have link up, and in that case the peer notification should be deferred until it does. Change ipv4 and ipv6 so they do not automatically send peer notifications on bonding failover. Change the bonding driver to send separate NETDEV_NOTIFY_PEERS notifications when the link is up, as many times as requested. Since it does not directly control which protocols send notifications, make num_grat_arp and num_unsol_na aliases for a single parameter. Bump the bonding version number and update its documentation. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Acked-by: NBrian Haley <brian.haley@hp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 4月, 2011 1 次提交
-
-
由 Jiri Pirko 提交于
Since now when bonding uses rx_handler, all traffic going into bond device goes thru bond_handle_frame. So there's no need to go back into bonding code later via ptype handlers. This patch converts original ptype handlers into "bonding receive probes". These functions are called from bond_handle_frame and they are registered per-mode. Note that vlan packets are also handled because they are always untagged thanks to vlan_untag() Note that this also allows arpmon for eth-bond-bridge-vlan topology. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 4月, 2011 1 次提交
-
-
由 Ben Hutchings 提交于
It is undesirable for the bonding driver to be poking into higher level protocols, and notifiers provide a way to avoid that. This does mean removing the ability to configure reptitition of gratuitous ARPs and unsolicited NAs. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2011 3 次提交
-
-
由 Jiri Pirko 提交于
Since bond-related code was moved from net/core/dev.c into bonding, IFF_SLAVE_INACTIVE is no longer needed. Replace is with flag "inactive" stored in slave structure Signed-off-by: NJiri Pirko <jpirko@redhat.com> Reviewed-by: NNicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
transfers slave->state into slave->backup (that it's going to transfer into bitfield. Introduce wrapper inlines to do the work with it. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Reviewed-by: NNicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Now when bond-related code is moved from net/core/dev.c into bonding code, multiple priv_flags are not needed anymore. So let them rot. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Reviewed-by: NNicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 3月, 2011 1 次提交
-
-
由 Phil Oester 提交于
When the bonding module is loaded, it creates bond0 by default. Then, when attempting to create bond0, the following messages are printed to syslog: kernel: bonding: bond0 is being created... kernel: bonding: Bond creation failed. Which seems to indicate a problem, when in reality there is no problem. Since the actual error code is passed down from bond_create, make use of it to print a bit less ominous message: kernel: bonding: bond0 is being created... kernel: bond0 already exists. Signed-off-by: NPhil Oester <kernel@linuxace.com> Signed-off-by: NAndy Gospodarek <andy@greyhouse.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 1月, 2011 1 次提交
-
-
由 Jiri Pirko 提交于
count is incorrectly returned even in case of fail. Return ret instead. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 10月, 2010 1 次提交
-
-
由 Neil Horman 提交于
The monitoring paths in the bonding driver take write locks that are shared by the tx path. If netconsole is in use, these paths can call printk which puts us in the netpoll tx path, which, if netconsole is attached to the bonding driver, result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the monitor paths in the bonding driver don't claim the xmit_lock, nor should they). The solution is to use a per cpu flag internal to the driver to indicate when a cpu is holding the lock in a path that might recusrse into the tx path for the driver via netconsole. By checking this flag on transmit, we can defer the sending of the netconsole frames until a later time using the retransmit feature of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY. I've tested this and am able to transmit via netconsole while causing failover conditions on the bond slave links. Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 10月, 2010 1 次提交
-
-
由 Flavio Leitner 提交于
Allow sysadmins to configure the number of multicast membership report sent on a link failure event. Signed-off-by: NFlavio Leitner <fleitner@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 7月, 2010 1 次提交
-
-
由 Andy Gospodarek 提交于
When using module options arp monitoring and balance-alb/balance-tlb are mutually exclusive options. Anytime balance-alb/balance-tlb are enabled mii monitoring is forced to 100ms if not set. When configuring via sysfs no checking is currently done. Handling these cases with sysfs has to be done a bit differently because we do not have all configuration information available at once. This patch will not allow a mode change to balance-alb/balance-tlb if arp_interval is already non-zero. It will also not allow the user to set a non-zero arp_interval value if the mode is already set to balance-alb/balance-tlb. They are still mutually exclusive on a first-come, first serve basis. Tested with initscripts on Fedora and manual setting via sysfs. Signed-off-by: NAndy Gospodarek <gospo@redhat.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 7月, 2010 1 次提交
-
-
由 Nicolas de Pesloüan 提交于
The test for buffer overflow ensures we have room for 6 more bytes. sprintf, called with %s:%d, slave->dev->name, slave->queue_id may yield far more than 6 bytes. The correct test is res > (PAGE_SIZE - IFNAMSIZ - 6) . Signed-off-by: NNicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 6月, 2010 2 次提交
-
-
由 Andy Gospodarek 提交于
v2: changed bonding module version, modified to apply on top of changes from previous patch in series, and updated documentation to elaborate on multiqueue awareness that now exists in bonding driver. This patch give the user the ability to control the output slave for round-robin and active-backup bonding. Similar functionality was discussed in the past, but Jay Vosburgh indicated he would rather see a feature like this added to existing modes rather than creating a completely new mode. Jay's thoughts as well as Neil's input surrounding some of the issues with the first implementation pushed us toward a design that relied on the queue_mapping rather than skb marks. Round-robin and active-backup modes were chosen as the first users of this slave selection as they seemed like the most logical choices when considering a multi-switch environment. Round-robin mode works without any modification, but active-backup does require inclusion of the first patch in this series and setting the 'all_slaves_active' flag. This will allow reception of unicast traffic on any of the backup interfaces. This was tested with IPv4-based filters as well as VLAN-based filters with good results. More information as well as a configuration example is available in the patch to Documentation/networking/bonding.txt. Signed-off-by: NAndy Gospodarek <andy@greyhouse.net> Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Gospodarek 提交于
v2: changed parameter name from 'keep_all' to 'all_slaves_active' and skipped setting slaves to inactive rather than creating a new flag at Jay's suggestion. In an effort to suppress duplicate frames on certain bonding modes (specifically the modes that do not require additional configuration on the switch or switches connected to the host), code was added in the generic receive patch in 2.6.16. The current behavior works quite well for most users, but there are some times it would be nice to restore old functionality and allow all frames to make their way up the stack. This patch adds support for a new module option and sysfs file called 'all_slaves_active' that will restore pre-2.6.16 functionality if the user desires. The default value is '0' and retains existing behavior, but the user can set it to '1' and allow all frames up if desired. Signed-off-by: NAndy Gospodarek <andy@greyhouse.net> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 6月, 2010 5 次提交
-
-
由 Jiri Pirko 提交于
Move the code that copies slave's mac address in case that's the first slave into bond_enslave. Ifenslave app does this also but that's not a problem. This is something that should be done in bond_enslave, and it shound not matter from where is it called. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This patch makes bonding_store_slaves function nicer and easier to understand. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
(it's actually the same as v1) Remove checks that duplicates similar checks in bond_enslave. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
V1->V2: corrected res/ret use For some reason, MTU handling (storing, and restoring) is taking place in bond_sysfs. The correct place for this code is in bond_enslave, bond_release. So move it there. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 3月, 2010 1 次提交
-
-
由 Andi Kleen 提交于
Passing the attribute to the low level IO functions allows all kinds of cleanups, by sharing low level IO code without requiring an own function for every piece of data. Also drivers can extend the attributes with own data fields and use that in the low level function. This makes the class attributes the same as sysdev_class attributes and plain attributes. This will allow further cleanups in drivers. Full tree sweep converting all users. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 14 12月, 2009 1 次提交
-
-
由 Joe Perches 提交于
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt Remove DRV_NAME from pr_<level>s Consolidate long format strings Remove some extra tab indents Remove some unnecessary ()s from pr_<level>s arguments Align pr_<level> arguments Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 10月, 2009 2 次提交
-
-
由 Eric W. Biederman 提交于
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
This patch delegates the work of creating the sysfs groups to the netdev layer and ultimately to the device layer. This closes races between uevents. Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 10月, 2009 1 次提交
-
-
由 Alexey Dobriyan 提交于
After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
-
- 07 10月, 2009 1 次提交
-
-
由 Jiri Pirko 提交于
In some cases there is not desirable to switch back to primary interface when it's link recovers and rather stay with currently active one. We need to avoid packetloss as much as we can in some cases. This is solved by introducing primary_reselect option. Note that enslaved primary slave is set as current active no matter what. Patch modified by Jay Vosburgh as follows: fixed bug in action after change of option setting via sysfs, revised the documentation update, and bumped the bonding version number. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 10月, 2009 1 次提交
-
-
由 Jiri Pirko 提交于
Primary module parameter passed to bonding is pernament. That means if you release the primary slave and enslave it again, it becomes the primary slave again. But if you set primary slave via sysfs, the primary slave is only set once and it's not remembered in bond->params structure. Therefore the setting is lost after releasing the primary slave. This simple one-liner fixes this. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 8月, 2009 1 次提交
-
-
由 Jiri Pirko 提交于
I did not introduce new lines over 80 chars. I even eliminated some of them. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 6月, 2009 7 次提交
-
-
由 Stephen Hemminger 提交于
Remove bogus non-portable possibly unaligned way of testing for zero addres.. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
The bonding device acts unlike all other Linux network device functions in that it ignores case of device names. The developer must have come from windows! Cleanup the management of names and use standard routines where possible. Flag places where bonding device still doesn't work right with network namespaces. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
The "expected_refcount" stuff in bonding sysfs module is a mistake. Sysfs does proper refcounting, and it is okay to remove a bond device that has some user process holding the file open. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Resolve some of the complaints from checkpatch, and remove "magic emacs format" comments, and useless MODULE_SUPPORTED_DEVICE(). But should not change actual code. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
It is not safe to use a network device destructor that is a function in the module, since it can be called after module is unloaded if sysfs handle is open. When eventually using netlink, the device cleanup code needs to be done via uninit function. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
The whole read/write semaphore locking can be removed. It doesn't add any protection that isn't already done by using the RTNL mutex properly. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
bond_create() is always called with same parameters so move the argument down. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 6月, 2009 1 次提交
-
-
由 Stephen Hemminger 提交于
Some users still load bond module multiple times to create bonding devices. This accidentally was broken by a later patch about the time sysfs was fixed. According to Jay, it was broken by: commit b8a9787e Author: Jay Vosburgh <fubar@us.ibm.com> Date: Fri Jun 13 18:12:04 2008 -0700 bonding: Allow setting max_bonds to zero Note: sysfs and procfs still produce WARN() messages when this is done so the sysfs method is the recommended API. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 5月, 2009 1 次提交
-
-
由 Eric W. Biederman 提交于
Sysfs files for a network device can not unconditionally take the rtnl_lock as the bonding sysfs files do. If someone accesses those sysfs files while the network device is being unregistered with the rtnl_lock held we will deadlock. So use trylock and restart_syscall to avoid this problem. Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 4月, 2009 1 次提交
-
-
由 Brian Haley 提交于
Fix a zero address hole bug in the bonding arp_ip_target list that was causing the bond to ignore ARP replies (bugz 13006). Instead of just setting the array entry to zero, we now copy any additional entries down one slot, putting the zero entry at the end. With this change we can now have all the loops that walk the array stop when they hit a zero since there will be no addresses after it. Changes are based in part on code fragment provided in kernel: bugzilla 13006: http://bugzilla.kernel.org/show_bug.cgi?id=13006 by Steve Howard <steve@astutenetworks.com> Signed-off-by: NBrian Haley <brian.haley@hp.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 12月, 2008 1 次提交
-
-
由 Hannes Eder 提交于
Fix this sparse warnings: drivers/net/bonding/bond_main.c:104:20: warning: symbol 'bonding_defaults' was not declared. Should it be static? drivers/net/bonding/bond_main.c:204:22: warning: symbol 'ad_select_tbl' was not declared. Should it be static? drivers/net/bonding/bond_sysfs.c:60:21: warning: symbol 'bonding_rwsem' was not declared. Should it be static? Signed-off-by: NHannes Eder <hannes@hanneseder.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 12月, 2008 1 次提交
-
-
由 Holger Eitzenberger 提交于
Remove some declarations from bonding.c as they are declared in bonding.h already. Signed-off-by: NHolger Eitzenberger <holger@eitzenberger.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-