netpoll: Remove netpoll blocking from uninit path
Some recent testing in netpoll with bonding showed this backtrace ------------[ cut here ]------------ kernel BUG at drivers/net/bonding/bonding.h:134! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:1d.2/usb7/devnum CPU 0 Pid: 1876, comm: rmmod Not tainted 2.6.36-rc3+ #10 D26928/ RIP: 0010:[<ffffffffa0514ba4>] [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0 RSP: 0018:ffff88003b1b5d58 EFLAGS: 00010296 RAX: ffff88003b9b6200 RBX: ffff8800373e8e00 RCX: 00000000000f4240 RDX: 00000000ffffffff RSI: 0000000000000286 RDI: 0000000000000286 RBP: ffff88003b1b5dc8 R08: 0000000000000000 R09: 00000001af7de920 R10: 0000000000000000 R11: ffff880002495e98 R12: ffff880037922700 R13: ffff880038c31000 R14: ffff880037922730 R15: 0000000000000286 FS: 00007f90e6d72700(0000) GS:ffff880002400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000346f0d9ad0 CR3: 000000003b263000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rmmod (pid: 1876, threadinfo ffff88003b1b4000, task ffff88003b36aa80) Stack: 00000000ffffffff ffff88003b1b5d7a ffff8800379221e8 ffff880037922000 <0> ffff88003b1b5dc8 ffffffff813eb5fb ffff88003b1b5da8 0000000031b177a3 <0> ffff88003b1b5da8 ffff880037922000 ffff88003b1b5e48 ffff88003b1b5e48 Call Trace: [<ffffffff813eb5fb>] ? rtmsg_ifinfo+0xcb/0xf0 [<ffffffff813daad8>] rollback_registered_many+0x168/0x280 [<ffffffff813dac09>] unregister_netdevice_many+0x19/0x80 [<ffffffff813e97b3>] __rtnl_kill_links+0x63/0x90 [<ffffffff813e980b>] __rtnl_link_unregister+0x2b/0x60 [<ffffffff813e9bde>] rtnl_link_unregister+0x1e/0x30 [<ffffffffa052124b>] bonding_exit+0x37/0x51 [bonding] [<ffffffff81098b2e>] sys_delete_module+0x19e/0x270 [<ffffffff810bb2b2>] ? audit_syscall_entry+0x252/0x280 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b RIP [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0 [bonding] RSP <ffff88003b1b5d58> ---[ end trace 1395ad691cea24d1 ]--- It occurs because of my recent netpoll blocking patches, which I added to avoid recursive deadlock in the bonding driver. It relies on some per cpu bits, but the shutdown path forces some rescheduling as we cancel workqueues for the driver and wait for some device refcounts. If after the forced reschedule, we wind up on a different cpu we trigger the bughalt in unblock_netpoll_tx. The fix is to remove the netpoll block/unblock calls from bond_release_all. This is safe to do because bond_uninit, which is called via ndo_uninit in rollback_registered_many, doesn't occur until we send a NETDEV_UNREGISTER event, which triggers netconsole to remove us as a netpoll client, so we are guaranteed not to recurse into our own tx path here. Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Reviewed-by: NWANG Cong <amwang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Showing
想要评论请 注册 或 登录