- 12 12月, 2013 4 次提交
-
-
由 Philippe De Muyter 提交于
commit 03191656 worked around errata ERR006358, but comment contains duplicated lines, impairing the readability. Remove them. Signed-off-by: NPhilippe De Muyter <phdm@macqel.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This reverts commit 99023e90. Accidently checked this into 'net' instead of 'net-next'. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matthew Whitehead 提交于
Removed the shared ei_debug variable. Replaced it by adding u32 msg_enable to the private struct ei_device. Now each 8390 ethernet instance has a per-device logging variable. Changed older style printk() calls to more canonical forms. Tested on: ne, ne2k-pci, smc-ultra, and wd hardware. V4.0 - Substituted pr_info() and pr_debug() for printk() KERN_INFO and KERN_DEBUG V3.0 - Checked for cases where pr_cont() was most appropriate choice. - Changed module parameter from 'debug' to 'msg_enable' because debug was no longer the best description. V2.0 - Changed netif_msg_(drv|probe|ifdown|rx_err|tx_err|tx_queued|intr|rx_status|hw) to netif_(dbg|info|warn|err) where possible. Signed-off-by: NMatthew Whitehead <tedheadster@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tony Lindgren 提交于
Commit 89ce376c (drivers/net: Use of_match_ptr() macro in smc91x.c) added minimal device tree support to smc91x, but it's not working on many platforms because of the lack of some key configuration bits. Fix the issue by parsing the necessary configuration like the smc911x driver is doing. As most smc91x users seem to use 16-bit access, let's default to that if no reg-io-width is specified. Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: netdev@vger.kernel.org Cc: devicetree@vger.kernel.org Acked-by: NNishanth Menon <nm@ti.com> Signed-off-by: NTony Lindgren <tony@atomide.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 12月, 2013 2 次提交
-
-
由 Nat Gurumoorthy 提交于
The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120) uninitialized. From power on reset this register may have garbage in it. The Register Base Address register defines the device local address of a register. The data pointed to by this location is read or written using the Register Data register (PCI config offset 128). When REG_BASE_ADDR has garbage any read or write of Register Data Register (PCI 128) will cause the PCI bus to lock up. The TCO watchdog will fire and bring down the system. Signed-off-by: NNat Gurumoorthy <natg@google.com> Acked-by: NMichael Chan <mchan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maxime Ripard 提交于
The sun4i-emac driver uses devm_request_irq at .ndo_open time, but relies on the managed device mechanism to actually free it. This causes an issue whenever someone wants to restart the interface, the interrupt still being held, and not yet released. Fall back to using the regular request_irq at .ndo_open time, and introduce a free_irq during .ndo_stop. Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com> Cc: stable@vger.kernel.org # 3.11+ Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 12月, 2013 2 次提交
-
-
由 Srikanth Thokala 提交于
This patch adds barriers at appropriate places to ensure the driver works on Xilinx Zynq ARM-based SoC platform. Signed-off-by: NSrikanth Thokala <sthokal@xilinx.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Srikanth Thokala 提交于
There are no specific interrupts for the PONG buffer on both transmit and receive side, same interrupt is valid for both buffers. So, this patch removes this code. Signed-off-by: NSrikanth Thokala <sthokal@xilinx.com> Reviewed-by: NMichal Simek <monstr@monstr.eu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 12月, 2013 11 次提交
-
-
由 Robert Stonehouse 提交于
There is an as-yet unexplained bug that sometimes prevents (or delays) the driver seeing the completion event for a completed MCDI request on the SFC9120. The requested configuration change will have happened but the driver assumes it to have failed, and this can result in further failures. We can mitigate this by polling for completion after unsuccessfully waiting for an event. Fixes: 8127d661 ('sfc: Add support for Solarflare SFC9100 family') Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Robert Stonehouse 提交于
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Andrew Rybchenko 提交于
rx_prefix_size is 4-bytes aligned on Falcon/Siena (16 bytes), but it is equal to 14 on EF10. So, it should be taken into account if arch requires IP header to be 4-bytes aligned (via NET_IP_ALIGN). Fixes: 8127d661 ('sfc: Add support for Solarflare SFC9100 family') Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Ben Hutchings 提交于
There is a single MCDI PTP operation for setting the frequency adjustment and applying a time offset to the hardware clock. When applying a time offset we should not change the frequency adjustment. These two operations can now be requested separately but this requires a flash firmware update. Keep using the single operation, but remember and repeat the previous frequency adjustment. Fixes: 7c236c43 ('sfc: Add support for IEEE-1588 PTP') Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Alexandre Rames 提交于
This disables PTP when we bring the interface down to avoid getting unmatched RX timestamp events, and tries to re-enable it when bringing the interface up. [bwh: Make efx_ptp_stop() safe on Falcon. Introduce efx_ptp_{start,stop}_datapath() functions; we'll expand them later.] Fixes: 7c236c43 ('sfc: Add support for IEEE-1588 PTP') Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Ben Hutchings 提交于
In case of a flood of PTP packets, the timestamp peripheral and MC firmware on the SFN[56]322F boards may not be able to provide timestamp events for all packets. Don't complain too much about this. Fixes: 7c236c43 ('sfc: Add support for IEEE-1588 PTP') Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Laurence Evans 提交于
Limit syslog flood if a PTP packet storm occurs. Fixes: 7c236c43 ('sfc: Add support for IEEE-1588 PTP') Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Ezequiel Garcia 提交于
The current code unmaps the DMA mapping created for rx skb_buff's by using the data_size as the the mapping size. This is wrong since the correct size to specify should match the size used to create the mapping. This commit removes the following DMA_API_DEBUG warning: ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:887 check_unmap+0x3a8/0x860() mvneta d0070000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002eb80000] [map size=1600 bytes] [unmap size=66 bytes] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.21-01444-ga88ae13-dirty #92 [<c0013600>] (unwind_backtrace+0x0/0xf8) from [<c0010fb8>] (show_stack+0x10/0x14) [<c0010fb8>] (show_stack+0x10/0x14) from [<c001afa0>] (warn_slowpath_common+0x48/0x68) [<c001afa0>] (warn_slowpath_common+0x48/0x68) from [<c001b01c>] (warn_slowpath_fmt+0x30/0x40) [<c001b01c>] (warn_slowpath_fmt+0x30/0x40) from [<c018d0fc>] (check_unmap+0x3a8/0x860) [<c018d0fc>] (check_unmap+0x3a8/0x860) from [<c018d734>] (debug_dma_unmap_page+0x64/0x70) [<c018d734>] (debug_dma_unmap_page+0x64/0x70) from [<c0233f78>] (mvneta_rx+0xec/0x468) [<c0233f78>] (mvneta_rx+0xec/0x468) from [<c023436c>] (mvneta_poll+0x78/0x16c) [<c023436c>] (mvneta_poll+0x78/0x16c) from [<c02db468>] (net_rx_action+0x94/0x160) [<c02db468>] (net_rx_action+0x94/0x160) from [<c0021e68>] (__do_softirq+0xe8/0x1d0) [<c0021e68>] (__do_softirq+0xe8/0x1d0) from [<c0021ff8>] (do_softirq+0x4c/0x58) [<c0021ff8>] (do_softirq+0x4c/0x58) from [<c0022228>] (irq_exit+0x58/0x90) [<c0022228>] (irq_exit+0x58/0x90) from [<c000e7c8>] (handle_IRQ+0x3c/0x94) [<c000e7c8>] (handle_IRQ+0x3c/0x94) from [<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4) [<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4) from [<c000dc20>] (__irq_svc+0x40/0x50) Exception stack(0xc04f1f70 to 0xc04f1fb8) 1f60: c1fe46f8 00000000 00001d92 00001d92 1f80: c04f0000 c04f0000 c04f84a4 c03e081c c05220e7 00000001 c05220e7 c04f0000 1fa0: 00000000 c04f1fb8 c000eaf8 c004c048 60000113 ffffffff [<c000dc20>] (__irq_svc+0x40/0x50) from [<c004c048>] (cpu_startup_entry+0x54/0x128) [<c004c048>] (cpu_startup_entry+0x54/0x128) from [<c04c1a14>] (start_kernel+0x29c/0x2f0) [<c04c1a14>] (start_kernel+0x29c/0x2f0) from [<00008074>] (0x8074) ---[ end trace d4955f6acd178110 ]--- Mapped at: [<c018d600>] debug_dma_map_page+0x4c/0x11c [<c0235d6c>] mvneta_setup_rxqs+0x398/0x598 [<c0236084>] mvneta_open+0x40/0x17c [<c02dbbd4>] __dev_open+0x9c/0x100 [<c02dbe58>] __dev_change_flags+0x7c/0x134 Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
efx_ptp_is_ptp_tx() must be robust against skbs from raw sockets that have invalid IPv4 and UDP headers. Add checks that: - the transport header has been found - there is enough space between network and transport header offset for an IPv4 header - there is enough space after the transport header offset for a UDP header Fixes: 7c236c43 ('sfc: Add support for IEEE-1588 PTP') Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Somnath Kotur 提交于
During suspend-resume and lancer error recovery we will cleanup and re-initialize the resources through be_clear() and be_setup() respectively. During re-initialisation in be_setup(), if be_get_config() fails, we'll again call be_clear() which will cause a NULL pointer dereference as adapter->pmac_id is already freed. Signed-off-by: NKalesh AP <kalesh.purayil@emulex.com> Signed-off-by: NSomnath Kotur <somnath.kotur@emulex.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Somnath Kotur 提交于
The Firmware update would be detected by looking at the sliport_error1/ sliport_error2 register values(0x02/0x00). If its not a FW reset the current messaging would take place. If the error is due to FW reset, log a message to user that "Firmware update in progress" and also do not log sliport_status and sliport_error register values. Signed-off-by: NKalesh AP <kalesh.purayil@emulex.com> Signed-off-by: NSomnath Kotur <somnath.kotur@emulex.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 12月, 2013 6 次提交
-
-
由 Ivan Vecera 提交于
The driver incorrectly run loopback test on chips that don't support it. Loopback test is only supported by chips that has DEV_HAS_TEST_EXTENDED flag and returns 4 (NV_TEST_COUNT_EXTENDED) as test count. Signed-off-by: NIvan Vecera <ivecera@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michal Kalderon 提交于
Fixed NULL pointer dereference when dynamically activating SR-IOV after vf database failed to be allocated in probe stage (for example due to no ARI support in pci hub). Signed-off-by: NMichal Kalderon <michals@broadcom.com> Signed-off-by: NAriel Elior <ariele@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tony Lindgren 提交于
When booted with device tree, we may still have platform data passed as auxdata. For am3517 this is needed for passing the interrupt_enable and interrupt_disable callbacks that access the omap system control module registers. These callback functions will eventually go away when we have a separate system control module driver. Some of the things that are currently passed as platform data we don't need to set up as device tree properties as they are always the same on am3517. So let's use a new compatible flag for those so we can get those from the device tree match data. Also note that we need to fix setting of phy_dev to NULL instead of an empty string as the code later on uses that to find the first phy on the mdio bus. This seems to have been caused by 5d69e007 (net: davinci_emac: switch to new mdio). Signed-off-by: NTony Lindgren <tony@atomide.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jitendra Kalsaria 提交于
Signed-off-by: NJitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jitendra Kalsaria 提交于
o Fix the driver to allow user to enable/disable rx/tx vlan acceleration independently. For example: ethtool -K ethX rxvlan on/off ethtool -K ethX txvlan on/off Signed-off-by: NJitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jitendra Kalsaria 提交于
o Receive mac error stat was getting overwritten by other stats. Signed-off-by: NJitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 12月, 2013 4 次提交
-
-
由 Hariprasad Shenai 提交于
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Yang 提交于
When driver registration fails, we need to clean up the resources allocated before. mlx4_core missed destroying the workqueue allocated. This patch destroys the workqueue when registration fails. Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 12月, 2013 4 次提交
-
-
由 Eric Dumazet 提交于
Few network drivers really supports frag_list : virtual drivers. Some drivers wrongly advertise NETIF_F_FRAGLIST feature. If skb with a frag_list is given to them, packet on the wire will be corrupt. Remove this flag, as core networking stack will make sure to provide packets that can be sent without corruption. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Cc: Anirudha Sarangi <anirudh@xilinx.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sebastian Siewior 提交于
On tx submit the driver always dma_map_single() FEC_ENET_TX_FRSIZE (=2048) bytes. This works because we don't overwrite any memory after the data buffer, we remove it from cache if it was there. So we hurt performace in case the mapping of a smaller area makes a difference. There is also a bug: If the data area starts shortly before the end of RAM say 0xc7fffa10 and the RAM ends at 0xc8000000 then we have enough space to fit the data area (according to skb->len) but we would map beyond end of ram if we are using 2048. In v2.6.31 (against which kernel this patch made) there is the following check in dma_cache_maint(): |BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(start + size - 1)); Since the area starting at 0xc8000000 is no longer virt_addr_valid() we BUG() during dma_map_single(). The BUG() statement was removed in v3.5-rc1 as per 2dc6a016 ("ARM: dma-mapping: use asm-generic/dma-mapping-common.h"). This patch was tested on v2.6.31 and then forward-ported and compile tested only against the net tree. I think it is still worth fixing mainline even after the BUG() statement is gone. Tested-by: NFugang Duan <B38611@freescale.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: NFugang Duan <B38611@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mugunthan V N 提交于
When only one port of the two port is pinned out, then dt probe is failing because second port phy is not found. fixing this by checking the number of slaves and breaking the loop. Signed-off-by: NMugunthan V N <mugunthanvnm@ti.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rafael J. Wysocki 提交于
Modify tg3_chip_reset() and tg3_close() to check if the PCI network adapter device is accessible at all in order to skip poking it or trying to handle a carrier loss in vain when that's not the case. Introduce a special PCI helper function pci_device_is_present() for this purpose. Of course, this uncovers the lack of the appropriate RTNL locking in tg3_suspend() and tg3_resume(), so add that locking in there too. These changes prevent tg3 from burning a CPU at 100% load level for solid several seconds after the Thunderbolt link is disconnected from a Matrox DS1 docking station. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NMichael Chan <mchan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 12月, 2013 1 次提交
-
-
由 Eugenia Emantayev 提交于
Remove waiting for TX queues to become empty during selftest. This check is not necessary for any purpose, and might put the driver into an infinite loop. Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 11月, 2013 6 次提交
-
-
由 Mark Rustad 提交于
Correct a namespace complaint by making the function static and moving the prototype into the .c file. Signed-off-by: NMark Rustad <mark.d.rustad@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 John Fastabend 提交于
NETIF_F_HW_L2FW_DOFFLOAD allows upper layer net devices such as macvlan to use queues in the hardware to directly submit and receive skbs. This creates a subtle change in the datapath though. One change being the skb may no longer use the root devices qdisc. Because users may not expect this we can't enable the feature by default unless the hardware can offload all the software functionality above it. So for now disable it by default and let users opt in. Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 John Fastabend 提交于
When compiling with -Wstrict-prototypes gcc catches a static I missed. ./ixgbe_main.c:4254: warning: no previous prototype for 'ixgbe_fwd_ring_down' Reported-by: NPhillip Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Vladimir Davydov 提交于
On e1000_down(), we should ensure every asynchronous work is canceled before proceeding. Since the watchdog_task can schedule other works apart from itself, it should be stopped first, but currently it is stopped after the reset_task. This can result in the following race leading to the reset_task running after the module unload: e1000_down_and_stop(): e1000_watchdog(): ---------------------- ----------------- cancel_work_sync(reset_task) schedule_work(reset_task) cancel_delayed_work_sync(watchdog_task) The patch moves cancel_delayed_work_sync(watchdog_task) at the beginning of e1000_down_and_stop() thus ensuring the race is impossible. Cc: Tushar Dave <tushar.n.dave@intel.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Vladimir Davydov 提交于
The patch fixes the following lockdep warning, which is 100% reproducible on network restart: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0+ #47 Tainted: GF ------------------------------------------------------- kworker/1:1/27 is trying to acquire lock: ((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70 but task is already holding lock: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&adapter->mutex){+.+...}: [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390 [<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 -> #0 ((&(&adapter->watchdog_task)->work)){+.+...}: [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); *** DEADLOCK *** 3 locks held by kworker/1:1/27: #0: (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #1: ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #2: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] stack backtrace: CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF 3.12.0+ #47 Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501 05/31/2007 Workqueue: events e1000_reset_task [e1000] ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002 ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8 ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00 Call Trace: [<ffffffff816b54a2>] dump_stack+0x49/0x5f [<ffffffff810ba936>] print_circular_bug+0x216/0x310 [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108b906>] ? process_one_work+0x166/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 == The issue background == The problem occurs, because e1000_down(), which is called under adapter->mutex by e1000_reset_task(), tries to synchronously cancel e1000 auxiliary works (reset_task, watchdog_task, phy_info_task, fifo_stall_task), which take adapter->mutex in their handlers. So the question is what does adapter->mutex protect there? The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to private mutex from rtnl") as a replacement for rtnl_lock() taken in the asynchronous handlers. It targeted on fixing a similar lockdep warning issued when e1000_down() was called under rtnl_lock(), and it fixed it, but unfortunately it introduced the lockdep warning described above. Anyway, that said the source of this bug is that the asynchronous works were made to take rtnl_lock() some time ago, so let's look deeper and find why it was added there. The rtnl_lock() was added to asynchronous handlers by commit 338c15 ("e1000: fix occasional panic on unload") in order to prevent asynchronous handlers from execution after the module is unloaded (e1000_down() is called) as it follows from the comment to the commit: > Net drivers in general have an issue where timers fired > by mod_timer or work threads with schedule_work are running > outside of the rtnl_lock. > > With no other lock protection these routines are vulnerable > to races with driver unload or reset paths. > > The longer term solution to this might be a redesign with > safer locks being taken in the driver to guarantee no > reentrance, but for now a safe and effective fix is > to take the rtnl_lock in these routines. I'm not sure if this locking scheme fixed the problem or just made it unlikely, although I incline to the latter. Anyway, this was long time ago when e1000 auxiliary works were implemented as timers scheduling real work handlers in their routines. The e1000_down() function only canceled the timers, but left the real handlers running if they were running, which could result in work execution after module unload. Today, the e1000 driver uses sane delayed works instead of the pair timer+work to implement its delayed asynchronous handlers, and the e1000_down() synchronously cancels all the works so that the problem that commit 338c15 tried to cope with disappeared, and we don't need any locks in the handlers any more. Moreover, any locking there can potentially result in a deadlock. So, this patch reverts commits 0ef4ee and 338c15. Fixes: 0ef4eedc ("e1000: convert to private mutex from rtnl") Fixes: 338c15e4 ("e1000: fix occasional panic on unload") Cc: Tushar Dave <tushar.n.dave@intel.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 yzhu1 提交于
This change is based on a similar change made to e1000e support in commit bb9e44d0 ("e1000e: prevent oops when adapter is being closed and reset simultaneously"). The same issue has also been observed on the older e1000 cards. Here, we have increased the RESET_COUNT value to 50 because there are too many accesses to e1000 nic on stress tests to e1000 nic, it is not enough to set RESET_COUT 25. Experimentation has shown that it is enough to set RESET_COUNT 50. Signed-off-by: Nyzhu1 <yanjun.zhu@windriver.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-