- 30 9月, 2015 7 次提交
-
-
由 Maxime Ripard 提交于
Since the switch to per-CPU interrupts, we lost the ability to set which CPU was going to receive our RX interrupt, which was now only the CPU on which the mvneta_open function was run. We can now assign our queues to their respective CPUs, and make sure only this CPU is going to handle our traffic. This also paves the road to be able to change that at runtime, and later on to support RSS. [gregory.clement@free-electrons.com]: hardened the CPU hotplug support. Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maxime Ripard 提交于
The mvneta driver allows to change the default RX queue trough the rxq_def kernel parameter. However, the current code doesn't allow to have any value but 0. It is actively checked for in the driver's probe because the drivers makes a number of assumption and takes a number of shortcuts in order to just use that RX queue. Remove these limitations in order to be able to specify any available queue. Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maxime Ripard 提交于
Now that our interrupt controller is allowing us to use per-CPU interrupts, actually use it in the mvneta driver. This involves obviously reworking the driver to have a CPU-local NAPI structure, and report for incoming packet using that structure. Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maxime Ripard 提交于
The CPU_MAP register is duplicated for each CPUs at different addresses, each instance being at a different address. However, the code so far was using CONFIG_NR_CPUS to initialise the CPU_MAP registers for each registers, while the SoCs embed at most 4 CPUs. This is especially an issue with multi_v7_defconfig, where CONFIG_NR_CPUS is currently set to 16, resulting in writes to registers that are not CPU_MAP. Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com> Cc: <stable@vger.kernel.org> # v3.8+ Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maxime Ripard 提交于
The MPIC driver currently has a list of interrupts to handle as per-cpu. Since the timer, fabric and neta interrupts were the only per-cpu interrupts in the system, we can now remove the switch and just check for the hardware irq number to determine whether a given interrupt is per-cpu or not. Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com> Acked-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maxime Ripard 提交于
Some drivers might use the per-cpu interrupts and still might be built as a module. Export request_percpu_irq an free_percpu_irq to these user, which also make it consistent with enable/disable_percpu_irq that were exported. Reported-by: NWilly Tarreau <w@1wt.eu> Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maxime Ripard 提交于
The documentation of request_percpu_irq is confusing and suggest that the interrupt is not enabled at all, while it is actually enabled on the local CPU. Clarify that. Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 9月, 2015 28 次提交
-
-
由 Jesper Dangaard Brouer 提交于
Noticed that the compiler (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC)) generated suboptimal assembler code in eth_get_headlen(). This early return coding style is usually not an issue, on super scalar CPUs, but the compiler choose to put the return statement after this very unlikely branch, thus creating larger jump down to the likely code path. Performance wise, I could measure slightly less L1-icache-load-misses and less branch-misses, and an improvement of 1 nanosec with an IP-forwarding use-case with 257 bytes packets with ixgbe (CPU i7-4790K @ 4.00GHz). Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bendik Rønning Opstad 提交于
Application limited streams such as thin streams, that transmit small amounts of payload in relatively few packets per RTT, can be prevented from growing the CWND when in congestion avoidance. This leads to increased sojourn times for data segments in streams that often transmit time-dependent data. Currently, a connection is considered CWND limited only after having successfully transmitted at least one packet with new data, while at the same time failing to transmit some unsent data from the output queue because the CWND is full. Applications that produce small amounts of data may be left in a state where it is never considered to be CWND limited, because all unsent data is successfully transmitted each time an incoming ACK opens up for more data to be transmitted in the send window. Fix by always testing whether the CWND is fully used after successful packet transmissions, such that a connection is considered CWND limited whenever the CWND has been filled. This is the correct behavior as specified in RFC2861 (section 3.1). Cc: Andreas Petlund <apetlund@simula.no> Cc: Carsten Griwodz <griff@simula.no> Cc: Jonas Markussen <jonassm@ifi.uio.no> Cc: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Cc: Mads Johannessen <madsjoh@ifi.uio.no> Signed-off-by: NBendik Rønning Opstad <bro.devel+kernel@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com> Tested-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Tested-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hariprasad Shenai 提交于
Adds support for ethtool get time stamp ioctl, which is used by tcpdump to get the supported time stamp types eg: tcpdump -i eth5 -J Time stamp types for eth5 (use option -j to set): host (Host) adapter_unsynced (Adapter, not synced with system time) Adds support for adapter unsynced mode, by adding SIOCSHWTSTAMP support in driver. Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 huangdaode 提交于
This patch fixes the compilation error with arm allmodconfig, this error generated due to unavailability of readq() on 32-bit platform which was found during net-next daily compilation. In the same time, fix all the hns drivers compilation warnings. Signed-off-by: Nhuangdaode <huangdaode@hisilicon.com> Signed-off-by: Nzhaungyuzeng <Yisen.zhuang@huawei.com> Signed-off-by: Nkenneth Lee <liguozhu@hisilicon.com> Signed-off-by: Nyankejian <yankejian@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Robert Jarzmik 提交于
Convert pxaficp_ir to dmaengine. As pxa architecture is shifting from raw DMA registers access to pxa_dma dmaengine driver, convert this driver to dmaengine. Signed-off-by: NRobert Jarzmik <robert.jarzmik@free.fr> Tested-by: NPetr Cvek <petr.cvek@tul.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Robert Jarzmik 提交于
Convert the pxa IRDA driver to readl and writel primitives, and remove another set of direct registers access. This leaves only the DMA registers access, which will be dealt with dmaengine conversion. Signed-off-by: NRobert Jarzmik <robert.jarzmik@free.fr> Tested-by: NPetr Cvek <petr.cvek@tul.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Robert Jarzmik 提交于
Instead of using directly the OS timer through direct register access, use the standard sched_clock(), which will end up in OSCR reading anyway. This is a first step for direct access register removal and machine specific code removal from this driver. This commit changes the behavior, as previously the minimum turnaround time was counted in 76ns steps, while with this patch it is counted in microsecond steps. The strictly equal formula would have been : while ((sched_clock() - si->last_clk) * 76 < mtt) Signed-off-by: NRobert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabio Estevam 提交于
There is no need to have FEATURES_NEED_QUIESCE defined as we can simply use NETIF_F_RXCSUM instead as done in other parts of the driver. Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
The oif has already been checked that it is non-zero; the 2 additional checks on oif within that if (oif) {...} block are redundant. CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
lan78xx_suspend() may return non-zero from lan78xx_write_reg() in some scenario. Fix to return 0 when lan78xx_suspend() has no error. Signed-off-by: NWoojung Huh <woojung.huh@microchip.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Or Gerlitz says: ==================== Mellanox mlx5 driver update Bunch of changes from the team, while warming engines for the upcoming SRIOV support. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eli Cohen 提交于
Update new health monitored syndromes and their descriptions. Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eli Cohen 提交于
The name refers to syndrome so uset ext_synd instread of ext_sync. Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Majd Dibbiny 提交于
In the new flow, we separate the pci initialization and teardown from the initialization and teardown of the other resources. init_one calls mlx5_pci_init that handles the pci resources initialization. It then calls mlx5_load_one to initialize the remainder of the resources. When removing a device, remove_one is invoked. However, now remove_one calls mlx5_unload_one to free all the resources except the pci resources. When mlx5_unload_one returns, mlx5_pci_close is called to free the pci resources. The above separation will allow us to implement the pci error handlers and suspend and resume callbacks. Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eli Cohen 提交于
Some errors did not result with notifying firmware that the page request could not be fulfilled. Fix this and put the notification logic into a separate function. Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eli Cohen 提交于
In case of async command completion, the error code returned should take into account the command completion status. Signed-off-by: NEli Cohen <eli@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Achiad Shochat 提交于
Cosmetic change. Do not use the an err variable just to assign and return it. Signed-off-by: NAchiad Shochat <achiad@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Saeed Mahameed 提交于
Used the output mailbox format for input mailbox. Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Achiad Shochat 提交于
The private mlx5 state flag that indicates that the netdev is opened is set at the beginning of the netdev open flow. In case an error occured later in the mlx5 netdev open flow, this flag was not cleared, remaining set although the actual set is closed. Signed-off-by: NAchiad Shochat <achiad@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrzej Hajda 提交于
The function returns always non-negative values. The problem has been detected using proposed semantic patch scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2046107Signed-off-by: NAndrzej Hajda <a.hajda@samsung.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We found that a TCP Fast Open passive connection was vulnerable to reorders, as the exchange might look like [1] C -> S S <FO ...> <request> [2] S -> C S. ack request <options> [3] S -> C . <answer> packets [2] and [3] can be generated at almost the same time. If C receives the 3rd packet before the 2nd, it will drop it as the socket is in SYN_SENT state and expects a SYNACK. S will have to retransmit the answer. Current OOO avoidance in linux is defeated because SYNACK packets are attached to the LISTEN socket, while DATA packets are attached to the children. They might be sent by different cpus, and different TX queues might be selected. It turns out that for TFO, we created a child, which is a full blown socket in TCP_SYN_RECV state, and we simply can attach the SYNACK packet to this socket. This means that at the time tcp_sendmsg() pushes DATA packet, skb->ooo_okay will be set iff the SYNACK packet had been sent and TX completed. This removes the reorder source at the host level. We also removed the export of tcp_try_fastopen(), as it is no longer called from IPv6. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue由 David S. Miller 提交于
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-09-28 This series contains updates to i40e, i40evf and igb to resolve issues seen and reported by Red Hat. Kiran moves i40e_get_head() in preparation for the refactor of the Tx timeout logic, so that it can be used in other areas of the driver. Refactored the driver timeout logic by issuing a writeback request via a software interrupt to the hardware the first time the driver detects a hang. This was due to the driver being too aggressive in resetting a hung queue. Shannon adds the GRE protocol to the transmit checksum encoding. Anjali fixes an issue of forcing writeback too often, which caused us to not benefit from NAPI. We now disable force writeback in the clean routine for X710 and XL710 adapters. The X722 adapters do not enable interrupt to force a writeback and benefit from WB_ON_ITR and so force WB is left enabled for those adapters. Fixed a possible deadlock issue where sync_vsi_filters() can be called directly under RTNL or through the timer subtask without RTNL. So update the flow to see if we are already under RTNL before trying to grab it. Stefan Assmann provides a fix for igb where SR-IOV was not getting enabled properly and we ran into a NULL pointer if the max_vfs module parameter is specified. This is prevented by setting the IGB_FLAG_HAS_MSIX bit before calling igb_probe_vfs(). v2: added "i40e: Fix for recursive RTNL lock during PROMISC change" patch to the series, as it resolves another issues seen and reported by Red Hat. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stefan Assmann 提交于
In igb_sw_init() the sequence of calls was changed from igb_init_queue_configuration() igb_init_interrupt_scheme() igb_probe_vfs() to igb_probe_vfs() igb_init_queue_configuration() igb_init_interrupt_scheme() This results in adapter->flags not having the IGB_FLAG_HAS_MSIX bit set during igb_probe_vfs()->igb_enable_sriov(). Therefore SR-IOV does not get enabled properly and we run into a NULL pointer if the max_vfs module parameter is specified (adapter->vf_data does not get allocated, crash on accessing the structure). [ 7.419348] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 7.419367] IP: [<ffffffffa02161c6>] igb_reset+0xe6/0x5d0 [igb] [ 7.419370] PGD 0 [ 7.419373] Oops: 0002 [#1] SMP [ 7.419381] Modules linked in: ahci(+) libahci igb(+) i40e(+) vxlan ip6_udp_tunnel udp_tunnel megaraid_sas(+) ixgbe(+) mdio [ 7.419385] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.2.0+ #153 [ 7.419387] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 1.6.0 03/07/2013 [...] [ 7.419431] Call Trace: [ 7.419442] [<ffffffffa0217236>] igb_probe+0x8b6/0x1340 [igb] [ 7.419447] [<ffffffff814c7f15>] local_pci_probe+0x45/0xa0 Prevent this by setting the IGB_FLAG_HAS_MSIX bit before calling igb_probe_vfs(). The real interrupt capabilities will be checked during igb_init_interrupt_scheme() so this is safe to do. Signed-off-by: NStefan Assmann <sassmann@kpanic.de> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Anjali Singhai 提交于
The sync_vsi_filters function can be called directly under RTNL or through the timer subtask without one. This was causing a deadlock. If sync_vsi_filters is called from a thread which held the lock, and in another thread the PROMISC setting got changed we would be executing the PROMISC change in the thread which already held the lock alongside the other filter update. The PROMISC change requires a reset if we are on a VEB, which requires it to be called under RTNL. Earlier the driver would call reset for PROMISC change without checking if we were already under RTNL and would try to grab it causing a deadlock. This patch changes the flow to see if we are already under RTNL before trying to grab it. Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: NKiran Patil <kiran.patil@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Anjali Singhai 提交于
This patch fixes the issue of forcing WB too often causing us to not benefit from NAPI. Without this patch we were forcing WB/arming interrupt too often taking away the benefits of NAPI and causing a performance impact. With this patch we disable force WB in the clean routine for X710 and XL710 adapters. X722 adapters do not enable interrupt to force a WB and benefit from WB_ON_ITR and hence force WB is left enabled for those adapters. For XL710 and X710 adapters if we have less than 4 packets pending a software Interrupt triggered from service task will force a WB. This patch also changes the conditions for setting RS bit as described in code comments. This optimizes when the HW does a tail bump and when it does a WB. It also optimizes when we do a wmb. Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Shannon Nelson 提交于
Make sure the Tx checksum encoder knows about GRE protocol and sets the descriptor flag appropriately. Signed-off-by: NShannon Nelson <shannon.nelson@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Kiran Patil 提交于
This patch modifies the driver timeout logic by issuing a writeback request via a software interrupt to the hardware the first time the driver detects a hang. The driver was too aggressive in resetting a hung queue, so back that off by removing logic to down the netdevice after too many hangs, and move the function to the service task. Change-ID: Ife100b9d124cd08cbdb81ab659008c1b9abbedea Signed-off-by: NKiran Patil <kiran.patil@intel.com> Signed-off-by: NShannon Nelson <shannon.nelson@intel.com> Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Kiran Patil 提交于
i40e_get_head needs to be called in multiple files in a further patch, prepare by moving the function into a header file. Signed-off-by: NKiran Patil <kiran.patil@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 28 9月, 2015 1 次提交
-
-
由 Ian Wilson 提交于
Allow bridge forward delay to be configured when Spanning Tree is enabled. Signed-off-by: NIan Wilson <iwilson@brocade.com> Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 9月, 2015 4 次提交
-
-
由 David S. Miller 提交于
Jiri Benc says: ==================== vxlan: support both IPv4 and IPv6 sockets Note: this needs net merged into net-next in order to apply. It's currently not easy enough to work with metadata based vxlan tunnels. In particular, it's necessary to create separate network interfaces for IPv4 and IPv6 tunneling. Assigning an IPv6 address to an IPv4 interface is allowed yet won't do what's expected. With route based tunneling, one has to pay attention to use the vxlan interface opened with the correct family. Other users of this (openvswitch) would need to always create two vxlan interfaces. Furthermore, there's no sane API for creating an IPv6 vxlan metadata based interface. This patchset simplifies this by opening both IPv4 and IPv6 socket if the vxlan interface has the metadata flag (IFLA_VXLAN_COLLECT_METADATA) set. Assignment of addresses etc. works as expected after this. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Benc 提交于
For metadata based vxlan interface, open both IPv4 and IPv6 socket. This is much more user friendly: it's not necessary to create two vxlan interfaces and pay attention to using the right one in routing rules. Signed-off-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Benc 提交于
Make vxlan_sock_add both alloc the socket and attach it to vxlan_dev. Let vxlan_sock_release accept vxlan_dev as its parameter instead of vxlan_sock. This makes vxlan_sock_add and vxlan_sock release complementary. It reduces code duplication in the next patch. Signed-off-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Woodhouse 提交于
When fixing the TSO support I noticed we just mask ->gso_size with the MSSMask value and don't care about the consequences. Provide a .ndo_features_check() method which drops the NETIF_F_TSO feature for any skb which would exceed the maximum, and thus forces it to be segmented by software. Then we can stop the masking in cp_start_xmit(), and just WARN if the maximum is exceeded, which should now never happen. Finally, Francois Romieu noticed that we didn't even have the right value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff. Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-