- 20 11月, 2012 1 次提交
-
-
由 Ben Hutchings 提交于
Commit fa37a958 ('mlx4_en: Moving to work with GRO') left behind the Kconfig depends/select, some dead code and comments referring to LRO. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Acked-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 11月, 2012 1 次提交
-
-
由 Eric Dumazet 提交于
mlx4 currently uses a too high tx coalescing setting, deferring TX completion interrupts by up to 128 us. With the recent skb_orphan() removal in commit 8112ec3b, performance of a single TCP flow is capped to ~4 Gbps, unless we increase tcp_limit_output_bytes. I suggest using 16 us instead of 128 us, allowing a finer control. Performance of a single TCP flow is restored to previous levels, while keeping TCP small queues fully enabled with default sysctl. This patch is also a BQL prereq. Reported-by: NVimalkumar <j.vimal@gmail.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Yevgeny Petrilin <yevgenyp@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2012 1 次提交
-
-
由 Yevgeny Petrilin 提交于
Removing the ring->blocked flag, it is redundant and leads to a race: We close the TX queue and then set the "blocked" flag. Between those 2 operations the completion function can check the "blocked" flag, sees that it is 0, and wouldn't open the TX queue. Using netif_tx_queue_stopped to check the state of the queue to avoid this race. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 7月, 2012 1 次提交
-
-
In its receive path, mlx4_en driver maps each page chunk that it pushes to the hardware and unmaps it when pushing it up the stack. This limits throughput to about 3Gbps on a Power7 8-core machine. One solution is to map the entire allocated page at once. However, this requires that we keep track of every page fragment we give to a descriptor. We also need to work with the discipline that all fragments will be released (in the sense that it will not be reused by the driver anymore) in the order they are allocated to the driver. This requires that we don't reuse any fragments, every single one of them must be reallocated. We do that by releasing all the fragments that are processed and only after finished processing the descriptors, we start the refill. We also must somehow guarantee that we either refill all fragments in a descriptor or none at all, without resorting to giving up a page fragment that we would have already given. Otherwise, we would break the discipline of only releasing the fragments in the order they were allocated. This has passed page allocation fault injections (restricted to the driver by using required-start and required-end) and device hotplug while 16 TCP streams were able to deliver more than 9Gbps. Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 7月, 2012 1 次提交
-
-
由 Amir Vadai 提交于
Use RFS infrastructure and flow steering in HW to keep CPU affinity of rx interrupts and application per TCP stream. A flow steering filter is added to the HW whenever the RFS ndo callback is invoked by core networking code. Because the invocation takes place in interrupt context, the actual setup of HW is done using workqueue. Whenever new filter is added, the driver checks for expiry of existing filters. Since there's window in time between the point where the core RFS code invoked the ndo callback, to the point where the HW is configured from the workqueue context, the 2nd, 3rd etc packets from that stream will cause the net core to invoke the callback again and again. To prevent inefficient/double configuration of the HW, the filters are kept in a database which is indexed using hash function to enable fast access. Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 7月, 2012 4 次提交
-
-
由 Hadar Hen Zion 提交于
The drop action is implemented by allocating a QP and keeping it in a reset state such that the HW drops any packets which are steered to that QP. When a drop action is requested, we attach the relevant flow to that QP. Sign-off-by: NHadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hadar Hen Zion 提交于
Implement the ethtool APIs for attaching L2/L3/L4 based flow steering rules to the netdevice RX rings. Added set_rxnfc callback and enhanced the existing get_rxnfc callback. Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hadar Hen Zion 提交于
The driver is modified to support three operation modes. If supported by firmware use the device managed flow steering API, that which we call device managed steering mode. Else, if the firmware supports the B0 steering mode use it, and finally, if none of the above, use the A0 steering mode. When the steering mode is device managed, the code is modified such that L2 based rules set by the mlx4_en driver for Ethernet unicast and multicast, and the IB stack multicast attach calls done through the mlx4_ib driver are all routed to use the device managed API. When attaching rule using device managed flow steering API, the firmware returns a 64 bit registration id, which is to be provided during detach. Currently the firmware is always programmed during HCA initialization to use standard L2 hashing. Future work should be done to allow configuring the flow-steering hash function with common, non proprietary means. Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yevgeny Petrilin 提交于
Currently, for every change in the net device multicast list, the driver detaches all the addresses from the HW device, and then attaches the updated list. This behavior is wrong from two aspects: first, it causes a load of firmware commands and second, there is period of time where the correct addresses are not attached, which turned into packet loss. To improve - a copy of the multicast list is saved by the driver. For every change in the multicast list, the multicast list copy is used to find the delta between those two lists and add or remove multicast addresses as needed. Reported-by: NShawn Bohrer <sbohrer@rgmadvisors.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 6月, 2012 1 次提交
-
-
由 Yevgeny Petrilin 提交于
Add a missing resource release in ring cleanup. Not doing this leaves a range of QPs that are being reserved, and no one can use them. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 5月, 2012 1 次提交
-
-
由 Amir Vadai 提交于
Change the TX ring scheme such that the number of rings for untagged packets and for tagged packets (per each of the vlan priorities) is the same, unlike the current situation where for tagged traffic there's one ring per priority and for untagged rings as the number of core. Queue selection is done as follows: If the mqprio qdisc is operates on the interface, such that the core networking code invoked the device setup_tc ndo callback, a mapping of skb->priority => queue set is forced - for both, tagged and untagged traffic. Else, the egress map skb->priority => User priority is used for tagged traffic, and all untagged traffic is sent through tx rings of UP 0. The patch follows the convergence of discussing that issue with John Fastabend over this thread http://comments.gmane.org/gmane.linux.network/229877 Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Liran Liss <liranl@mellanox.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 4月, 2012 2 次提交
-
-
由 Yevgeny Petrilin 提交于
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yevgeny Petrilin 提交于
Moving to interrupts instead of polling fpr TX completions Avoiding situations where skb can be held in by the driver for a long time (till timer expires). The change is also necessary for supporting BQL. Removing comp_lock that was required because we could handle TX completions from several contexts: Interrupts, timer, polling. Now there is only interrupts Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 4月, 2012 4 次提交
-
-
由 Amir Vadai 提交于
This patch is using the DCB netlink to set rate limit per ETS TC Values are accepted in Kbps and rounded up to the nearest multiply of 100Mbps. Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amir Vadai 提交于
Set TSA, promised BW and PFC using IEEE 802.1qaz netlink commands. Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amir Vadai 提交于
Instead of relying on HW to change schedule queue by UP, schedule queue is fixed for a tx_ring, and UP in WQE is ignored in this aspect. This resolves two issues with untagged traffic: 1. untagged traffic has no UP in packet which is needed for QoS. The change above allows setting the schedule queue (and by that the UP) of such a stream. 2. BlueFlame uses the same field used by vlan tag. So forcing UP from QPC allows using BF for untagged but prioritized traffic. In old firmware that force UP is not supported, untagged traffic will not subject to QoS. Because UP is set by QP, need to always have a tx ring per UP, even if pfcrx module paramter is false. Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
The driver uses a 2-order allocation, which is too much on architectures like ppc64, which has a 64KiB page. This particular allocation is used for large packet fragments that may have a size of 512, 1024, 4096 or fill the whole allocation. So, a minimum size of 16384 is good enough and will be the same size that is used in architectures of 4KiB sized pages. This will avoid allocation failures that we see when the system is under stress, but still has plenty of memory, like the one below. This will also allow us to set the interface MTU to higher values like 9000, which was not possible on ppc64 without this patch. Node 1 DMA: 737*64kB 37*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 51904kB 83137 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 10420096kB Total swap = 10420096kB 107776 pages RAM 1184 pages reserved 147343 pages shared 28152 pages non-shared netstat: page allocation failure. order:2, mode:0x4020 Call Trace: [c0000001a4fa3770] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable) [c0000001a4fa3820] [c00000000016af38] .__alloc_pages_nodemask+0x618/0x930 [c0000001a4fa39a0] [c0000000001a71a0] .alloc_pages_current+0xb0/0x170 [c0000001a4fa3a40] [d00000000dcc3e00] .mlx4_en_alloc_frag+0x200/0x240 [mlx4_en] [c0000001a4fa3b10] [d00000000dcc3f8c] .mlx4_en_complete_rx_desc+0x14c/0x250 [mlx4_en] [c0000001a4fa3be0] [d00000000dcc4eec] .mlx4_en_process_rx_cq+0x62c/0x850 [mlx4_en] [c0000001a4fa3d20] [d00000000dcc5150] .mlx4_en_poll_rx_cq+0x40/0x90 [mlx4_en] [c0000001a4fa3dc0] [c0000000004e2bb8] .net_rx_action+0x178/0x450 [c0000001a4fa3eb0] [c00000000009c9b8] .__do_softirq+0x118/0x290 [c0000001a4fa3f90] [c000000000031df8] .call_do_softirq+0x14/0x24 [c000000184c3b520] [c00000000000e700] .do_softirq+0xf0/0x110 [c000000184c3b5c0] [c00000000009c6d4] .irq_exit+0xb4/0xc0 [c000000184c3b640] [c00000000000e964] .do_IRQ+0x144/0x230 Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Tested-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 3月, 2012 3 次提交
-
-
由 Yevgeny Petrilin 提交于
The SET_PORT functions are implemented in port.c, which is part of mlx4_core, these functions are exported. The functions are in use by the mlx4_en module (were originally part of mlx4_en). Their declaration remained in mlx4_en module, moving the declaration to the right location. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Or Gerlitz 提交于
Fix sparse warnings on incompatibility between the endianess of the ctrl_flags field of struct mlx4_en_priv to the srcrb_flags field of struct mlx4_wqe_ctrl_seg by changing the former to be __be32 instead of u32. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yevgeny Petrilin 提交于
Localized the pdev->dev, and using dma_map instead of pci_map There are multiple map/unmap operations on data path, optimizing those by saving redundant pointer access. Those places were identified as hot-spots when running kernel profiling during some benchmarks. The fixes had most impact when testing packet rate with small packets, reducing several % from CPU load, and in some case being the difference between reaching wire speed or being CPU bound. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 2月, 2012 1 次提交
-
-
After opening the network interface, Mellanox ConnectX device cannot be removed by hotplug because it has not properly unmapped all DMA memory. It happens that mlx4_en_activate_rx_rings overrides the variable that keeps the size of the memory mapped. This is fixed by passing to mlx4_en_destroy_rx_ring the same size that is given to mlx4_en_create_rx_ring. After applying this patch, hot unplugging the device works after opening the interface. Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 1月, 2012 1 次提交
-
-
由 Eugenia Emantayev 提交于
In native mode display all available staticstics. In SRIOV mode on VF display only SW counters statistics, in SRIOV mode on hypervisor display SW counters and errors (got from FW) statistics. Signed-off-by: NEugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 1月, 2012 1 次提交
-
-
由 Yevgeny Petrilin 提交于
Value must be a power of 2 due to HW limitation. Driver supports only 'equal' mode in ethtool and can't be set by using weights. Signed-off-by: NAmir Vadai <amirv@mellanox.co.il> Reviewed-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 12月, 2011 1 次提交
-
-
由 Yevgeny Petrilin 提交于
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 11月, 2011 3 次提交
-
-
由 Amir Vadai 提交于
Device must be in promiscuous mode or DMAC must be same as the host MAC, or else packet will be dropped by the HW rx filtering. Signed-off-by: NAmir Vadai <amirv@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Or Gerlitz 提交于
the MLX4_EN_WOL_DO_MODIFY flag which is defined through enum targets bit 63, this triggers a "cast truncate bits from constant value (8000000000000000 becomes 0)" warning from sparse, fix that by using define instead of enum. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Or Gerlitz 提交于
Towards adding RSS support for IB drivers/application who use the mlx4 HW, make the RSS related definitions global and change the mlx4_en driver to use them. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 11月, 2011 1 次提交
-
-
由 Yevgeny Petrilin 提交于
When HW doesn't remove FCS bytes they are reported in the completion byte count, we don't need to take them to skb. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 11月, 2011 1 次提交
-
-
由 Joe Perches 提交于
Standardize the style for compiler based printf format verification. Standardized the location of __printf too. Done via script and a little typing. $ grep -rPl --include=*.[ch] -w "__attribute__" * | \ grep -vP "^(tools|scripts|include/linux/compiler-gcc.h)" | \ xargs perl -n -i -e 'local $/; while (<>) { s/\b__attribute__\s*\(\s*\(\s*format\s*\(\s*printf\s*,\s*(.+)\s*,\s*(.+)\s*\)\s*\)\s*\)/__printf($1, $2)/g ; print; }' [akpm@linux-foundation.org: revert arch bits] Signed-off-by: NJoe Perches <joe@perches.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 10月, 2011 2 次提交
-
-
由 Yevgeny Petrilin 提交于
Driver version updated to 1.5.4.2 Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yevgeny Petrilin 提交于
Not updating common counters from data path. The checksum counters are per ring, summarizing them when collecting statistics. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 10月, 2011 3 次提交
-
-
由 Alexander Guller 提交于
Moderation is now done per ring and coalescing is enabled by set_ring_param in ethtool. Signed-off-by: NAlexander Guller <alexg@mellanox.co.il> Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Guller 提交于
Fixed a bug where ring size change caused insufficient memory upon driver restart due to unreleased EQs. Signed-off-by: NAlexander Guller <alexg@mellanox.co.il> Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Guller 提交于
Until now only RX rings used irq per ring and TX used only one per port. >From now on, both of them will use the irq per ring while RX & TX ring[i] will use the same irq. Signed-off-by: NAlexander Guller <alexg@mellanox.co.il> Signed-off-by: NSharon Cohen <sharonc@mellanox.co.il> Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 8月, 2011 1 次提交
-
-
由 Jeff Kirsher 提交于
Moves the Mellanox driver into drivers/net/ethernet/mellanox/ and make the necessary Kconfig and Makefile changes. CC: Roland Dreier <roland@kernel.org> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 22 7月, 2011 1 次提交
-
-
由 Jiri Pirko 提交于
- unify vlan and nonvlan path - kill priv->vlgrp and mlx4_en_vlan_rx_register Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 4月, 2011 1 次提交
-
-
由 Michał Mirosław 提交于
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 3月, 2011 3 次提交
-
-
由 Yevgeny Petrilin 提交于
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yevgeny Petrilin 提交于
Doorbell is used according to usage of BlueFlame. For Blue Flame to work in Ethernet mode QP number should have 0 at bits 6,7. Allocating range of QPs accordingly. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yevgeny Petrilin 提交于
The mlx4_en module now uses the new steering mechanism. The RX packets are now steered through the MCG table instead of Mac table for unicast, and default entry for multicast. The feature is enabled through INIT_HCA Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-