- 14 1月, 2015 1 次提交
-
-
由 Jiri Pirko 提交于
The same macros are used for rx as well. So rename it. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 12月, 2014 1 次提交
-
-
由 Amir Vadai 提交于
iowrite32() will byteswap it's argument on big endian archs. iowrite32be() will byteswap on little endian archs. Since we don't want to do this unnecessary byteswap on the fast path, doorbell is stored in the NIC's native endianness. Using the right iowrite() according to the arch endianness. CC: Wei Yang <weiyang@linux.vnet.ibm.com> CC: David Laight <david.laight@aculab.com> Fixes: 6a4e8121 ("net/mlx4_en: Avoid calling bswap in tx fast path") Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 12月, 2014 1 次提交
-
-
由 Eugenia Emantayev 提交于
When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset. The current Ethernet driver code reserves a Tx QP range with 256b alignment. This is wrong because if there are more than 64 Tx QPs in use, QPNs >= base + 65 will have bits 6/7 set. This problem is not specific for the Ethernet driver, any entity that tries to reserve more than 64 BF-enabled QPs should fail. Also, using ranges is not necessary here and is wasteful. The new mechanism introduced here will support reservation for "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs (when hypervisors support WC in VMs). The flow we use is: 1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation, and request "BF enabled QPs" if BF is supported for the function 2. In the ALLOC_RES FW command, change param1 to: a. param1[23:0] - number of QPs b. param1[31-24] - flags controlling QPs reservation Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits 6 and 7 unset in order to be used in Ethernet. Bits 24-30 of the flags are currently reserved. When a function tries to allocate a QP, it states the required attributes for this QP. Those attributes are considered "best-effort". If an attribute, such as Ethernet BF enabled QP, is a must-have attribute, the function has to check that attribute is supported before trying to do the allocation. In a lower layer of the code, mlx4_qp_reserve_range masks out the bits which are unsupported. If SRIOV is used, the PF validates those attributes and masks out unsupported attributes as well. In order to notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's mailbox is filled by the PF, which notifies which QP allocation attributes it supports. Signed-off-by: NEugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 10月, 2014 2 次提交
-
-
由 Or Gerlitz 提交于
For VXLAN/NVGRE encapsulation, the current HW doesn't support offloading both the outer UDP TX checksum and the inner TCP/UDP TX checksum. The driver doesn't advertize SKB_GSO_UDP_TUNNEL_CSUM, however we are wrongly telling the HW to offload the outer UDP checksum for encapsulated packets, fix that. Fixes: 837052d0 ('net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling') Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
mlx4_en_rx_irq() and mlx4_en_tx_irq() run from hard interrupt context. They can use napi_schedule_irqoff() instead of napi_schedule() Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-By: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2014 1 次提交
-
-
由 Eric Dumazet 提交于
Add two helpers so that drivers do not have to care of BQL being available or not. Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NJim Davis <jim.epost@gmail.com> Fixes: 29d40c90 ("net/mlx4_en: Use prefetch in tx path") Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 10月, 2014 1 次提交
-
-
由 Eric Dumazet 提交于
Drivers should avoid NETDEV_TX_BUSY as much as possible. They should stop the tx queue before qdisc even tries to push another packet, to avoid requeues. For a driver supporting skb->xmit_more, this is likely to be a prereq anyway, otherwise we could have a tx deadlock : We need to force a doorbell if TX ring is full. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 10月, 2014 12 次提交
-
-
由 Eric Dumazet 提交于
Instead of setting inline threshold using module parameter only on driver load, use set_tunable() to set it dynamically. No need to store the threshold per ring, using instead the netdev global priv->prof->inline_thold Initial value still is set using the module parameter, therefore backward compatability is kept. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Reorganize code to call is_inline() once, so compiler can inline it Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Properly clear tx_info->ts_requested Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Access skb_headlen() once in tx flow Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Acces skb_shinfo(skb) once in tx flow. Also, rename @i variable to @i_frag to avoid confusion, as the "goto tx_drop_unmap;" relied on this @i variable. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
mlx4_en_process_tx_cq() carefully fetches and writes ring->last_nr_txbb and ring->cons only one time to avoid false sharing Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
mlx4_en_free_tx_desc() uses a prefetchw(&skb->users) to speed up consume_skb() prefetchw(&ring->tx_queue->dql) to speed up BQL update Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Add frag0_dma/frag0_byte_count into mlx4_en_tx_info to avoid a cache line miss in TX completion for frames having one dma element. (We avoid reading back the tx descriptor) Note this could be extended to 2/3 dma elements later, as we have free room in mlx4_en_tx_info Also, mlx4_en_free_tx_desc() no longer accesses skb_shinfo(). We use a new nr_maps fields in mlx4_en_tx_info to avoid 2 or 3 cache misses. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Try to allocate using kmalloc_node() first, only on failure use vmalloc() Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
- doorbell_qpn is stored in the cpu_to_be32() way to avoid bswap() in fast path. - mdev->mr.key stored in ring->mr_key to also avoid bswap() and access to cold cache line. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
- Remove unused variable ring->poll_cnt - No need to set some fields if using blueflame - Add missing const's - Use unlikely - Remove unneeded new line - Make some comments more precise - struct mlx4_bf @offset field reduced to unsigned int to save space Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 10月, 2014 1 次提交
-
-
由 Eric Dumazet 提交于
ethtool -S reports a new counter, tracking number of time doorbell was not triggered, because skb->xmit_more was set. $ ethtool -S eth0 | egrep "tx_packet|xmit_more" tx_packets: 2413288400 xmit_more: 666121277 I merged the tso_packet false sharing avoidance in this patch as well. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 9月, 2014 1 次提交
-
-
由 Eric Dumazet 提交于
skb->xmit_more tells us if another skb is coming next. We need to send doorbell when : xmit_more is not set, or txqueue is stopped (preventing next skb to come immediately) Tested with a modified pktgen version, I got a 40% increase of throughput. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 9月, 2014 1 次提交
-
-
由 Ido Shamay 提交于
This function derives the base address of the CQE from the CQE size, and calculates the real CQE context segment in it from the factor (this is like before). Before this change the code used the factor to calculate the base address of the CQE as well. The factor indicates in which segment of the cqe stride the cqe information is located. For 32-byte strides, the segment is 0, and for 64 byte strides, the segment is 1 (bytes 32..63). Using the factor was ok as long as we had only 32 and 64 byte strides. However, with larger strides, the factor is zero, and so cannot be used to calculate the base of the CQE. The helper uses the same method of CQE buffer pulling made by all other components that reads the CQE buffer (mlx4_ib driver and libmlx4). Signed-off-by: NIdo Shamay <idos@mellanox.com> Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 9月, 2014 1 次提交
-
-
由 Rick Jones 提交于
It would appear the mlx4_en driver was still making a call to dev_kfree_skb_any() where dev_consume_skb_any() would be more appropriate. This should make dropped packet profiling/tracking easier/better over a NIC driven by mlx4_en. Signed-off-by: NRick Jones <rick.jones2@hp.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 7月, 2014 1 次提交
-
-
由 Amir Vadai 提交于
Enable the user to turn off the hardware feature called BlueFlame. Since it is something specific to mlx4_en hardware, we control the feature via ethtool private flags. Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 7月, 2014 1 次提交
-
-
由 Amir Vadai 提交于
It is recommended that TX work not count against the quota. The cost of TX packet liberation is a minute percentage of what it costs to process an RX frame. Furthermore, that SKB freeing makes memory available for other paths in the stack. Give the TX a larger budget and be more aggressive about cleaning up the Tx descriptors this budget could be changed using ethtool: $ ethtool -C eth1 tx-frames-irq <budget> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 7月, 2014 1 次提交
-
-
由 Amir Vadai 提交于
IRQ affinity notifier can only have a single notifier - cpu_rmap notifier. Can't use it to track changes in IRQ affinity map. Detect IRQ affinity changes by comparing CPU to current IRQ affinity map during NAPI poll thread. CC: Thomas Gleixner <tglx@linutronix.de> CC: Ben Hutchings <ben@decadent.org.uk> Fixes: 2eacc23c ("net/mlx4_core: Enforce irq affinity changes immediatly") Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2014 1 次提交
-
-
由 Jiri Kosina 提交于
Modify the various routines used to allocate memory resources which serve QPs in mlx4 to get an input GFP directive. Have the Ethernet driver to use GFP_KERNEL in it's QP allocations as done prior to this commit, and the IB driver to use GFP_NOIO when the IB verbs IB_QP_CREATE_USE_GFP_NOIO QP creation flag is provided. Signed-off-by: NMel Gorman <mgorman@suse.de> Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 15 5月, 2014 1 次提交
-
-
由 Yuval Atias 提交于
During heavy traffic, napi is constatntly polling the complition queue and no interrupt is fired. Because of that, changes to irq affinity are ignored until traffic is stopped and resumed. By registering to the irq notifier mechanism, and forcing interrupt when affinity is changed, irq affinity changes will be immediatly enforced. Signed-off-by: NYuval Atias <yuvala@mellanox.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 5月, 2014 1 次提交
-
-
由 Joe Perches 提交于
Use a more current logging style. o Coalesce formats o Add missing spaces for coalesced formats o Align arguments for modified formats o Add missing newlines for some logging messages o Use DRV_NAME as part of format instead of %s, DRV_NAME to reduce overall text. o Use ..., ##__VA_ARGS__ instead of args... in macros o Correct a few format typos o Use a single line message where appropriate Signed-off-by: NJoe Perches <joe@perches.com> Acked-By: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 3月, 2014 1 次提交
-
-
由 Eric W. Biederman 提交于
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can be called in hard irq and other contexts. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 3月, 2014 4 次提交
-
-
由 Amir Vadai 提交于
When BlueFlame is turned on, control segment of the TX WQE is changed, and the second line of it is used for QPN. Changed code to use a union in the mlx4_wqe_ctrl_seg instead of casting. This makes the code clearer and solves the static checker warning: drivers/net/ethernet/mellanox/mlx4/en_tx.c:839 mlx4_en_xmit() warn: potential memory corrupting cast 4 vs 2 bytes CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eugenia Emantayev 提交于
Give accurate counters and avoids cache misses when several rings update the counters of stop/wake queue. Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eugenia Emantayev 提交于
Hardware can't accept packets smaller than 17 bytes. Therefore need to pad with zeros. Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eugenia Emantayev 提交于
Verify mlx4_en module parameters. In case they are out of range - reset to default values. Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 2月, 2014 1 次提交
-
-
由 Daniel Borkmann 提交于
Add a new argument for ndo_select_queue() callback that passes a fallback handler. This gets invoked through netdev_pick_tx(); fallback handler is currently __netdev_pick_tx() as most drivers invoke this function within their customized implementation in case for skbs that don't need any special handling. This fallback handler can then be replaced on other call-sites with different queue selection methods (e.g. in packet sockets, pktgen etc). This also has the nice side-effect that __netdev_pick_tx() is then only invoked from netdev_pick_tx() and export of that function to modules can be undone. Suggested-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 1月, 2014 1 次提交
-
-
由 Jason Wang 提交于
Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The will cause several issues: - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan instead of lower device which misses the necessary txq synchronization for lower device such as txq stopping or frozen required by dev watchdog or control path. - dev_hard_start_xmit() was called with NULL txq which bypasses the net device watchdog. - dev_hard_start_xmit() does not check txq everywhere which will lead a crash when tso is disabled for lower device. Fix this by explicitly introducing a new param for .ndo_select_queue() for just selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also extended to accept this parameter and dev_queue_xmit_accel() was used to do l2 forwarding transmission. With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of dev_queue_xmit() to do the transmission. In the future, it was also required for macvtap l2 forwarding support since it provides a necessary synchronization method. Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: e1000-devel@lists.sourceforge.net Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NJohn Fastabend <john.r.fastabend@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 1月, 2014 1 次提交
-
-
由 Or Gerlitz 提交于
When the device tunneling offloads mode is vxlan do the following - call SET_PORT with the relevant setting - add DMFS steering vxlan rule for the device self and multicast mac addresses of the form: {<ETH, outer-mac> <VXLAN, ANY vnid> <ETH, ANY mac>} --> RSS QP - set relevant QPC fields in RSS context and RX ring QPs - in TX flow, set WQE fields to generate HW checksum, and handle gso skbs which are marked for encapsulation such that the HW will segment them properly. - in RX flow, read HW offloaded checksum for encapsulated packets from the CQE - advertize hw_enc_features and NETIF_F_GSO_UDP_TUNNEL to the networking stack Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 12月, 2013 2 次提交
-
-
由 Eugenia Emantayev 提交于
Add NAPI for TX side, implement its support and provide NAPI callback. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Shamay 提交于
Only TX rings of User Piority 0 are mapped. TX rings of other UP's are using UP 0 mapping. XPS is not in use when num_tc is set. Signed-off-by: NIdo Shamay <idos@mellanox.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 11月, 2013 1 次提交
-
-
由 Eugenia Emantayev 提交于
For each RX/TX ring and its CQ, allocation is done on a NUMA node that corresponds to the core that the data structure should operate on. The assumption is that the core number is reflected by the ring index. The affected allocations are the ring/CQ data structures, the TX/RX info and the shared HW/SW buffer. For TX rings, each core has rings of all UPs. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com> Reviewed-by: NHadar Hen Zion <hadarh@mellanox.com> Signed-off-by: NAmir Vadai <amirv@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-