提交 · 80cb0021163cb55b14c7c054073f89d63a2e1e40 · openeuler / raspberrypi-kernel

08 3月, 2013 3 次提交

net/mlx4_en: Disable RFS when running in SRIOV mode · a229e488

由 Amir Vadai 提交于 3月 07, 2013

Commit 37706996 "mlx4_en: fix allocation of CPU affinity reverse-map" fixed
a bug when mlx4_dev->caps.comp_pool is larger from the device rx rings, but
introduced a regression.

When the mlx4_core is activating its "legacy mode" (e.g when running in SRIOV
mode) w.r.t to EQs/IRQs usage, comp_pool becomes zero and we're crashing on
divide by zero alloc_cpu_rmap.

Fix that by enabling RFS only when running in non-legacy mode.
Reported-by: NYan Burman <yanb@mellanox.com>
Cc: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a229e488

net/mlx4_en: Cleanup MAC resources on module unload or port stop · 83a5a6ce

由 Yan Burman 提交于 3月 07, 2013

Make sure we cleanup all MAC related resources (entries in the port MAC
table and steering rules) when stopping a port or when the driver is unloaded.

The leak was introduced by commit 07cb4b0a "net/mlx4_en: Manage hash of MAC
addresses per port".
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83a5a6ce

net/mlx4_en: Fix race when setting the device MAC address · bfa8ab47

由 Yan Burman 提交于 3月 07, 2013

Remove unnecessary use of workqueue for the device MAC address setting
flow, and fix a race when setting MAC address which was introduced by
commit c07cb4b0 "net/mlx4_en: Manage hash of MAC addresses per port"

The race happened when mlx4_en_replace_mac was being executed in parallel
with a successive call to ndo_set_mac_address, e.g witn an A/B/A MAC
setting configuration test, the third set fails.

With this change we also properly report an error if set MAC fails.
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfa8ab47

28 2月, 2013 1 次提交

hlist: drop the node parameter from iterators · b67bfe0d

由 Sasha Levin 提交于 2月 27, 2013

I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b67bfe0d

24 2月, 2013 2 次提交

mlx4_en: fix allocation of CPU affinity reverse-map · 37706996

由 Kleber Sacilotto de Souza 提交于 2月 22, 2013

The mlx4_en driver allocates the number of objects for the CPU affinity
reverse-map based on the number of rx rings of the device. However,
mlx4_assign_eq() calls irq_cpu_rmap_add() as many times as IRQ's are
assigned to EQ's, which can be as large as mlx4_dev->caps.comp_pool. If
caps.comp_pool is larger than rx_ring_num we will eventually hit the
BUG_ON() in cpu_rmap_add().

Fix this problem by allocating space for the maximum number of CPU
affinity reverse-map objects we might want to add.
Signed-off-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37706996

mlx4_en: fix allocation of device tx_cq · 427a9625

由 Kleber Sacilotto de Souza 提交于 2月 22, 2013

The memory to hold the network device tx_cq is not being allocated with
the correct size in mlx4_en_init_netdev(). It should use MAX_TX_RINGS
instead of MAX_RX_RINGS. This can cause problems if the number of tx
rings being used is greater than MAX_RX_RINGS.
Signed-off-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

427a9625

14 2月, 2013 1 次提交

bridge: Add vlan support to static neighbors · 1690be63

由 Vlad Yasevich 提交于 2月 13, 2013

When a user adds bridge neighbors, allow him to specify VLAN id.
If the VLAN id is not specified, the neighbor will be added
for VLANs currently in the ports filter list.  If no VLANs are
configured on the port, we use vlan 0 and only add 1 entry.
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Acked-by: NJitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1690be63

09 2月, 2013 1 次提交

drivers: net: Remove remaining alloc/OOM messages · 14f8dc49

由 Joe Perches 提交于 2月 07, 2013

alloc failures already get standardized OOM
messages and a dump_stack.

For the affected mallocs around these OOM messages:

Converted kmallocs with multiplies to kmalloc_array.
Converted a kmalloc/memcpy to kmemdup.
Removed now unused stack variables.
Removed unnecessary parentheses.
Neatened alignment.
Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NArend van Spriel <arend@broadcom.com>
Acked-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Acked-by: NJohn W. Linville <linville@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14f8dc49

08 2月, 2013 10 次提交

mlx4_en: Fix BQL reset TX queue call point · 41b74920

由 Tom Herbert 提交于 2月 06, 2013

Fix issue in Mellanox driver related to BQL.  netdev_tx_reset_queue
was not being called in certain situations where the device was
being start and stopped.  Moved netdev_tx_reset_queue from the reset
device path to mlx4_en_free_tx_buf which is where the rings are
cleaned in a reset (specifically from device being stopped).
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-By: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41b74920

net/mlx4_en: Implement ndo fdb functionality · 0ccddcd1

由 Yan Burman 提交于 2月 07, 2013

Add support for setting embedded switch fdb in case of SRIOV, by
implementing ndo_fdb_{add, del, dump}. This will allow to use
bridged configuration with multi-function. In order to add VM MAC
to the eSwitch fdb, the following command may be used over the relevant function interface:
bridge fdb add <MAC> permanent self dev <IFACE>
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ccddcd1

net/mlx4_en: Add unicast MAC filtering · cc5387f7

由 Yan Burman 提交于 2月 07, 2013

Implement and advertise unicast MAC filtering, such that setting macvlan
instance over mlx4_en interfaces will not require the networking core
to put mlx4_en devices in promiscuous mode.

If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc5387f7

net/mlx4_en: Manage hash of MAC addresses per port · c07cb4b0

由 Yan Burman 提交于 2月 07, 2013

As a preparation step for supporting multiple unicast addresses, store MAC addresses in hash table.
Remove the radix tree for MAC addresses per QP, as it's not in use.
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c07cb4b0

net/mlx4_en: Save previous MAC address of the port so we can replace it later · 90bbb74a

由 Yan Burman 提交于 2月 07, 2013

In preparation to having more than one unicast MAC per port, we need to keep track
of the previous MAC address in the flow of ndo_set_mac_address,
so that mlx4_en_replace_mac will know what to replace.
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90bbb74a

net/mlx4_en: Re-arrange ndo_set_rx_mode related code · 0eb74fdd

由 Yan Burman 提交于 2月 07, 2013

Currently, mlx4_en_do_set_multicast serves as the ndo_set_rx_mode entry for mlx4_en,
doing all related work. Split it to few calls, one per required functionality
(e.g multicast, promiscuous, etc) and rename some structures and calls
to use rx_mode notation instead of multicast.
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0eb74fdd

net/mlx4: Move Ethernet related functionality from mlx4_core to mlx4_en · 16a10ffd

由 Yan Burman 提交于 2月 07, 2013

Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en.
Also convert the new functions to deal with MACs in form of char array instead of u64.

Actual functions moved:
mlx4_replace_mac
mlx4_get_eth_qp
mlx4_put_eth_qp

To conduct this change, some functionality had to be exported from the core,
the following functions were added:
mlx4_get_base_qp
__mlx4_replace_mac (low level function for CX1/A0 compatibility)
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16a10ffd

net/mlx4_en: Cleanup multiline strings · 48e551ff

由 Yan Burman 提交于 2月 07, 2013

Make the code consistent in regard to error messages
not spanning multiple lines.
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48e551ff

net/mlx4_en: Optimize Rx fast path filter checks · 6bbb6d99

由 Yan Burman 提交于 2月 07, 2013

Currently, RX path code that does RX filtering is not optimized
and does an expensive conversion. In order to use ether_addr_equal_64bits
which is optimized for such cases, we need the MAC address kept by the device
to be in the form of unsigned char array instead of u64. Store the MAC address
as unsigned char array and convert to/from u64 out of the fast path when needed.
Side effect of this is that we no longer need priv->mac, since it's the same
as dev->dev_addr.

This optimization was suggested by Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6bbb6d99

net/mlx4_en: Optimize loopback related checks in data path · 79aeaccd

由 Yan Burman 提交于 2月 07, 2013

Currently there are relatively complex conditional checks in the fast path,
for TX loopback enabling and resulting RX filter logic.
Move elaborate if's out of data path, replace them with a single flag
for each state and update that state from appropriate places.
Also, in native (non SRIOV) mode and not in loopback or in selftest,
there is no need to try and filter out packets that HW loopback-ed,
as in native mode we do not loopback packets anymore.
Signed-off-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79aeaccd

01 2月, 2013 3 次提交

net/mlx4_en: Fix transmit timeout when driver restarts port · 3484aac1

由 Amir Vadai 提交于 1月 30, 2013

Under heavy CPU load, changing, ring size/mtu/etc. could result in transmit
timeout, since stop-start port might take more than 10 seconds.
Calling netif_detach_device to prevent tx queue transmit timeout.

netif_detach_device() is not called under ndo_stop, because netif_carrier_off
will prevent the timeout, and device should not be marked as not present, or
else user won't be able to start it later on.

CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3484aac1

net/mlx4_en: Don't reassign port mac address on firmware that supports it · 955154fa

由 Matan Barak 提交于 1月 30, 2013

Mac reassignments should only be done when not supported by the firmware. To
accomplish that, checking firmware capability bit to know whether we should
reassign macs in the driver.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

955154fa

net/mlx4_en: Fix ethtool rules leftovers after module unloaded · 0d256c0e

由 Hadar Hen Zion 提交于 1月 30, 2013

As part of the driver unload flow, all steering rules must be deleted,
make sure to remove the rules that were set through ethtool.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d256c0e

28 1月, 2013 4 次提交

net/mlx4_en: Initialize RFS filters lock and list in init_netdev · 78fb2de7

由 Amir Vadai 提交于 1月 24, 2013

filters_lock might have been used while it was re-initialized.
Moved filters_lock and filters_list initialization to init_netdev instead of
alloc_resources which is called every time the device is configured.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78fb2de7

net/mlx4_en: Use the correct netif lock on ndo_set_rx_mode · dbd501a8

由 Eugenia Emantayev 提交于 1月 24, 2013

The device multicast list is protected by netif_addr_lock_bh in the networking core, we should
use this locking practice in mlx4_en too.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Reviewed-by: NYevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbd501a8

net/mlx4_en: Fix traffic loss under promiscuous mode · db0e7cba

由 Aviad Yehezkel 提交于 1月 24, 2013

When port is stopped and flow steering mode is not device managed: promisc QP
rule wasn't removed from MCG table.
Added code to remove it in all flow steering modes.
In addition, promsic rule removal should be in stop port and not in start
port - moved it accordingly.
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db0e7cba

net/mlx4_en: Issue the dump eth statistics command under lock · 2d51837f

由 Eugenia Emantayev 提交于 1月 24, 2013

Performing the DUMP_ETH_STATS firmware command outside the lock leads to kernel
panic when data structures such as RX/TX rings are freed in parallel, e.g when
one changes the mtu or ring sizes.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d51837f

09 1月, 2013 1 次提交

remove init of dev->perm_addr in drivers · aaeb6cdf

由 Jiri Pirko 提交于 1月 08, 2013

perm_addr is initialized correctly in register_netdevice() so to init it in
drivers is no longer needed.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aaeb6cdf

03 12月, 2012 1 次提交

net/mlx4_en: Set number of rx/tx channels using ethtool · d317966b

由 Amir Vadai 提交于 12月 02, 2012

Add support to changing number of rx/tx channels using
ethtool ('ethtool -[lL]'). Where the number of tx channels specified in ethtool
is the number of rings per user priority - not total number of tx rings.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d317966b

27 11月, 2012 1 次提交

mlx4: 64-byte CQE/EQE support · 08ff3235

由 Or Gerlitz 提交于 10月 21, 2012

ConnectX-3 devices can use either 64- or 32-byte completion queue
entries (CQEs) and event queue entries (EQEs).  Using 64-byte
EQEs/CQEs performs better because each entry is aligned to a complete
cacheline.  This patch queries the HCA's capabilities, and if it
supports 64-byte CQEs and EQES the driver will configure the HW to
work in 64-byte mode.

The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ.

Since this mode is global, userspace (libmlx4) must be updated to work
with the configured CQE size, and guests using SR-IOV virtual
functions need to know both EQE and CQE size.

In case one of the 64-byte CQE/EQE capabilities is activated, the
patch makes sure that older guest drivers that use the QUERY_DEV_FUNC
command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that
they need an update to be able to work with the PPF. This is done by
changing the returned pf_context_behaviour not to be zero any more. In
case none of these capabilities is activated that value remains zero
and older guest drivers can run OK.

The SRIOV related flow is as follows

1. the PPF does the detection of the new capabilities using
   QUERY_DEV_CAP command.

2. the PPF activates the new capabilities using INIT_HCA.

3. the VF detects if the PPF activated the capabilities using
   QUERY_HCA, and if this is the case activates them for itself too.

Note that the VF detects that it must be aware to the new PF behaviour
using QUERY_FUNC_CAP.  Steps 1 and 2 apply also for native mode.

User space notification is done through a new field introduced in
struct mlx4_ib_ucontext which holds device capabilities for which user
space must take action. This changes the binary interface so the ABI
towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only
when **needed** i.e. only when the driver does use 64-byte CQEs or
future device capabilities which must be in sync by user space. This
practice allows to work with unmodified libmlx4 on older devices (e.g
A0, B0) which don't support 64-byte CQEs.

In order to keep existing systems functional when they update to a
newer kernel that contains these changes in VF and userspace ABI, a
module parameter enable_64b_cqe_eqe must be set to enable 64-byte
mode; the default is currently false.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

08ff3235

08 11月, 2012 1 次提交

mlx4: change TX coalescing defaults · ecfd2ce1

由 Eric Dumazet 提交于 11月 05, 2012

mlx4 currently uses a too high tx coalescing setting, deferring
TX completion interrupts by up to 128 us.

With the recent skb_orphan() removal in commit 8112ec3b,
performance of a single TCP flow is capped to ~4 Gbps, unless
we increase tcp_limit_output_bytes.

I suggest using 16 us instead of 128 us, allowing a finer control.

Performance of a single TCP flow is restored to previous levels,
while keeping TCP small queues fully enabled with default sysctl.

This patch is also a BQL prereq.
Reported-by: NVimalkumar <j.vimal@gmail.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Yevgeny Petrilin <yevgenyp@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ecfd2ce1

26 7月, 2012 1 次提交

net/mlx4_en: Limit the RFS filter IDs to be < RPS_NO_FILTER · ee64c0ee

由 Amir Vadai 提交于 7月 25, 2012

RFS filter id can't have the special value RPS_NO_FILTER,
need to skip it when allocating id's.

CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ee64c0ee

19 7月, 2012 1 次提交

net/mlx4_en: Add accelerated RFS support · 1eb8c695

由 Amir Vadai 提交于 7月 18, 2012

Use RFS infrastructure and flow steering in HW to keep CPU
affinity of rx interrupts and application per TCP stream.

A flow steering filter is added to the HW whenever the RFS
ndo callback is invoked by core networking code.

Because the invocation takes place in interrupt context, the
actual setup of HW is done using workqueue. Whenever new filter
is added, the driver checks for expiry of existing filters.

Since there's window in time between the point where the core
RFS code invoked the ndo callback, to the point where the HW
is configured from the workqueue context, the 2nd, 3rd etc
packets from that stream will cause the net core to invoke
the callback again and again.

To prevent inefficient/double configuration of the HW, the filters
are kept in a database which is indexed using hash function to enable
fast access.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1eb8c695

17 7月, 2012 1 次提交

net/mlx4_en: dereferencing freed memory · 9c64508a

由 Dan Carpenter 提交于 7月 10, 2012

We dereferenced "mclist" after the kfree().
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c64508a

08 7月, 2012 5 次提交

net/mlx4_en: Add support for drop action through ethtool · cabdc8ee

由 Hadar Hen Zion 提交于 7月 05, 2012

The drop action is implemented by allocating a QP and keeping it in a reset state
such that the HW drops any packets which are steered to that QP. When a drop action
is requested, we attach the relevant flow to that QP.
Sign-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cabdc8ee

net/mlx4: Implement promiscuous mode with device managed flow-steering · 592e49dd

由 Hadar Hen Zion 提交于 7月 05, 2012

The device managed flow steering API has three promiscuous modes:

1. Uplink - captures all the packets that arrive to the port.
2. Allmulti - captures all multicast packets arriving to the port.
3. Function port - for future use, this mode is not implemented yet.

Use these modes with the flow_attach and flow_detach firmware commands
according to the promiscuous state of the netdevice.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

592e49dd

{NET, IB}/mlx4: Add device managed flow steering firmware API · 0ff1fb65

由 Hadar Hen Zion 提交于 7月 05, 2012

The driver is modified to support three operation modes.

If supported by firmware use the device managed flow steering
API, that which we call device managed steering mode. Else, if
the firmware supports the B0 steering mode use it, and finally,
if none of the above, use the A0 steering mode.

When the steering mode is device managed, the code is modified
such that L2 based rules set by the mlx4_en driver for Ethernet
unicast and multicast, and the IB stack multicast attach calls
done through the mlx4_ib driver are all routed to use the device
managed API.

When attaching rule using device managed flow steering API,
the firmware returns a 64 bit registration id, which is to be
provided during detach.

Currently the firmware is always programmed during HCA initialization
to use standard L2 hashing. Future work should be done to allow
configuring the flow-steering hash function with common, non
proprietary means.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ff1fb65

net/mlx4: Set steering mode according to device capabilities · c96d97f4

由 Hadar Hen Zion 提交于 7月 05, 2012

Instead of checking the firmware supported steering mode in various
places in the code, add a dedicated field in the mlx4 device capabilities
structure which is written once during the initialization flow and read
across the code.

This also set the grounds for add new steering modes. Currently two modes
are supported, and are named after the ConnectX HW versions A0 and B0.

A0 steering uses mac_index, vlan_index and priority to steer traffic
into pre-defined range of QPs.

B0 steering uses Ethernet L2 hashing rules and is enabled only
if the firmware supports both unicast and multicast B0 steering,

The current steering modes are relevant for Ethernet traffic only,
such that Infiniband steering remains untouched.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c96d97f4

net/mlx4_en: Re-design multicast attachments flow · 6d199937

由 Yevgeny Petrilin 提交于 7月 05, 2012

Currently, for every change in the net device multicast list, the driver
detaches all the addresses from the HW device, and then attaches the
updated list. This behavior is wrong from two aspects: first, it causes
a load of firmware commands and second, there is period of time where
the correct addresses are not attached, which turned into packet loss.

To improve - a copy of the multicast list is saved by the driver. For
every change in the multicast list, the multicast list copy is used
to find the delta between those two lists and add or remove multicast
addresses as needed.
Reported-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d199937

26 6月, 2012 2 次提交

net/mlx4_en: Release QP range in free_resources · 044ca2a5

由 Yevgeny Petrilin 提交于 6月 25, 2012

Add a missing resource release in ring cleanup.
Not doing this leaves a range of QPs that are being reserved,
and no one can use them.
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

044ca2a5

net/mlx4_en: Set correct port parameters during device initialization · 5c8e9046

由 Yevgeny Petrilin 提交于 6月 25, 2012

Set valid port parameters: MTU and flow control configuration when
configuring the port during HW device initialization,
prior to the net device open() being called.
Using  invalid parameters (such as all zeros)
could lead to bad firmware behavior.
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c8e9046

18 5月, 2012 1 次提交

net/mlx4_en: num cores tx rings for every UP · bc6a4744

由 Amir Vadai 提交于 5月 17, 2012

Change the TX ring scheme such that the number of rings for untagged packets
and for tagged packets (per each of the vlan priorities) is the same, unlike
the current situation where for tagged traffic there's one ring per priority
and for untagged rings as the number of core.

Queue selection is done as follows:

If the mqprio qdisc is operates on the interface, such that the core networking
code invoked the device setup_tc ndo callback, a mapping of skb->priority =>
queue set is forced - for both, tagged and untagged traffic.

Else, the egress map skb->priority => User priority is used for tagged traffic, and
all untagged traffic is sent through tx rings of UP 0.

The patch follows the convergence of discussing that issue with John Fastabend
over this thread http://comments.gmane.org/gmane.linux.network/229877

Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Liran Liss <liranl@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc6a4744