提交 · fe9ccc785de5f8d0cb1b6113a0da387dfd8bf38c · openeuler / raspberrypi-kernel

18 5月, 2017 8 次提交

mlxsw: spectrum_switchdev: Don't batch VLAN operations · fe9ccc78

由 Ido Schimmel 提交于 5月 16, 2017

switchdev's VLAN object has the ability to describe a range of VLAN IDs,
but this is only used when VLAN operations are done using the SELF flag,
which is something we would like to remove as it allows one to bypass
the bridge driver.

Do VLAN operations on a per-VLAN basis, thereby simplifying the code and
preparing it for refactoring in a follow-up patchset.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe9ccc78

mlxsw: spectrum_switchdev: Remove redundant check · d341e2ce

由 Ido Schimmel 提交于 5月 16, 2017

Since commit 97c24290 ("switchdev: Execute bridge ndos only for
bridge ports") switchdev code checks that port is bridged, so no need to
perform the same check in the driver.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d341e2ce

mlxsw: spectrum_router: Initialize RIFs in a separate function · 348b8fc3

由 Ido Schimmel 提交于 5月 16, 2017

The router interfaces (RIFs) array is currently initialized together
with the general router configuration. However, in a follow-up patchset
we're going to introduce a common RIF core that will require us to
initialize more RIF constructs, so move the RIF initialization to its
own function.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

348b8fc3

mlxsw: spectrum_router: Move FIB notification block to router struct · 7e39d115

由 Ido Schimmel 提交于 5月 16, 2017

The FIB notification block logically belongs inside the router specific
struct, so move it there.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e39d115

mlxsw: spectrum_router: Move RIFs array to its rightful place · 5f9efffb

由 Ido Schimmel 提交于 5月 16, 2017

The router interfaces (RIFs) array is of no interest to code outside the
routing realm, so declare it inside the router specific struct instead
of the chip-wide one.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f9efffb

mlxsw: spectrum_switchdev: Reduce scope of bridge struct · 5f6935c6

由 Ido Schimmel 提交于 5月 16, 2017

Some attributes in the global chip struct are only relevant for bridge
operation, so encapsulate them in their own struct that isn't exposed to
non-bridge code.

This will also help us later, when we add more bridge-specific
attributes.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f6935c6

mlxsw: spectrum_router: Reduce scope of router struct · 9011b677

由 Ido Schimmel 提交于 5月 16, 2017

In a similar fashion to previous patch, the router structure
('mlxsw_sp_router') doesn't need to be accessible to anyone, but the
router code located at spectrum_router.c

Make this apparent and reduce its scope by defining it there.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9011b677

mlxsw: spectrum_buffer: Reduce scope of shared buffer struct · 33cbd87c

由 Ido Schimmel 提交于 5月 16, 2017

The shared buffer structure ('mlxsw_sp_sb') doesn't need to be
accessible to anyone, but the shared buffer code located at
spectrum_buffers.c

Make this apparent and reduce its scope by defining it there.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33cbd87c

16 5月, 2017 1 次提交

net/mlx4_core: Use min3 to select number of MSI-X vectors · 4762010f

由 yuval.shaia@oracle.com 提交于 5月 12, 2017

Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4762010f

14 5月, 2017 5 次提交

net/mlx5: Use underlay QPN from the root name space · 50854114

由 Yishai Hadas 提交于 4月 25, 2017

Root flow table is dynamically changed by the underlying flow steering
layer, and IPoIB/ULPs have no idea what will be the root flow table in
the future, hence we need a dynamic infrastructure to move Underlay QPs
with the root flow table.

Fixes: b3ba5149 ("net/mlx5: Refactor create flow table method to accept underlay QP")
Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

50854114

net/mlx5e: IPoIB, Only support regular RQ for now · 5360fd47

由 Saeed Mahameed 提交于 5月 04, 2017

IPoIB doesn't support striding RQ at the moment, for this
we need to explicitly choose non striding RQ in IPoIB init,
even if the HW supports it.

Fixes: 8f493ffd ("net/mlx5e: IPoIB, RX steering RSS RQTs and TIRs")
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

5360fd47

net/mlx5e: Fix setup TC ndo · 20b6a1c7

由 Saeed Mahameed 提交于 5月 09, 2017

Fail-safe support patches introduced a trivial bug,
setup tc callback is doing a wrong check of the netdevice state,
the fix is simply to invert the condition.

Fixes: 6f9485af ("net/mlx5e: Fail safe tc setup")
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

20b6a1c7

net/mlx5e: Fix ethtool pause support and advertise reporting · e3c19503

由 Gal Pressman 提交于 4月 19, 2017

Pause bit should set when RX pause is on, not TX pause.
Also, setting Asym_Pause is incorrect, and should be turned off.

Fixes: 665bc539 ("net/mlx5e: Use new ethtool get/set link ksettings API")
Signed-off-by: NGal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

e3c19503

net/mlx5e: Use the correct pause values for ethtool advertising · b383b544

由 Gal Pressman 提交于 4月 03, 2017

Query the operational pause from firmware (PFCC register) instead of
always passing zeros.

Fixes: 665bc539 ("net/mlx5e: Use new ethtool get/set link ksettings API")
Signed-off-by: NGal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

b383b544

09 5月, 2017 4 次提交

net/mlx4_core: Reduce harmless SRIOV error message to debug level · 83bd5118

由 Jack Morgenstein 提交于 5月 09, 2017

Under SRIOV resource management, extra counters are allocated to VFs
from a free pool. If that pool is empty, the ALLOC_RES command for
a counter resource fails -- and this generates a misleading error
message in the message log.

Under SRIOV, each VF is allocated (i.e., guaranteed) 2 counters --
one counter per port. For ETH ports, the RoCE driver requests an
additional counter (above the guaranteed counters). If that request
fails, the VF RoCE driver simply uses the default (i.e., guaranteed)
counter for that port.

Thus, failing to allocate an additional counter does not constitute
a  problem, and the error message on the PF when this occurs should
be reduced to debug level.

Finally, to identify the situation that the reason for the failure is
that no resources are available to grant to the VF, we modified the
error returned by mlx4_grant_resource to -EDQUOT (Quota exceeded),
which more accurately describes the error.

Fixes: c3abb51b ("IB/mlx4: Add RoCE/IB dedicated counters")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83bd5118

net/mlx4_en: Avoid adding steering rules with invalid ring · 89c55768

由 Talat Batheesh 提交于 5月 09, 2017

Inserting steering rules with illegal ring is an invalid operation,
block it.

Fixes: 82067281 ('net/mlx4_en: Manage flow steering rules with ethtool')
Signed-off-by: NTalat Batheesh <talatb@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89c55768

net/mlx4_en: Change the error print to debug print · 505a9249

由 Kamal Heib 提交于 5月 09, 2017

The error print within mlx4_en_calc_rx_buf() should be a debug print.

Fixes: 51151a16 ('mlx4: allow order-0 memory allocations in RX path')
Signed-off-by: NKamal Heib <kamalh@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

505a9249

treewide: use kv[mz]alloc* rather than opencoded variants · 752ade68

由 Michal Hocko 提交于 5月 08, 2017

There are many code paths opencoding kvmalloc.  Let's use the helper
instead.  The main difference to kvmalloc is that those users are
usually not considering all the aspects of the memory allocator.  E.g.
allocation requests <= 32kB (with 4kB pages) are basically never failing
and invoke OOM killer to satisfy the allocation.  This sounds too
disruptive for something that has a reasonable fallback - the vmalloc.
On the other hand those requests might fallback to vmalloc even when the
memory allocator would succeed after several more reclaim/compaction
attempts previously.  There is no guarantee something like that happens
though.

This patch converts many of those places to kv[mz]alloc* helpers because
they are more conservative.

Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits
Acked-by: NKees Cook <keescook@chromium.org>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390
Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph
Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Santosh Raspatur <santosh@chelsio.com>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: "Yan, Zheng" <zyan@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

752ade68

05 5月, 2017 1 次提交

IB/mlx5: Enable IPoIB acceleration · 693dfd5a

由 Erez Shitrit 提交于 4月 27, 2017

Enable mlx5 IPoIB acceleration by declaring
mlx5_ib_{alloc,free}_rdma_netdev and assigning the mlx5
IPoIB rdma_netdev callbacks.

In addition, this patch brings in sync mlx5's IPoIB parts for net and IB
trees. As a precaution, we disabled IPoIB acceleration by default (in
the mlx5_core Kconfig file).
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

693dfd5a

01 5月, 2017 1 次提交

mlxsw: spectrum_router: Simplify VRF enslavement · b1e45526

由 Ido Schimmel 提交于 4月 30, 2017

When a netdev is enslaved to a VRF master, its router interface (RIF)
needs to be destroyed (if exists) and a new one created using the
corresponding virtual router (VR).

>From the driver's perspective, the above is equivalent to an inetaddr
event sent for this netdev. Therefore, when a port netdev (or its
uppers) are enslaved to a VRF master, call the same function that
would've been called had a NETDEV_UP was sent for this netdev in the
inetaddr notification chain.

This patch also fixes a bug when a LAG netdev with an existing RIF is
enslaved to a VRF. Before this patch, each LAG port would drop the
reference on the RIF, but would re-join the same one (in the wrong VR)
soon after. With this patch, the corresponding RIF is first destroyed
and a new one is created using the correct VR.

Fixes: 7179eb5a ("mlxsw: spectrum_router: Add support for VRFs")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1e45526

30 4月, 2017 15 次提交

net/mlx5: E-Switch, Avoid redundant memory allocation · 0a0ab1d2

由 Eli Cohen 提交于 2月 28, 2017

struct esw_mc_addr is a small struct that can be part of struct
mlx5_eswitch. Define it as a field and not as a pointer and save the
kzalloc call and then error flow handling.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

0a0ab1d2

net/mlx5e: Disable HW LRO when PCI is slower than link on striding RQ · 0f6e4cf6

由 Eran Ben Elisha 提交于 4月 26, 2017

We will activate the HW LRO only on servers with PCI BW > MAX LINK BW,
or when PCI BW > 16Gbps. On other cases we do not want LRO by default as
LRO sessions might get timeout and add redundant software overhead.

Tested:
	ethtool -k <ifs-name> | grep large-receive-offload
	On systems with and without the limitations.
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

0f6e4cf6

net/mlx5e: Use u8 as ownership type in mlx5e_get_cqe() · b1b03bde

由 Tariq Toukan 提交于 4月 03, 2017

CQE ownership indication is as small as a single bit.
Use u8 to speedup the comparison.
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

b1b03bde

net/mlx5e: Use prefetchw when a write is to follow · ad78af9b

由 Tariq Toukan 提交于 2月 15, 2017

"prefetchw()" prefetches the cacheline for write. Use it for
skb->data, as soon we'll be copying the packet header there.

Performance:
Single-stream packet-rate tested with pktgen.
Packets are dropped in tc level to zoom into driver data-path.
Larger gain is expected for smaller packets, as less time
is spent on handling SKB fragments, making the path shorter
and the improvement more significant.

---------------------------------------------
packet size | before    | after     | gain  |
64B         | 4,113,306 | 4,778,720 |  16%  |
1024B       | 3,633,819 | 3,950,593 | 8.7%  |
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

ad78af9b

net/mlx5e: Optimize poll ICOSQ completion queue · 1f5b1e47

由 Tariq Toukan 提交于 3月 29, 2017

UMR operations are more frequent and important.
Check them first, and add a compiler branch predictor hint.

According to current design, ICOSQ CQ can contain at most one
pending CQE per napi. Poll function is optimized accordingly.

Performance:
Single-stream packet-rate tested with pktgen.
Packets are dropped in tc level to zoom into driver data-path.
Larger gain is expected for larger packet sizes, as BW is higher
and UMR posts are more frequent.

---------------------------------------------
packet size | before    | after     | gain  |
64B         | 4,092,370 | 4,113,306 |  0.5% |
1024B       | 3,421,435 | 3,633,819 |  6.2% |
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

1f5b1e47

net/mlx5e: Act on delay probe time updates · a2fa1fe5

由 Hadar Hen Zion 提交于 2月 14, 2017

The user can change delay_first_probe_time parameter through sysctl.
Listen to NETEVENT_DELAY_PROBE_TIME_UPDATE notifications and update the
intervals for updating the neighbours 'used' value periodic task and
for flow HW counters query periodic task.
Both of the intervals will be update only in case the new delay prob
time value is lower the current interval.

Since the driver saves only one min interval value and not per device,
the users will be able to set lower interval value for updating
neighbour 'used' value periodic task but they won't be able to schedule
a higher interval for this periodic task.
The used interval for scheduling neighbour 'used' value periodic task is
the minimal delay prob time parameter ever seen by the driver.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

a2fa1fe5

net/mlx5e: Update neighbour 'used' state using HW flow rules counters · f6dfb4c3

由 Hadar Hen Zion 提交于 2月 24, 2017

When IP tunnel encapsulation rules are offloaded, the kernel can't see
the traffic of the offloaded flow. The neighbour for the IP tunnel
destination of the offloaded flow can mistakenly become STALE and
deleted by the kernel since its 'used' value wasn't changed.

To make sure that a neighbour which is used by the HW won't become
STALE, we proactively update the neighbour 'used' value every
DELAY_PROBE_TIME period, when packets were matched and counted by the HW
for one of the tunnel encap flows related to this neighbour.

The periodic task that updates the used neighbours is scheduled when a
tunnel encap rule is successfully offloaded into HW and keeps re-scheduling
itself as long as the representor's neighbours list isn't empty.

Add, remove, lookup and status change operations done over the
representor's neighbours list or the neighbour hash entry encaps list
are all serialized by RTNL lock.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

f6dfb4c3

net/mlx5e: Add support to neighbour update flow · 232c0013

由 Hadar Hen Zion 提交于 3月 20, 2017

In order to offload TC encap rules, the driver does a lookup for the IP
tunnel neighbour according to the output device and the destination IP
given by the user.

To keep tracking after the validity state of such neighbours, we keep
the neighbours information (pair of device pointer and destination IP)
in a hash table maintained at the relevant egress representor and
register to get NETEVENT_NEIGH_UPDATE events. When getting neighbour update
netevent, we search for a match among the cached neighbours entries used for
encapsulation.

In case the neighbour isn't valid, we can't offload the flow into the
HW. We cache the flow (requested matching and actions) in the driver and
offload the rule later, when the neighbour is resolved and becomes
valid.

When a flow is only cached in the driver and not offloaded into HW
yet, we use EAGAIN return value to mark it internally, the TC ndo still
returns success.

Listen to kernel neighbour update netevents to trace relevant neighbours
validity state:

1. If a neighbour becomes valid, offload the related rules to HW.

2. If the neighbour becomes invalid, remove the related rules from HW.

3. If the neighbour mac address was changed, update the encap header.
   Remove all the offloaded rules using the old encap header from the HW
   and insert new rules to HW with updated encap header.

Access to the neighbors hash table is protected by RTNL lock of its
caller or by the table's spinlock.

Details of the locking/synchronization among the different actions
applied on the neighbour table:

Add/remove operations - protected by RTNL lock of its caller (all TC
commands are protected by RTNL lock). Add and remove operations are
initiated only when the user inserts/removes a TC rule into/from the driver.

Lookup/remove operations - since the lookup operation is done from
netevent notifier block, RTNL lock can't be used (atomic context).
Use the table's spin lock to protect lookups from TC user removal operation.
bh is used since netevent can be called from a softirq context.

Lookup/add operations - The hash table access functions are taking
care of the protection between lookup and add operations.

When adding/removing encap headers and rules to/from the HW, RTNL lock
is used. It can happen when:

1. The user inserts/removes a TC rule into/from the driver (TC commands
are protected by RTNL lock of it's caller).

2. The driver gets neighbour notification event, which reports about
neighbour validity status change. Before adding/removing encap headers
and rules to/from the HW, RTNL lock is taken.

A neighbour hash table entry should be freed when its encap list is empty.
Since The neighbour update netevent notification schedules a neighbour
update work that uses the neighbour hash entry, it can't be freed
unconditionally when the encap list becomes empty during TC delete rule flow.
Use reference count to protect from freeing neighbour hash table entry
while it's still in use.

When the user asks to unregister a netdvice used by one of the neigbours,
neighbour removal notification is received. Then we take a reference on the
neighbour and don't free it until the relevant encap entries (and flows) are
marked as invalid (not offloaded) and removed from HW.
As long as the encap entry is still valid (checked under RTNL lock) we
can safely access the neighbour device saved on mlx5e_neigh struct.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

232c0013

net/mlx5e: Add neighbour hash table to the representors · 37b498ff

由 Hadar Hen Zion 提交于 2月 02, 2017

Add hash table to the representors which is to be used by the next patch
to save neighbours information in the driver.

In order to offload IP tunnel encapsulation rules, the driver must find
the tunnel dst neighbour according to the output device and the
destination address given by the user. The next patch will cache the
neighbors information in the driver to allow support in neigh update
flow for tunnel encap rules.

The neighbour entries are also saved in a list so we easily iterate over
them when querying statistics in order to provide 'used' feedback to the
kernel neighbour NUD core.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

37b498ff

net/mlx5e: Read neigh parameters with proper locking · 033354d5

由 Hadar Hen Zion 提交于 2月 02, 2017

The nud_state and hardware address fields are protected by the neighbour
lock, we should acquire it before accessing those parameters.

Use this lock to avoid inconsistency between the neighbour validity state
and it's hardware address.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

033354d5

net/mlx5e: Use flag to properly monitor a flow rule offloading state · 0b67a38f

由 Hadar Hen Zion 提交于 2月 14, 2017

Instead of relaying on the 'flow->rule' pointer value which can be
valid or invalid (in case the FW returns an error while trying to offload
the rule), monitor the rule state using a flag.

In downstream patch which adds support to IP tunneling neigh update
flow, a TC rule could be cached in the driver and not offloaded into the
HW. In this case, the flow handle pointer stays NULL.

Check the offloaded flag to properly deal with rules which are currently
not offloaded when querying rule statistics.

This patch doesn't add any new functionality.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

0b67a38f

net/mlx5e: Remove output device parameter from create encap header helpers definition · 1a8552bd

由 Hadar Hen Zion 提交于 2月 02, 2017

Passing output device parameter to the helper functions that deal with
creation of encapsulation headers is redundant. Output device parameter
can be defined inside those helpers, no need to pass it. Refactor the code by
removing the parameter from the function signature.

This patch doesn't change any functionality.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

1a8552bd

net/mlx5e: Move the encap entry structure from the eswitch header · c1ae1152

由 Or Gerlitz 提交于 4月 25, 2017

The encap entry structure isn't manipulated by the eswitch code,
hence it can/needs to be removed from the eswitch header.

Do that, and change it to have mlx5e_ prefix.

This patch doesn't change any functionality.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

c1ae1152

net/mlx5: Remove encap entry pointer from the eswitch flow attributes · 45247bf2

由 Or Gerlitz 提交于 4月 25, 2017

Encap wise, the tc eswitch flow attribute struct needs to have
only the encap ID which is programmed later to the HW and none
of the higher level encap params, fix that.

This patch doesn't change any functionality.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

45247bf2

net/mlx5e: Extendable vport representor netdev private data · 1d447a39

由 Saeed Mahameed 提交于 4月 24, 2017

Make representor netdev private data extendable by adding new struct
"mlx5e_rep_priv" and use it as the rep netdev private data struct
instead of directly pointing to mlx5_eswitch_rep.

Added new en_rep.h header file to contain all representor related
definitions and prototypes, and moved all representor specific logic
into en_rep.c.

Needed for downstream patches to extend representor functionality to
support neighbour update.
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>

1d447a39

25 4月, 2017 1 次提交

net/mlx5e: Fix race in mlx5e_sw_stats and mlx5e_vport_stats · 1510d728

由 Martin KaFai Lau 提交于 4月 20, 2017

We have observed a sudden spike in rx/tx_packets and rx/tx_bytes
reported under /proc/net/dev. There is a race in mlx5e_update_stats()
and some of the get-stats functions (the one that we hit is the
mlx5e_get_stats() which is called by ndo_get_stats64()).

In particular, the very first thing mlx5e_update_sw_counters()
does is 'memset(s, 0, sizeof(*s))'. For example, if mlx5e_get_stats()
is unlucky at one point, rx_bytes and rx_packets could be 0. One second
later, a normal (and much bigger than 0) value will be reported.

This patch is to use a 'struct mlx5e_sw_stats temp' to avoid
a direct memset zero on priv->stats.sw.

mlx5e_update_vport_counters() has a similar race. Hence, addressed
together. However, memset zero is removed instead because
it is not needed.

I am lucky enough to catch this 0-reset in rx multicast:
eth0: 41457665 76804 70 0 0 70 0 47085 15586634 87502 3 0 0 0 3 0
eth0: 41459860 76815 70 0 0 70 0 47094 15588376 87516 3 0 0 0 3 0
eth0: 41460577 76822 70 0 0 70 0 0 15589083 87521 3 0 0 0 3 0
eth0: 41463293 76838 70 0 0 70 0 47108 15595872 87538 3 0 0 0 3 0
eth0: 41463379 76839 70 0 0 70 0 47116 15596138 87539 3 0 0 0 3 0

v2: Remove memset zero from mlx5e_update_vport_counters()
v1: Use temp and memcpy

Fixes: 9218b44d ("net/mlx5e: Statistics handling refactoring")
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Suggested-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1510d728

23 4月, 2017 4 次提交

net/mlx5e: Fix ETHTOOL_GRXCLSRLALL handling · 5e82c9e4

由 Ilan Tayari 提交于 3月 02, 2017

Handler for ETHTOOL_GRXCLSRLALL must set info->data to the size
of the table, regardless of the amount of entries in it.
Existing code does not do that, and this breaks all usage of ethtool -N
or -n without explicit location, with this error:
rmgr: Invalid RX class rules table size: Success

Set info->data to the table size.

Tested:
ethtool -n ens8
ethtool -N ens8 flow-type ip4 src-ip 1.1.1.1 dst-ip 2.2.2.2 action 1
ethtool -N ens8 flow-type ip4 src-ip 1.1.1.1 dst-ip 2.2.2.2 action 1 loc 55
ethtool -n ens8
ethtool -N ens8 delete 1023
ethtool -N ens8 delete 55

Fixes: f913a72a ("net/mlx5e: Add support to get ethtool flow rules")
Signed-off-by: NIlan Tayari <ilant@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

5e82c9e4

net/mlx5e: Fix small packet threshold · cbad8cdd

由 Eugenia Emantayev 提交于 3月 22, 2017

RX packet headers are meant to be contained in SKB linear part,
and chose a threshold of 128.
It turns out this is not enough, i.e. for IPv6 packet over VxLAN.
In this case, UDP/IPv4 needs 42 bytes, GENEVE header is 8 bytes,
and 86 bytes for TCP/IPv6. In total 136 bytes that is more than
current 128 bytes. In this case expand header flow is reached.
The warning in skb_try_coalesce() caused by a wrong truesize
was already fixed here:
commit 158f323b ("net: adjust skb->truesize in pskb_expand_head()").
Still, we prefer to totally avoid the expand header flow for performance reasons.
Tested regular TCP_STREAM with iperf for 1 and 8 streams, no degradation was found.

Fixes: 461017cb ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

cbad8cdd

net/mlx5: Fix UAR memory leak · 5ae85b0e

由 Maor Gottlieb 提交于 4月 06, 2017

When UAR is released, we deallocate the device resource, but
don't unmmap the UAR mapping memory.
Fix the leak by unmapping this memory.

Fixes: a6d51b68 ('net/mlx5: Introduce blue flame register allocator)
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

5ae85b0e

net/mlx5e: Make sure the FW max encap size is enough for ipv6 tunnels · 225aabaf

由 Or Gerlitz 提交于 4月 06, 2017

Otherwise the code that fills the ipv6 encapsulation headers could be writing
beyond the allocated headers buffer.

Fixes: ce99f6b9 ('net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels')
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

225aabaf