提交 · 186962ebb7ecd74d305cf167e99bf1885b89ea8c · openeuler / Kernel

11 3月, 2017 3 次提交

mlxsw: spectrum: Associate PVID vPort with appropriate netdev · 186962eb

由 Ido Schimmel 提交于 3月 10, 2017

When a VLAN device is configured on top of a LAG device (f.e.,
bond0.10), a vPort is created on top of each of the LAG's slaves and its
'dev' pointer is set to the VLAN device.

This is in contrast to the implicit PVID vPort (representing 'bond0'),
whose 'dev' pointer keeps pointing to the port netdev itself (f.e.,
'sw1p1').

Make both cases consistent by setting their 'dev' pointer to the actual
netdev they represent. Either the LAG device itself (in the case of the
PVID vPort) or the VLAN device on top of it.

This will later allow us to more easily understand for which netdev we
should create the router interface (RIF) upon enslavement to a VRF
master.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

186962eb

mlxsw: spectrum: Don't assume upper device's type · 1f88061e

由 Ido Schimmel 提交于 3月 10, 2017

When an upper device is configured on top of a vPort we make sure it's a
bridge master during PRECHANGEUPPER and fail otherwise. Therefore, when
CHANGEUPPER is later received we don't bother checking the upper's type.

Make the code more extendable in preparation for VRF uppers, by checking
the upper's type.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f88061e

mlxsw: spectrum: Sanitize bridge's upper devices · b414970e

由 Ido Schimmel 提交于 3月 10, 2017

We're going to allow bridges stacked on top of port netdevs to be
enslaved to a VRF, but for now, only VLAN uppers of the VLAN-aware
bridge are supported.

Sanitize any other bridge upper. This is consistent with the way we
sanitize port netdevs' uppers.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b414970e

10 3月, 2017 15 次提交

mlxsw: spectrum: Add support for flower matches on VLAN ID, PCP · 9caab08a

由 Petr Machata 提交于 3月 09, 2017

Introduce MLXSW_AFK_ELEMENT_VID, PCP and declare them in afk_element
infos that contain them.  Use the elements when VLAD ID or priority are
used in the flow.

Also add MLXSW_AFK_ELEMENT_VID, PCP to mlxsw_sp_acl_tcam_pattern_ipv4.
Both items are included in mlxsw_sp_afk_element_info_l2_dmac,
resp. _smac, and both MLXSW_AFK_ELEMENT_SMAC and _DMAC are already in
the pattern.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9caab08a

mlxsw: spectrum: Add support for vlan modify TC action · a150201a

由 Petr Machata 提交于 3月 09, 2017

Add VLAN action offloading. Invoke it from Spectrum flower handler for
"vlan modify" actions.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a150201a

mlx4: remove duplicate code in mlx4_en_process_rx_cq() · 68b8df46

由 Eric Dumazet 提交于 3月 08, 2017

We should keep one way to build skbs, regardless of GRO being on or off.

Note that I made sure to defer as much as possible the point we need to
pull data from the frame, so that future prefetch() we might add
are more effective.

These skb attributes derive from the CQE or ring :
 ip_summed, csum
 hash
 vlan offload
 hwtstamps
 queue_mapping

As a bonus, this patch removes mlx4 dependency on eth_get_headlen()
which is very often broken enough to give us headaches.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68b8df46

mlx4: make validate_loopback() more generic · 6969cf0f

由 Eric Dumazet 提交于 3月 08, 2017

Testing a boolean in fast path is not worth duplicating
the code allocating packets, when GRO is on or off.

If this proves to be a problem, we might later use a jump label.

Next patch will remove this duplicated code and ease code review.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6969cf0f

mlx4: factorize page_address() calls · 02e6fd3e

由 Eric Dumazet 提交于 3月 08, 2017

We need to compute the frame virtual address at different points.
Do it once.

Following patch will use the new va address for validate_loopback()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02e6fd3e

mlx4: do not access rx_desc from mlx4_en_process_rx_cq() · 9e8c0395

由 Eric Dumazet 提交于 3月 08, 2017

Instead of fetching dma address from rx_desc->data[0].addr,
prefer using frags[0].dma + frags[0].page_offset to avoid
a potential cache line miss.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e8c0395

mlx4: add rx_alloc_pages counter in ethtool -S · 7d7bfc6a

由 Eric Dumazet 提交于 3月 08, 2017

This new counter tracks number of pages that we allocated for one port.

lpaa24:~# ethtool -S eth0 | egrep 'rx_alloc_pages|rx_packets'
     rx_packets: 306755183
     rx_alloc_pages: 932897
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d7bfc6a

mlx4: add page recycling in receive path · 34db548b

由 Eric Dumazet 提交于 3月 08, 2017

Same technique than some Intel drivers, for arches where PAGE_SIZE = 4096

In most cases, pages are reused because they were consumed
before we could loop around the RX ring.

This brings back performance, and is even better,
a single TCP flow reaches 30Gbit on my hosts.

v2: added full memset() in mlx4_en_free_frag(), as Tariq found it was needed
if we switch to large MTU, as priv->log_rx_info can dynamically be changed.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34db548b

mlx4: use order-0 pages for RX · b5a54d9a

由 Eric Dumazet 提交于 3月 08, 2017

Use of order-3 pages is problematic in some cases.

This patch might add three kinds of regression :

1) a CPU performance regression, but we will add later page
recycling and performance should be back.

2) TCP receiver could grow its receive window slightly slower,
   because skb->len/skb->truesize ratio will decrease.
   This is mostly ok, we prefer being conservative to not risk OOM,
   and eventually tune TCP better in the future.
   This is consistent with other drivers using 2048 per ethernet frame.

3) Because we allocate one page per RX slot, we consume more
   memory for the ring buffers. XDP already had this constraint anyway.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5a54d9a

mlx4: removal of frag_sizes[] · 60c7f5ae

由 Eric Dumazet 提交于 3月 08, 2017

We will soon use order-0 pages, and frag truesize will more precisely
match real sizes.

In the new model, we prefer to use <= 2048 bytes fragments, so that
we can use page-recycle technique on PAGE_SIZE=4096 arches.

We will still pack as much frames as possible on arches with big
pages, like PowerPC.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60c7f5ae

mlx4: reduce rx ring page_cache size · acd7628d

由 Eric Dumazet 提交于 3月 08, 2017

We only need to store the page and dma address.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

acd7628d

mlx4: rx_headroom is a per port attribute · d85f6c14

由 Eric Dumazet 提交于 3月 08, 2017

No need to duplicate it per RX queue / frags.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d85f6c14

mlx4: get rid of frag_prefix_size · aaca121d

由 Eric Dumazet 提交于 3月 08, 2017

Using per frag storage for frag_prefix_size is really silly.

mlx4_en_complete_rx_desc() has all needed info already.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aaca121d

mlx4: remove order field from mlx4_en_frag_info · 159ddfd2

由 Eric Dumazet 提交于 3月 08, 2017

This is really a port attribute, no need to duplicate it per
RX queue and per frag.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

159ddfd2

mlx4: dma_dir is a mlx4_en_priv attribute · 69ba9431

由 Eric Dumazet 提交于 3月 08, 2017

No need to duplicate it for all queues and frags.

num_frags & log_rx_info become u8 to save space.
u8 accesses are a bit faster than u16 anyway.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

69ba9431

09 3月, 2017 2 次提交

mlxsw: pci: Remove unused bit · 61793af6

由 Ido Schimmel 提交于 3月 06, 2017

The overrun ignore bit isn't supported by the device's firmware and was
recently removed from the programmer's reference manual (PRM).

Remove it from the driver as well.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61793af6

mlxsw: spectrum: Fix helper function and port variable names · 1182e536

由 Jiri Pirko 提交于 3月 06, 2017

Commit dd82364c ("mlxsw: Flip to the new dev walk API") did some
small changes in mlxsw code, but it did not respect the naming
conventions. So fix this now.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1182e536

02 3月, 2017 1 次提交

mlxsw: spectrum_router: Avoid potential packets loss · f7df4923

由 Ido Schimmel 提交于 2月 28, 2017

When the structure of the LPM tree changes (f.e., due to the addition of
a new prefix), we unbind the old tree and then bind the new one. This
may result in temporary packet loss.

Instead, overwrite the old binding with the new one.

Fixes: 6b75c480 ("mlxsw: spectrum_router: Add virtual router management")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7df4923

27 2月, 2017 1 次提交

net/mlx4_en: fix overflow in mlx4_en_init_timestamp() · 47d3a075

由 Eric Dumazet 提交于 2月 23, 2017

The cited commit makes a great job of finding optimal shift/multiplier
values assuming a 10 seconds wrap around, but forgot to change the
overflow_period computation.

It overflows in cyclecounter_cyc2ns(), and the final result is 804 ms,
which is silly.

Lets simply use 5 seconds, no need to recompute this, given how it is
supposed to work.

Later, we will use a timer instead of a work queue, since the new RX
allocation schem will no longer need mlx4_en_recover_from_oom() and the
service_task firing every 250 ms.

Fixes: 31c128b6 ("net/mlx4_en: Choose time-stamping shift value according to HW frequency")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47d3a075

23 2月, 2017 11 次提交

net/mlx4_en: Use __skb_fill_page_desc() · 7f0137e2

由 Eric Dumazet 提交于 2月 23, 2017

Or we might miss the fact that a page was allocated from memory reserves.

Fixes: dceeab0e ("mlx4: support __GFP_MEMALLOC for rx")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f0137e2

net/mlx4_core: Use cq quota in SRIOV when creating completion EQs · 6ed63d84

由 Jack Morgenstein 提交于 2月 23, 2017

When creating EQs to handle CQ completion events for the PF
or for VFs, we create enough EQE entries to handle completions
for the max number of CQs that can use that EQ.

When SRIOV is activated, the max number of CQs a VF (or the PF) can
obtain is its CQ quota (determined by the Hypervisor resource tracker).
Therefore, when creating an EQ, the number of EQE entries that the VF
should request for that EQ is the CQ quota value (and not the total
number of CQs available in the FW).

Under SRIOV, the PF, also must use its CQ quota, because
the resource tracker also controls how many CQs the PF can obtain.

Using the FW total CQs instead of the CQ quota when creating EQs resulted
wasting MTT entries, due to allocating more EQEs than were needed.

Fixes: 5a0d0a61 ("mlx4: Structures and init/teardown for VF resource quotas")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reported-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ed63d84

net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs · 95f1ba9a

由 Majd Dibbiny 提交于 2月 23, 2017

In the VF driver, module parameter mlx4_log_num_mgm_entry_size was
mistakenly overwritten -- and in a manner which overrode the
device-managed flow steering option encoded in the parameter.

log_num_mgm_entry_size is a global module parameter which
affects all ConnectX-3 PFs installed on that host.
If a VF changes log_num_mgm_entry_size, this will affect all PFs
which are probed subsequent to the change (by disabling DMFS for
those PFs).

Fixes: 3c439b55 ("mlx4_core: Allow choosing flow steering mode")
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95f1ba9a

net/mlx4: Spoofcheck and zero MAC can't coexist · 745d8ae4

由 Eugenia Emantayev 提交于 2月 23, 2017

Spoofcheck can't be enabled if VF MAC is zero.
Vice versa, can't zero MAC if spoofcheck is on.

Fixes: 8f7ba3ca ('net/mlx4: Add set VF mac address support')
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

745d8ae4

net/mlx4: Change ENOTSUPP to EOPNOTSUPP · 423b3aec

由 Or Gerlitz 提交于 2月 23, 2017

As ENOTSUPP is specific to NFS, change the return error value to
EOPNOTSUPP in various places in the mlx4 driver.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Suggested-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

423b3aec

net/mlx5e: Fix wrong CQE decompression · 36154be4

由 Tariq Toukan 提交于 2月 22, 2017

In cqe compression with striding RQ, the decompression of the CQE field
wqe_counter was done with a wrong wraparound value.
This caused handling cqes with a wrong pointer to wqe (rx descriptor)
and creating SKBs with wrong data, pointing to wrong (and already consumed)
strides/pages.

The meaning of the CQE field wqe_counter in striding RQ holds the
stride index instead of the WQE index. Hence, when decompressing
a CQE, wqe_counter should have wrapped-around the number of strides
in a single multi-packet WQE.

We dropped this wrap-around mask at all in CQE decompression of striding
RQ. It is not needed as in such cases the CQE compression session would
break because of different value of wqe_id field, starting a new
compression session.

Tested:
 ethtool -K ethxx lro off/on
 ethtool --set-priv-flags ethxx rx_cqe_compress on
 super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
 verified no csum errors and no page refcount issues.

Fixes: 7219ab34 ("net/mlx5e: CQE compression")
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Reported-by: NTom Herbert <tom@herbertland.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36154be4

net/mlx5e: Update MPWQE stride size when modifying CQE compress state · 6dc4b54e

由 Saeed Mahameed 提交于 2月 22, 2017

When the admin enables/disables cqe compression, updating
mpwqe stride size is required:
    CQE compress ON  ==> stride size = 256B
    CQE compress OFF ==> stride size = 64B

This is already done on driver load via mlx5e_set_rq_type_params, all we
need is just to call it on arbitrary admin changes of cqe compression
state via priv flags or when changing timestamping state
(as it is mutually exclusive with cqe compression).

This bug introduces no functional damage, it only makes cqe compression
occur less often, since in ConnectX4-LX CQE compression is performed
only on packets smaller than stride size.

Tested:
 ethtool --set-priv-flags ethxx rx_cqe_compress on
 pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
 verify `ethtool -S ethxx | grep compress` are advancing more often
 (rapidly)

Fixes: 7219ab34 ("net/mlx5e: CQE compression")
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6dc4b54e

net/mlx5e: Fix broken CQE compression initialization · b0d4660b

由 Tariq Toukan 提交于 2月 22, 2017

Some of RQ type parameters are derived from CQE compression state flag,
CQE compression flag was initialized only after RQ type parameters
setup. This leads to load RQ with stride size smaller than what we
want for when CQE compression is on.

This bug introduces no functional damage, it only makes CQE compression
occur less often, since in ConnectX4-LX CQE compression is performed
only on packets smaller than stride size.

Fix this by marking default status of CQE compression in PFLAG prior to
calling mlx5e_set_rq_priv_params(), as it inits some fields based on it.

Tested:
 load driver on systems where rx CQE compress will be on (MH)
 pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
 verify `ethtool -S ethxx | grep compress` are advancing more often
 (rapidly)

Fixes: 2fc4bfb7 ("net/mlx5e: Dynamic RQ type infrastructure")
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0d4660b

net/mlx5e: Do not reduce LRO WQE size when not using build_skb · 4078e637

由 Tariq Toukan 提交于 2月 22, 2017

When rq_type is Striding RQ, no room of SKB_RESERVE is needed
as SKB allocation is not done via build_skb.

Fixes: e4b85508 ("net/mlx5e: Slightly reduce hardware LRO size")
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4078e637

net/mlx5e: Register/unregister vport representors on interface attach/detach · 6f08a22c

由 Saeed Mahameed 提交于 2月 22, 2017

Currently vport representors are added only on driver load and removed on
driver unload.  Apparently we forgot to handle them when we added the
seamless reset flow feature.  This caused to leave the representors
netdevs alive and active with open HW resources on pci shutdown and on
error reset flows.

To overcome this we move their handling to interface attach/detach, so
they would be cleaned up on shutdown and recreated on reset flows.

Fixes: 26e59d80 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NHadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f08a22c

net/mlx5e: s390 system compilation fix · 18bcf742

由 Mohamad Haj Yahia 提交于 2月 22, 2017

Add necessary headers include for s390 arch compilation.

Fixes: e586b3b0 ("net/mlx5: Ethernet Datapath files")
Fixes: d605d668 ("net/mlx5e: Add support for ethtool self..")
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18bcf742

20 2月, 2017 2 次提交

mlx4: reduce OOM risk on arches with large pages · 3608b13c

由 Eric Dumazet 提交于 2月 18, 2017

Since mlx4 NIC are used on PowerPC with 64K pages, we need to adapt
MLX4_EN_ALLOC_PREFER_ORDER definition.

Otherwise, a fragment sitting in an out of order TCP queue can hold
0.5 Mbytes and it is a serious OOM risk.

Fixes: 51151a16 ("mlx4: allow order-0 memory allocations in RX path")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3608b13c

mlx4: fix potential divide by 0 in mlx4_en_auto_moderation() · f5a57723

由 Eric Dumazet 提交于 2月 16, 2017

1) In the case where rate == priv->pkt_rate_low == priv->pkt_rate_high,
mlx4_en_auto_moderation() does a divide by zero.

2) We want to properly change the moderation parameters if rx_frames
was changed (like in ethtool -C eth0 rx-frames 16)
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5a57723

17 2月, 2017 1 次提交

mlx4: do not fire tasklet unless necessary · 01f0f425

由 Eric Dumazet 提交于 2月 10, 2017

All rx and rx netdev interrupts are handled by respectively
by mlx4_en_rx_irq() and mlx4_en_tx_irq() which simply schedule a NAPI.

But mlx4_eq_int() also fires a tasklet to service all items that were
queued via mlx4_add_cq_to_tasklet(), but this handler was not called
unless user cqe was handled.

This is very confusing, as "mpstat -I SCPU ..." show huge number of
tasklet invocations.

This patch saves this overhead, by carefully firing the tasklet directly
from mlx4_add_cq_to_tasklet(), removing four atomic operations per IRQ.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01f0f425

16 2月, 2017 2 次提交

mlxsw: acl: Use PBS type for forward action · 0c921a89

由 Jiri Pirko 提交于 2月 15, 2017

Current behaviour of "mirred redirect" action (forward) offload is a bit
odd. For matched packets the action forwards them to the desired
destination, but it also lets the packet duplicates to go the original
way down (bridge, router, etc). That is more like "mirred mirror".
Fix this by using PBS type which behaves exactly like "mirred redirect".
Note that PBS does not support loopback mode.

Fixes: 4cda7d8d ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c921a89

mlx4: do not use rwlock in fast path · 99f5711e

由 Eric Dumazet 提交于 2月 09, 2017

Using a reader-writer lock in fast path is silly, when we can
instead use RCU or a seqlock.

For mlx4 hwstamp clock, a seqlock is the way to go, removing
two atomic operations and false sharing.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99f5711e

15 2月, 2017 2 次提交

mlxsw: spectrum: Change ipv6 unregistered mc table · 1ca6270b

由 Nogah Frankel 提交于 2月 13, 2017

Point back the unregister IPv6 mc table to the bc table.
It is done since IPv6 mcast snooping is not supported for Spectrum yet.
Reported-by: NJiri Pirko <jiri@mellanox.com>
Fixes: 71c365bd ("mlxsw: spectrum: Separate bc and mc floods")
Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Tested-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ca6270b

net/mlx5e: Disable preemption when doing TC statistics upcall · fed06ee8

由 Or Gerlitz 提交于 2月 12, 2017

When called by HW offloading drivers, the TC action (e.g
net/sched/act_mirred.c) code uses this_cpu logic, e.g

 _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets)

per the kernel documention, preemption should be disabled, add that.

Before the fix, when running with CONFIG_PREEMPT set, we get a

BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793

asserion from the TC action (mirred) stats_update callback.

Fixes: aad7e08d ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fed06ee8

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功