提交 · 9393eb5034a040931120f9c6eed9bf0e78029192 · openeuler / Kernel

10 2月, 2021 5 次提交

net: hns3: clean up unnecessary parentheses in macro definitions · 9393eb50

由 Yufeng Mo 提交于 2月 09, 2021

In macro definitions, parentheses are unnecessary in some cases,
such as the calling parameter of a function, the left variable
of the equal sign, and so on. So remove these unnecessary
parentheses according to these rules.
Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9393eb50

net: hns3: remove the shaper param magic number · 9d2a1cea

由 Peng Li 提交于 2月 09, 2021

To make the code more readable, this patch adds a definition for
the magic number 126 used for the default shaper param ir_b, and
rename macro DIVISOR_IR_B_126.

No functional change.
Signed-off-by: NPeng Li <lipeng321@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d2a1cea

net: hns3: remove redundant client_setup_tc handle · ae9e492a

由 Jian Shen 提交于 2月 09, 2021

Since the real tx queue number and real rx queue number
always be updated when netdev opens, it's redundant
to call hclge_client_setup_tc to do the same thing.
So remove it.
Signed-off-by: NJian Shen <shenjian15@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae9e492a

net: hns3: clean up some incorrect variable types in hclge_dbg_dump_tm_map() · 0256844d

由 Yonglong Liu 提交于 2月 09, 2021

queue_id, qset_id and other IDs are unsigned type, so modify
the corresponding local variables' type in hclge_dbg_dump_tm_map()
from signed to unsigned. kstrtouint() and the print format should
be updated as well.
Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0256844d

net: broadcom: bcm4908enet: add BCM4908 controller driver · 4feffead

由 Rafał Miłecki 提交于 2月 07, 2021

BCM4908 SoCs family uses Ethernel controller that includes UniMAC but
uses different DMA engine (than other controllers) and requires
different programming.
Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4feffead

09 2月, 2021 4 次提交

mlxsw: spectrum_router: Set offload_failed flag · a4cb1c02

由 Amit Cohen 提交于 2月 07, 2021

When FIB_EVENT_ENTRY_{REPLACE, APPEND} are triggered and route insertion
fails, FIB abort is triggered.

After aborting, set the appropriate hardware flag to make the kernel emit
RTM_NEWROUTE notification with RTM_F_OFFLOAD_FAILED flag.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4cb1c02

IPv6: Add "offload failed" indication to routes · 0c5fcf9e

由 Amit Cohen 提交于 2月 07, 2021

After installing a route to the kernel, user space receives an
acknowledgment, which means the route was installed in the kernel, but not
necessarily in hardware.

The asynchronous nature of route installation in hardware can lead to a
routing daemon advertising a route before it was actually installed in
hardware. This can result in packet loss or mis-routed packets until the
route is installed in hardware.

To avoid such cases, previous patch set added the ability to emit
RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags
are changed, this behavior is controlled by sysctl.

With the above mentioned behavior, it is possible to know from user-space
if the route was offloaded, but if the offload fails there is no indication
to user-space. Following a failure, a routing daemon will wait indefinitely
for a notification that will never come.

This patch adds an "offload_failed" indication to IPv6 routes, so that
users will have better visibility into the offload process.

'struct fib6_info' is extended with new field that indicates if route
offload failed. Note that the new field is added using unused bit and
therefore there is no need to increase struct size.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c5fcf9e

IPv4: Add "offload failed" indication to routes · 36c5100e

由 Amit Cohen 提交于 2月 07, 2021

After installing a route to the kernel, user space receives an
acknowledgment, which means the route was installed in the kernel, but not
necessarily in hardware.

The asynchronous nature of route installation in hardware can lead to a
routing daemon advertising a route before it was actually installed in
hardware. This can result in packet loss or mis-routed packets until the
route is installed in hardware.

To avoid such cases, previous patch set added the ability to emit
RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags
are changed, this behavior is controlled by sysctl.

With the above mentioned behavior, it is possible to know from user-space
if the route was offloaded, but if the offload fails there is no indication
to user-space. Following a failure, a routing daemon will wait indefinitely
for a notification that will never come.

This patch adds an "offload_failed" indication to IPv4 routes, so that
users will have better visibility into the offload process.

'struct fib_alias', and 'struct fib_rt_info' are extended with new field
that indicates if route offload failed. Note that the new field is added
using unused bit and therefore there is no need to increase structs size.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36c5100e

cxgb4: remove unused vpd_cap_addr · 4429c5fc

由 Heiner Kallweit 提交于 2月 08, 2021

It is likely that this is a leftover from T3 driver heritage. cxgb4 uses
the PCI core VPD access code that handles detection of VPD capabilities.
Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com>
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4429c5fc

07 2月, 2021 26 次提交

r8169: don't try to disable interrupts if NAPI is scheduled already · 7274c414

由 Heiner Kallweit 提交于 2月 05, 2021

There's no benefit in trying to disable interrupts if NAPI is
scheduled already. This allows us to save a PCI write in this case.
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/78c7f2fb-9772-1015-8c1d-632cbdff253f@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

7274c414

net: dsa: felix: propagate the LAG offload ops towards the ocelot lib · 8fe6832e

由 Vladimir Oltean 提交于 2月 06, 2021

The ocelot switch has been supporting LAG offload since its initial
commit, however felix could not make use of that, due to lack of a LAG
abstraction in DSA. Now that we have that, let's forward DSA's calls
towards the ocelot library, who will deal with setting up the bonding.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

8fe6832e

net: mscc: ocelot: rebalance LAGs on link up/down events · 23ca3b72

由 Vladimir Oltean 提交于 2月 06, 2021

At present there is an issue when ocelot is offloading a bonding
interface, but one of the links of the physical ports goes down. Traffic
keeps being hashed towards that destination, and of course gets dropped
on egress.

Monitor the netdev notifier events emitted by the bonding driver for
changes in the physical state of lower interfaces, to determine which
ports are active and which ones are no longer.

Then extend ocelot_get_bond_mask to return either the configured bonding
interfaces, or the active ones, depending on a boolean argument. The
code that does rebalancing only needs to do so among the active ports,
whereas the bridge forwarding mask and the logical port IDs still need
to look at the permanently bonded ports.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

23ca3b72

net: mscc: ocelot: rename aggr_count to num_ports_in_lag · 21357b61

由 Vladimir Oltean 提交于 2月 06, 2021

It makes it a bit easier to read and understand the code that deals with
balancing the 16 aggregation codes among the ports in a certain LAG.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

21357b61

net: mscc: ocelot: drop the use of the "lags" array · 528d3f19

由 Vladimir Oltean 提交于 2月 06, 2021

We can now simplify the implementation by always using ocelot_get_bond_mask
to look up the other ports that are offloading the same bonding interface
as us.

In ocelot_set_aggr_pgids, the code had a way to uniquely iterate through
LAGs. We need to achieve the same behavior by marking each LAG as visited,
which we do now by using a temporary 32-bit "visited" bitmask. This is
ok and we do not need dynamic memory allocation, because we know that
this switch architecture will not have more than 32 ports (the PGID port
masks are 32-bit anyway).
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

528d3f19

net: mscc: ocelot: set up logical port IDs centrally · 2527f2e8

由 Vladimir Oltean 提交于 2月 06, 2021

The setup of logical port IDs is done in two places: from the inconclusively
named ocelot_setup_lag and from ocelot_port_lag_leave, a function that
also calls ocelot_setup_lag (which apparently does an incomplete setup
of the LAG).

To improve this situation, we can rename ocelot_setup_lag into
ocelot_setup_logical_port_ids, and drop the "lag" argument. It will now
set up the logical port IDs of all switch ports, which may be just
slightly more inefficient but more maintainable.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

2527f2e8

net: mscc: ocelot: avoid unneeded "lp" variable in LAG join · 2e9f4afa

由 Vladimir Oltean 提交于 2月 06, 2021

The index of the LAG is equal to the logical port ID that all the
physical port members have, which is further equal to the index of the
first physical port that is a member of the LAG.

The code gets a bit carried away with logic like this:

	if (a == b)
		c = a;
	else
		c = b;

which can be simplified, of course, into:

	c = b;

(with a being port, b being lp, c being lag)

This further makes the "lp" variable redundant, since we can use "lag"
everywhere where "lp" (logical port) was used. So instead of a "c = b"
assignment, we can do a complete deletion of b. Only one comment here:

		if (bond_mask) {
			lp = __ffs(bond_mask);
			ocelot->lags[lp] = 0;
		}

lp was clobbered before, because it was used as a temporary variable to
hold the new smallest port ID from the bond. Now that we don't have "lp"
any longer, we'll just avoid the temporary variable and zeroize the
bonding mask directly.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

2e9f4afa

net: mscc: ocelot: set up the bonding mask in a way that avoids a net_device · b80af659

由 Vladimir Oltean 提交于 2月 06, 2021

Since this code should be called from pure switchdev as well as from
DSA, we must find a way to determine the bonding mask not by looking
directly at the net_device lowers of the bonding interface, since those
could have different private structures.

We keep a pointer to the bonding upper interface, if present, in struct
ocelot_port. Then the bonding mask becomes the bitwise OR of all ports
that have the same bonding upper interface. This adds a duplication of
functionality with the current "lags" array, but the duplication will be
short-lived, since further patches will remove the latter completely.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

b80af659

net: mscc: ocelot: use ipv6 in the aggregation code · f79c20c8

由 Vladimir Oltean 提交于 2月 06, 2021

IPv6 header information is not currently part of the entropy source for
the 4-bit aggregation code used for LAG offload, even though it could be.
The hardware reference manual says about these fields:

ANA::AGGR_CFG.AC_IP6_TCPUDP_PORT_ENA
Use IPv6 TCP/UDP port when calculating aggregation code. Configure
identically for all ports. Recommended value is 1.

ANA::AGGR_CFG.AC_IP6_FLOW_LBL_ENA
Use IPv6 flow label when calculating AC. Configure identically for all
ports. Recommended value is 1.

Integration with the xmit_hash_policy of the bonding interface is TBD.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

f79c20c8

net: mscc: ocelot: don't refuse bonding interfaces we can't offload · 583cbbe3

由 Vladimir Oltean 提交于 2月 06, 2021

Since switchdev/DSA exposes network interfaces that fulfill many of the
same user space expectations that dedicated NICs do, it makes sense to
not deny bonding interfaces with a bonding policy that we cannot offload,
but instead allow the bonding driver to select the egress interface in
software.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

583cbbe3

net: mscc: ocelot: use a switch-case statement in ocelot_netdevice_event · 41e66fa2

由 Vladimir Oltean 提交于 2月 06, 2021

Make ocelot's net device event handler more streamlined by structuring
it in a similar way with others. The inspiration here was
dsa_slave_netdevice_event.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

41e66fa2

net: mscc: ocelot: rename ocelot_netdevice_port_event to ocelot_netdevice_changeupper · 662981bb

由 Vladimir Oltean 提交于 2月 06, 2021

ocelot_netdevice_port_event treats a single event, NETDEV_CHANGEUPPER.
So we can remove the check for the type of event, and rename the
function to be more suggestive, since there already is a function with a
very similar name of ocelot_netdevice_event.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

662981bb

net: hns3: replace macro of max qset number with specification · 3f094bd1

由 Guangbin Huang 提交于 2月 05, 2021

The max qset number is a fixed value now and it is defined by a macro.
In order to support other value in different kinds of device, it is
better to use specification queried from firmware to replace macro.
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3f094bd1

net: hns3: debugfs add max tm rate specification print · 2783e77b

由 Guangbin Huang 提交于 2月 05, 2021

In order to add a method to check the specification of max tm rate
for debugging, function hns3_dbg_dev_specs() adds this value print.
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

2783e77b

net: hns3: add support for obtaining the maximum frame size · e070c8b9

由 Yufeng Mo 提交于 2月 05, 2021

Since the newer hardware may supports different frame size,
so add support to obtain the capability from the firmware
instead of the fixed value.
Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e070c8b9

net: hns3: optimize the code when update the tc info · 693e4415

由 GuoJia Liao 提交于 2月 05, 2021

When update the TC info for NIC, there are some differences
between PF and VF. Currently, four "vport->vport_id" are
used to distinguish PF or VF. So merge them into one to
improve readability and maintainability of code.
Signed-off-by: NGuoJia Liao <liaoguojia@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

693e4415

net: hns3: RSS indirection table use device specification · 87ce161e

由 Guangbin Huang 提交于 2月 05, 2021

As RSS indirection table size may be different in different
hardware. Instead of using macro, this value is better to use
device specification which querying from firmware.

BTW, RSS indirection table should be allocated by the queried
size instead the static array.

.get_rss_indir_size in struct hnae3_ae_ops is not used now,
so remove it as well.
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

87ce161e

net: hns3: add api capability bits for firmware · 1cef42c8

由 Jian Shen 提交于 2月 05, 2021

To improve the compatibility of firmware for driver, help firmware
to deal with different api commands, add api capability bits when
initialize the command queue.
Signed-off-by: NJian Shen <shenjian15@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

1cef42c8

net: dpaa2-mac: add backplane link mode support · 085f1776

由 Russell King 提交于 2月 05, 2021

Add support for backplane link mode, which is, according to discussions
with NXP earlier in the year, is a mode where the OS (Linux) is able to
manage the PCS and Serdes itself.

This commit prepares the ground work for allowing 1G fiber connections
to be used with DPAA2 on the SolidRun CEX7 platforms.
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

085f1776

net: dpaa2-mac: add 1000BASE-X support · 46c518c8

由 Russell King 提交于 2月 05, 2021

Now that pcs-lynx supports 1000BASE-X, add support for this interface
mode to dpaa2-mac. pcs-lynx can be switched at runtime between SGMII
and 1000BASE-X mode, so allow dpaa2-mac to switch between these as
well.

This commit prepares the ground work for allowing 1G fiber connections
to be used with DPAA2 on the SolidRun CEX7 platforms.
Reviewed-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

46c518c8

net: dpaa2: Use napi_alloc_frag_align() to avoid the memory waste · d0dfbb99

由 Kevin Hao 提交于 2月 04, 2021

The napi_alloc_frag_align() will guarantee that a correctly align
buffer address is returned. So use this function to simplify the buffer
alloc and avoid the unnecessary memory waste.
Signed-off-by: NKevin Hao <haokexin@gmail.com>
Reviewed-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

d0dfbb99

net: octeontx2: Use napi_alloc_frag_align() to avoid the memory waste · 1b041601

由 Kevin Hao 提交于 2月 04, 2021

The napi_alloc_frag_align() will guarantee that a correctly align
buffer address is returned. So use this function to simplify the buffer
alloc and avoid the unnecessary memory waste.
Signed-off-by: NKevin Hao <haokexin@gmail.com>
Tested-by: NSubbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

1b041601

net: dwc-xlgmac: Fix spelling mistake in function name · a455fcd7

由 Colin Ian King 提交于 2月 04, 2021

There is a spelling mistake in the function name alloc_channles_and_rings.
Fix this by renaming it to alloc_channels_and_rings.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210204094944.51460-1-colin.king@canonical.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

a455fcd7

net: qualcomm: rmnet: Fix rx_handler for non-linear skbs · d698e6a0

由 Loic Poulain 提交于 2月 04, 2021

There is no guarantee that rmnet rx_handler is only fed with linear
skbs, but current rmnet implementation does not check that, leading
to crash in case of non linear skbs processed as linear ones.

Fix that by ensuring skb linearization before processing.
Signed-off-by: NLoic Poulain <loic.poulain@linaro.org>
Acked-by: NWillem de Bruijn <willemb@google.com>
Reviewed-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Link: https://lore.kernel.org/r/1612428002-12333-2-git-send-email-loic.poulain@linaro.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

d698e6a0

net: ethernet: ti: fix netdevice stats for XDP · a8225efd

由 Lorenzo Bianconi 提交于 2月 03, 2021

Align netdevice statistics when the device is running in XDP mode
to other upstream drivers. In particular report to user-space rx
packets even if they are not forwarded to the networking stack
(XDP_PASS) but if they are redirected (XDP_REDIRECT), dropped (XDP_DROP)
or sent back using the same interface (XDP_TX). This patch allows the
system administrator to verify the device is receiving data correctly.
Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/a457cb17dd9c58c116d64ee34c354b2e89c0ff8f.1612375372.git.lorenzo@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

a8225efd

dpaa2-eth: Simplify the calculation of variables · b91b3a21

由 Jiapeng Chong 提交于 2月 02, 2021

Fix the following coccicheck warnings:

./drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1651:36-38: WARNING
!A || A && B is equivalent to !A || B.
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/1612260157-128026-1-git-send-email-jiapeng.chong@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

b91b3a21

06 2月, 2021 5 次提交

net/mlx5e: Handle FIB events to update tunnel endpoint device · 8914add2

由 Vlad Buslov 提交于 1月 25, 2021

Process FIB route update events to dynamically update the stack device
rules when tunnel routing changes. Use rtnl lock to prevent FIB event
handler from running concurrently with neigh update and neigh stats
workqueue tasks. Use encap_tbl_lock mutex to synchronize with TC rule
update path that doesn't use rtnl lock.

FIB event workflow for encap flows:

- Unoffload all flows attached to route encaps from slow or fast path
depending on encap destination endpoint neigh state.

- Update encap IP header according to new route dev.

- Update flows mod_hdr action that is responsible for overwriting reg_c0
source port bits to source port of new underlying VF of new route dev. This
step requires changing flow create/delete code to save flow parse attribute
mod_hdr_acts structure for whole flow lifetime instead of deallocating it
after flow creation. Refactor mod_hdr code to allow saving id of individual
mod_hdr actions and updating them with dedicated helper.

- Offload all flows to either slow or fast path depending on encap
destination endpoint neigh state.

FIB event workflow for decap flows:

- Unoffload all route flows from hardware. When last route flow is deleted
all indirect table rules for the route dev will also be deleted.

- Update flow attr decap_vport and destination MAC according to underlying
VF of new rote dev.

- Offload all route flows back to hardware creating new indirect table
rules according to updated flow attribute data.

Extract some neigh update code to helper functions to be used by both neigh
update and route update infrastructure.
Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

8914add2

net/mlx5e: Rename some encap-specific API to generic names · 021905f8

由 Vlad Buslov 提交于 1月 25, 2021

Some of the encap-specific functions and fields will also be used by route
update infrastructure in following patches. Rename them to generic names.
Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

021905f8

net/mlx5e: TC preparation refactoring for routing update event · c7b9038d

由 Vlad Buslov 提交于 1月 25, 2021

Following patch in series implement routing update event which requires
ability to modify rule match_to_reg modify header actions dynamically
during rule lifetime. In order to accommodate such behavior, refactor and
extend TC infrastructure in following ways:

- Modify mod_hdr infrastructure to preserve its parse attribute for whole
rule lifetime, instead of deallocating it after rule creation.

- Extend match_to_reg infrastructure with new function
mlx5e_tc_match_to_reg_set_and_get_id() that returns mod_hdr action id that
can be used afterwards to update the action, and
mlx5e_tc_match_to_reg_mod_hdr_change() that can modify existing actions by
its id.

- Extend tun API with new functions mlx5e_tc_tun_update_header_ipv{4|6}()
that are used to updated existing encap entry tunnel header.
Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

c7b9038d

net/mlx5e: Refactor neigh update infrastructure · 2221d954

由 Vlad Buslov 提交于 9月 20, 2020

Following patches in series implements route update which can cause encap
entries to migrate between routing devices. Consecutively, their parent
nhe's need to be also transferable between devices instead of having neigh
device as a part of their immutable key. Move neigh device from struct
mlx5_neigh to struct mlx5e_neigh_hash_entry and check that nhe and neigh
devices are the same in workqueue neigh update handler.

Save neigh net_device that can change dynamically in dedicated nhe->dev
field. With FIB event handler that is implemented in following patches
changing nhe->dev, NETEVENT_DELAY_PROBE_TIME_UPDATE handler can
concurrently access the nhe entry when traversing neigh list under rcu read
lock. Processing stale values in that handler doesn't change the handler
logic, so just wrap all accesses to the dev pointer in {WRITE|READ}_ONCE()
helpers.
Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

2221d954

net/mlx5e: Create route entry infrastructure · 777bb800

由 Vlad Buslov 提交于 9月 21, 2020

Implement dedicated route entry infrastructure to be used in following
patch by route update event. Both encap (indirectly through their
corresponding encap entries) and decap (directly) flows are attached to
routing entry. Since route update also requires updating encap (route
device MAC address is a source MAC address of tunnel encapsulation), same
encap_tbl_lock mutex is used for synchronization.

The new infrastructure looks similar to existing infrastructures for shared
encap, mod_hdr and hairpin entries:

- Per-eswitch hash table is used for quick entry lookup.

- Flows are attached to per-entry linked list and hold reference to entry
  during their lifetime.

- Atomic reference counting and rcu mechanisms are used as synchronization
  primitives for concurrent access.

The infrastructure also enables connection tracking on stacked devices
topology by attaching CT chain 0 flow on tunneling dev to decap route
entry.
Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

777bb800

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功