提交 · e9cf8990faea42a0809b9f1e618effd6fd836e8a · openeuler / Kernel

04 7月, 2022 10 次提交

mlxsw: Add ubridge to config profile · e9cf8990

由 Amit Cohen 提交于 7月 04, 2022

The unified bridge model is enabled via the CONFIG_PROFILE command
during driver initialization. Add the definition of the relevant fields
to the command's payload in preparation for unified bridge enablement.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e9cf8990

mlxsw: Add support for 802.1Q FID family · bf73904f

由 Amit Cohen 提交于 7月 04, 2022

Using the legacy bridge model, there is no VID classification at egress
for 802.1Q FIDs, which means that the VID is maintained.

This behavior cause the limitation that 802.1Q FIDs cannot work with VXLAN.
This limitation stems from the fact that a decapsulated VXLAN packet should
not contain a VLAN tag. If such a packet was to egress from a local port
using a 802.1Q FID, it would "maintain" its VLAN on egress, which is no
VLAN at all.

Currently 802.1Q FIDs are emulated in mlxsw driver using 802.1D FIDs. Using
unified bridge model, there is a FID->VID mapping, so it is possible to
stop emulating 802.1Q FIDs.

The main changes are:
1. Use 'SFGC.bridge_type' = 0, to separate between 802.1Q FIDs and
   802.1D FIDs.
2. Use VLAN RIF instead of the emulated one (VLAN_EMU which is emulated
   using FID RIF).
3. Create VID->FID mapping when the FID is created. Then when a new port
   is mapped to the FID, if it not in virtual mode, no new mapping is
   needed. Save the new port in 'port_vid_list', to be able to update a
   RIF in all {Port, VID}->FID mappings in case that the port will be in
   virtual mode later.
4. Add a dedicated operation function per FID family to update RIF for
   VID->FID mappings. For 802.1d and rFID families, just return. For
   802.1q family, handle the global mapping which is created for new 802.1q
   FID.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf73904f

mlxsw: Add new FID families for unified bridge model · d4324e31

由 Amit Cohen 提交于 7月 04, 2022

In the unified bridge model, mlxsw will no longer emulate 802.1Q FIDs
using 802.1D FIDs. The new FID table will look as follows:

     +---------------+
     | 802.1q FIDs   | 4K entries
     | [1..4094]     |
     +---------------+
     | 802.1d FIDs   | 1K entries
     | [4095..5118]  |
     +---------------+
     | Dummy FIDs    | 1 entry
     | [5119..5119]  |
     +---------------+
     | rFIDs         | 11K entries
     | [5120..16383] |
     +---------------+

In order to make the change easier to review, four new temporary FID
families will be added (e.g., MLXSW_SP_FID_TYPE_8021D_UB) and will not
be registered with the FID core until mlxsw is flipped to use the unified
bridge model.

Add .1d, rfid and dummy FID families for unified bridge, the next patch
will add .1q family separately as it requires more changes.

The following changes are required:
1. Add 'smpe_index_valid' field to 'struct mlxsw_sp_fid_family' and set
   SFMR.smpe accordingly. SMPE index is reserved for rFIDs, as their
   flooding is handled by firmware, and always reserved in Spectrum-1,
   as it is configured as part of PGT table.

2. Add 'ubridge' field to 'struct mlxsw_sp_fid_family'. This field will
   be removed later, use it in mlxsw_sp_fid_family_{register,unregister}()
   to skip the registration / unregistration of the new families when the
   legacy model is used.

3. Indexes - the start and end indexes of each FID family will need to be
   changed according to the above diagram.

4. Add flood tables for unified bridge model, use 'fid_offset' as table
   type, as in the new model the access to flood tables will be using
   'fid_offset' calculation.

5. FID family operation changes:
   a. rFID supposed to be created using SFMR, as it is not created by
      firmware using unified bridge model.
   b. port_vid_map() should perform SVFA for rFID, as the mapping is not
      created by firmware using unified bridge model.
   c. flood_index() is not aligned to the new model, as this function will
      be removed later.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4324e31

mlxsw: Add support for VLAN RIFs · 662761d8

由 Amit Cohen 提交于 7月 04, 2022

Router interfaces (RIFs) constructed on top of VLAN-aware bridges are of
'VLAN' type, whereas RIFs constructed on top of VLAN-unaware bridges are of
'FID' type.

Currently 802.1Q FIDs are emulated using 802.1D FIDs, therefore VLAN RIFs
are emulated using FID RIFs. As part of converting the driver to use
unified bridge model, 802.1Q FIDs and VLAN RIFs will be used.

The egress FID is required for VLAN RIFs in Spectrum-2 and above, but not
in Spectrum-1, as in Spectrum-1 the mapping for VLAN RIFs is VID->FID,
while in other ASICs it is FID->FID. The reason for the change is that it
is more scalable to reuse the FID->FID entry than creating multiple
{Port, VID}->FID entries for the router port. Use the existing operation
structure to separate the configuration between different ASICs.

Add support for VLAN RIFs, most of the configurations are same to FID
RIFs.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

662761d8

mlxsw: Configure egress FID classification after routing · 058de325

由 Amit Cohen 提交于 7月 04, 2022

After routing, a packet needs to perform an L2 lookup using the DMAC it got
from the routing and a FID. In unified bridge model, the egress FID
configuration needs to be performed by software.

It is configured by RITR for both sub-port RIFs and FID RIFs. Currently
FID RIFs already configure eFID. Add eFID configuration for sub-port RIFs.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

058de325

mlxsw: spectrum_router: Do not configure VID for sub-port RIFs · 2c3ae763

由 Amit Cohen 提交于 7月 04, 2022

The field 'vid' in RITR is reserved when unified bridge model is used
and the RIF's type is sub-port RIF. Instead, ingress VID is configured via
SVFA and egress VID is configured via REIV.

Set 'vid' to zero in RITR register for sub-port RIF when unified bridge
model is used.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c3ae763

mlxsw: spectrum_fid: Configure layer 3 egress VID classification · d4b464d2

由 Amit Cohen 提交于 7月 04, 2022

After routing, the device always consults a table that determines the
packet's egress VID based on {egress RIF, egress local port}. In the
unified bridge model, it is up to software to maintain this table via REIV
register.

The table needs to be updated in the following flows:
1. When a RIF is set on a FID, need to iterate over the FID's {Port, VID}
   list and issue REIV write to map the {RIF, Port} to the given VID.
2. When a {Port, VID} is mapped to a FID and the FID already has a RIF,
   need to issue REIV write with a single record to map the {RIF, Port}
   to the given VID.

REIV register supports a simultaneous update of 256 ports, so use this
capability for the first flow.

Handle the two above mentioned flows.

Add mlxsw_sp_fid_evid_map() function to handle egress VID classification
for both unicast and multicast. Layer 2 multicast configuration is already
done in the driver, just move it to the new function.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4b464d2

mlxsw: Configure ingress RIF classification · fea20547

由 Amit Cohen 提交于 7月 04, 2022

Before layer 2 forwarding, the device classifies an incoming packet to
a FID. The classification is done based on one of the following keys:

1. FID
2. VNI (after decapsulation)
3. VID / {Port, VID}

After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that
needs to be routed will ingress the router block.

In the legacy model, when a RIF was created / destroyed, it was
firmware's responsibility to update it in the previously mentioned FID
classification records. In the unified bridge model, this responsibility
moved to software.

The third classification requires to iterate over the FID's {Port, VID}
list and issue SVFA write with the correct mapping table according to the
port's mode (virtual or not). We never map multiple VLANs to the same FID
using VID->FID mapping, so such a mapping needs to be performed once.

When a new FID classification entry is configured and the FID already has
a RIF, set the RIF as part of SVFA configuration.

The reverse needs to be done when clearing a RIF from a FID. Currently,
clearing is done by issuing mlxsw_sp_fid_rif_set() with a NULL RIF pointer.
Instead, introduce mlxsw_sp_fid_rif_unset().

Note that mlxsw_sp_fid_rif_set() is called after the RIF is fully
operational, so it conforms to the internal requirement regarding
SVFA.irif_v: "Must not be set for a non-enabled RIF".

Do not set the ingress RIF for rFIDs, as the {Port, VID}->rFID entry is
configured by firmware when legacy model is used, a next patch will
handle this configuration for rFIDs and unified bridge model.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fea20547

mlxsw: spectrum_fid: Configure VNI to FID classification · 8cfc7f77

由 Amit Cohen 提交于 7月 04, 2022

In the new model, SFMR no longer configures both VNI->FID and FID->VNI
classifications, but only the later. The former needs to be configured via
SVFA.

Add SVFA configuration as part of vni_set() and vni_clear().
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8cfc7f77

mlxsw: Configure egress VID for unicast FDB entries · 53d7ae53

由 Amit Cohen 提交于 7月 04, 2022

Using unified bridge model, firmware no longer configures the egress VID
"under the hood" and moves this responsibility to software.

For layer 2, this means that software needs to determine the egress VID
for both unicast (i.e., FDB) and multicast (i.e., MDB and flooding) flows.

Unicast FDB records and unicast LAG FDB records have new fields - "set_vid"
and "vid", set them. For records which point to router port, do not set
these fields.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53d7ae53

03 7月, 2022 15 次提交

net/mlx5e: TC, Support offloading police action · a8d52b02

由 Jianbo Liu 提交于 6月 22, 2021

Add parsing support by implementing struct mlx5e_tc_act for police
action.

TC rule with police actions is broken down into several rules in
different tables. One rule with the original match in the original
flow table, which set fte_id, do metering, and jump to the post_meter
table. If there are more police actions, more rules are created for
each of them. Besides, a last rule is created in the end.

In post_meter table, there are two pre-defined rules, one is to drop
packet if its packet color is RED, the other is to jump back to
post_act table. As fte_id is updated before jumping, the rule for next
meter is matched to do another round of metering (if there are
multiple meters in the flow rule). Otherwise, last fte_id is matched
and do the original actions.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

a8d52b02

net/mlx5e: Add flow_action to parse state · 03a92a93

由 Jianbo Liu 提交于 3月 01, 2022

As a preparation for validating police action, adds flow_action to
parse state, which is to passed to parsing callbacks.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

03a92a93

net/mlx5e: Add post meter table for flow metering · 06fe52a4

由 Jianbo Liu 提交于 6月 18, 2021

Flow meter object monitors the packets rate for the flows it is
attached to, and color packets with GREEN or RED. The post meter table
is used to check the color. Packet is dropped if it's RED, or
forwarded to post_act table if GREEN.

Packet color will be set to 8 LSB of the register C5, so they are
reserved for metering, which are previously used for matching fte id.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

06fe52a4

net/mlx5e: Add generic macros to use metadata register mapping · 17c5da03

由 Jianbo Liu 提交于 11月 01, 2021

There are many definitions to get bits and mask for different types of
metadata register mapping, add generic macros to unify them.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

17c5da03

net/mlx5e: Get or put meter by the index of tc police action · b8acfd4f

由 Jianbo Liu 提交于 6月 07, 2021

Add functions to create and destroy flow meter aso object.
This object only supports the range allocation. 64 objects are
allocated at a time, and there are two meters in each object.
Usually only one meter is allocated for a flow, so bitmap is used
to manage these 128 meters.

TC police action is mapped to hardware meter. As the index is unique
for each police action, add APIs to allocate or free hardware meter by
the index. If the meter is already created, increment its refcnt,
otherwise create new one. If police action has different parameters,
update hardware meter accordingly.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

b8acfd4f

net/mlx5e: Add support to modify hardware flow meter parameters · 6ddac26c

由 Jianbo Liu 提交于 6月 07, 2021

The policing rate and burst from user are converted to flow meter
parameters in hardware. These parameters are set or modified by
ACCESS_ASO WQE, add function to support it.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

6ddac26c

net/mlx5e: Prepare for flow meter offload if hardware supports it · 74e6b2a8

由 Jianbo Liu 提交于 6月 09, 2021

If flow meter aso object is supported, set the allocated range, and
initialize aso wqe.

The allocated range is indicated by log_meter_aso_granularity in HW
capabilities, and currently is 6.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Reviewed-by: NMaor Dickman <maord@nvidia.com>
Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

74e6b2a8

net/mlx5: Implement interfaces to control ASO SQ and CQ · c491ded0

由 Jianbo Liu 提交于 4月 29, 2022

Add interfaces to use ASO object control channel. The channel consists
of a control SQ and CQ to which user can post ACCESS_ASO work requests
to modify ASO objects. The functions to get wqe from SQ, fill wqe,
post the request, and poll the completion of the work, are provided.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

c491ded0

net/mlx5: Add support to create SQ and CQ for ASO · cdd04f4d

由 Jianbo Liu 提交于 4月 30, 2022

Add a separate API to create SQ and CQ for advanced steering
operations (ASO).

Since the mlx5_en API to create these resources is strongly coupled
with netdev channels and datapath elements, this API provides an
alternative for creating send queues that are used for ASO.

Currently the API allows creating channels with 2 wqbbs only - meaning
the support will be for a single ACCESS_ASO wqe with data at a time.
Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

cdd04f4d

net/mlx5: E-switch: Change eswitch mode only via devlink command · b6f2846a

由 Chris Mi 提交于 5月 30, 2022

Enable or disable switchdev according to the eswitch mode set by
devlink command. So it is not changed by other functions anymore.
Signed-off-by: NChris Mi <cmi@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

b6f2846a

net/mlx5: E-switch, Remove dependency between sriov and eswitch mode · f019679e

由 Chris Mi 提交于 5月 30, 2022

Currently, there are three eswitch modes, none, legacy and switchdev.
None is the default mode. Remove redundant none mode as eswitch mode
should always be either legacy mode or switchdev mode.

With this patch, there are two behavior changes:

1. Legacy becomes the default mode. When querying eswitch mode using
   devlink, a valid mode is always returned.
2. When disabling sriov, the eswitch mode will not change, only vfs
   are unloaded.
Signed-off-by: NChris Mi <cmi@nvidia.com>
Reviewed-by: NMaor Dickman <maord@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

f019679e

net/mlx5: E-switch, Introduce flag to indicate if fdb table is created · fbd43b72

由 Chris Mi 提交于 5月 05, 2022

Introduce flag to indicate if fdb table is created as a pre-step
to prepare for removing dependency between sriov and eswitch mode
in the downstream patches.
Signed-off-by: NChris Mi <cmi@nvidia.com>
Reviewed-by: NMark Bloch <mbloch@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

fbd43b72

net/mlx5: E-switch, Introduce flag to indicate if vport acl namespace is created · ea5872dd

由 Chris Mi 提交于 2月 10, 2022

Eswitch vport acl namespace is needed when loading vfs. There is
no need to free and reallocate it when switching eswitch mode.
Introduce flag to indicate if it is created or not. When needed,
create it. Only free it when the driver is unloaded or in bare
metal mode.
Signed-off-by: NChris Mi <cmi@nvidia.com>
Reviewed-by: NMark Bloch <mbloch@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

ea5872dd

net/mlx5: delete dead code in mlx5_esw_unlock() · 8e755f7a

由 Dan Carpenter 提交于 5月 30, 2022

Smatch complains about this function:

    drivers/net/ethernet/mellanox/mlx5/core/eswitch.c:2000 mlx5_esw_unlock()
    warn: inconsistent returns '&esw->mode_lock'.

Before commit ec2fa47d ("net/mlx5: Lag, use lag lock") there
used to be a matching mlx5_esw_lock() function and the lock and
unlock functions were symmetric.  But now we take the lock
unconditionally and must unlock unconditionally as well.

As near as I can tell this is dead code and can just be deleted.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

8e755f7a

net/mlx5: Delete ipsec_fs header file as not used · 9de64ae8

由 Leon Romanovsky 提交于 5月 11, 2022

ipsec_fs.h is not used and can be safely deleted.
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

9de64ae8

02 7月, 2022 1 次提交

net: add skb_[inner_]tcp_all_headers helpers · 504148fe

由 Eric Dumazet 提交于 6月 30, 2022

Most drivers use "skb_transport_offset(skb) + tcp_hdrlen(skb)"
to compute headers length for a TCP packet, but others
use more convoluted (but equivalent) ways.

Add skb_tcp_all_headers() and skb_inner_tcp_all_headers()
helpers to harmonize this a bit.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

504148fe

01 7月, 2022 1 次提交

mellanox/mlxsw: fix repeated words in comments · 62783827

由 Jilin Yuan 提交于 6月 30, 2022

Delete the redundant word 'action'.
Delete the redundant word 'refer'.
Delete the redundant word 'for'.
Signed-off-by: NJilin Yuan <yuanjilin@cdjrlc.com>
Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62783827

30 6月, 2022 1 次提交

mlxsw: spectrum_router: Fix rollback in tunnel next hop init · 665030fd

由 Petr Machata 提交于 6月 29, 2022

In mlxsw_sp_nexthop6_init(), a next hop is always added to the router
linked list, and mlxsw_sp_nexthop_type_init() is invoked afterwards. When
that function results in an error, the next hop will not have been removed
from the linked list. As the error is propagated upwards and the caller
frees the next hop object, the linked list ends up holding an invalid
object.

A similar issue comes up with mlxsw_sp_nexthop4_init(), where rollback
block does exist, however does not include the linked list removal.

Both IPv6 and IPv4 next hops have a similar issue with next-hop counter
rollbacks. As these were introduced in the same patchset as the next hop
linked list, include the cleanup in this patch.

Fixes: dbe4598c ("mlxsw: spectrum_router: Keep nexthops in a linked list")
Fixes: a5390278 ("mlxsw: spectrum: Add support for setting counters on nexthops")
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220629070205.803952-1-idosch@nvidia.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

665030fd

29 6月, 2022 10 次提交

mlxsw: spectrum_switchdev: Convert MDB code to use PGT APIs · e28cd993

由 Amit Cohen 提交于 6月 29, 2022

The previous patches added common APIs for maintaining PGT (Port Group
Table) table. In the legacy model, software did not interact with this
table directly. Instead, it was accessed by firmware in response to
registers such as SFTR and SMID. In the new model, software has full
control over the PGT table using the SMID register.

The configuration of MDB entries is already done via SMID, so the new
PGT APIs can be used also using the legacy model, the only difference is
that MID index should be aligned to bridge model. See a previous patch
which added API for that.

The main changes are:
- MDB code does not maintain bitmap of ports in MDB entry anymore, instead,
  it stores a list of ports with additional information.
- MDB code does not configure SMID register directly anymore, it will be
  done via PGT API when port is first added or removed.
- Today MDB code does not update SMID when port is added/removed while
  multicast is disabled. Instead, it maintains bitmap of ports and once
  multicast is enabled, it rewrite the entry to hardware. Using PGT APIs,
  the entry will be updated also when multicast is disabled, but the
  mapping between {MAC, FID}->{MID} will not appear in SFD register. It
  means that SMID will be updated all the time and disable/enable multicast
  will impact only SFD configuration.
- For multicast router, today only SMID is updated and the bitmap is not
  updated. Using the new list of ports, there is a reference count for each
  port, so it can be saved in software also. For such port,
  'struct mlxsw_sp_mdb_entry.ports_count' will not be updated and the
  port in the list will be marked as 'mrouter'.
- Finally, `struct mlxsw_sp_mid.in_hw` is not needed anymore.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e28cd993

mlxsw: spectrum_switchdev: Flush port from MDB entries according to FID index · 4c3f7442

由 Amit Cohen 提交于 6月 29, 2022

Currently, flushing port from all MDB entries is done when the last VLAN
is removed. This behavior is inaccurate, as port can be removed while there
is another port which uses the same VLAN, in such case, this is not the
last port which uses this VLAN and removed, but this port is supposed to be
removed from the MDB entries.

Flush the port from MDB when it is removed, regardless the state of other
ports. Flush only the MDB entries which are relevant for the same FID
index.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c3f7442

mlxsw: spectrum_switchdev: Add support for getting and putting MDB entry · 7434ed61

由 Amit Cohen 提交于 6月 29, 2022

A previous patch added support for init() and fini() for MDB entries. MDB
entry can be updated, ports can be added and removed from the entry. Add
get() and put() functions, the first one checks if the entry already exists
and otherwise initializes the entry. The second removes the entry just in
case that there are no more ports in this entry.

Use the list of the ports which was added in a previous patch. When the
list contains only one port which is not multicast router, and this port
is removed, the MDB entry can be removed. Use
'struct mlxsw_sp_mdb_entry.ports_count' to know how many ports use the
entry, regardless the use of multicast router ports.

When mlxsw_sp_mc_mdb_entry_put() is called with specific port which
supposed to be removed, check if the removal will cause a deletion of
the entry. If this is the case, call mlxsw_sp_mc_mdb_entry_fini() which
first deletes the MDB entry and then releases the PGT entry, to avoid a
temporary situation in which the MDB entry points to an empty PGT entry,
as otherwise packets will be temporarily dropped instead of being flooded.

The new functions will be used in the next patches.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7434ed61

mlxsw: spectrum_switchdev: Implement mlxsw_sp_mc_mdb_entry_{init, fini}() · ea0f58d6

由 Amit Cohen 提交于 6月 29, 2022

The next patches will convert MDB code to use PGT APIs. The change will
move the responsibility of allocating MID indexes and writing PGT
configurations to hardware to PGT code. As part of this change, most of the
MDB code will be changed and improved.

As a preparation for the above mentioned change, implement
mlxsw_sp_mc_mdb_entry_{init, fini}(). Currently, there is a function
__mlxsw_sp_mc_alloc(), which does not only allocate MID. In addition,
there is no an equivalent function to free the MID. When
mlxsw_sp_port_remove_from_mid() removes the last port, it handles MID
removal. Instead, add init() and fini() functions, which use PGT APIs.

The differences between the existing and the new functions are as follows:
1. Today MDB code does not update SMID when port is added/removed while
   multicast is disabled. It maintains a bitmap of ports and once multicast
   is enabled, it writes the entry to hardware. Instead, using PGT APIs,
   the entry will be updated also when multicast is disabled, but the
   mapping between {MAC, FID}->{MID} (is configured using SFD) will be
   updated according to multicast state. It means that SMID will be updated
   all the time and disable/enable multicast will impact only SFD
   configuration.

2. Today the allocation of MID index is done as part of
   mlxsw_sp_mc_write_mdb_entry(). The fact that the entry will be
   written in hardware all the time, moves the allocation of the index to
   be as part of the MDB entry initialization. PGT API is used for the
   allocation.

3. Today the update of multicast router ports is done as part of
   mlxsw_sp_mc_write_mdb_entry(). Instead, add functions to add/remove
   all multicast router ports when entry is first added or removed. When
   new multicast router port will be added/removed, the dedicated API will
   be used to add/remove it from the existing entries.

4. A list of ports will be stored per MDB entry instead of the exiting
   bitmap. The list will contain the multicast router ports and maintain
   reference counter per port.

Add mlxsw_sp_mdb_entry_write() which is almost identical to
mlxsw_sp_port_mdb_op(). Use more clear name and align the MID index to
bridge model using PGT API. The existing function will be removed in the
next patches.

Note that PGT APIs configure the firmware using SMID register, like the
driver already does today for MDB entries, so PGT APIs can be used also
using legacy bridge model.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea0f58d6

mlxsw: spectrum_switchdev: Add support for maintaining list of ports per MDB entry · d2994e13

由 Amit Cohen 提交于 6月 29, 2022

As part of converting MDB code to use PGT APIs, PGT code stores which ports
are mapped to each PGT entry. PGT code is not aware of the type of the port
(multicast router or not), as it is not relevant there.

To be able to release an MDB entry when the there are no ports which are
not multicast routers, the entry should be aware of the state of its
ports. Add support for maintaining list of ports per MDB entry.

Each port will hold a reference count as multiple MDB entries can use the
same hardware MDB entry. It occurs because MDB entries in the Linux bridge
are keyed according to their multicast IP, when these entries are notified
to device drivers via switchdev, the multicast IP is converted to a
multicast MAC. This conversion might cause collisions, for example,
ff0e::1 and ff0e:1234::1 are both mapped to the multicast MAC
33:33:00:00:00:01.

Multicast router port will take a reference once, and will be marked as
'mrouter', then when port in the list is multicast router and its
reference value is one, it means that the entry can be removed in case
that there are no other ports which are not multicast routers. For that,
maintain a counter per MDB entry to count ports in the list, which were
added to the multicast group, and not because they are multicast routers.
When this counter is zero, the entry can be removed.

Add mlxsw_sp_mdb_entry_port_{get,put}() for regular ports and
mlxsw_sp_mdb_entry_mrouter_port_{get,put}() for multicast router ports.
Call PGT API to add or remove port from PGT entry when port is first added
or removed, according to the reference counting.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2994e13

mlxsw: spectrum_switchdev: Add support for maintaining hash table of MDB entries · 5d0512e5

由 Amit Cohen 提交于 6月 29, 2022

Currently MDB entries are stored in a list as part of
'struct mlxsw_sp_bridge_device'. Storing them in a hash table in
addition to the list will allow finding a specific entry more efficiently.

Add support for the required hash table, the next patches will insert
and remove MDB entries from the table. The existing code which adds and
removes entries will be removed and replaced by new code in the next
patches, so there is no point to adjust the existing code.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d0512e5

mlxsw: spectrum_switchdev: Save MAC and FID as a key in 'struct mlxsw_sp_mdb_entry' · 0ac98543

由 Amit Cohen 提交于 6月 29, 2022

The next patch will add support for storing all the MDB entries in a hash
table. As a preparation, save the MAC address and the FID in a
separate structure. This structure will be used later as a key for the
hash table.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ac98543

mlxsw: spectrum_switchdev: Rename MIDs list · eaa0791a

由 Amit Cohen 提交于 6月 29, 2022

Currently, the list which stores the MDB entries for a given bridge
instance is called 'mids_list'.

This name is not accurate as a MID entry stores a bitmap of ports to
which a packet needs to be replicated and a MDB entry stores the mapping
from {MAC, FID} to PGT index (MID)

Rename it to 'mdb_list'.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eaa0791a

mlxsw: spectrum_switchdev: Rename MID structure · eede53a4

由 Amit Cohen 提交于 6月 29, 2022

Currently the structure which represents MDB entry is called
'struct mlxsw_sp_mid'. This name is not accurate as a MID entry stores a
bitmap of ports to which a packet needs to be replicated and a MDB entry
stores the mapping from {MAC, FID} to PGT index (MID).

Rename the structure to 'struct mlxsw_sp_mdb_entry'. The structure
'mlxsw_sp_mid' is defined as part of spectrum.h. The only file which
uses it is spectrum_switchdev.c, so there is no reason to expose it to
other files. Move the definition to spectrum_switchdev.c.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eede53a4

mlxsw: Align PGT index to legacy bridge model · 4abaa5cc

由 Amit Cohen 提交于 6月 29, 2022

FID code reserves about 15K entries in PGT table for flooding. These
entries are just allocated and are not used yet because the code that uses
them is skipped now.

The next patches will convert MDB code to use PGT APIs. The allocation of
indexes for multicast is done after FID code reserves 15K entries.
Currently, legacy bridge model is used and firmware manages PGT table. That
means that the indexes which are allocated using PGT API are too high when
legacy bridge model is used. To not exceed firmware limitation for MDB
entries, add an API that returns the correct 'mid_index', based on bridge
model. For legacy model, subtract the number of flood entries from PGT
index. Use it to write the correct MID to SMID register. This API will be
used also from MDB code in the next patches.

PGT should not be aware of MDB and FID different usage, this API is
temporary and will be removed once unified bridge model will be used.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4abaa5cc

28 6月, 2022 2 次提交

mlxsw: spectrum_fid: Configure flooding entries using PGT APIs · fe94df6d

由 Amit Cohen 提交于 6月 27, 2022

The PGT (Port Group Table) table maps an index to a bitmap of local ports
to which a packet needs to be replicated. This table is used for layer 2
multicast and flooding.

In the legacy model, software did not interact with PGT table directly.
Instead, it was accessed by firmware in response to registers such as SFTR
and SMID. In the new model, the SFTR register is deprecated and software
has full control over the PGT table using the SMID register.

Use the new PGT APIs to allocate entries for flooding as part of flood
tables initialization. Add mlxsw_sp_fid_flood_tables_fini() to free the
allocated indexes. In addition, use PGT APIs to add/remove ports from PGT
table. The existing code which configures the flood entries via SFTR2 will
be removed later.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>

fe94df6d

mlxsw: spectrum_fid: Set 'mid_base' as part of flood tables initialization · 9f6f467a

由 Amit Cohen 提交于 6月 27, 2022

The PGT (Port Group Table) table maps an index to a bitmap of local ports
to which a packet needs to be replicated. This table is used for layer 2
multicast and flooding.

The index to PGT table which is called 'mid_index', is a result of
'mid_base' + 'fid_offset'. Using the legacy bridge model, firmware
configures 'mid_base'. However, using the new model, software is
responsible to configure it via SFGC register. The first 15K entries will
be used for flooding and the rest for multicast. The table will look as
follows:

+----------------------------+
|                            |
| 802.1q, unicast flooding   | 4K entries
|                            |
+----------------------------+
|                            |
| 802.1q, multicast flooding | 4K entries
|                            |
+----------------------------+
|                            |
| 802.1q, broadcast flooding | 4K entries
|                            |
+----------------------------+
| 802.1d, unicast flooding   | 1K entries
+----------------------------+
| 802.1d, multicast flooding | 1K entries
+----------------------------+
| 802.1d, broadcast flooding | 1K entries
+----------------------------+
|                            |
|                            |
|    Multicast entries       | The rest of the table
|                            |
|                            |
+----------------------------+

Add 'pgt_base' to 'struct mlxsw_sp_fid_family' and use it to calculate
MID base, set 'SFGC.mid_base' as part of flood tables initialization.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>

9f6f467a

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功