提交 · 66167c310deb4ac1725f81004fb4b504676ad0bf · openeuler / Kernel

30 3月, 2021 1 次提交

mlxsw: spectrum: Fix ECN marking in tunnel decapsulation · 66167c31

由 Ido Schimmel 提交于 3月 29, 2021

Cited commit changed the behavior of the software data path with regards
to the ECN marking of decapsulated packets. However, the commit did not
change other callers of __INET_ECN_decapsulate(), namely mlxsw. The
driver is using the function in order to ensure that the hardware and
software data paths act the same with regards to the ECN marking of
decapsulated packets.

The discrepancy was uncovered by commit 5aa3c334 ("selftests:
forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value") that
aligned the selftest to the new behavior. Without this patch the
selftest passes when used with veth pairs, but fails when used with
mlxsw netdevs.

Fix this by instructing the device to propagate the ECT(1) mark from the
outer header to the inner header when the inner header is ECT(0), for
both NVE and IP-in-IP tunnels.

A helper is added in order not to duplicate the code between both tunnel
types.

Fixes: b7237487 ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040")
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66167c31

04 2月, 2021 3 次提交

mlxsw: ethtool: Pass link mode in use to ethtool · 25a96f05

由 Danielle Ratson 提交于 2月 02, 2021

Currently, when user space queries the link's parameters, as speed and
duplex, each parameter is passed from the driver to ethtool.

Instead, pass the link mode bit in use.
In Spectrum-1, simply pass the bit that is set to '1' from PTYS register.
In Spectrum-2, pass the first link mode bit in the mask of the used
link mode.
Signed-off-by: NDanielle Ratson <danieller@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

25a96f05

mlxsw: ethtool: Add support for setting lanes when autoneg is off · 763ece86

由 Danielle Ratson 提交于 2月 02, 2021

Currently, when auto negotiation is set to off, the user can force a
specific speed or both speed and duplex. The user cannot influence the
number of lanes that will be forced.

Add support for setting speed along with lanes so one would be able
to choose how many lanes will be forced.

When lanes parameter is passed from user space, choose the link mode
that its actual width equals to it.
Otherwise, the default link mode will be the one that supports the width
of the port.
Signed-off-by: NDanielle Ratson <danieller@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

763ece86

mlxsw: ethtool: Remove max lanes filtering · 5fc4053d

由 Danielle Ratson 提交于 2月 02, 2021

Currently, when a speed can be supported by different number of lanes,
the supported link modes bitmask contains only link modes with a single
number of lanes.

This was done in order to prevent auto negotiation on number of
lanes after 50G-1-lane and 100G-2-lanes link modes were introduced.

For example, if a port's max width is 4, only link modes with 4 lanes
will be presented as supported by that port, so 100G is always achieved by
4 lanes of 25G.

After the previous patches that allow selection of the number of lanes,
auto negotiation on number of lanes becomes practical.

Remove that filtering of the maximum number of lanes supported link modes,
so indeed all the supported and advertised link modes will be shown.
Signed-off-by: NDanielle Ratson <danieller@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

5fc4053d

23 1月, 2021 1 次提交

mlxsw: Register physical ports as a devlink resource · 321f7ab0

由 Danielle Ratson 提交于 1月 21, 2021

The switch ASIC has a limited capacity of physical ('flavour physical'
in devlink terminology) ports that it can support. While each system is
brought up with a different number of ports, this number can be
increased via splitting up to the ASIC's limit.

Expose physical ports as a devlink resource so that user space will have
visibility to the maximum number of ports that can be supported and the
current occupancy.

In addition, add a "Generic Resources" section in devlink-resource
documentation so the different drivers will be aligned by the same resource
name when exposing to user space.
Signed-off-by: NDanielle Ratson <danieller@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

321f7ab0

09 12月, 2020 2 次提交

mlxsw: spectrum: Publish mlxsw_sp_ethtype_to_sver_type() · 4418096e

由 Amit Cohen 提交于 12月 08, 2020

Declare mlxsw_sp_ethtype_to_sver_type() in spectrum.h to enable using it
in other files.

It will be used in the next patch to map between EtherType and the
relevant value configured by SVER register.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4418096e

mlxsw: Save EtherType as part of mlxsw_sp_nve_params · 0913a24b

由 Amit Cohen 提交于 12月 08, 2020

Add EtherType field to mlxsw_sp_nve_params struct.
Set it when VxLAN device is added to bridge device.

This field is needed to configure which EtherType will be used when
VLAN is pushed at ingress of the tunnel port.

Use ETH_P_8021Q for tunnel port enslaved to 802.1d and 802.1q bridges.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0913a24b

07 12月, 2020 1 次提交

mlxsw: spectrum: Apply RIF configuration when joining a LAG · 31e1de4f

由 Ido Schimmel 提交于 12月 06, 2020

In case a router interface (RIF) is configured for a LAG, make sure its
configuration is applied on the new LAG member.
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31e1de4f

02 12月, 2020 2 次提交

mlxsw: spectrum_switchdev: Add support of QinQ traffic · 80dfeafd

由 Amit Cohen 提交于 11月 29, 2020

802.1ad, also known as QinQ is an extension to the 802.1q standard, which
is concerned with passing possibly 802.1q-tagged packets through another
VLAN-like tunnel. The format of 802.1ad tag is the same as 802.1q, except
it uses the EtherType of 0x88a8, unlike 802.1q's 0x8100.

Add support for 802.1ad protocol. Most of the configuration is the same
as 802.1q. The difference is that before a port joins an 802.1ad bridge it
needs to be configured to recognize 802.1ad packets as tagged and other
packets (e.g., 802.1q) as untagged.

VXLAN is not currently supported with 802.1ad bridge, so return an error
with an appropriate extack message.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

80dfeafd

mlxsw: Make EtherType configurable when pushing VLAN at ingress · 3ae7a65b

由 Amit Cohen 提交于 11月 29, 2020

Currently, when pushing a PVID at ingress, mlxsw always uses 802.1q
EtherType.

Make this EtherType configurable by extending mlxsw_sp_port_pvid_set()
with an EtherType argument.

This is a preparation for QinQ support, that needs to push a PVID with
802.1ad EtherType.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3ae7a65b

24 11月, 2020 1 次提交

net: don't include ethtool.h from netdevice.h · cc69837f

由 Jakub Kicinski 提交于 11月 20, 2020

linux/netdevice.h is included in very many places, touching any
of its dependecies causes large incremental builds.

Drop the linux/ethtool.h include, linux/netdevice.h just needs
a forward declaration of struct ethtool_ops.

Fix all the places which made use of this implicit include.
Acked-by: NJohannes Berg <johannes@sipsolutions.net>
Acked-by: NShannon Nelson <snelson@pensando.io>
Reviewed-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

cc69837f

27 10月, 2020 1 次提交

mlxsw: Only advertise link modes supported by both driver and device · 1601559b

由 Amit Cohen 提交于 10月 24, 2020

During port creation the driver instructs the device to advertise all
the supported link modes queried from the device.

Since cited commit not all the link modes supported by the device are
supported by the driver. This can result in the device negotiating a
link mode that is not recognized by the driver causing ethtool to show
an unsupported speed:

$ ethtool swp1
...
Speed: Unknown!

This is especially problematic when the netdev is enslaved to a bond, as
the bond driver uses unknown speed as an indication that the link is
down:

[13048.900895] net_ratelimit: 86 callbacks suppressed
[13048.900902] t_bond0: (slave swp52): failed to get link speed/duplex
[13048.912160] t_bond0: (slave swp49): failed to get link speed/duplex

Fix this by making sure that only link modes that are supported by both
the device and the driver are advertised.

Fixes: b97cd891 ("mlxsw: Remove 56G speed support")
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

1601559b

28 9月, 2020 1 次提交

mlxsw: spectrum: Initialize netdev's module overheat counter · 3bdbab3f

由 Amit Cohen 提交于 9月 27, 2020

The overheat counter is a per-module counter, but it is exposed as part
of the corresponding netdev's statistics. It should therefore be
presented to user space relative to the netdev's lifetime.

Query the counter just before registering the netdev, so that the value
exposed to user space will be relative to this initial value.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3bdbab3f

18 9月, 2020 1 次提交

mlxsw: spectrum_buffers: Support two headroom modes · 69e408a2

由 Petr Machata 提交于 9月 17, 2020

There are two interfaces to configure ETS: qdiscs and DCB. Historically,
DCB ETS configuration was projected to ingress as well, and configured port
buffers. Qdisc was not.

So as not to break clients that today use DCB ETS and PFC and rely on
getting a reasonable ingress buffer priomap, keep the ETS mirroring in
effect.

Since qdiscs have not done this mirroring historically, it is reasonable
not to introduce it, but rather permit manual ingress configuration through
dcbnl_setbuffer only in the qdisc mode.

This will require a toggle to indicate whether buffer sizes should be
autocomputed or taken from dcbnl_setbuffer, and likewise for priomaps.
Introduce such and initialize it, and guard port buffer size configuration
as appropriate. The toggle is currently left in the DCB position. In a
following patch, qdisc code will switch it.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

69e408a2

17 9月, 2020 10 次提交

mlxsw: spectrum_buffers: Manage internal buffer in the hdroom code · 22881adf

由 Petr Machata 提交于 9月 16, 2020

Traffic mirroring modes that are in-chip implemented on egress need an
internal buffer to work. As the only client, the SPAN module was managing
the buffer so far. However logically it belongs to the buffers module. E.g.
buffer size validation needs to take the size of the internal buffer into
account.

Therefore move the related code from SPAN to spectrum_buffers. Move over
the callbacks that determine the minimum buffer size as a function of
maximum speed and MTU. Add a field describing the internal buffer to struct
mlxsw_sp_hdroom. Extend mlxsw_sp_hdroom_bufs_reset_sizes() to take care of
sizing the internal buffer as well. Change the SPAN module to invoke that
function and mlxsw_sp_hdroom_configure() like all the other hdroom clients.
Drop the now-unnecessary mlxsw_sp_span_port_buffer_disable().
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22881adf

mlxsw: spectrum_buffers: Introduce shared buffer ops · a41b9626

由 Petr Machata 提交于 9月 16, 2020

The size of the internal buffer is currently calculated in the SPAN module.
Logically it belongs to the spectrum_buffers module, where it should be
moved. However, that being a chip-specific operation, it needs dynamic
dispatch. There currently is a chip-specific structure for description of
shared buffer values, struct mlxsw_sp_sb_vals. However placing ops into
this structure would be confusing. Therefore introduce a new per-chip
structure, currently empty, and initialize the ops pointer as appropriate.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a41b9626

mlxsw: spectrum_buffers: Inline mlxsw_sp_sb_max_headroom_cells() · bd3e86a5

由 Petr Machata 提交于 9月 16, 2020

This function is now only used from the buffers module, and is a trivial
field reference. Just inline it and drop the related artifacts.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd3e86a5

mlxsw: spectrum: Split headroom autoresize out of buffer configuration · 2d9f703f

由 Petr Machata 提交于 9月 16, 2020

Split mlxsw_sp_port_headroom_set() to three functions.
mlxsw_sp_hdroom_bufs_reset_sizes() changes the sizes of the individual PG
buffers, and mlxsw_sp_hdroom_configure_buffers() will actually apply the
configuration. A third function, mlxsw_sp_hdroom_bufs_fit(), verifies that
the requested buffer configuration matches total headroom size
requirements.

Add wrappers, mlxsw_sp_hdroom_configure() and __..., that will eventually
perform full headroom configuration, but for now, only have them verify the
configured headroom size, and invoke mlxsw_sp_hdroom_configure_buffers().
Have them take the `force` argument to prepare for a later patch, even
though it is currently unused.

Note that the loop in mlxsw_sp_hdroom_configure_buffers() only goes through
DCBX_MAX_BUFFERS. Since there is no logic to configure the control buffer,
it needs to keep the values queried from the FW. Eventually this function
should configure all the PGs.

Note that conversion of __mlxsw_sp_dcbnl_ieee_setets() is not trivial. That
function performs the headroom configuration in three steps: first it
resizes the buffers and adds any new ones. Then it redirects priorities to
the new buffers. And finally it sets the size of the now-unused buffers to
zero. This way no packet drops are introduced.

So after invoking mlxsw_sp_hdroom_bufs_reset_sizes(), tweak the
configuration to keep the old sizes of PG buffers for those buffers whose
size was set to zero.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d9f703f

mlxsw: spectrum: Track buffer sizes in struct mlxsw_sp_hdroom · aa7c0621

由 Petr Machata 提交于 9月 16, 2020

So far, port buffers were always autoconfigured. When dcbnl_setbuffer
callback is implemented, it will allow the user to change the buffer size
configuration by hand. The sizes therefore need to be a configuration
parameter, not always deduced, and therefore belong to struct
mlxsw_sp_hdroom, where the configuration routine should take them from.

Update mlxsw_sp_port_headroom_set() to update these sizes. Have the
function update the sizes even for the case that a given buffer is not
used.

Additionally, change the loop iteration end to DCBX_MAX_BUFFERS instead of
IEEE_8021QAZ_MAX_TCS. The value is the same, but the semantics differ.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa7c0621

mlxsw: spectrum: Track lossiness in struct mlxsw_sp_hdroom · ca21e84e

由 Petr Machata 提交于 9月 16, 2020

Client-side configuration has lossiness as an attribute of a priority.
Therefore add a "lossy" attribute to struct mlxsw_sp_hdroom_prio.

To a Spectrum ASIC, lossiness is a feature of a port buffer. Therefore add
struct mlxsw_sp_hdroom_buf, which in the following patches will get more
attributes, but right now only use it to track port buffer lossiness.

Instead of passing around the primary indicators of PFC and pause_en, add a
function mlxsw_sp_hdroom_bufs_reset_lossiness() to compute the buffer
lossiness from the priority map and priority lossiness. Change
mlxsw_sp_port_headroom_set() to take the buffer lossy flag from the
headroom configuration. Have the PFC and pause handlers configure priority
lossiness in mlxsw_sp_hdroom, from where it will propagate.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca21e84e

mlxsw: spectrum: Track priorities in struct mlxsw_sp_hdroom · 5df825ed

由 Petr Machata 提交于 9月 16, 2020

The mapping from priorities to buffers determines which buffers should be
configured. Lossiness of these priorities combined with the mapping
determines whether a given buffer should be lossy.

Currently this configuration is stored implicitly in DCB ETS, PFC and
ethtool PAUSE configuration. Keeping it together with the rest of the
headroom configuration and deriving it as needed from PFC / ETS / PAUSE
will make things clearer. To that end, add a field "prios" to struct
mlxsw_sp_hdroom.

Previously, __mlxsw_sp_port_headroom_set() took prio_tc as an argument, and
assumed that the same mapping as we use on the egress should be used on
ingress as well. Instead, track this configuration at each priority, so
that it can be adjusted flexibly.

In the following patches, as dcbnl_setbuffer is implemented, it will need
to store its own mapping, and it will also be sometimes necessary to revert
back to the original ETS mapping. Therefore track two buffer indices: the
one for chip configuration (buf_idx), and the source one (ets_buf_idx).
Introduce a function to configure the chip-level buffer index, and for now
have it simply copy the ETS mapping over to the chip mapping.

Update the ETS handler to project prio_tc to the ets_buf_idx and invoke the
buf_idx recomputation.

Now that there is a canonical place to look for this configuration,
mlxsw_sp_port_headroom_set() does not need to invent def_prio_tc to use if
DCB is compiled out.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5df825ed

mlxsw: spectrum: Track MTU in struct mlxsw_sp_hdroom · 0103a3e4

由 Petr Machata 提交于 9月 16, 2020

MTU influences sizes of auto-allocated buffers. Make it a part of port
buffer configuration and have __mlxsw_sp_port_headroom_set() take it from
there, instead of as an argument.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0103a3e4

mlxsw: spectrum: Unify delay handling between PFC and pause · b7e07bbd

由 Petr Machata 提交于 9月 16, 2020

When a priority is marked as lossless using DCB PFC, or when pause frames
are enabled on a port, mlxsw adds to port buffers an extra space to cover
the traffic that will arrive between the time that a pause or PFC frame is
emitted, and the time traffic actually stops. This is called the delay. The
concept is the same in PFC and pause, however the way the extra buffer
space is calculated differs.

In this patch, unify this handling. Delay is to be measured in bytes of
extra space, and will not include MTU. PFC handler sets the delay directly
from the parameter it gets through the DCB interface.

To convert pause handler, move MLXSW_SP_PAUSE_DELAY to ethtool module,
convert to bytes, and reduce it by maximum MTU, and divide by two. Then it
has the same meaning as the delay_bytes set by the PFC handler.

Keep the delay_bytes value in struct mlxsw_sp_hdroom introduced in the
previous patch. Change PFC and pause handlers to store the new delay value
there and have __mlxsw_sp_port_headroom_set() take it from there.

Instead of mlxsw_sp_pfc_delay_get() and mlxsw_sp_pg_buf_delay_get(),
introduce mlxsw_sp_hdroom_buf_delay_get() to calculate the delay provision.
Drop the unnecessary MLXSW_SP_CELL_FACTOR, and instead add an explanatory
comment describing the formula used.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7e07bbd

mlxsw: spectrum_buffers: Add struct mlxsw_sp_hdroom · 3a77f5a2

由 Petr Machata 提交于 9月 16, 2020

The port headroom handling is currently strewn across several modules and
tricky to follow: MTU, DCB PFC, DCB ETS and ethtool pause all influence the
settings, and then there is the completely separate initial configuraion in
spectrum_buffers. A following patch will implement the dcbnl_setbuffer
callback, which is going to further complicate the landscape.

In order to simplify work with port buffers, the following patches are
going to centralize all port-buffer handling in spectrum_buffers. As a
first step, introduce a (currently empty) struct mlxsw_sp_hdroom that will
keep the configuration parameters, and allocate and free it in appropriate
places.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a77f5a2

16 9月, 2020 1 次提交

mlxsw: Move fw flashing code into core.c · b79cb787

由 Jiri Pirko 提交于 9月 15, 2020

As the firmware flashing is not specific to Spectrum, move the code to
core.c and avoid one op call and 2 exported symbols. Also, this allows
to do flash before call of driver->init function and possibly do other
core calls in between.

Do some small renaming here and there on the way to be consistent with
the rest of core.c code.
Signed-off-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b79cb787

15 9月, 2020 4 次提交

mlxsw: spectrum_span: Derive SBIB from maximum port speed & MTU · 532b49e4

由 Petr Machata 提交于 9月 13, 2020

The SBIB register configures the size of an internal buffer that the
Spectrum ASICs use when mirroring traffic on egress. This size should be
taken into account when validating that the port headroom buffers are not
larger than the chip can handle. Up until now this was not done, which is
incidentally not a problem, because the priority group buffers that mlxsw
auto-configures are small enough that the boundary condition could not be
violated.

However when dcbnl_setbuffer is implemented, the user has control over
sizes of PG buffers, and they might overshoot the headroom capacity.
However the size of the SBIB buffer depends on port speed, and that cannot
be vetoed. Therefore SBIB size should be deduced from maximum port speed.

Additionally, once the buffers are configured by hand, the user could get
into an uncomfortable situation where their MTU change requests get vetoed,
because the SBIB does not fit anymore. Therefore derive SBIB size from
maximum permissible MTU as well.

Remove all the code that adjusted the SBIB size whenever speed or MTU
changed.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

532b49e4

mlxsw: spectrum: Keep maximum speed around · 3232e8c6

由 Petr Machata 提交于 9月 13, 2020

The maximum port speed depends on link modes supported by the port, and for
Ethernet ports is constant. The maximum speed will be handy when setting
SBIB, the internal buffer used for traffic mirroring. Therefore, keep it in
struct mlxsw_sp_port for easy access.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3232e8c6

mlxsw: spectrum: Keep maximum MTU around · 2ecf87ae

由 Petr Machata 提交于 9月 13, 2020

The maximum port MTU depends on port type. On Spectrum, mlxsw configures
all ports as Ethernet ports, and the maximum MTU therefore never changes.
Besides checking MTU configuration, maximum MTU will also be handy when
setting SBIB, the internal buffer used for traffic mirroring. Therefore,
keep it in struct mlxsw_sp_port for easy access.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ecf87ae

mlxsw: spectrum_ethtool: Introduce ptys_max_speed callback · 60fbc521

由 Petr Machata 提交于 9月 13, 2020

The SBIB register configures the size of an internal buffer that the
Spectrum ASICs use when mirroring traffic on egress. This size should be
taken into account when validating that the port headroom buffers are not
larger than the chip can handle. Up until now this was not done, which is
incidentally not a problem, because the priority group buffers that mlxsw
auto-configures are small enough that the boundary condition could not be
violated.

When dcbnl_setbuffer is implemented, the user gets control over sizes of PG
buffers, and they might overshoot the headroom capacity. However the size
of the SBIB buffer depends on port speed, which cannot be vetoed. There is
obviously no way to retroactively push back on requests for overlarge PG
buffers, or reject an overlarge MTU, or cancel losslessness of a certain
PG.

Therefore, instead of taking into account the current speed when
calculating SBIB buffer size, take into account the maximum speed that a
port with given Ethernet protocol capabilities can have.

To that end, add a new ethtool callback, ptys_max_speed, which determines
this maximum speed.
Signed-off-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60fbc521

24 8月, 2020 1 次提交

treewide: Use fallthrough pseudo-keyword · df561f66

由 Gustavo A. R. Silva 提交于 8月 23, 2020

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

df561f66

04 8月, 2020 3 次提交

mlxsw: spectrum_qdisc: Offload action trap for qevents · 54a92385

由 Petr Machata 提交于 8月 03, 2020

When offloading action trap on a qevent, pass to_dev of NULL to the SPAN
module to trigger the mirror to the CPU port. Query the buffer drops
policer and use it for policing of the trapped traffic.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54a92385

mlxsw: spectrum_trap: Allow for per-ASIC trap groups initialization · 36d1fd68

由 Ido Schimmel 提交于 8月 03, 2020

Subsequent patches will need to register different trap groups for
Spectrum-1 and Spectrum-2 onwards.

Enable that by invoking a per-ASIC operation during trap groups
initialization.
Reviewed-by: NPetr Machata <petrm@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36d1fd68

devlink: Pass extack when setting trap's action and group's parameters · c88e11e0

由 Ido Schimmel 提交于 8月 03, 2020

A later patch will refuse to set the action of certain traps in mlxsw
and also to change the policer binding of certain groups. Pass extack so
that failure could be communicated clearly to user space.
Reviewed-by: NPetr Machata <petrm@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c88e11e0

16 7月, 2020 3 次提交

mlxsw: spectrum_acl: Offload FLOW_ACTION_POLICE · af11e818

由 Ido Schimmel 提交于 7月 15, 2020

Offload action police when used with a flower classifier. The number of
dropped packets is read from the policer and reported to tc.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NPetr Machata <petrm@mellanox.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

af11e818

mlxsw: spectrum_policer: Add devlink resource support · bf038f03

由 Ido Schimmel 提交于 7月 15, 2020

Expose via devlink-resource the maximum number of single-rate policers
and their current occupancy. Example:

$ devlink resource show pci/0000:01:00.0
...
  name global_policers size 1000 unit entry dpipe_tables none
    resources:
      name single_rate_policers size 968 occ 0 unit entry dpipe_tables none
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NPetr Machata <petrm@mellanox.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

bf038f03

mlxsw: spectrum_policer: Add policer core · 8d3fbae7

由 Ido Schimmel 提交于 7月 15, 2020

Add common code to handle all policer-related functionality in mlxsw.
Currently, only policer for policy engines are supported, but it in the
future more policer families will be added such as CPU (trap) policers
and storm control policers.

The API allows different modules to add / delete policers and read their
drop counter.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NPetr Machata <petrm@mellanox.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

8d3fbae7

14 7月, 2020 4 次提交

mlxsw: spectrum_qdisc: Offload mirroring on RED qevent early_drop · f6668eac

由 Petr Machata 提交于 7月 11, 2020

The RED qevents early_drop and mark can be offloaded under the following
fairly strict conditions:

- At most one filter is configured at the qevent block
- The protocol is "any"
- The classifier is matchall
- The action is trap, sample, or mirror with the same conditions as
  with other SPAN offloads
- The hw_counters type is none

In this patchset, implement offload of mirror for early_drop qevent.
The ECN trigger is currently not implemented in the FW and therefore
the mark qevent is not supported.

The qevent notifications look exactly like regular block binding
notifications with a binder type that identifies them as qevents.
Therefore the details of processing this binding are fairly similar
to the matchall offload.

struct flow_block_offload.sch points at the qdisc in question. Use it to
figure out if the qdisc is offloaded at all and what TC it configures.
Bounce bindings on not-offloaded qdiscs.

Individual bindings are kept in a list so that several qevents can share
the same block and all binding points get configured as the configured
filters change.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6668eac

mlxsw: spectrum_flow: Promote binder-type dispatch to spectrum.c · f7a439cb

由 Petr Machata 提交于 7月 11, 2020

Two RED qevents have been introduced recently. From the point of view of a
driver, qevents are simply blocks with unusual binder types. However they
need to be handled by different logic than ACL-like flows.

Thus rename mlxsw_sp_setup_tc_block() to mlxsw_sp_setup_tc_block_clsact()
and move the binder-type dispatch from there to spectrum.c into a new
function of the original name. The new dispatcher is easier to extend with
new binder types.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7a439cb

mlxsw: spectrum_matchall: Publish matchall data structures · b50f60a0

由 Petr Machata 提交于 7月 11, 2020

A following patch introduces offloading of filters attached to blocks bound
to the RED tail_drop qevent. The only classifier that mlxsw will permit in
this role is matchall. mlxsw currently offloads matchall filters used with
clsact qdisc. The data structures used for that offload will come handy for
the qevent offload as well. Publish them in spectrum.h.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b50f60a0

mlxsw: spectrum_flow: Drop an unused field · d928f821

由 Petr Machata 提交于 7月 11, 2020

The field "dev" in struct mlxsw_sp_flow_block_binding is not used. Drop it.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d928f821

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功