提交 · 52697a9ede4ff9ce55930342c2a4d0f985df4ebf · openeuler / Kernel

03 7月, 2016 1 次提交

mlxsw: spectrum: Send untagged packets through a port netdev · 52697a9e

由 Ido Schimmel 提交于 7月 02, 2016

Port netdevs (e.g. swXpY) that are not bridged are represented in the
device using a vPort with VID=PVID=1 (the PVID vPort), as untagged
packets entering the switch are internally tagged with the PVID VLAN.
When these packets are routed through a different port netdev they
should egress untagged.

This wasn't a problem until now, as non-bridged traffic only originated
from the CPU, which transmits packets out of the port as-is.

When a vPort is created with VID 1 mark it as egress untagged.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52697a9e

21 6月, 2016 21 次提交

mlxsw: spectrum: Add debug prints · 22305378

由 Ido Schimmel 提交于 6月 20, 2016

For debug purposes, it's useful to know the order in which the driver
responds to changes in the topology of its upper devices.

Add debug prints to signal these events.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22305378

mlxsw: spectrum: Free resources upon vPort destruction · 1c800759

由 Ido Schimmel 提交于 6月 20, 2016

There are situations in which a vPort is destroyed while still holding
references to device's resources such as FIDs and FDB records. This can
happen, for example, when a VLAN device is deleted while still being
bridged.

Instead of trying to make sure vPort destruction is invoked when it no
longer uses device's resources, just free them upon destruction. This
simplifies the code, as we no longer need to take different situations
into account when events are received - cleanup is taken care of in one
place.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c800759

mlxsw: spectrum: Refactor FDB flushing logic · fe3f6d14

由 Ido Schimmel 提交于 6月 20, 2016

FDB entries are learned using {Port / LAG ID, FID} and therefore should
be flushed whenever a port (vPort) leaves its FID (vFID).

However, when the bridge port is a LAG device (or a VLAN device on top),
then FDB flushing is conditional. Ports removed from such LAG
configurations must not trigger flushing, as other ports might still be
members in the LAG and therefore the bridge port is still active.

The decision whether to flush or not was previously computed in the
netdevice notification block, but in order to flush the entries when a
port leaves its FID this decision should be computed there.

Strip the notification block from this logic and instead move it to one
FDB flushing function that is invoked from both the FID / vFID leave
functions.

When port isn't member in LAG, FDB flushing should always occur.
Otherwise, it should occur only when the last port (vPort) member in the
LAG leaves the FID (vFID).

This will allow us - in the next patch - to simplify the cleanup code
paths that are hit whenever the topology above the port netdevs changes.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe3f6d14

mlxsw: spectrum: Don't count on FID being present · 56918b6b

由 Ido Schimmel 提交于 6月 20, 2016

Not all vPorts will have FIDs assigned to them, so make sure functions
first test for FID presence.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56918b6b

mlxsw: spectrum: Add FID get / set functions · 41b996cc

由 Ido Schimmel 提交于 6月 20, 2016

As previously explained, not all vPorts will be assigned FIDs, so instead
of returning the FID index of a vPort, return a pointer to its FID
struct. This will allow us to know whether it's legal to access the
vPort's FID parameters such as index and device.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41b996cc

mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge · 14d39461

由 Ido Schimmel 提交于 6月 20, 2016

In a very similar way to the vFIDs, make the first 4K FIDs - used in the
VLAN-aware bridge - use the new FID struct.

Upon first use of the FID by any of the ports do the following:

1) Create the FID
2) Setup a matching flooding entry
3) Create a mapping for the FID

Unlike vFIDs, upon creation of a FID we always create a global
VID-to-FID mapping, so that ports without upper vPorts can use it
instead of creating an explicit {Port, VID} to FID mapping.

When a port leaves a FID the reverse is performed. Whenever the FID's
reference count reaches zero the FID is deleted along with the global
mapping.

The per-FID struct will later allow us to configure L3 interfaces on top
of the VLAN-aware bridge.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14d39461

mlxsw: spectrum: Remove unused function argument · 37286d25

由 Ido Schimmel 提交于 6月 20, 2016

Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37286d25

mlxsw: spectrum: Use join / leave functions for vFID operations · 0355b59f

由 Ido Schimmel 提交于 6月 20, 2016

When a vPort is created or when it joins a bridge we always do the same
set of operations:

1) Create the vFID, if not already created
2) Setup flooding for the vFID
3) Map the {Port, VID} to the vFID

When a vPort is destroyed or when it leaves a bridge the reverse is
performed.

Encapsulate the above in join / leave functions and simplify the code.
FIDs and rFIDs will use a similar set of functions.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0355b59f

mlxsw: spectrum: Make vFID struct generic · d0ec875a

由 Ido Schimmel 提交于 6月 20, 2016

Up until now we had a dedicated struct only for vFIDs, but before
introducing support for L3 interfaces we need to make it generic and
use it for all three types of FIDs:

1) FIDs - 0..4K-1, used for the VLAN-aware bridge
2) vFIDs - 4K..15K-1, used for VLAN-unaware bridges
3) rFIDs - 15K..16K-1, used to direct traffic to / from the router in
the device. Will be introduced later in the series.

The three types of L3 interfaces - Router InterFaces, RIFs - that will
be introduced correspond to the three types of FIDs and are configured
using them. Therefore, we'll need to store the links between them as
well as a reference count on the underlying FID, so that the
corresponding RIF will be destroyed when it reaches zero.

Note that the lower 0.5K vFIDs are currently used for for non-bridged
netdevs, so that traffic could be flooded to the CPU port. However, when
rFIDs will be introduced we'll no longer need these and they too will be
used for VLAN-unaware bridges.

Make the vFID struct generic by renaming it and some of its fields. FIDs
will be converted to use it later in the series.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0ec875a

mlxsw: spectrum: Use FID instead of vFID to setup flooding · e6060027

由 Ido Schimmel 提交于 6月 20, 2016

Use a FID index instead of vFID and ease the transition towards a
generic FID struct.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6060027

mlxsw: spectrum: Create a function to map vPort's FID · 9c4d4423

由 Ido Schimmel 提交于 6月 20, 2016

A FID used by a vPort (vFID, but also rFID later in the series) is
always mapped using {Port, VID} and not only VID as with the 4K FIDs of
the VLAN-aware bridge.

Instead of specifying all the arguments each time, just wrap this
operation using a dedicated function and simplify the code.

As before, the function takes FID as its argument in preparation for a
generic FID struct.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c4d4423

mlxsw: spectrum: Use only one function to create vFIDs · c7e920b5

由 Ido Schimmel 提交于 6月 20, 2016

Simplify the code and use only one function for vFID creation /
destruction.

Unlike before, the function receives a FID index as its argument and not
a vFID index. Instead of passing 0, now one would need to pass 4K, which
is the first vFID.

This is the first step in creating a generic FID struct that will be
used for all three types of FIDs.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7e920b5

mlxsw: spectrum: Remove redundant function argument · 47a0a9e6

由 Ido Schimmel 提交于 6月 20, 2016

In all call sites 'only_uc' is set to false, so strip it.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47a0a9e6

mlxsw: spectrum: Centralize VLAN-aware bridge ref counting · 7117a570

由 Ido Schimmel 提交于 6月 20, 2016

We hold a reference count on the number of ports member in the
VLAN-aware bridge, as we only support one.

Instead of always incrementing / decrementing the reference count after
joining / leaving the bridge, simply do this accounting in the join /
leave functions.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7117a570

mlxsw: spectrum: Remove unnecessary function argument · 27943895

由 Ido Schimmel 提交于 6月 20, 2016

The argument 'br_dev' is never used, so remove it.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27943895

mlxsw: spectrum: Make unlinking functions return void · 82e6db03

由 Ido Schimmel 提交于 6月 20, 2016

When responding to unlinking CHANGEUPPER notifications we shouldn't
return any value, as it's not checked by upper layers.

In addition, there's nothing the driver can do in case of failure, so it
should simply continue and try to free as much resources as possible and
not stop on first error.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82e6db03

mlxsw: spectrum: Use WARN_ON() return value · 423b937e

由 Ido Schimmel 提交于 6月 20, 2016

Instead of checking for a condition and then issue the warning, just do
it in one go and simplify the code.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

423b937e

mlxsw: spectrum: Remove unnecessary checks from event processing · ddbe993d

由 Ido Schimmel 提交于 6月 20, 2016

When upper device of a VLAN device changes we already made sure it's
a bridge device in PRECHANGEUPPER, so no need to check it's a master
device in CHANGEUPPER.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ddbe993d

mlxsw: spectrum: Forbid LAG slave from having VLAN uppers · 6ec43904

由 Ido Schimmel 提交于 6月 20, 2016

When a port netdev is put under LAG it cannot have VLAN upper devices,
so forbid that. The LAG device itself can have VLAN upper devices.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ec43904

mlxsw: spectrum: Sanitize port netdev upper devices · 59fe9b3f

由 Ido Schimmel 提交于 6月 20, 2016

We currently only support the following upper devices for port netdevs:
1) Bridge
2) LAG (bond / team)
3) VLAN

Any other device is forbidden, so return an error.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59fe9b3f

mlxsw: spectrum: Use notifier_from_errno() in notifier block · 80bedf1a

由 Ido Schimmel 提交于 6月 20, 2016

Instead of checking the error value and returning NOTIFY_BAD, just use
notifier_from_errno() and simplify the code.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80bedf1a

18 6月, 2016 1 次提交

mlxsw: spectrum: Don't count internal TX header bytes to stats · 63dcdd35

由 Nogah Frankel 提交于 6月 17, 2016

Stop the SW TX counter from counting the TX header bytes
since they are not being sent out.

Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63dcdd35

10 6月, 2016 2 次提交

mlxsw: spectrum: Don't sleep during ndo_get_phys_port_name() · d664b41e

由 Ido Schimmel 提交于 6月 09, 2016

When rtnl_fill_ifinfo() is called for a certain netdevice it queries its
various parameters such as switch id and physical port name. The
function might get called in an atomic context, which means the
underlying driver must not sleep during the query operation.

Don't query the device and sleep during ndo_get_phys_port_name(), but
instead store the needed parameters in port creation time.

Fixes: 2bf9a586 ("mlxsw: spectrum: Add support for physical port names")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d664b41e

mlxsw: spectrum: Make split flow match firmware requirements · be94535f

由 Ido Schimmel 提交于 6月 09, 2016

When a port is created following a split / unsplit we need to map it to
the correct module and lane, enable it and then continue to initialize
its various parameters such as MTU and VLAN filters.

Under certain conditions, such as trying to split ports at the bottom
row of the front panel by four, we get firmware errors.

After evaluating this with the firmware team it was decided to alter the
split / unsplit flow, so that first all the affected ports are mapped,
then enabled and finally each is initialized separately.

Fix the split / unsplit flow by first mapping and enabling all the
affected ports. Newer firmware versions will support both flows.

Fixes: 18f1e70c ("mlxsw: spectrum: Introduce port splitting")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be94535f

07 5月, 2016 2 次提交

mlxsw: spectrum: Fix ordering in mlxsw_sp_fini · 5113bfdb

由 Jiri Pirko 提交于 5月 06, 2016

Fixes: 0f433fa0 ("mlxsw: spectrum_buffers: Implement shared buffer configuration")
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5113bfdb

mlxsw: spectrum: Fix rollback order in LAG join failure · 51554db2

由 Ido Schimmel 提交于 5月 06, 2016

Make the leave procedure in the error path symmetric to the join
procedure and first remove the port from the collector before
potentially destroying the LAG.

Fixes: 0d65fc13 ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51554db2

15 4月, 2016 2 次提交

mlxsw: spectrum_buffers: Implement occupancy monitoring · 2d0ed39f

由 Jiri Pirko 提交于 4月 14, 2016

Implement occupancy API introduced in devlink and mlxsw core. This is
done by accessing SBPM register for Port-Pool and SBSR for Port-TC
current and max occupancy values. Max clear is implemented using the
same registers.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d0ed39f

mlxsw: spectrum_buffers: Implement shared buffer configuration · 0f433fa0

由 Jiri Pirko 提交于 4月 14, 2016

Implement previously introduced mlxsw core shared buffer API.
For Spectrum, that is done utilizing registers SBPR, SBCM and SBPM.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f433fa0

09 4月, 2016 3 次提交

mlxsw: Do not pass around driver_priv directly · b2f10571

由 Jiri Pirko 提交于 4月 08, 2016

Instead of that, pass mlxsw_core and use a helper to get driver priv
from driver code. Looks much cleaner that way.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2f10571

mlxsw: Pass mlxsw_core as a param of mlxsw_core_skb_transmit* · 307c2431

由 Jiri Pirko 提交于 4月 08, 2016

Instead of passing around driver priv, pass struct mlxsw_core *
directly.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

307c2431

mlxsw: Move devlink port registration into common core code · 932762b6

由 Jiri Pirko 提交于 4月 08, 2016

Remove devlink port reg/unreg from spectrum and switchx2 code and rather
do the common work in core. That also ensures code separation where
devlink is only used in core.c.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

932762b6

07 4月, 2016 8 次提交

mlxsw: spectrum: Add IEEE 802.1Qbb PFC support · d81a6bdb

由 Ido Schimmel 提交于 4月 06, 2016

Implement the appropriate DCB ops and allow a user to configure certain
traffic classes as lossless.

The operation configures PFC for both the egress (respecting PFC frames)
and ingress (sending PFC frames) parts of the port.

At egress, when a PFC frame is received for a PFC enabled priority, then
all the priorities mapped to the same TC are stopped.

At ingress, the priority group (PG) buffers to which the enabled PFC
priorities are mapped are configured to be lossless. PFC frames will be
transmitted when the Xoff threshold is crossed.

The user-supplied delay parameter is used to determine the PG's size
according to the following formula:

PG_SIZE = PG_SIZE_LOSSY + delay * CELL_FACTOR + MTU

In the worst case scenario the delay will be made up of packets that
are all of size CELL_SIZE + 1, which means each packet will require
almost twice its true size when buffered in the switch. We therefore
multiply this value by the "cell factor", which is close to 2.

Another MTU is added in case the transmitting host already started
transmitting a maximum length frame when the PFC packet was received.

As with PAUSE enabled ports, when the port's MTU is changed both the
PGs' size and threshold are adjusted accordingly.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d81a6bdb

mlxsw: reg: Introduce per priority counters · 34dba0a5

由 Ido Schimmel 提交于 4月 06, 2016

We are going to add support for PFC as part of DCB ops, which requires us
to report the number of PFC frames sent and received per priority.

Add per priority counters in order to report number of PFC frames sent
and received per priority.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34dba0a5

mlxsw: spectrum: Add support for PAUSE frames · 9f7ec052

由 Ido Schimmel 提交于 4月 06, 2016

When a packet ingress the switch it's placed in its assigned priority
group (PG) buffer in the port's headroom buffer while it goes through
the switch's pipeline. After going through the pipeline - which
determines its egress port(s) and traffic class - it's moved to the
switch's shared buffer awaiting transmission.

However, some packets are not eligible to enter the shared buffer due to
exceeded quotas or insufficient space. Marking their associated PGs as
lossless will cause the packets to accumulate in the PG buffer. Another
reason for packets accumulation are complicated pipelines (e.g.
involving a lot of ACLs).

To prevent packets from being dropped a user can enable PAUSE frames on
the port. This will mark all the active PGs as lossless and set their
size according to the maximum delay, as it's not configured by user.

                         +----------------+   +
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   | Delay
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   |
    Xon/Xoff threshold   +----------------+   +
                         |                |   |
                         |                |   | 2 * MTU
                         |                |   |
                         +----------------+   +

The delay (612 [Cells]) was calculated according to worst-case scenario
involving maximum MTU and 100m cables.

After marking the PGs as lossless the device is configured to respect
incoming PAUSE frames (Rx PAUSE) and generate PAUSE frames (Tx PAUSE)
according to user's settings.

Whenever the port's headroom configuration changes we take into account
the PAUSE configuration, so that we correctly set the PG's type (lossy /
lossless), size and threshold. This can happen when:

a) The port's MTU changes, as it directly affects the PG's size.

b) A PG is created following user configuration, by binding a priority
to it.

Note that the relevant SUPPORTED flags were already mistakenly set by
the driver before this commit.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f7ec052

mlxsw: spectrum: Allow setting maximum rate for a TC · cc7cf517

由 Ido Schimmel 提交于 4月 06, 2016

Allow a user to set maximum rate for a particular TC using DCB ops.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc7cf517

mlxsw: spectrum: Add IEEE 802.1Qaz ETS support · 8e8dfe9f

由 Ido Schimmel 提交于 4月 06, 2016

Implement the appropriate DCB ops and allow a user to configure:
	* Priority to traffic class (TC) mapping with a total of 8
	  supported TCs
	* Transmission selection algorithm (TSA) for each TC and the
	  corresponding weights in case of weighted round robin (WRR)

As previously explained, we treat the priority group (PG) buffer in the
port's headroom as the ingress counterpart of the egress TC. Therefore,
when a certain priority to TC mapping is configured, we also configure
the port's headroom buffer.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e8dfe9f

mlxsw: spectrum: Introduce support for Data Center Bridging (DCB) · f00817df

由 Ido Schimmel 提交于 4月 06, 2016

Introduce basic infrastructure for DCB and add the missing ops in
following patches.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f00817df

mlxsw: spectrum: Initialize egress scheduling · 90183b98

由 Ido Schimmel 提交于 4月 06, 2016

Before introducing support for DCB ops we should first make sure we
initialize the relevant parts in the device correctly. Specifically, the
egress scheduling.

The device supports a superset of the 802.1Qaz standard with 4 hierarchy
levels that can be linked to each other in multiple ways and with
different transmission selection algorithms (TSA) employed between them.

However, since we only intend to support the 802.1Qaz standard we
flatten the hierarchies and let the user configure via DCB ops the TSA
and max rate shaper at the subgroup hierarchy (see figure below) and the
mapping between switch priority to traffic class. By default, all switch
priorities are mapped to traffic class 0, strict priority is employed
and max shaper is disabled.

Default configuration:

         switch priority 0      ...         switch priority 7
                 +                                  +
                 |                                  |
                 +----------------------------------+
                 |
              +--v--+                          +-----+
Traffic Class |     |                          |     |
  Hierarchy   | TC0 |           ...            | TC7 |
              |     |                          |     |
              +--+--+                          +--+--+
                 |                                |
              +--v--+                          +--v--+
  Subgroup    | SG0 |                          | SG7 |
  Hierarchy   |     |                          |     |
              +-----+                          +-----+
              | TSA |                          | TSA |
              +-----+           ...            +-----+
              | MAX |                          | MAX |
              +--+--+                          +--+--+
                 |                                |
                 +---------------+----------------+
                                 |
                              +--v--+
                      Group   |     |
                    Hierarchy | GR0 |
                              |     |
                              +--+--+
                                 |
                              +--v--+
                      Port    |     |
                    Hierarchy | PR0 |
                              |     |
                              +-----+
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90183b98

mlxsw: spectrum: Correctly configure headroom size · ff6551ec

由 Ido Schimmel 提交于 4月 06, 2016

When packets ingress the switch they are assigned a switch priority and
directed to the corresponding priority group (PG) buffer in the port's
headroom buffer.

Since we now map all switch priorities to priority group 0 (PG0) by
default, there is no need to allocate the other priority groups during
initialization. The only exception is PG9, which is used for control
traffic.

At minimum, the PG should be able to store the currently classified
packet (pipeline latency isn't 0) and also the packets arriving during
the classification time. However, an incoming packet will not be
buffered if there is no available MTU-sized buffer space for storing it.

The buffer needed to accommodate for pipeline latency is variable and
needs to take into account both the current link speed and current
latency of the pipeline, which is time-dependent. Testing showed that
setting the PG's size to twice the current MTU is optimal.

Since PG9 is used strictly for control packets and not subject to flow
control, we are not going to resize it according to user configuration,
so we simply set it according to worst case scenario, which is twice the
maximum MTU.

In any case, later patches in the series will allow a user to direct
lossless flows to other PGs than PG0 and set their size to accommodate
for round-trip propagation delay.

The above change also requires us to resize the PG buffer whenever the
port's MTU is changed.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff6551ec

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功