- 30 3月, 2021 1 次提交
-
-
由 Ido Schimmel 提交于
Cited commit changed the behavior of the software data path with regards to the ECN marking of decapsulated packets. However, the commit did not change other callers of __INET_ECN_decapsulate(), namely mlxsw. The driver is using the function in order to ensure that the hardware and software data paths act the same with regards to the ECN marking of decapsulated packets. The discrepancy was uncovered by commit 5aa3c334 ("selftests: forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value") that aligned the selftest to the new behavior. Without this patch the selftest passes when used with veth pairs, but fails when used with mlxsw netdevs. Fix this by instructing the device to propagate the ECT(1) mark from the outer header to the inner header when the inner header is ECT(0), for both NVE and IP-in-IP tunnels. A helper is added in order not to duplicate the code between both tunnel types. Fixes: b7237487 ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040") Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NPetr Machata <petrm@nvidia.com> Acked-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 2月, 2021 2 次提交
-
-
由 Ido Schimmel 提交于
Routes are currently processed from a workqueue whereas nexthop objects are processed in system call context. This can result in the driver not finding a suitable nexthop group for a route and issuing a warning [1]. Fix this by ignoring such routes earlier in the process. The subsequent deletion notification will be ignored as well. [1] WARNING: CPU: 2 PID: 7754 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4853 mlxsw_sp_router_fib_event_work+0x1112/0x1e00 [mlxsw_spectrum] [...] CPU: 2 PID: 7754 Comm: kworker/u8:0 Not tainted 5.11.0-rc6-cq-20210207-1 #16 Hardware name: Mellanox Technologies Ltd. MSN2100/SA001390, BIOS 5.6.5 05/24/2018 Workqueue: mlxsw_core_ordered mlxsw_sp_router_fib_event_work [mlxsw_spectrum] RIP: 0010:mlxsw_sp_router_fib_event_work+0x1112/0x1e00 [mlxsw_spectrum] Fixes: cdd6cfc5 ("mlxsw: spectrum_router: Allow programming routes with nexthop objects") Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reported-by: NAlex Veber <alexve@nvidia.com> Tested-by: NAlex Veber <alexve@nvidia.com> Reviewed-by: NPetr Machata <petrm@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Danielle Ratson 提交于
Currently, only external bits are added to the PTYS register, whereas there is one external bit that is wrongly marked as internal, and so was recently removed from the register. Add that bit to the PTYS register again, as this bit is no longer internal. Its removal resulted in '100000baseLR4_ER4/Full' link mode no longer being supported, causing a regression on some setups. Fixes: 5bf01b57 ("mlxsw: spectrum_ethtool: Remove internal speeds from PTYS register") Signed-off-by: NDanielle Ratson <danieller@nvidia.com> Reported-by: NEddie Shklaer <eddies@nvidia.com> Tested-by: NEddie Shklaer <eddies@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 13 2月, 2021 2 次提交
-
-
由 Vladimir Oltean 提交于
This switchdev attribute offers a counterproductive API for a driver writer, because although br_switchdev_set_port_flag gets passed a "flags" and a "mask", those are passed piecemeal to the driver, so while the PRE_BRIDGE_FLAGS listener knows what changed because it has the "mask", the BRIDGE_FLAGS listener doesn't, because it only has the final value. But certain drivers can offload only certain combinations of settings, like for example they cannot change unicast flooding independently of multicast flooding - they must be both on or both off. The way the information is passed to switchdev makes drivers not expressive enough, and unable to reject this request ahead of time, in the PRE_BRIDGE_FLAGS notifier, so they are forced to reject it during the deferred BRIDGE_FLAGS attribute, where the rejection is currently ignored. This patch also changes drivers to make use of the "mask" field for edge detection when possible. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NGrygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
When a struct switchdev_attr is notified through switchdev, there is no way to report informational messages, unlike for struct switchdev_obj. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Reviewed-by: NNikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: NGrygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 2月, 2021 3 次提交
-
-
由 Amit Cohen 提交于
When FIB_EVENT_ENTRY_{REPLACE, APPEND} are triggered and route insertion fails, FIB abort is triggered. After aborting, set the appropriate hardware flag to make the kernel emit RTM_NEWROUTE notification with RTM_F_OFFLOAD_FAILED flag. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amit Cohen 提交于
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. To avoid such cases, previous patch set added the ability to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, this behavior is controlled by sysctl. With the above mentioned behavior, it is possible to know from user-space if the route was offloaded, but if the offload fails there is no indication to user-space. Following a failure, a routing daemon will wait indefinitely for a notification that will never come. This patch adds an "offload_failed" indication to IPv6 routes, so that users will have better visibility into the offload process. 'struct fib6_info' is extended with new field that indicates if route offload failed. Note that the new field is added using unused bit and therefore there is no need to increase struct size. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amit Cohen 提交于
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. To avoid such cases, previous patch set added the ability to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, this behavior is controlled by sysctl. With the above mentioned behavior, it is possible to know from user-space if the route was offloaded, but if the offload fails there is no indication to user-space. Following a failure, a routing daemon will wait indefinitely for a notification that will never come. This patch adds an "offload_failed" indication to IPv4 routes, so that users will have better visibility into the offload process. 'struct fib_alias', and 'struct fib_rt_info' are extended with new field that indicates if route offload failed. Note that the new field is added using unused bit and therefore there is no need to increase structs size. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2021 3 次提交
-
-
由 Danielle Ratson 提交于
Currently, when user space queries the link's parameters, as speed and duplex, each parameter is passed from the driver to ethtool. Instead, pass the link mode bit in use. In Spectrum-1, simply pass the bit that is set to '1' from PTYS register. In Spectrum-2, pass the first link mode bit in the mask of the used link mode. Signed-off-by: NDanielle Ratson <danieller@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Danielle Ratson 提交于
Currently, when auto negotiation is set to off, the user can force a specific speed or both speed and duplex. The user cannot influence the number of lanes that will be forced. Add support for setting speed along with lanes so one would be able to choose how many lanes will be forced. When lanes parameter is passed from user space, choose the link mode that its actual width equals to it. Otherwise, the default link mode will be the one that supports the width of the port. Signed-off-by: NDanielle Ratson <danieller@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Danielle Ratson 提交于
Currently, when a speed can be supported by different number of lanes, the supported link modes bitmask contains only link modes with a single number of lanes. This was done in order to prevent auto negotiation on number of lanes after 50G-1-lane and 100G-2-lanes link modes were introduced. For example, if a port's max width is 4, only link modes with 4 lanes will be presented as supported by that port, so 100G is always achieved by 4 lanes of 25G. After the previous patches that allow selection of the number of lanes, auto negotiation on number of lanes becomes practical. Remove that filtering of the maximum number of lanes supported link modes, so indeed all the supported and advertised link modes will be shown. Signed-off-by: NDanielle Ratson <danieller@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 03 2月, 2021 2 次提交
-
-
由 Amit Cohen 提交于
With the next patch mlxsw and netdevsim will fail in compilation if CONFIG_IPV6 is disabled. Do not call fib6_info_hw_flags_set() when IPv6 is disabled. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Amit Cohen 提交于
The next patch will emit notification when hardware flags are changed, in case that fib_notify_on_flag_change sysctl is set to 1. To know sysctl values, net struct is needed. This change is consistent with the IPv4 version, which gets 'net' struct as its first argument. Currently, the only callers of this function are mlxsw and netdevsim. Patch the callers to pass net. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 29 1月, 2021 2 次提交
-
-
由 Ido Schimmel 提交于
Currently there are only two types of in-kernel nexthop notification. The two are distinguished by the 'is_grp' boolean field in 'struct nh_notifier_info'. As more notification types are introduced for more next-hop group types, a boolean is not an easily extensible interface. Instead, convert it to an enum. Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NPetr Machata <petrm@nvidia.com> Signed-off-by: NPetr Machata <petrm@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
The purpose of the delayed work in the SPAN module is to potentially update the destination port and various encapsulation parameters of SPAN agents that point to a VLAN device or a GRE tap. The destination port can change following the insertion of a new route, for example. SPAN agents that point to a physical port or the CPU port are static and never change throughout the lifetime of the SPAN agent. Therefore, skip over them in the delayed work. This fixes an issue where the delayed work overwrites the policer that was set on a SPAN agent pointing to the CPU. Modifying the delayed work to inherit the original policer configuration is error-prone, as the same will be needed for any new parameter. Fixes: 4039504e ("mlxsw: spectrum_span: Allow setting policer on a SPAN agent") Reviewed-by: NPetr Machata <petrm@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 23 1月, 2021 1 次提交
-
-
由 Danielle Ratson 提交于
The switch ASIC has a limited capacity of physical ('flavour physical' in devlink terminology) ports that it can support. While each system is brought up with a different number of ports, this number can be increased via splitting up to the ASIC's limit. Expose physical ports as a devlink resource so that user space will have visibility to the maximum number of ports that can be supported and the current occupancy. In addition, add a "Generic Resources" section in devlink-resource documentation so the different drivers will be aligned by the same resource name when exposing to user space. Signed-off-by: NDanielle Ratson <danieller@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 15 1月, 2021 1 次提交
-
-
由 Christophe JAILLET 提交于
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GFP_ with a correct flag. It has been compile tested. When memory is allocated in 'mlxsw_pci_queue_init()' and 'mlxsw_pci_fw_area_init()' GFP_KERNEL can be used because both functions are already using this flag and no lock is acquired. When memory is allocated in 'mlxsw_pci_mbox_alloc()' GFP_KERNEL can be used because it is only called from the probe function and no lock is acquired in the between. The call chain is: --> mlxsw_pci_probe() --> mlxsw_pci_cmd_init() --> mlxsw_pci_mbox_alloc() While at it, also replace the 'dma_set_mask/dma_set_coherent_mask' sequence by a less verbose 'dma_set_mask_and_coherent() call. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: NIdo Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20210114084757.490540-1-christophe.jaillet@wanadoo.frSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 12 1月, 2021 4 次提交
-
-
由 Vladimir Oltean 提交于
As of commit 457e20d6 ("mlxsw: spectrum_switchdev: Avoid returning errors in commit phase"), the mlxsw driver performs the VLAN object offloading during the prepare phase. So conversion just seems to be a matter of removing the code that was running in the commit phase. Ido Schimmel explains that the reason why mlxsw_sp_span_respin is called unconditionally is because the bridge driver will ignore -EOPNOTSUPP and actually add the VLAN on the bridge device - see commit 9c86ce2c ("net: bridge: Notify about bridge VLANs") and commit ea472175 ("mlxsw: spectrum_switchdev: Ignore bridge VLAN events"). Since the VLAN was successfully added on the bridge device, mlxsw_sp_span_respin_work() should be able to resolve the egress port for a packet that is mirrored to a gre tap and passes through the bridge device. Therefore keep the logic as it is. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NJiri Pirko <jiri@nvidia.com> Reviewed-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
Since the introduction of the switchdev API, port attributes were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8ece ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port attribute notifier structures, and converts drivers to not look at this member. In part, this patch contains a revert of my previous commit 2e554a7a ("net: dsa: propagate switchdev vlan_filtering prepare phase to drivers"). For the most part, the conversion was trivial except for: - Rocker's world implementation based on Broadcom OF-DPA had an odd implementation of ofdpa_port_attr_bridge_flags_set. The conversion was done mechanically, by pasting the implementation twice, then only keeping the code that would get executed during prepare phase on top, then only keeping the code that gets executed during the commit phase on bottom, then simplifying the resulting code until this was obtained. - DSA's offloading of STP state, bridge flags, VLAN filtering and multicast router could be converted right away. But the ageing time could not, so a shim was introduced and this was left for a further commit. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NJiri Pirko <jiri@nvidia.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB Reviewed-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
Since the introduction of the switchdev API, port objects were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8ece ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port object notifier structures, and converts drivers to not look at this member. Where driver conversion is trivial (like in the case of the Marvell Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is done in this patch. Where driver conversion needs more attention (DSA, Mellanox Spectrum), the conversion is left for subsequent patches and here we only fake the prepare/commit phases at a lower level, just not in the switchdev notifier itself. Where the code has a natural structure that is best left alone as a preparation and a commit phase (as in the case of the Ocelot switch), that structure is left in place, just made to not depend upon the switchdev transactional model. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NJiri Pirko <jiri@nvidia.com> Reviewed-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
The call path of a switchdev VLAN addition to the bridge looks something like this today: nbp_vlan_init | __br_vlan_set_default_pvid | | | | | br_afspec | | | | | | | v | | | br_process_vlan_info | | | | | | | v | | | br_vlan_info | | | / \ / | | / \ / | | / \ / | | / \ / v v v v v nbp_vlan_add br_vlan_add ------+ | ^ ^ | | | / | | | | / / / | \ br_vlan_get_master/ / v \ ^ / / br_vlan_add_existing \ | / / | \ | / / / \ | / / / \ | / / / \ | / / / v | | v / __vlan_add / / | / / | / v | / __vlan_vid_add | / \ | / v v v br_switchdev_port_vlan_add The ranges UAPI was introduced to the bridge in commit bdced7ef ("bridge: support for multiple vlans and vlan ranges in setlink and dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec) have always been passed one by one, through struct bridge_vlan_info tmp_vinfo, to br_vlan_info. So the range never went too far in depth. Then Scott Feldman introduced the switchdev_port_bridge_setlink function in commit 47f8328b ("switchdev: add new switchdev bridge setlink"). That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made full use of the range. But switchdev_port_bridge_setlink was called like this: br_setlink -> br_afspec -> switchdev_port_bridge_setlink Basically, the switchdev and the bridge code were not tightly integrated. Then commit 41c498b9 ("bridge: restore br_setlink back to original") came, and switchdev drivers were required to implement .ndo_bridge_setlink = switchdev_port_bridge_setlink for a while. In the meantime, commits such as 0944d6b5 ("bridge: try switchdev op first in __vlan_vid_add/del") finally made switchdev penetrate the br_vlan_info() barrier and start to develop the call path we have today. But remember, br_vlan_info() still receives VLANs one by one. Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit 29ab586c ("net: switchdev: Remove bridge bypass support from switchdev") so that drivers would not implement .ndo_bridge_setlink any longer. The switchdev_port_bridge_setlink also got deleted. This refactoring removed the parallel bridge_setlink implementation from switchdev, and left the only switchdev VLAN objects to be the ones offloaded from __vlan_vid_add (basically RX filtering) and __vlan_add (the latter coming from commit 9c86ce2c ("net: bridge: Notify about bridge VLANs")). That is to say, today the switchdev VLAN object ranges are not used in the kernel. Refactoring the above call path is a bit complicated, when the bridge VLAN call path is already a bit complicated. Let's go off and finish the job of commit 29ab586c by deleting the bogus iteration through the VLAN ranges from the drivers. Some aspects of this feature never made too much sense in the first place. For example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID flag supposed to mean, when a port can obviously have a single pvid? This particular configuration _is_ denied as of commit 6623c60d ("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API perspective, the driver still has to play pretend, and only offload the vlan->vid_end as pvid. And the addition of a switchdev VLAN object can modify the flags of another, completely unrelated, switchdev VLAN object! (a VLAN that is PVID will invalidate the PVID flag from whatever other VLAN had previously been offloaded with switchdev and had that flag. Yet switchdev never notifies about that change, drivers are supposed to guess). Nonetheless, having a VLAN range in the API makes error handling look scarier than it really is - unwinding on errors and all of that. When in reality, no one really calls this API with more than one VLAN. It is all unnecessary complexity. And despite appearing pretentious (two-phase transactional model and all), the switchdev API is really sloppy because the VLAN addition and removal operations are not paired with one another (you can add a VLAN 100 times and delete it just once). The bridge notifies through switchdev of a VLAN addition not only when the flags of an existing VLAN change, but also when nothing changes. There are switchdev drivers out there who don't like adding a VLAN that has already been added, and those checks don't really belong at driver level. But the fact that the API contains ranges is yet another factor that prevents this from being addressed in the future. Of the existing switchdev pieces of hardware, it appears that only Mellanox Spectrum supports offloading more than one VLAN at a time, through mlxsw_sp_port_vlan_set. I have kept that code internal to the driver, because there is some more bookkeeping that makes use of it, but I deleted it from the switchdev API. But since the switchdev support for ranges has already been de facto deleted by a Mellanox employee and nobody noticed for 4 years, I'm going to assume it's not a biggie. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 10 1月, 2021 2 次提交
-
-
由 Vadim Pasternak 提交于
Increase critical threshold for ASIC thermal zone from 110C to 140C according to the system hardware requirements. All the supported ASICs (Spectrum-1, Spectrum-2, Spectrum-3) could be still operational with ASIC temperature below 140C. With the old critical threshold value system can perform unjustified shutdown. All the systems equipped with the above ASICs implement thermal protection mechanism at firmware level and firmware could decide to perform system thermal shutdown in case the temperature is below 140C. So with the new threshold system will not meltdown, while thermal operating range will be aligned with hardware abilities. Fixes: 41e76084 ("mlxsw: core: Replace thermal temperature trips with defines") Fixes: a50c1e35 ("mlxsw: core: Implement thermal zone") Signed-off-by: NVadim Pasternak <vadimp@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vadim Pasternak 提交于
Validate thresholds to avoid a single failure due to some transceiver unreliability. Ignore the last readouts in case warning temperature is above alarm temperature, since it can cause unexpected thermal shutdown. Stay with the previous values and refresh threshold within the next iteration. This is a rare scenario, but it was observed at a customer site. Fixes: 6a79507c ("mlxsw: core: Extend thermal module with per QSFP module thermal zones") Signed-off-by: NVadim Pasternak <vadimp@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 15 12月, 2020 15 次提交
-
-
由 Jiri Pirko 提交于
In case the eXtended mezzanine is present on the system, use it for IPv4 router offload. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
Set a profile option to instruct FW to use 1/2 of KVH for XLT cache, not the whole one. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
Upon route insertion and removal, it is needed to flush possibly cached entries from the XM cache. Extend XM op context to carry information needed for the flush. Implement the flush in delayed work since for HW design reasons there is a need to wait 50usec before the flush can be done. If during this time comes the same flush request, consolidate it to the first one. Implement this queued flushes by a hashtable. v2: * Fix GENMASK() high bit Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
The RLPMCE allows disabling the LPM cache. Can be changed on the fly. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
The RLCMLD register is used to bulk delete the XLT-LPM cache ML entries. This can be used by SW when L is increased or decreased, thus need to remove entries with old ML values. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
There is a table that assigns L-value per M-index. The L is always the biggest from the currently inserted prefixes. Setup a hashtable to track the M-index information and the prefixes that are related to it. Ensure the L-value is always correctly set. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
The XRMT configures the M-Table for the XLT-LPM. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
During the router init flow, call into XM code and initialize couple of items needed for XM functionality: 1) Query the capabilities and sizes. Check the XM device id. 2) Initialize the M-value. Note that currently the M-value is set fixed to 16 for IPv4. In future this may change to better cover the actual inserted routes. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
The XLTQ is used to query HW for XM-related info. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
The RXLTM configures and selects the M for the XM lookups. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
Use the info stored in the bus_info struct about the eXtended mezanine connected ports and don't expose them. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
The output of boardinfo command was extended to contain information about XM. Indicates if is present and in case it is, tells which localports are used for the connection. So parse this info and store it in bus_info passed up to the driver. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
In order to offload entries to XM, implement a set of low-level functions to work with LPM trees in XM and also to pack and write FIB entries into XM. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
The RXLTE enables XLT (eXtended Lookup Table) LPM lookups if a capable XM is present on the system. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jiri Pirko 提交于
The XMDR allows direct access to the XM device via the switch. Signed-off-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 09 12月, 2020 2 次提交
-
-
由 Amit Cohen 提交于
The previous patches added support for VxLAN device enslaved to 802.1ad bridge in Spectrum-2 ASIC and vetoed it in Spectrum-1. Do not veto VxLAN with 802.1ad bridge. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Reviewed-by: NPetr Machata <petrm@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amit Cohen 提交于
Implementation of Q-in-VNI is different between ASIC types, this set adds support only for Spectrum-2. Return an error when trying to create VxLAN device and enslave it to 802.1ad bridge in Spectrum-1. Signed-off-by: NAmit Cohen <amcohen@nvidia.com> Reviewed-by: NPetr Machata <petrm@nvidia.com> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-