From 1cb9dfca39eb406960f8f84864ddd6ba329ec321 Mon Sep 17 00:00:00 2001 From: Rasmus Villemoes Date: Mon, 22 Jul 2019 23:37:26 +0000 Subject: [PATCH] net: dsa: mv88e6xxx: avoid some redundant vtu load/purge operations We have an ERPS (Ethernet Ring Protection Switching) setup involving mv88e6250 switches which we're in the process of switching to a BSP based on the mainline driver. Breaking any link in the ring works as expected, with the ring reconfiguring itself quickly and traffic continuing with almost no noticable drops. However, when plugging back the cable, we see 5+ second stalls. This has been tracked down to the userspace application in charge of the protocol missing a few CCM messages on the good link (the one that was not unplugged), causing it to broadcast a "signal fail". That message eventually reaches its link partner, which responds by blocking the port. Meanwhile, the first node has continued to block the port with the just plugged-in cable, breaking the network. And the reason for those missing CCM messages has in turn been tracked down to the VTU apparently being too busy servicing load/purge operations that the normal lookups are delayed. Initial state, the link between C and D is blocked in software. _____________________ / \ | | A ----- B ----- C *---- D Unplug the cable between C and D. _____________________ / \ | | A ----- B ----- C * * D Reestablish the link between C and D. _____________________ / \ | | A ----- B ----- C *---- D Somehow, enough VTU/ATU operations happen inside C that prevents the application from receving the CCM messages from B in a timely manner, so a Signal Fail message is sent by C. When B receives that, it responds by blocking its port. _____________________ / \ | | A ----- B *---* C *---- D Very shortly after this, the signal fail condition clears on the BC link (some CCM messages finally make it through), so C unblocks the port. However, a guard timer inside B prevents it from removing the blocking before 5 seconds have elapsed. It is not unlikely that our userspace ERPS implementation could be smarter and/or is simply buggy. However, this patch fixes the symptoms we see, and is a small optimization that should not break anything (knock wood). The idea is simply to avoid doing an VTU load of an entry identical to the one already present. To do that, we need to know whether mv88e6xxx_vtu_get() actually found an existing entry, or has just prepared a struct mv88e6xxx_vtu_entry for us to load. To that end, let vlan->valid be an output parameter. The other two callers of mv88e6xxx_vtu_get() are not affected by this patch since they pass new=false. Signed-off-by: Rasmus Villemoes Signed-off-by: David S. Miller --- drivers/net/dsa/mv88e6xxx/chip.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 6b17cd961d06..2e500428670c 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1423,7 +1423,6 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, /* Initialize a fresh VLAN entry */ memset(entry, 0, sizeof(*entry)); - entry->valid = true; entry->vid = vid; /* Exclude all ports */ @@ -1618,6 +1617,9 @@ static int _mv88e6xxx_port_vlan_add(struct mv88e6xxx_chip *chip, int port, if (err) return err; + if (vlan.valid && vlan.member[port] == member) + return 0; + vlan.valid = true; vlan.member[port] = member; err = mv88e6xxx_vtu_loadpurge(chip, &vlan); -- GitLab