1. 27 2月, 2022 6 次提交
    • V
      net: dsa: pass extack to .port_bridge_join driver methods · 06b9cce4
      Vladimir Oltean 提交于
      As FDB isolation cannot be enforced between VLAN-aware bridges in lack
      of hardware assistance like extra FID bits, it seems plausible that many
      DSA switches cannot do it. Therefore, they need to reject configurations
      with multiple VLAN-aware bridges from the two code paths that can
      transition towards that state:
      
      - joining a VLAN-aware bridge
      - toggling VLAN awareness on an existing bridge
      
      The .port_vlan_filtering method already propagates the netlink extack to
      the driver, let's propagate it from .port_bridge_join too, to make sure
      that the driver can use the same function for both.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06b9cce4
    • V
      net: dsa: request drivers to perform FDB isolation · c2693363
      Vladimir Oltean 提交于
      For DSA, to encourage drivers to perform FDB isolation simply means to
      track which bridge does each FDB and MDB entry belong to. It then
      becomes the driver responsibility to use something that makes the FDB
      entry from one bridge not match the FDB lookup of ports from other
      bridges.
      
      The top-level functions where the bridge is determined are:
      - dsa_port_fdb_{add,del}
      - dsa_port_host_fdb_{add,del}
      - dsa_port_mdb_{add,del}
      - dsa_port_host_mdb_{add,del}
      
      aka the pre-crosschip-notifier functions.
      
      Changing the API to pass a reference to a bridge is not superfluous, and
      looking at the passed bridge argument is not the same as having the
      driver look at dsa_to_port(ds, port)->bridge from the ->port_fdb_add()
      method.
      
      DSA installs FDB and MDB entries on shared (CPU and DSA) ports as well,
      and those do not have any dp->bridge information to retrieve, because
      they are not in any bridge - they are merely the pipes that serve the
      user ports that are in one or multiple bridges.
      
      The struct dsa_bridge associated with each FDB/MDB entry is encapsulated
      in a larger "struct dsa_db" database. Although only databases associated
      to bridges are notified for now, this API will be the starting point for
      implementing IFF_UNICAST_FLT in DSA. There, the idea is to install FDB
      entries on the CPU port which belong to the corresponding user port's
      port database. These are supposed to match only when the port is
      standalone.
      
      It is better to introduce the API in its expected final form than to
      introduce it for bridges first, then to have to change drivers which may
      have made one or more assumptions.
      
      Drivers can use the provided bridge.num, but they can also use a
      different numbering scheme that is more convenient.
      
      DSA must perform refcounting on the CPU and DSA ports by also taking
      into account the bridge number. So if two bridges request the same local
      address, DSA must notify the driver twice, once for each bridge.
      
      In fact, if the driver supports FDB isolation, DSA must perform
      refcounting per bridge, but if the driver doesn't, DSA must refcount
      host addresses across all bridges, otherwise it would be telling the
      driver to delete an FDB entry for a bridge and the driver would delete
      it for all bridges. So introduce a bool fdb_isolation in drivers which
      would make all bridge databases passed to the cross-chip notifier have
      the same number (0). This makes dsa_mac_addr_find() -> dsa_db_equal()
      say that all bridge databases are the same database - which is
      essentially the legacy behavior.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2693363
    • V
      net: dsa: tag_8021q: rename dsa_8021q_bridge_tx_fwd_offload_vid · b6362bdf
      Vladimir Oltean 提交于
      The dsa_8021q_bridge_tx_fwd_offload_vid is no longer used just for
      bridge TX forwarding offload, it is the private VLAN reserved for
      VLAN-unaware bridging in a way that is compatible with FDB isolation.
      
      So just rename it dsa_tag_8021q_bridge_vid.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6362bdf
    • V
      net: dsa: tag_8021q: merge RX and TX VLANs · 04b67e18
      Vladimir Oltean 提交于
      In the old Shared VLAN Learning mode of operation that tag_8021q
      previously used for forwarding, we needed to have distinct concepts for
      an RX and a TX VLAN.
      
      An RX VLAN could be installed on all ports that were members of a given
      bridge, so that autonomous forwarding could still work, while a TX VLAN
      was dedicated for precise packet steering, so it just contained the CPU
      port and one egress port.
      
      Now that tag_8021q uses Independent VLAN Learning and imprecise RX/TX
      all over, those lines have been blurred and we no longer have the need
      to do precise TX towards a port that is in a bridge. As for standalone
      ports, it is fine to use the same VLAN ID for both RX and TX.
      
      This patch changes the tag_8021q format by shifting the VLAN range it
      reserves, and halving it. Previously, our DIR bits were encoding the
      VLAN direction (RX/TX) and were set to either 1 or 2. This meant that
      tag_8021q reserved 2K VLANs, or 50% of the available range.
      
      Change the DIR bits to a hardcoded value of 3 now, which makes tag_8021q
      reserve only 1K VLANs, and a different range now (the last 1K). This is
      done so that we leave the old format in place in case we need to return
      to it.
      
      In terms of code, the vid_is_dsa_8021q_rxvlan and vid_is_dsa_8021q_txvlan
      functions go away. Any vid_is_dsa_8021q is both a TX and an RX VLAN, and
      they are no longer distinct. For example, felix which did different
      things for different VLAN types, now needs to handle the RX and the TX
      logic for the same VLAN.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04b67e18
    • V
      net: dsa: felix: delete workarounds present due to SVL tag_8021q bridging · 08f44db3
      Vladimir Oltean 提交于
      The felix driver, which also has a tagging protocol implementation based
      on tag_8021q, does not care about adding the RX VLAN that is pvid on one
      port on the other ports that are in the same bridge with it. It simply
      doesn't need that, because in its implementation, the RX VLAN that is
      pvid of a port is only used to install a TCAM rule that pushes that VLAN
      ID towards the CPU port.
      
      Now that tag_8021q no longer performs Shared VLAN Learning based
      forwarding, the RX VLANs are actually segregated into two types:
      standalone VLANs and VLAN-unaware bridging VLANs. Since you actually
      have to call dsa_tag_8021q_bridge_join() to get a bridging VLAN from
      tag_8021q, and felix does not do that because it doesn't need it, it
      means that it only gets standalone port VLANs from tag_8021q. Which is
      perfect because this means it can drop its workarounds that avoid the
      VLANs it does not need.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      08f44db3
    • V
      net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging · 91495f21
      Vladimir Oltean 提交于
      For VLAN-unaware bridging, tag_8021q uses something perhaps a bit too
      tied with the sja1105 switch: each port uses the same pvid which is also
      used for standalone operation (a unique one from which the source port
      and device ID can be retrieved when packets from that port are forwarded
      to the CPU). Since each port has a unique pvid when performing
      autonomous forwarding, the switch must be configured for Shared VLAN
      Learning (SVL) such that the VLAN ID itself is ignored when performing
      FDB lookups. Without SVL, packets would always be flooded, since FDB
      lookup in the source port's VLAN would never find any entry.
      
      First of all, to make tag_8021q more palatable to switches which might
      not support Shared VLAN Learning, let's just use a common VLAN for all
      ports that are under the same bridge.
      
      Secondly, using Shared VLAN Learning means that FDB isolation can never
      be enforced. But if all ports under the same VLAN-unaware bridge share
      the same VLAN ID, it can.
      
      The disadvantage is that the CPU port can no longer perform precise
      source port identification for these packets. But at least we have a
      mechanism which has proven to be adequate for that situation: imprecise
      RX (dsa_find_designated_bridge_port_by_vid), which is what we use for
      termination on VLAN-aware bridges.
      
      The VLAN ID that VLAN-unaware bridges will use with tag_8021q is the
      same one as we were previously using for imprecise TX (bridge TX
      forwarding offload). It is already allocated, it is just a matter of
      using it.
      
      Note that because now all ports under the same bridge share the same
      VLAN, the complexity of performing a tag_8021q bridge join decreases
      dramatically. We no longer have to install the RX VLAN of a newly
      joining port into the port membership of the existing bridge ports.
      The newly joining port just becomes a member of the VLAN corresponding
      to that bridge, and the other ports are already members of it from when
      they joined the bridge themselves. So forwarding works properly.
      
      This means that we can unhook dsa_tag_8021q_bridge_{join,leave} from the
      cross-chip notifier level dsa_switch_bridge_{join,leave}. We can put
      these calls directly into the sja1105 driver.
      
      With this new mode of operation, a port controlled by tag_8021q can have
      two pvids whereas before it could only have one. The pvid for standalone
      operation is different from the pvid used for VLAN-unaware bridging.
      This is done, again, so that FDB isolation can be enforced.
      Let tag_8021q manage this by deleting the standalone pvid when a port
      joins a bridge, and restoring it when it leaves it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91495f21
  2. 26 2月, 2022 5 次提交
  3. 25 2月, 2022 12 次提交
  4. 23 2月, 2022 4 次提交
    • H
      net: dsa: mv88e6xxx: Add support for bridge port locked mode · 34ea415f
      Hans Schultz 提交于
      Supporting bridge ports in locked mode using the drop on lock
      feature in Marvell mv88e6xxx switchcores is described in the
      '88E6096/88E6097/88E6097F Datasheet', sections 4.4.6, 4.4.7 and
      5.1.2.1 (Drop on Lock).
      
      This feature is implemented here facilitated by the locked port flag.
      Signed-off-by: NHans Schultz <schultz.hans+netdev@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34ea415f
    • A
      net: dsa: realtek: rtl8365mb: serialize indirect PHY register access · 27967284
      Alvin Šipraga 提交于
      Realtek switches in the rtl8365mb family can access the PHY registers of
      the internal PHYs via the switch registers. This method is called
      indirect access. At a high level, the indirect PHY register access
      method involves reading and writing some special switch registers in a
      particular sequence. This works for both SMI and MDIO connected
      switches.
      
      Currently the rtl8365mb driver does not take any care to serialize the
      aforementioned access to the switch registers. In particular, it is
      permitted for other driver code to access other switch registers while
      the indirect PHY register access is ongoing. Locking is only done at the
      regmap level. This, however, is a bug: concurrent register access, even
      to unrelated switch registers, risks corrupting the PHY register value
      read back via the indirect access method described above.
      
      Arınç reported that the switch sometimes returns nonsense data when
      reading the PHY registers. In particular, a value of 0 causes the
      kernel's PHY subsystem to think that the link is down, but since most
      reads return correct data, the link then flip-flops between up and down
      over a period of time.
      
      The aforementioned bug can be readily observed by:
      
       1. Enabling ftrace events for regmap and mdio
       2. Polling BSMR PHY register for a connected port;
          it should always read the same (e.g. 0x79ed)
       3. Wait for step 2 to give a different value
      
      Example command for step 2:
      
          while true; do phytool read swp2/2/0x01; done
      
      On my i.MX8MM, the above steps will yield a bogus value for the BSMR PHY
      register within a matter of seconds. The interleaved register access it
      then evident in the trace log:
      
       kworker/3:4-70      [003] .......  1927.139849: regmap_reg_write: ethernet-switch reg=1004 val=bd
           phytool-16816   [002] .......  1927.139979: regmap_reg_read: ethernet-switch reg=1f01 val=0
       kworker/3:4-70      [003] .......  1927.140381: regmap_reg_read: ethernet-switch reg=1005 val=0
           phytool-16816   [002] .......  1927.140468: regmap_reg_read: ethernet-switch reg=1d15 val=a69
       kworker/3:4-70      [003] .......  1927.140864: regmap_reg_read: ethernet-switch reg=1003 val=0
           phytool-16816   [002] .......  1927.140955: regmap_reg_write: ethernet-switch reg=1f02 val=2041
       kworker/3:4-70      [003] .......  1927.141390: regmap_reg_read: ethernet-switch reg=1002 val=0
           phytool-16816   [002] .......  1927.141479: regmap_reg_write: ethernet-switch reg=1f00 val=1
       kworker/3:4-70      [003] .......  1927.142311: regmap_reg_write: ethernet-switch reg=1004 val=be
           phytool-16816   [002] .......  1927.142410: regmap_reg_read: ethernet-switch reg=1f01 val=0
       kworker/3:4-70      [003] .......  1927.142534: regmap_reg_read: ethernet-switch reg=1005 val=0
           phytool-16816   [002] .......  1927.142618: regmap_reg_read: ethernet-switch reg=1f04 val=0
           phytool-16816   [002] .......  1927.142641: mdio_access: SMI-0 read  phy:0x02 reg:0x01 val:0x0000 <- ?!
       kworker/3:4-70      [003] .......  1927.143037: regmap_reg_read: ethernet-switch reg=1001 val=0
       kworker/3:4-70      [003] .......  1927.143133: regmap_reg_read: ethernet-switch reg=1000 val=2d89
       kworker/3:4-70      [003] .......  1927.143213: regmap_reg_write: ethernet-switch reg=1004 val=be
       kworker/3:4-70      [003] .......  1927.143291: regmap_reg_read: ethernet-switch reg=1005 val=0
       kworker/3:4-70      [003] .......  1927.143368: regmap_reg_read: ethernet-switch reg=1003 val=0
       kworker/3:4-70      [003] .......  1927.143443: regmap_reg_read: ethernet-switch reg=1002 val=6
      
      The kworker here is polling MIB counters for stats, as evidenced by the
      register 0x1004 that we are writing to (RTL8365MB_MIB_ADDRESS_REG). This
      polling is performed every 3 seconds, but is just one example of such
      unsynchronized access. In Arınç's case, the driver was not using the
      switch IRQ, so the PHY subsystem was itself doing polling analogous to
      phytool in the above example.
      
      A test module was created [see second Link] to simulate such spurious
      switch register accesses while performing indirect PHY register reads
      and writes. Realtek was also consulted to confirm whether this is a
      known issue or not. The conclusion of these lines of inquiry is as
      follows:
      
      1. Reading of PHY registers via indirect access will be aborted if,
         after executing the read operation (via a write to the
         INDIRECT_ACCESS_CTRL_REG), any register is accessed, other than
         INDIRECT_ACCESS_STATUS_REG.
      
      2. The PHY register indirect read is only complete when
         INDIRECT_ACCESS_STATUS_REG reads zero.
      
      3. The INDIRECT_ACCESS_DATA_REG, which is read to get the result of the
         PHY read, will contain the result of the last successful read
         operation. If there was spurious register access and the indirect
         read was aborted, then this register is not guaranteed to hold
         anything meaningful and the PHY read will silently fail.
      
      4. PHY writes do not appear to be affected by this mechanism.
      
      5. Other similar access routines, such as for MIB counters, although
         similar to the PHY indirect access method, are actually table access.
         Table access is not affected by spurious reads or writes of other
         registers. However, concurrent table access is not allowed. Currently
         this is protected via mib_lock, so there is nothing to fix.
      
      The above statements are corroborated both via the test module and
      through consultation with Realtek. In particular, Realtek states that
      this is simply a property of the hardware design and is not a hardware
      bug.
      
      To fix this problem, one must guard against regmap access while the
      PHY indirect register read is executing. Fix this by using the newly
      introduced "nolock" regmap in all PHY-related functions, and by aquiring
      the regmap mutex at the top level of the PHY register access callbacks.
      Although no issue has been observed with PHY register _writes_, this
      change also serializes the indirect access method there. This is done
      purely as a matter of convenience and for reasons of symmetry.
      
      Fixes: 4af2950c ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC")
      Link: https://lore.kernel.org/netdev/CAJq09z5FCgG-+jVT7uxh1a-0CiiFsoKoHYsAWJtiKwv7LXKofQ@mail.gmail.com/
      Link: https://lore.kernel.org/netdev/871qzwjmtv.fsf@bang-olufsen.dk/Reported-by: NArınç ÜNAL <arinc.unal@arinc9.com>
      Reported-by: NLuiz Angelo Daros de Luca <luizluca@gmail.com>
      Signed-off-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27967284
    • A
      net: dsa: realtek: allow subdrivers to externally lock regmap · 907e772f
      Alvin Šipraga 提交于
      Currently there is no way for Realtek DSA subdrivers to serialize
      consecutive regmap accesses. In preparation for a bugfix relating to
      indirect PHY register access - which involves a series of regmap
      reads and writes - add a facility for subdrivers to serialize their
      regmap access.
      
      Specifically, a mutex is added to the driver private data structure and
      the standard regmap is initialized with custom lock/unlock ops which use
      this mutex. Then, a "nolock" variant of the regmap is added, which is
      functionally equivalent to the existing regmap except that regmap
      locking is disabled. Functions that wish to serialize a sequence of
      regmap accesses may then lock the newly introduced driver-owned mutex
      before using the nolock regmap.
      
      Doing things this way means that subdriver code that doesn't care about
      serialized register access - i.e. the vast majority of code - needn't
      worry about synchronizing register access with an external lock: it can
      just continue to use the original regmap.
      
      Another advantage of this design is that, while regmaps with locking
      disabled do not expose a debugfs interface for obvious reasons, there
      still exists the original regmap which does expose this interface. This
      interface remains safe to use even combined with driver codepaths that
      use the nolock regmap, because said codepaths will use the same mutex
      to synchronize access.
      
      With respect to disadvantages, it can be argued that having
      near-duplicate regmaps is confusing. However, the naming is rather
      explicit, and examples will abound.
      
      Finally, while we are at it, rename realtek_smi_mdio_regmap_config to
      realtek_smi_regmap_config. This makes it consistent with the naming
      realtek_mdio_regmap_config in realtek-mdio.c.
      Signed-off-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      907e772f
    • O
      net: dsa: microchip: ksz9477: reduce polling interval for statistics · 12c740c8
      Oleksij Rempel 提交于
      30 seconds is too long interval especially if it used with ip -s l.
      Reduce polling interval to 5 sec.
      Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Link: https://lore.kernel.org/r/20220221084129.3660124-1-o.rempel@pengutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      12c740c8
  5. 22 2月, 2022 5 次提交
  6. 20 2月, 2022 3 次提交
  7. 18 2月, 2022 5 次提交