1. 09 6月, 2021 2 次提交
    • V
      net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX · 5a8f0974
      Vladimir Oltean 提交于
      The SJA1110 contains two types of integrated PHYs: one 100base-TX PHY
      and multiple 100base-T1 PHYs.
      
      The access procedure for the 100base-T1 PHYs is also different than it
      is for the 100base-TX one. So we register 2 MDIO buses, one for the
      base-TX and the other for the base-T1. Each bus has an OF node which is
      a child of the "mdio" subnode of the switch, and they are recognized by
      compatible string.
      
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: devicetree@vger.kernel.org
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a8f0974
    • V
      net: dsa: sja1105: add support for the SJA1110 switch family · 3e77e59b
      Vladimir Oltean 提交于
      The SJA1110 is basically an SJA1105 with more ports, some integrated
      PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which
      can be disabled, and the switch core can be controlled by a host running
      Linux, over SPI.
      
      This patch contains:
      - the static and dynamic config packing functions, for the tables that
        are common with SJA1105
      - one more static config tables which is "unique" to the SJA1110
        (actually it is a rehash of stuff that was placed somewhere else in
        SJA1105): the PCP Remapping Table
      - a reset and clock configuration procedure for the SJA1110 switch.
        This resets just the switch subsystem, and gates off the clock which
        powers on the embedded microcontroller.
      - an RGMII delay configuration procedure for SJA1110, which is very
        similar to SJA1105, but different enough for us to be unable to reuse
        it (this is a pattern that repeats itself)
      - some adaptations to dynamic config table entries which are no longer
        programmed in the same way. For example, to delete a VLAN, you used to
        write an entry through the dynamic reconfiguration interface with the
        desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN
        table entries contain a TYPE_ENTRY field, which must be set to zero
        (in a backwards-incompatible way) in order for the entry to be deleted,
        or to some other entry for the VLAN to match "inner tagged" or "outer
        tagged" packets.
      - a similar thing for the static config: the xMII Mode Parameters Table
        encoding for SGMII and MII (the latter just when attached to a
        100base-TX PHY) just isn't what it used to be in SJA1105. They are
        identical, except there is an extra "special" bit which needs to be
        set. Set it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e77e59b
  2. 08 6月, 2021 1 次提交
  3. 01 6月, 2021 4 次提交
    • V
      net: dsa: sja1105: add a translation table for port speeds · 41fed17f
      Vladimir Oltean 提交于
      In order to support the new speed of 2500Mbps, the SJA1110 has achieved
      the great performance of changing the encoding in the MAC Configuration
      Table for the port speeds of 10, 100, 1000 compared to SJA1105.
      
      Because this is a common driver, we need a layer of indirection in order
      to program the hardware with the right values irrespective of switch
      generation.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      41fed17f
    • V
      net: dsa: sja1105: add a PHY interface type compatibility matrix · 91a05078
      Vladimir Oltean 提交于
      On the SJA1105, all ports support the parallel "xMII" protocols (MII,
      RMII, RGMII) except for port 4 on SJA1105R/S which supports only SGMII.
      This was relatively easy to model, by special-casing the SGMII port.
      
      On the SJA1110, certain ports can be pinmuxed between SGMII and xMII, or
      between SGMII and an internal 100base-TX PHY. This creates problems,
      because the driver's assumption so far was that if a port supports
      SGMII, it uses SGMII.
      
      We allow the device tree to tell us how the port pinmuxing is done, and
      check that against a PHY interface type compatibility matrix for
      plausibility.
      
      The other big change is that instead of doing SGMII configuration based
      on what the port supports, we do it based on what is the configured
      phy_mode of the port.
      
      The 2500base-x support added in this patch is not complete.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      91a05078
    • V
      net: dsa: sja1105: cache the phy-mode port property · bf4edf4a
      Vladimir Oltean 提交于
      So far we've succeeded in operating without keeping a copy of the
      phy-mode in the driver, since we already have the static config and we
      can look at the xMII Mode Parameters Table which already holds that
      information.
      
      But with the SJA1110, we cannot make the distinction between sgmii and
      2500base-x, because to the hardware's static config, it's all SGMII.
      So add a phy_mode property per port inside struct sja1105_private.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bf4edf4a
    • V
      net: dsa: sja1105: the 0x1F0000 SGMII "base address" is actually MDIO_MMD_VEND2 · 4c7ee010
      Vladimir Oltean 提交于
      Looking at the SGMII PCS from SJA1110, which is accessed indirectly
      through a different base address as can be seen in the next patch, it
      appears odd that the address accessed through indirection still
      references the base address from the SJA1105S register map (first MDIO
      register is at 0x1f0000), when it could index the SGMII registers
      starting from zero.
      
      Except that the 0x1f0000 is not a base address at all, it seems. It is
      0x1f << 16 | 0x0000, and 0x1f is coding for the vendor-specific MMD2.
      So, it turns out, the Synopsys PCS implements all its registers inside
      the vendor-specific MMDs 1 and 2 (0x1e and 0x1f). This explains why the
      PCS has no overlaps (for the other MMDs) with other register regions of
      the switch (because no other MMDs are implemented).
      
      Change the code to remove the SGMII "base address" and explicitly encode
      the MMD for reads/writes. This will become necessary for SJA1110 support.
      
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      4c7ee010
  4. 25 5月, 2021 3 次提交
  5. 22 5月, 2021 3 次提交
    • V
      net: dsa: sja1105: don't use burst SPI reads for port statistics · 039b167d
      Vladimir Oltean 提交于
      The current internal sja1105 driver API is optimized for retrieving many
      statistics counters at once. But the switch does not do atomic snapshotting
      for them anyway.
      
      In case we start reporting the hardware port counters through
      ndo_get_stats64 as well, not just ethtool, it would be good to be able
      to read individual port counters and not all of them.
      
      Additionally, since Arnd Bergmann's commit ae1804de ("dsa: sja1105:
      dynamically allocate stats structure"), sja1105_get_ethtool_stats
      allocates memory dynamically, since struct sja1105_port_status was
      deemed to consume too much stack memory. That is not ideal.
      The large structure is only needed because of the burst read.
      If we read statistics one by one, we can consume less memory, and
      we can avoid dynamic allocation.
      
      Additionally, latency-sensitive interfaces such as PTP operations (for
      phc2sys) might suffer if the SPI mutex is being held for too long, which
      happens in the case of SPI burst reads. By reading counters one by one,
      we give a chance for higher priority processes to preempt and take the
      SPI bus mutex for accessing the PTP clock.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      039b167d
    • V
      net: dsa: sja1105: stop reporting the queue levels in ethtool port counters · 30a2e9c0
      Vladimir Oltean 提交于
      The queue levels are not counters, but instead they represent the
      occupancy of the MAC TX queues. Having these in ethtool port counters is
      not helpful, so remove them.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30a2e9c0
    • V
      net: dsa: sja1105: adapt to a SPI controller with a limited max transfer size · 718bad0e
      Vladimir Oltean 提交于
      The static config of the sja1105 switch is a long stream of bytes which
      is programmed to the hardware in chunks (portions with the chip select
      continuously asserted) of max 256 bytes each. Each chunk is a
      spi_message composed of 2 spi_transfers: the buffer with the data and a
      preceding buffer with the SPI access header.
      
      Only that certain SPI controllers, such as the spi-sc18is602 I2C-to-SPI
      bridge, cannot keep the chip select asserted for that long.
      The spi_max_transfer_size() and spi_max_message_size() functions are how
      the controller can impose its hardware limitations upon the SPI
      peripheral driver.
      
      For the sja1105 driver to work with these controllers, both buffers must
      be smaller than the transfer limit, and their sum must be smaller than
      the message limit.
      
      Regression-tested on a switch connected to a controller with no
      limitations (spi-fsl-dspi) as well as with one with caps for both
      max_transfer_size and max_message_size (spi-sc18is602).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      718bad0e
  6. 17 2月, 2021 1 次提交
    • V
      net: dsa: sja1105: fix leakage of flooded frames outside bridging domain · 7f7ccdea
      Vladimir Oltean 提交于
      Quite embarrasingly, I managed to fool myself into thinking that the
      flooding domain of sja1105 source ports is restricted by the forwarding
      domain, which it isn't. Frames which match an FDB entry are forwarded
      towards that entry's DESTPORTS restricted by REACH_PORT[SRC_PORT], while
      frames that don't match any FDB entry are forwarded towards
      FL_DOMAIN[SRC_PORT] or BC_DOMAIN[SRC_PORT].
      
      This means we can't get away with doing the simple thing, and we must
      manage the flooding domain ourselves such that it is restricted by the
      forwarding domain. This new function must be called from the
      .port_bridge_join and .port_bridge_leave methods too, not just from
      .port_bridge_flags as we did before.
      
      Fixes: 4d942354 ("net: dsa: sja1105: offload bridge port flags to device")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f7ccdea
  7. 15 2月, 2021 1 次提交
  8. 13 2月, 2021 1 次提交
    • V
      net: dsa: sja1105: offload bridge port flags to device · 4d942354
      Vladimir Oltean 提交于
      The chip can configure unicast flooding, broadcast flooding and learning.
      Learning is per port, while flooding is per {ingress, egress} port pair
      and we need to configure the same value for all possible ingress ports
      towards the requested one.
      
      While multicast flooding is not officially supported, we can hack it by
      using a feature of the second generation (P/Q/R/S) devices, which is that
      FDB entries are maskable, and multicast addresses always have an odd
      first octet. So by putting a match-all for 00:01:00:00:00:00 addr and
      00:01:00:00:00:00 mask at the end of the FDB, we make sure that it is
      always checked last, and does not take precedence in front of any other
      MDB. So it behaves effectively as an unknown multicast entry.
      
      For the first generation switches, this feature is not available, so
      unknown multicast will always be treated the same as unknown unicast.
      So the only thing we can do is request the user to offload the settings
      for these 2 flags in tandem, i.e.
      
      ip link set swp2 type bridge_slave flood off
      Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
      ip link set swp2 type bridge_slave flood off mcast_flood off
      ip link set swp2 type bridge_slave mcast_flood on
      Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d942354
  9. 12 1月, 2021 1 次提交
    • V
      net: switchdev: remove the transaction structure from port attributes · bae33f2b
      Vladimir Oltean 提交于
      Since the introduction of the switchdev API, port attributes were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      attribute notifier structures, and converts drivers to not look at this
      member.
      
      In part, this patch contains a revert of my previous commit 2e554a7a
      ("net: dsa: propagate switchdev vlan_filtering prepare phase to
      drivers").
      
      For the most part, the conversion was trivial except for:
      - Rocker's world implementation based on Broadcom OF-DPA had an odd
        implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
        done mechanically, by pasting the implementation twice, then only
        keeping the code that would get executed during prepare phase on top,
        then only keeping the code that gets executed during the commit phase
        on bottom, then simplifying the resulting code until this was obtained.
      - DSA's offloading of STP state, bridge flags, VLAN filtering and
        multicast router could be converted right away. But the ageing time
        could not, so a shim was introduced and this was left for a further
        commit.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bae33f2b
  10. 05 10月, 2020 1 次提交
    • V
      net: dsa: propagate switchdev vlan_filtering prepare phase to drivers · 2e554a7a
      Vladimir Oltean 提交于
      A driver may refuse to enable VLAN filtering for any reason beyond what
      the DSA framework cares about, such as:
      - having tc-flower rules that rely on the switch being VLAN-aware
      - the particular switch does not support VLAN, even if the driver does
        (the DSA framework just checks for the presence of the .port_vlan_add
        and .port_vlan_del pointers)
      - simply not supporting this configuration to be toggled at runtime
      
      Currently, when a driver rejects a configuration it cannot support, it
      does this from the commit phase, which triggers various warnings in
      switchdev.
      
      So propagate the prepare phase to drivers, to give them the ability to
      refuse invalid configurations cleanly and avoid the warnings.
      
      Since we need to modify all function prototypes and check for the
      prepare phase from within the drivers, take that opportunity and move
      the existing driver restrictions within the prepare phase where that is
      possible and easy.
      
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
      Cc: Hauke Mehrtens <hauke@hauke-m.de>
      Cc: Woojung Huh <woojung.huh@microchip.com>
      Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
      Cc: Sean Wang <sean.wang@mediatek.com>
      Cc: Landen Chao <Landen.Chao@mediatek.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Jonathan McDowell <noodles@earth.li>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e554a7a
  11. 26 9月, 2020 3 次提交
  12. 12 9月, 2020 1 次提交
    • V
      net: dsa: tag_8021q: add a context structure · 5899ee36
      Vladimir Oltean 提交于
      While working on another tag_8021q driver implementation, some things
      became apparent:
      
      - It is not mandatory for a DSA driver to offload the tag_8021q VLANs by
        using the VLAN table per se. For example, it can add custom TCAM rules
        that simply encapsulate RX traffic, and redirect & decapsulate rules
        for TX traffic. For such a driver, it makes no sense to receive the
        tag_8021q configuration through the same callback as it receives the
        VLAN configuration from the bridge and the 8021q modules.
      
      - Currently, sja1105 (the only tag_8021q user) sets a
        priv->expect_dsa_8021q variable to distinguish between the bridge
        calling, and tag_8021q calling. That can be improved, to say the
        least.
      
      - The crosschip bridging operations are, in fact, stateful already. The
        list of crosschip_links must be kept by the caller and passed to the
        relevant tag_8021q functions.
      
      So it would be nice if the tag_8021q configuration was more
      self-contained. This patch attempts to do that.
      
      Create a struct dsa_8021q_context which encapsulates a struct
      dsa_switch, and has 2 function pointers for adding and deleting a VLAN.
      These will replace the previous channel to the driver, which was through
      the .port_vlan_add and .port_vlan_del callbacks of dsa_switch_ops.
      
      Also put the list of crosschip_links into this dsa_8021q_context.
      Drivers that don't support cross-chip bridging can simply omit to
      initialize this list, as long as they dont call any cross-chip function.
      
      The sja1105_vlan_add and sja1105_vlan_del functions are refactored into
      a smaller sja1105_vlan_add_one, which now has 2 entry points:
      - sja1105_vlan_add, from struct dsa_switch_ops
      - sja1105_dsa_8021q_vlan_add, from the tag_8021q ops
      But even this change is fairly trivial. It just reflects the fact that
      for sja1105, the VLANs from these 2 channels end up in the same hardware
      table. However that is not necessarily true in the general sense (and
      that's the reason for making this change).
      
      The rest of the patch is mostly plain refactoring of "ds" -> "ctx". The
      dsa_8021q_context structure needs to be propagated because adding a VLAN
      is now done through the ops function pointers inside of it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5899ee36
  13. 23 6月, 2020 1 次提交
  14. 31 5月, 2020 1 次提交
  15. 29 5月, 2020 1 次提交
    • V
      net: dsa: sja1105: offload the Credit-Based Shaper qdisc · 4d752508
      Vladimir Oltean 提交于
      SJA1105, being AVB/TSN switches, provide hardware assist for the
      Credit-Based Shaper as described in the IEEE 8021Q-2018 document.
      
      First generation has 10 shapers, freely assignable to any of the 4
      external ports and 8 traffic classes, and second generation has 16
      shapers.
      
      The Credit-Based Shaper tables are accessed through the dynamic
      reconfiguration interface, so we have to restore them manually after a
      switch reset. The tables are backed up by the static config only on
      P/Q/R/S, and we don't want to add custom code only for that family,
      since the procedure that is in place now works for both.
      
      Tested with the following commands:
      
      data_rate_kbps=67000
      port_transmit_rate_kbps=1000000
      idleslope=$data_rate_kbps
      sendslope=$(($idleslope - $port_transmit_rate_kbps))
      locredit=$((-0x80000000))
      hicredit=$((0x7fffffff))
      tc qdisc add dev swp2 root handle 1: mqprio hw 0 num_tc 8 \
              map 0 1 2 3 4 5 6 7 \
              queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7
      tc qdisc replace dev swp2 parent 1:1 cbs \
              idleslope $idleslope \
              sendslope $sendslope \
              hicredit $hicredit \
              locredit $locredit \
              offload 1
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d752508
  16. 13 5月, 2020 7 次提交
    • V
      net: dsa: sja1105: implement a common frame memory partitioning function · aaa270c6
      Vladimir Oltean 提交于
      There are 2 different features that require some reserved frame memory
      space: VLAN retagging and virtual links. Create a central function that
      modifies the static config and ensures frame memory is never
      overcommitted.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aaa270c6
    • V
      net: dsa: sja1105: add packing ops for the Retagging Table · 88cac0fa
      Vladimir Oltean 提交于
      The Retagging Table is an optional feature that allows the switch to
      match frames against a {ingress port, egress port, vid} rule and change
      their VLAN ID. The retagged frames are by default clones of the original
      ones (since the hardware-foreseen use case was to mirror traffic for
      debugging purposes and to tag it with a special VLAN for this purpose),
      but we can force the original frames to be dropped by removing the
      pre-retagging VLAN from the port membership list of the egress port.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88cac0fa
    • V
      net: dsa: sja1105: add a new best_effort_vlan_filtering devlink parameter · 2cafa72e
      Vladimir Oltean 提交于
      This devlink parameter enables the handling of DSA tags when enslaved to
      a bridge with vlan_filtering=1. There are very good reasons to want
      this, but there are also very good reasons for not enabling it by
      default. So a devlink param named best_effort_vlan_filtering, currently
      driver-specific and exported only by sja1105, is used to configure this.
      
      In practice, this is perhaps the way that most users are going to use
      the switch in. It assumes that no more than 7 VLANs are needed per port.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2cafa72e
    • V
      net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously · 38b5beea
      Vladimir Oltean 提交于
      In VLAN-unaware mode, sja1105 uses VLAN tags with a custom TPID of
      0xdadb. While in the yet-to-be introduced best_effort_vlan_filtering
      mode, it needs to work with normal VLAN TPID values.
      
      A complication arises when we must transmit a VLAN-tagged packet to the
      switch when it's in VLAN-aware mode. We need to construct a packet with
      2 VLAN tags, and the switch will use the outer header for routing and
      pop it on egress. But sadly, here the 2 hardware generations don't
      behave the same:
      
      - E/T switches won't pop an ETH_P_8021AD tag on egress, it seems
        (packets will remain double-tagged).
      - P/Q/R/S switches will drop a packet with 2 ETH_P_8021Q tags (it looks
        like it tries to prevent VLAN hopping).
      
      But looks like the reverse is also true:
      
      - E/T switches have no problem popping the outer tag from packets with
        2 ETH_P_8021Q tags.
      - P/Q/R/S will have no problem popping a single tag even if that is
        ETH_P_8021AD.
      
      So it is clear that if we want the hardware to work with dsa_8021q
      tagging in VLAN-aware mode, we need to send different TPIDs depending on
      revision. Keep that information in priv->info->qinq_tpid.
      
      The per-port tagger structure will hold an xmit_tpid value that depends
      not only upon the qinq_tpid, but also upon the VLAN awareness state
      itself (in case we must transmit using 0xdadb).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38b5beea
    • V
      net: dsa: sja1105: save/restore VLANs using a delta commit method · ec5ae610
      Vladimir Oltean 提交于
      Managing the VLAN table that is present in hardware will become very
      difficult once we add a third operating state
      (best_effort_vlan_filtering). That is because correct cleanup (not too
      little, not too much) becomes virtually impossible, when VLANs can be
      added from the bridge layer, from dsa_8021q for basic tagging, for
      cross-chip bridging, as well as retagging rules for sub-VLANs and
      cross-chip sub-VLANs. So we need to rethink VLAN interaction with the
      switch in a more scalable way.
      
      In preparation for that, use the priv->expect_dsa_8021q boolean to
      classify any VLAN request received through .port_vlan_add or
      .port_vlan_del towards either one of 2 internal lists: bridge VLANs and
      dsa_8021q VLANs.
      
      Then, implement a central sja1105_build_vlan_table method that creates a
      VLAN configuration from scratch based on the 2 lists of VLANs kept by
      the driver, and based on the VLAN awareness state. Currently, if we are
      VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs.
      
      Then, implement a delta commit procedure that identifies which VLANs
      from this new configuration are actually different from the config
      previously committed to hardware. We apply the delta through the dynamic
      configuration interface (we don't reset the switch). The result is that
      the hardware should see the exact sequence of operations as before this
      patch.
      
      This also helps remove the "br" argument passed to
      dsa_8021q_crosschip_bridge_join, which it was only using to figure out
      whether it should commit the configuration back to us or not, based on
      the VLAN awareness state of the bridge. We can simplify that, by always
      allowing those VLANs inside of our dsa_8021q_vlans list, and committing
      those to hardware when necessary.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec5ae610
    • V
      net: dsa: sja1105: deny alterations of dsa_8021q VLANs from the bridge · 60b33aeb
      Vladimir Oltean 提交于
      At the moment, this can never happen. The 2 modes that we operate in do
      not permit that:
      
       - SJA1105_VLAN_UNAWARE: we are guarded from bridge VLANs added by the
         user by the DSA core. We will later lift this restriction by setting
         ds->vlan_bridge_vtu = true, and that is where we'll need it.
      
       - SJA1105_VLAN_FILTERING_FULL: in this mode, dsa_8021q configuration is
         disabled. So the user is free to add these VLANs in the 1024-3071
         range.
      
      The reason for the patch is that we'll introduce a third VLAN awareness
      state, where both dsa_8021q as well as the bridge are going to call our
      .port_vlan_add and .port_vlan_del methods.
      
      For that, we need a good way to discriminate between the 2. The easiest
      (and less intrusive way for upper layers) is to recognize the fact that
      dsa_8021q configurations are always driven by our driver - we _know_
      when a .port_vlan_add method will be called from dsa_8021q because _we_
      initiated it.
      
      So introduce an expect_dsa_8021q boolean which is only used, at the
      moment, for blacklisting VLANs in range 1024-3071 in the modes when
      dsa_8021q is active.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60b33aeb
    • V
      net: dsa: sja1105: keep the VLAN awareness state in a driver variable · 7f14937f
      Vladimir Oltean 提交于
      Soon we'll add a third operating mode to the driver. Introduce a
      vlan_state to make things more easy to manage, and use it where
      applicable.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f14937f
  17. 11 5月, 2020 1 次提交
    • V
      net: dsa: sja1105: implement cross-chip bridging operations · ac02a451
      Vladimir Oltean 提交于
      sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart
      and which is compatible with cascading. A complete description of this
      tagging format is in net/dsa/tag_8021q.c, but a quick summary is that
      each external-facing port tags incoming frames with a unique pvid, and
      this special VLAN is transmitted as tagged towards the inside of the
      system, and as untagged towards the exterior. The tag encodes the switch
      id and the source port index.
      
      This means that cross-chip bridging for dsa_8021q only entails adding
      the dsa_8021q pvids of one switch to the RX filter of the other
      switches. Everything else falls naturally into place, as long as the
      bottom-end of ports (the leaves in the tree) is comprised exclusively of
      dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be
      a chance that a front-panel switch transmits a packet tagged with a
      dsa_8021q header, header which it wouldn't be able to remove, and which
      would hence "leak" out.
      
      The only use case I tested (due to lack of board availability) was when
      the sja1105 switches are part of disjoint trees (however, this doesn't
      change the fact that multiple sja1105 switches still need unique switch
      identifiers in such a system). But in principle, even "true" single-tree
      setups (with DSA links) should work just as fine, except for a small
      change which I can't test: dsa_towards_port should be used instead of
      dsa_upstream_port (I made the assumption that the routing port that any
      sja1105 should use towards its neighbours is the CPU port. That might
      not hold true in other setups).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ac02a451
  18. 08 5月, 2020 4 次提交
    • V
      net: dsa: sja1105: implement tc-gate using time-triggered virtual links · 834f8933
      Vladimir Oltean 提交于
      Restrict the TTEthernet hardware support on this switch to operate as
      closely as possible to IEEE 802.1Qci as possible. This means that it can
      perform PTP-time-based ingress admission control on streams identified
      by {DMAC, VID, PCP}, which is useful when trying to ensure the
      determinism of traffic scheduled via IEEE 802.1Qbv.
      
      The oddity comes from the fact that in hardware (and in TTEthernet at
      large), virtual links always need a full-blown action, including not
      only the type of policing, but also the list of destination ports. So in
      practice, a single tc-gate action will result in all packets getting
      dropped. Additional actions (either "trap" or "redirect") need to be
      specified in the same filter rule such that the conforming packets are
      actually forwarded somewhere.
      
      Apart from the VL Lookup, Policing and Forwarding tables which need to
      be programmed for each flow (virtual link), the Schedule engine also
      needs to be told to open/close the admission gates for each individual
      virtual link. A fairly accurate (and detailed) description of how that
      works is already present in sja1105_tas.c, since it is already used to
      trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
      point here, we remember that the schedule engine supports 8
      "subschedules" (execution threads that iterate through the global
      schedule in parallel, and that no 2 hardware threads must execute a
      schedule entry at the same time). For tc-taprio, each egress port used
      one of these 8 subschedules, leaving a total of 4 subschedules unused.
      In principle we could have allocated 1 subschedule for the tc-gate
      offload of each ingress port, but actually the schedules of all virtual
      links installed on each ingress port would have needed to be merged
      together, before they could have been programmed to hardware. So
      simplify our life and just merge the entire tc-gate configuration, for
      all virtual links on all ingress ports, into a single subschedule. Be
      sure to check that against the usual hardware scheduling conflicts, and
      program it to hardware alongside any tc-taprio subschedule that may be
      present.
      
      The following scenarios were tested:
      
      1. Quantitative testing:
      
         tc qdisc add dev swp2 clsact
         tc filter add dev swp2 ingress flower skip_sw \
                 dst_mac 42:be:24:9b:76:20 \
                 action gate index 1 base-time 0 \
                 sched-entry OPEN 1200 -1 -1 \
                 sched-entry CLOSE 1200 -1 -1 \
                 action trap
      
         ping 192.168.1.2 -f
         PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
         .............................
         --- 192.168.1.2 ping statistics ---
         948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
      
      2. Qualitative testing (with a phase-aligned schedule - the clocks are
         synchronized by ptp4l, not shown here):
      
         Receiver (sja1105):
      
         tc qdisc add dev swp2 clsact
         now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
                 sec=$(echo $now | awk -F. '{print $1}') && \
                 base_time="$(((sec + 2) * 1000000000))" && \
                 echo "base time ${base_time}"
         tc filter add dev swp2 ingress flower skip_sw \
                 dst_mac 42:be:24:9b:76:20 \
                 action gate base-time ${base_time} \
                 sched-entry OPEN  60000 -1 -1 \
                 sched-entry CLOSE 40000 -1 -1 \
                 action trap
      
         Sender (enetc):
         now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
                 sec=$(echo $now | awk -F. '{print $1}') && \
                 base_time="$(((sec + 2) * 1000000000))" && \
                 echo "base time ${base_time}"
         tc qdisc add dev eno0 parent root taprio \
                 num_tc 8 \
                 map 0 1 2 3 4 5 6 7 \
                 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
                 base-time ${base_time} \
                 sched-entry S 01  50000 \
                 sched-entry S 00  50000 \
                 flags 2
      
         ping -A 192.168.1.1
         PING 192.168.1.1 (192.168.1.1): 56 data bytes
         ...
         ^C
         --- 192.168.1.1 ping statistics ---
         1425 packets transmitted, 1424 packets received, 0% packet loss
         round-trip min/avg/max = 0.322/0.361/0.990 ms
      
         And just for comparison, with the tc-taprio schedule deleted:
      
         ping -A 192.168.1.1
         PING 192.168.1.1 (192.168.1.1): 56 data bytes
         ...
         ^C
         --- 192.168.1.1 ping statistics ---
         33 packets transmitted, 19 packets received, 42% packet loss
         round-trip min/avg/max = 0.336/0.464/0.597 ms
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      834f8933
    • V
      net: dsa: sja1105: support flow-based redirection via virtual links · dfacc5a2
      Vladimir Oltean 提交于
      Implement tc-flower offloads for redirect, trap and drop using
      non-critical virtual links.
      
      Commands which were tested to work are:
      
        # Send frames received on swp2 with a DA of 42:be:24:9b:76:20 to the
        # CPU and to swp3. This type of key (DA only) when the port's VLAN
        # awareness state is off.
        tc qdisc add dev swp2 clsact
        tc filter add dev swp2 ingress flower skip_sw dst_mac 42:be:24:9b:76:20 \
                action mirred egress redirect dev swp3 \
                action trap
      
        # Drop frames received on swp2 with a DA of 42:be:24:9b:76:20, a VID
        # of 100 and a PCP of 0.
        tc filter add dev swp2 ingress protocol 802.1Q flower skip_sw \
                dst_mac 42:be:24:9b:76:20 vlan_id 100 vlan_prio 0 action drop
      
      Under the hood, all rules match on DMAC, VID and PCP, but when VLAN
      filtering is disabled, those are set internally by the driver to the
      port-based defaults. Because we would be put in an awkward situation if
      the user were to change the VLAN filtering state while there are active
      rules (packets would no longer match on the specified keys), we simply
      deny changing vlan_filtering unless the list of flows offloaded via
      virtual links is empty. Then the user can re-add new rules.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfacc5a2
    • V
      net: dsa: sja1105: make room for virtual link parsing in flower offload · b70bb8d4
      Vladimir Oltean 提交于
      Virtual links are a sja1105 hardware concept of executing various flow
      actions based on a key extracted from the frame's DMAC, VID and PCP.
      
      Currently the tc-flower offload code supports only parsing the DMAC if
      that is the broadcast MAC address, and the VLAN PCP. Extract the key
      parsing logic from the L2 policers functionality and move it into its
      own function, after adding extra logic for matching on any DMAC and VID.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b70bb8d4
    • V
      net: dsa: sja1105: add static tables for virtual links · 94f94d4a
      Vladimir Oltean 提交于
      This patch adds the register definitions for the:
      - VL Lookup Table
      - VL Policing Table
      - VL Forwarding Table
      - VL Forwarding Parameters Table
      
      These are needed in order to perform TTEthernet operations: QoS
      classification, flow-based policing and/or frame redirecting with the
      switch.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94f94d4a
  19. 21 4月, 2020 1 次提交
    • V
      net: dsa: sja1105: enable internal pull-down for RX_DV/CRS_DV/RX_CTL and RX_ER · 135e3018
      Vladimir Oltean 提交于
      Some boards do not have the RX_ER MII signal connected. Normally in such
      situation, those pins would be grounded, but then again, some boards
      left it electrically floating.
      
      When sending traffic to those switch ports, one can see that the
      N_SOFERR statistics counter is incrementing once per each packet. The
      user manual states for this counter that it may count the number of
      frames "that have the MII error input being asserted prior to or
      up to the SOF delimiter byte". So the switch MAC is sampling an
      electrically floating signal, and preventing proper traffic reception
      because of that.
      
      As a workaround, enable the internal weak pull-downs on the input pads
      for the MII control signals. This way, a floating signal would be
      internally tied to ground.
      
      The logic levels of signals which _are_ externally driven should not be
      bothered by this 40-50 KOhm internal resistor. So it is not an issue to
      enable the internal pull-down unconditionally, irrespective of PHY
      interface type (MII, RMII, RGMII, SGMII) and of board layout.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      135e3018
  20. 31 3月, 2020 1 次提交
    • V
      net: dsa: sja1105: add broadcast and per-traffic class policers · a6af7763
      Vladimir Oltean 提交于
      This patch adds complete support for manipulating the L2 Policing Tables
      from this switch. There are 45 table entries, one entry per each port
      and traffic class, and one dedicated entry for broadcast traffic for
      each ingress port.
      
      Policing entries are shareable, and we use this functionality to support
      shared block filters.
      
      We are modeling broadcast policers as simple tc-flower matches on
      dst_mac. As for the traffic class policers, the switch only deduces the
      traffic class from the VLAN PCP field, so it makes sense to model this
      as a tc-flower match on vlan_prio.
      
      How to limit broadcast traffic coming from all front-panel ports to a
      cumulated total of 10 Mbit/s:
      
      tc qdisc add dev sw0p0 ingress_block 1 clsact
      tc qdisc add dev sw0p1 ingress_block 1 clsact
      tc qdisc add dev sw0p2 ingress_block 1 clsact
      tc qdisc add dev sw0p3 ingress_block 1 clsact
      tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
      	action police rate 10mbit burst 64k
      
      How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
      100 Mbit/s on port 0 only:
      
      tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
      	vlan_prio 0 action police rate 100mbit burst 64k
      
      The broadcast, VLAN PCP and port policers are compatible with one
      another (can be installed at the same time on a port).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6af7763
  21. 30 3月, 2020 1 次提交