1. 09 6月, 2021 2 次提交
    • V
      net: dsa: sja1105: make sure the retagging port is enabled for SJA1110 · ceec8bc0
      Vladimir Oltean 提交于
      The SJA1110 has an extra configuration in the General Parameters Table
      through which the user can select the buffer reservation config.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ceec8bc0
    • V
      net: dsa: sja1105: add support for the SJA1110 switch family · 3e77e59b
      Vladimir Oltean 提交于
      The SJA1110 is basically an SJA1105 with more ports, some integrated
      PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which
      can be disabled, and the switch core can be controlled by a host running
      Linux, over SPI.
      
      This patch contains:
      - the static and dynamic config packing functions, for the tables that
        are common with SJA1105
      - one more static config tables which is "unique" to the SJA1110
        (actually it is a rehash of stuff that was placed somewhere else in
        SJA1105): the PCP Remapping Table
      - a reset and clock configuration procedure for the SJA1110 switch.
        This resets just the switch subsystem, and gates off the clock which
        powers on the embedded microcontroller.
      - an RGMII delay configuration procedure for SJA1110, which is very
        similar to SJA1105, but different enough for us to be unable to reuse
        it (this is a pattern that repeats itself)
      - some adaptations to dynamic config table entries which are no longer
        programmed in the same way. For example, to delete a VLAN, you used to
        write an entry through the dynamic reconfiguration interface with the
        desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN
        table entries contain a TYPE_ENTRY field, which must be set to zero
        (in a backwards-incompatible way) in order for the entry to be deleted,
        or to some other entry for the VLAN to match "inner tagged" or "outer
        tagged" packets.
      - a similar thing for the static config: the xMII Mode Parameters Table
        encoding for SGMII and MII (the latter just when attached to a
        100base-TX PHY) just isn't what it used to be in SJA1105. They are
        identical, except there is an extra "special" bit which needs to be
        set. Set it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e77e59b
  2. 08 6月, 2021 2 次提交
  3. 01 6月, 2021 7 次提交
    • V
      net: dsa: sja1105: always keep RGMII ports in the MAC role · f41fad3c
      Vladimir Oltean 提交于
      In SJA1105, the xMII Mode Parameters Table field called PHY_MAC denotes
      the 'role' of the port, be it a PHY or a MAC. This makes a difference in
      the MII and RMII protocols, but RGMII is symmetric, so either PHY or MAC
      settings result in the same hardware behavior.
      
      The SJA1110 is different, and the RGMII ports only work when configured
      in MAC mode, so keep the port roles in MAC mode unconditionally.
      
      Why we had an RGMII port in the PHY role in the first place was because
      we wanted to have a way in the driver to denote whether RGMII delays
      should be applied based on the phy-mode property or not. This is already
      done in sja1105_parse_rgmii_delays() based on an intermediary
      struct sja1105_dt_port (which contains the port role). So it is a
      logical fallacy to use the hardware configuration as a scratchpad for
      driver data, it isn't necessary.
      
      We can also remove the gating condition for applying RGMII delays only
      for ports in the PHY role. The .setup_rgmii_delay() method looks at
      the priv->rgmii_rx_delay[port] and priv->rgmii_tx_delay[port] properties
      which are already populated properly (in the case of a port in the MAC
      role they are false). Removing this condition generates a few more SPI
      writes for these ports (clearing the RGMII delays) which are perhaps
      useless for SJA1105P/Q/R/S, where we know that the delays are disabled
      by default. But for SJA1110, the firmware on the embedded microcontroller
      might have done something funny, so it's always a good idea to clear the
      RGMII delays if that's what Linux expects.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f41fad3c
    • V
      net: dsa: sja1105: add a translation table for port speeds · 41fed17f
      Vladimir Oltean 提交于
      In order to support the new speed of 2500Mbps, the SJA1110 has achieved
      the great performance of changing the encoding in the MAC Configuration
      Table for the port speeds of 10, 100, 1000 compared to SJA1105.
      
      Because this is a common driver, we need a layer of indirection in order
      to program the hardware with the right values irrespective of switch
      generation.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      41fed17f
    • V
      net: dsa: sja1105: add a PHY interface type compatibility matrix · 91a05078
      Vladimir Oltean 提交于
      On the SJA1105, all ports support the parallel "xMII" protocols (MII,
      RMII, RGMII) except for port 4 on SJA1105R/S which supports only SGMII.
      This was relatively easy to model, by special-casing the SGMII port.
      
      On the SJA1110, certain ports can be pinmuxed between SGMII and xMII, or
      between SGMII and an internal 100base-TX PHY. This creates problems,
      because the driver's assumption so far was that if a port supports
      SGMII, it uses SGMII.
      
      We allow the device tree to tell us how the port pinmuxing is done, and
      check that against a PHY interface type compatibility matrix for
      plausibility.
      
      The other big change is that instead of doing SGMII configuration based
      on what the port supports, we do it based on what is the configured
      phy_mode of the port.
      
      The 2500base-x support added in this patch is not complete.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      91a05078
    • V
      net: dsa: sja1105: cache the phy-mode port property · bf4edf4a
      Vladimir Oltean 提交于
      So far we've succeeded in operating without keeping a copy of the
      phy-mode in the driver, since we already have the static config and we
      can look at the xMII Mode Parameters Table which already holds that
      information.
      
      But with the SJA1110, we cannot make the distinction between sgmii and
      2500base-x, because to the hardware's static config, it's all SGMII.
      So add a phy_mode property per port inside struct sja1105_private.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bf4edf4a
    • V
      net: dsa: sja1105: the 0x1F0000 SGMII "base address" is actually MDIO_MMD_VEND2 · 4c7ee010
      Vladimir Oltean 提交于
      Looking at the SGMII PCS from SJA1110, which is accessed indirectly
      through a different base address as can be seen in the next patch, it
      appears odd that the address accessed through indirection still
      references the base address from the SJA1105S register map (first MDIO
      register is at 0x1f0000), when it could index the SGMII registers
      starting from zero.
      
      Except that the 0x1f0000 is not a base address at all, it seems. It is
      0x1f << 16 | 0x0000, and 0x1f is coding for the vendor-specific MMD2.
      So, it turns out, the Synopsys PCS implements all its registers inside
      the vendor-specific MMDs 1 and 2 (0x1e and 0x1f). This explains why the
      PCS has no overlaps (for the other MMDs) with other register regions of
      the switch (because no other MMDs are implemented).
      
      Change the code to remove the SGMII "base address" and explicitly encode
      the MMD for reads/writes. This will become necessary for SJA1110 support.
      
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      4c7ee010
    • V
      net: dsa: sja1105: allow SGMII PCS configuration to be per port · 84db00f2
      Vladimir Oltean 提交于
      The SJA1105 R and S switches have 1 SGMII port (port 4). Because there
      is only one such port, there is no "port" parameter in the configuration
      code for the SGMII PCS.
      
      However, the SJA1110 can have up to 4 SGMII ports, each with its own
      SGMII register map. So we need to generalize the logic.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      84db00f2
    • V
      net: dsa: sja1105: be compatible with "ethernet-ports" OF node name · 15074a36
      Vladimir Oltean 提交于
      Since commit f2f3e09396be ("net: dsa: sja1105: be compatible with
      "ethernet-ports" OF node name"), DSA supports the "ethernet-ports" name
      for the container node of the ports, but the sja1105 driver doesn't,
      because it handles some device tree parsing of its own.
      
      Add the second node name as a fallback.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      15074a36
  4. 25 5月, 2021 13 次提交
    • V
      net: dsa: sja1105: allow the frame buffer size to be customized · 1bf658ee
      Vladimir Oltean 提交于
      The shared frame buffer of the SJA1110 is larger than that of SJA1105,
      which is natural due to the fact that there are more ports.
      
      Introduce yet another property in struct sja1105_info which encodes the
      maximum number of 128 byte blocks that can be used for frame buffers.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bf658ee
    • V
      net: dsa: sja1105: configure the multicast policers, if present · 38fbe91f
      Vladimir Oltean 提交于
      The SJA1110 policer array is similar in layout with SJA1105, except it
      contains one multicast policer per port at the end.
      
      Detect the presence of multicast policers based on the maximum number of
      supported L2 Policing Table entries, and make those policers have a
      shared index equal to the port's default policer. Letting the user
      configure these policers is not supported at the moment.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38fbe91f
    • V
      net: dsa: sja1105: dynamically choose the number of static config table entries · fd6f2c25
      Vladimir Oltean 提交于
      Due to the fact that the port count is different, some static config
      tables have a different number of elements in SJA1105 compared to
      SJA1110. Such an example is the L2 Policing table, which has 45 entries
      in SJA1105 (one per port x traffic class, and one broadcast policer per
      port) and 110 entries in SJA1110 (one per port x traffic class, one
      broadcast and one multicast policer per port).
      
      Similarly, the MAC Configuration Table, the L2 Forwarding table, all
      have a different number of elements simply because the port count is
      different, and although this can be accounted for by looking at
      ds->ports, the policing table can't because of the presence of the extra
      multicast policers.
      
      The common denominator for the static config initializers for these
      tables is that they must set up all the entries within that table.
      So the simplest way to account for these differences in a uniform manner
      is to look at struct sja1105_table_ops::max_entry_count. For the sake of
      uniformity, this patch makes that change also for tables whose number of
      elements did not change in SJA1110, like the xMII Mode Parameters, the
      L2 Lookup Parameters, General Parameters, AVB Parameters (all of these
      are singleton tables with a single entry).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd6f2c25
    • V
      net: dsa: sja1105: skip CGU configuration if it's unnecessary · c5037678
      Vladimir Oltean 提交于
      There are two distinct code paths which enter sja1105_clocking.c, one
      through sja1105_clocking_setup() and the other through
      sja1105_clocking_setup_port():
      
      sja1105_static_config_reload      sja1105_setup
                    |                         |
                    |      +------------------+
                    |      |
                    v      v
         sja1105_clocking_setup               sja1105_adjust_port_config
                       |                                   |
                       v                                   |
            sja1105_clocking_setup_port <------------------+
      
      As opposed to SJA1105, the SJA1110 does not need any configuration of
      the Clock Generation Unit in order for xMII ports to work. Just RGMII
      internal delays need to be configured, and that is done inside
      sja1105_clocking_setup_port for the RGMII ports.
      
      So this patch introduces the concept of a "reserved address", which the
      CGU configuration functions from sja1105_clocking.c must check before
      proceeding to do anything. The SJA1110 will have reserved addresses for
      the CGU PLLs for MII/RMII/RGMII.
      
      Additionally, make sja1105_clocking_setup() a function pointer so it can
      be overridden by the SJA1110. Even though nothing port-related needs to
      be done in the CGU, there are some operations such as disabling the
      watchdog clock which are unique to the SJA1110.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5037678
    • V
      net: dsa: sja1105: don't assign the host port using dsa_upstream_port() · df2a81a3
      Vladimir Oltean 提交于
      If @port is unused, then dsa_upstream_port(ds, port) returns @port,
      which means we cannot assume the CPU port can be retrieved this way.
      
      The sja1105 switches support a single CPU port, so just iterate over the
      switch ports and stop at the first CPU port we see.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df2a81a3
    • V
      net: dsa: sja1105: dimension the data structures for a larger port count · 82760d7f
      Vladimir Oltean 提交于
      Introduce a SJA1105_MAX_NUM_PORTS macro which at the moment is equal to
      SJA1105_NUM_PORTS (5). With the introduction of SJA1110, these
      structures will need to hold information for up to 11 ports.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82760d7f
    • V
      net: dsa: sja1105: avoid some work for unused ports · f238fef1
      Vladimir Oltean 提交于
      Do not put unused ports in the forwarding domain, and do not allocate
      FDB entries for dynamic address learning for them.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f238fef1
    • V
      net: dsa: sja1105: parameterize the number of ports · 542043e9
      Vladimir Oltean 提交于
      The sja1105 driver will gain support for the next-gen SJA1110 switch,
      which is very similar except for the fact it has more than 5 ports.
      
      So we need to replace the hardcoded SJA1105_NUM_PORTS in this driver
      with ds->num_ports. This patch is as mechanical as possible (save for
      the fact that ds->num_ports is not an integer constant expression).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      542043e9
    • V
      net: dsa: sja1105: update existing VLANs from the bridge VLAN list · b38e659d
      Vladimir Oltean 提交于
      When running this sequence of operations:
      
      ip link add br0 type bridge vlan_filtering 1
      ip link set swp4 master br0
      bridge vlan add dev swp4 vid 1
      
      We observe the traffic sent on swp4 is still untagged, even though the
      bridge has overwritten the existing VLAN entry:
      
      port    vlan ids
      swp4     1 PVID
      
      br0      1 PVID Egress Untagged
      
      This happens because we didn't consider that the 'bridge vlan add'
      command just overwrites VLANs like it's nothing. We treat the 'vid 1
      pvid untagged' and the 'vid 1' as two separate VLANs, and the first
      still has precedence when calling sja1105_build_vlan_table. Obviously
      there is a disagreement regarding semantics, and we end up doing
      something unexpected from the PoV of the bridge.
      
      Let's actually consider an "existing VLAN" to be one which is on the
      same port, and has the same VLAN ID, as one we already have, and update
      it if it has different flags than we do.
      
      The first blamed commit is the one introducing the bug, the second one
      is the latest on top of which the bugfix still applies.
      
      Fixes: ec5ae610 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
      Fixes: 5899ee36 ("net: dsa: tag_8021q: add a context structure")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b38e659d
    • V
      net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic · ed040abc
      Vladimir Oltean 提交于
      One thing became visible when writing the blamed commit, and that was
      that STP and PTP frames injected by net/dsa/tag_sja1105.c using the
      deferred xmit mechanism are always classified to the pvid of the CPU
      port, regardless of whatever VLAN there might be in these packets.
      
      So a decision needed to be taken regarding the mechanism through which
      we should ensure that delivery of STP and PTP traffic is possible when
      we are in a VLAN awareness mode that involves tag_8021q. This is because
      tag_8021q is not concerned with managing the pvid of the CPU port, since
      as far as tag_8021q is concerned, no traffic should be sent as untagged
      from the CPU port. So we end up not actually having a pvid on the CPU
      port if we only listen to tag_8021q, and unless we do something about it.
      
      The decision taken at the time was to keep VLAN 1 in the list of
      priv->dsa_8021q_vlans, and make it a pvid of the CPU port. This ensures
      that STP and PTP frames can always be sent to the outside world.
      
      However there is a problem. If we do the following while we are in
      the best_effort_vlan_filtering=true mode:
      
      ip link add br0 type bridge vlan_filtering 1
      ip link set swp2 master br0
      bridge vlan del dev swp2 vid 1
      
      Then untagged and pvid-tagged frames should be dropped. But we observe
      that they aren't, and this is because of the precaution we took that VID
      1 is always installed on all ports.
      
      So clearly VLAN 1 is not good for this purpose. What about VLAN 0?
      Well, VLAN 0 is managed by the 8021q module, and that module wants to
      ensure that 802.1p tagged frames are always received by a port, and are
      always transmitted as VLAN-tagged (with VLAN ID 0). Whereas we want our
      STP and PTP frames to be untagged if the stack sent them as untagged -
      we don't want the driver to just decide out of the blue that it adds
      VID 0 to some packets.
      
      So what to do?
      
      Well, there is one other VLAN that is reserved, and that is 4095:
      $ ip link add link swp2 name swp2.4095 type vlan id 4095
      Error: 8021q: Invalid VLAN id.
      $ bridge vlan add dev swp2 vid 4095
      Error: bridge: Vlan id is invalid.
      
      After we made this change, VLAN 1 is indeed forwarded and/or dropped
      according to the bridge VLAN table, there are no further alterations
      done by the sja1105 driver.
      
      Fixes: ec5ae610 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed040abc
    • V
      net: dsa: sja1105: error out on unsupported PHY mode · 6729188d
      Vladimir Oltean 提交于
      The driver continues probing when a port is configured for an
      unsupported PHY interface type, instead it should stop.
      
      Fixes: 8aa9ebcc ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6729188d
    • V
      net: dsa: sja1105: add error handling in sja1105_setup() · cec279a8
      Vladimir Oltean 提交于
      If any of sja1105_static_config_load(), sja1105_clocking_setup() or
      sja1105_devlink_setup() fails, we can't just return in the middle of
      sja1105_setup() or memory will leak. Add a cleanup path.
      
      Fixes: 0a7bdbc2 ("net: dsa: sja1105: move devlink param code to sja1105_devlink.c")
      Fixes: 8aa9ebcc ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cec279a8
    • V
      net: dsa: sja1105: call dsa_unregister_switch when allocating memory fails · dc596e3f
      Vladimir Oltean 提交于
      Unlike other drivers which pretty much end their .probe() execution with
      dsa_register_switch(), the sja1105 does some extra stuff. When that
      fails with -ENOMEM, the driver is quick to return that, forgetting to
      call dsa_unregister_switch(). Not critical, but a bug nonetheless.
      
      Fixes: 4d752508 ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
      Fixes: a68578c2 ("net: dsa: Make deferred_xmit private to sja1105")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc596e3f
  5. 22 5月, 2021 1 次提交
    • V
      net: dsa: sja1105: adapt to a SPI controller with a limited max transfer size · 718bad0e
      Vladimir Oltean 提交于
      The static config of the sja1105 switch is a long stream of bytes which
      is programmed to the hardware in chunks (portions with the chip select
      continuously asserted) of max 256 bytes each. Each chunk is a
      spi_message composed of 2 spi_transfers: the buffer with the data and a
      preceding buffer with the SPI access header.
      
      Only that certain SPI controllers, such as the spi-sc18is602 I2C-to-SPI
      bridge, cannot keep the chip select asserted for that long.
      The spi_max_transfer_size() and spi_max_message_size() functions are how
      the controller can impose its hardware limitations upon the SPI
      peripheral driver.
      
      For the sja1105 driver to work with these controllers, both buffers must
      be smaller than the transfer limit, and their sum must be smaller than
      the message limit.
      
      Regression-tested on a switch connected to a controller with no
      limitations (spi-fsl-dspi) as well as with one with caps for both
      max_transfer_size and max_message_size (spi-sc18is602).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      718bad0e
  6. 28 4月, 2021 1 次提交
    • Y
      net: dsa: free skb->cb usage in core driver · c4b364ce
      Yangbo Lu 提交于
      Free skb->cb usage in core driver and let device drivers decide to
      use or not. The reason having a DSA_SKB_CB(skb)->clone was because
      dsa_skb_tx_timestamp() which may set the clone pointer was called
      before p->xmit() which would use the clone if any, and the device
      driver has no way to initialize the clone pointer.
      
      This patch just put memset(skb->cb, 0, sizeof(skb->cb)) at beginning
      of dsa_slave_xmit(). Some new features in the future, like one-step
      timestamp may need more bytes of skb->cb to use in
      dsa_skb_tx_timestamp(), and p->xmit().
      Signed-off-by: NYangbo Lu <yangbo.lu@nxp.com>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4b364ce
  7. 21 3月, 2021 1 次提交
    • V
      Revert "net: dsa: sja1105: Clear VLAN filtering offload netdev feature" · a1e6f641
      Vladimir Oltean 提交于
      This reverts commit e9bf9694.
      
      The topic of the reverted patch is the support for switches with global
      VLAN filtering, added by commit 061f6a50 ("net: dsa: Add
      ndo_vlan_rx_{add, kill}_vid implementation"). Be there a switch with 4
      ports swp0 -> swp3, and the following setup:
      
      ip link add br0 type bridge vlan_filtering 1
      ip link set swp0 master br0
      ip link set swp1 master br0
      
      What would happen with VLAN-tagged traffic received on standalone ports
      swp2 and swp3? Well, it would get dropped, were it not for the
      .ndo_vlan_rx_add_vid and .ndo_vlan_rx_kill_vid implementations (called
      from vlan_vid_add and vlan_vid_del respectively). Basically, for DSA
      switches where VLAN filtering is a global attribute, we enforce the
      standalone ports to have 'rx-vlan-filter: off [fixed]' in their ethtool
      features, which lets the user know that all VLAN-tagged packets that are
      not explicitly added in the RX filtering list are dropped.
      
      As for the sja1105 driver, at the time of the reverted patch, it was
      operating in a pretty handicapped mode when it had ports under a bridge
      with vlan_filtering=1. Specifically, it was unable to terminate traffic
      through the CPU port (for further explanation see "Traffic support" in
      Documentation/networking/dsa/sja1105.rst).
      
      However, since then, the sja1105 driver has made considerable progress,
      and that limitation is no longer as severe now. Specifically, since
      commit 2cafa72e ("net: dsa: sja1105: add a new
      best_effort_vlan_filtering devlink parameter"), the driver is able to
      perform CPU termination even when some ports are under bridges with
      vlan_filtering=1. Then, since commit 8841f6e6 ("net: dsa: sja1105:
      make devlink property best_effort_vlan_filtering true by default"), this
      even became the default operating mode.
      
      So we can now take advantage of the logic in the DSA core.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1e6f641
  8. 05 3月, 2021 2 次提交
  9. 17 2月, 2021 2 次提交
  10. 16 2月, 2021 1 次提交
    • V
      net: dsa: sja1105: make devlink property best_effort_vlan_filtering true by default · 8841f6e6
      Vladimir Oltean 提交于
      The sja1105 driver has a limitation, extensively described under
      Documentation/networking/dsa/sja1105.rst and
      Documentation/networking/devlink/sja1105.rst, which says that when the
      ports are under a bridge with vlan_filtering=1, traffic to and from
      the network stack is not possible, unless the driver-specific
      best_effort_vlan_filtering devlink parameter is enabled.
      
      For users, this creates a 'wtf' moment. They need to go to the
      documentation and find about the existence of this property, then maybe
      install devlink and set it to true.
      
      Having best_effort_vlan_filtering enabled by the kernel by default
      delays that 'wtf' moment (maybe up to the point that it never even
      happens). The user doesn't need to care that the driver supports
      addressing the ports individually by retagging VLAN IDs until he/she
      needs to use more than 32 VLAN IDs (since there can be at most 32
      retagging rules). Only then do they need to think whether they need the
      full VLAN table, at the expense of no individual port addressing, or
      not.
      
      But the odds that an sja1105 user will need more than 32 VLANs
      terminated by the CPU is probably low. And, if we were to follow the
      principle that more advanced use cases should require more advanced
      preparation steps, then it makes more sense for ping to 'just work'
      while CPU termination of > 32 VLAN IDs to require a bit more forethought
      and possibly a driver-specific devlink param.
      
      So we should be able to safely change the default here, and make this
      driver act just a little bit more sanely out of the box.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8841f6e6
  11. 15 2月, 2021 2 次提交
  12. 13 2月, 2021 1 次提交
    • V
      net: dsa: sja1105: offload bridge port flags to device · 4d942354
      Vladimir Oltean 提交于
      The chip can configure unicast flooding, broadcast flooding and learning.
      Learning is per port, while flooding is per {ingress, egress} port pair
      and we need to configure the same value for all possible ingress ports
      towards the requested one.
      
      While multicast flooding is not officially supported, we can hack it by
      using a feature of the second generation (P/Q/R/S) devices, which is that
      FDB entries are maskable, and multicast addresses always have an odd
      first octet. So by putting a match-all for 00:01:00:00:00:00 addr and
      00:01:00:00:00:00 mask at the end of the FDB, we make sure that it is
      always checked last, and does not take precedence in front of any other
      MDB. So it behaves effectively as an unknown multicast entry.
      
      For the first generation switches, this feature is not available, so
      unknown multicast will always be treated the same as unknown unicast.
      So the only thing we can do is request the user to offload the settings
      for these 2 flags in tandem, i.e.
      
      ip link set swp2 type bridge_slave flood off
      Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
      ip link set swp2 type bridge_slave flood off mcast_flood off
      ip link set swp2 type bridge_slave mcast_flood on
      Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d942354
  13. 16 1月, 2021 1 次提交
    • V
      net: dsa: set configure_vlan_while_not_filtering to true by default · 0ee2af4e
      Vladimir Oltean 提交于
      As explained in commit 54a0ed0d ("net: dsa: provide an option for
      drivers to always receive bridge VLANs"), DSA has historically been
      skipping VLAN switchdev operations when the bridge wasn't in
      vlan_filtering mode, but the reason why it was doing that has never been
      clear. So the configure_vlan_while_not_filtering option is there merely
      to preserve functionality for existing drivers. It isn't some behavior
      that drivers should opt into. Ideally, when all drivers leave this flag
      set, we can delete the dsa_port_skip_vlan_configuration() function.
      
      New drivers always seem to omit setting this flag, for some reason. So
      let's reverse the logic: the DSA core sets it by default to true before
      the .setup() callback, and legacy drivers can turn it off. This way, new
      drivers get the new behavior by default, unless they explicitly set the
      flag to false, which is more obvious during review.
      
      Remove the assignment from drivers which were setting it to true, and
      add the assignment to false for the drivers that didn't previously have
      it. This way, it should be easier to see how many we have left.
      
      The following drivers: lan9303, mv88e6060 were skipped from setting this
      flag to false, because they didn't have any VLAN offload ops in the
      first place.
      
      The Broadcom Starfighter 2 driver calls the common b53_switch_alloc and
      therefore also inherits the configure_vlan_while_not_filtering=true
      behavior.
      
      Also, print a message through netlink extack every time a VLAN has been
      skipped. This is mildly annoying on purpose, so that (a) it is at least
      clear that VLANs are being skipped - the legacy behavior in itself is
      confusing, and the extack should be much more difficult to miss, unlike
      kernel logs - and (b) people have one more incentive to convert to the
      new behavior.
      
      No behavior change except for the added prints is intended at this time.
      
      $ ip link add br0 type bridge vlan_filtering 0
      $ ip link set sw0p2 master br0
      [   60.315148] br0: port 1(sw0p2) entered blocking state
      [   60.320350] br0: port 1(sw0p2) entered disabled state
      [   60.327839] device sw0p2 entered promiscuous mode
      [   60.334905] br0: port 1(sw0p2) entered blocking state
      [   60.340142] br0: port 1(sw0p2) entered forwarding state
      Warning: dsa_core: skipping configuration of VLAN. # This was the pvid
      $ bridge vlan add dev sw0p2 vid 100
      Warning: dsa_core: skipping configuration of VLAN.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NKurt Kanzenbach <kurt@linutronix.de>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20210115231919.43834-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      0ee2af4e
  14. 12 1月, 2021 4 次提交
    • V
      net: dsa: remove the transactional logic from VLAN objects · 1958d581
      Vladimir Oltean 提交于
      It should be the driver's business to logically separate its VLAN
      offloading into a preparation and a commit phase, and some drivers don't
      need / can't do this.
      
      So remove the transactional shim from DSA and let drivers propagate
      errors directly from the .port_vlan_add callback.
      
      It would appear that the code has worse error handling now than it had
      before. DSA is the only in-kernel user of switchdev that offloads one
      switchdev object to more than one port: for every VLAN object offloaded
      to a user port, that VLAN is also offloaded to the CPU port. So the
      "prepare for user port -> check for errors -> prepare for CPU port ->
      check for errors -> commit for user port -> commit for CPU port"
      sequence appears to make more sense than the one we are using now:
      "offload to user port -> check for errors -> offload to CPU port ->
      check for errors", but it is really a compromise. In the new way, we can
      catch errors from the commit phase that we previously had to ignore.
      But we have our hands tied and cannot do any rollback now: if we add a
      VLAN on the CPU port and it fails, we can't do the rollback by simply
      deleting it from the user port, because the switchdev API is not so nice
      with us: it could have simply been there already, even with the same
      flags. So we don't even attempt to rollback anything on addition error,
      just leave whatever VLANs managed to get offloaded right where they are.
      This should not be a problem at all in practice.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      1958d581
    • V
      net: dsa: remove the transactional logic from MDB entries · a52b2da7
      Vladimir Oltean 提交于
      For many drivers, the .port_mdb_prepare callback was not a good opportunity
      to avoid any error condition, and they would suppress errors found during
      the actual commit phase.
      
      Where a logical separation between the prepare and the commit phase
      existed, the function that used to implement the .port_mdb_prepare
      callback still exists, but now it is called directly from .port_mdb_add,
      which was modified to return an int code.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Wallei <linus.walleij@linaro.org> # RTL8366
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      a52b2da7
    • V
      net: switchdev: remove the transaction structure from port attributes · bae33f2b
      Vladimir Oltean 提交于
      Since the introduction of the switchdev API, port attributes were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      attribute notifier structures, and converts drivers to not look at this
      member.
      
      In part, this patch contains a revert of my previous commit 2e554a7a
      ("net: dsa: propagate switchdev vlan_filtering prepare phase to
      drivers").
      
      For the most part, the conversion was trivial except for:
      - Rocker's world implementation based on Broadcom OF-DPA had an odd
        implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
        done mechanically, by pasting the implementation twice, then only
        keeping the code that would get executed during prepare phase on top,
        then only keeping the code that gets executed during the commit phase
        on bottom, then simplifying the resulting code until this was obtained.
      - DSA's offloading of STP state, bridge flags, VLAN filtering and
        multicast router could be converted right away. But the ageing time
        could not, so a shim was introduced and this was left for a further
        commit.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bae33f2b
    • V
      net: switchdev: remove vid_begin -> vid_end range from VLAN objects · b7a9e0da
      Vladimir Oltean 提交于
      The call path of a switchdev VLAN addition to the bridge looks something
      like this today:
      
              nbp_vlan_init
              |  __br_vlan_set_default_pvid
              |  |                       |
              |  |    br_afspec          |
              |  |        |              |
              |  |        v              |
              |  | br_process_vlan_info  |
              |  |        |              |
              |  |        v              |
              |  |   br_vlan_info        |
              |  |       / \            /
              |  |      /   \          /
              |  |     /     \        /
              |  |    /       \      /
              v  v   v         v    v
            nbp_vlan_add   br_vlan_add ------+
             |              ^      ^ |       |
             |             /       | |       |
             |            /       /  /       |
             \ br_vlan_get_master/  /        v
              \        ^        /  /  br_vlan_add_existing
               \       |       /  /          |
                \      |      /  /          /
                 \     |     /  /          /
                  \    |    /  /          /
                   \   |   /  /          /
                    v  |   | v          /
                    __vlan_add         /
                       / |            /
                      /  |           /
                     v   |          /
         __vlan_vid_add  |         /
                     \   |        /
                      v  v        v
            br_switchdev_port_vlan_add
      
      The ranges UAPI was introduced to the bridge in commit bdced7ef
      ("bridge: support for multiple vlans and vlan ranges in setlink and
      dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec)
      have always been passed one by one, through struct bridge_vlan_info
      tmp_vinfo, to br_vlan_info. So the range never went too far in depth.
      
      Then Scott Feldman introduced the switchdev_port_bridge_setlink function
      in commit 47f8328b ("switchdev: add new switchdev bridge setlink").
      That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made
      full use of the range. But switchdev_port_bridge_setlink was called like
      this:
      
      br_setlink
      -> br_afspec
      -> switchdev_port_bridge_setlink
      
      Basically, the switchdev and the bridge code were not tightly integrated.
      Then commit 41c498b9 ("bridge: restore br_setlink back to original")
      came, and switchdev drivers were required to implement
      .ndo_bridge_setlink = switchdev_port_bridge_setlink for a while.
      
      In the meantime, commits such as 0944d6b5 ("bridge: try switchdev op
      first in __vlan_vid_add/del") finally made switchdev penetrate the
      br_vlan_info() barrier and start to develop the call path we have today.
      But remember, br_vlan_info() still receives VLANs one by one.
      
      Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit
      29ab586c ("net: switchdev: Remove bridge bypass support from
      switchdev") so that drivers would not implement .ndo_bridge_setlink any
      longer. The switchdev_port_bridge_setlink also got deleted.
      This refactoring removed the parallel bridge_setlink implementation from
      switchdev, and left the only switchdev VLAN objects to be the ones
      offloaded from __vlan_vid_add (basically RX filtering) and  __vlan_add
      (the latter coming from commit 9c86ce2c ("net: bridge: Notify about
      bridge VLANs")).
      
      That is to say, today the switchdev VLAN object ranges are not used in
      the kernel. Refactoring the above call path is a bit complicated, when
      the bridge VLAN call path is already a bit complicated.
      
      Let's go off and finish the job of commit 29ab586c by deleting the
      bogus iteration through the VLAN ranges from the drivers. Some aspects
      of this feature never made too much sense in the first place. For
      example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID
      flag supposed to mean, when a port can obviously have a single pvid?
      This particular configuration _is_ denied as of commit 6623c60d
      ("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API
      perspective, the driver still has to play pretend, and only offload the
      vlan->vid_end as pvid. And the addition of a switchdev VLAN object can
      modify the flags of another, completely unrelated, switchdev VLAN
      object! (a VLAN that is PVID will invalidate the PVID flag from whatever
      other VLAN had previously been offloaded with switchdev and had that
      flag. Yet switchdev never notifies about that change, drivers are
      supposed to guess).
      
      Nonetheless, having a VLAN range in the API makes error handling look
      scarier than it really is - unwinding on errors and all of that.
      When in reality, no one really calls this API with more than one VLAN.
      It is all unnecessary complexity.
      
      And despite appearing pretentious (two-phase transactional model and
      all), the switchdev API is really sloppy because the VLAN addition and
      removal operations are not paired with one another (you can add a VLAN
      100 times and delete it just once). The bridge notifies through
      switchdev of a VLAN addition not only when the flags of an existing VLAN
      change, but also when nothing changes. There are switchdev drivers out
      there who don't like adding a VLAN that has already been added, and
      those checks don't really belong at driver level. But the fact that the
      API contains ranges is yet another factor that prevents this from being
      addressed in the future.
      
      Of the existing switchdev pieces of hardware, it appears that only
      Mellanox Spectrum supports offloading more than one VLAN at a time,
      through mlxsw_sp_port_vlan_set. I have kept that code internal to the
      driver, because there is some more bookkeeping that makes use of it, but
      I deleted it from the switchdev API. But since the switchdev support for
      ranges has already been de facto deleted by a Mellanox employee and
      nobody noticed for 4 years, I'm going to assume it's not a biggie.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b7a9e0da