1. 09 8月, 2021 1 次提交
    • V
      net: dsa: centralize fast ageing when address learning is turned off · 045c45d1
      Vladimir Oltean 提交于
      Currently DSA leaves it down to device drivers to fast age the FDB on a
      port when address learning is disabled on it. There are 2 reasons for
      doing that in the first place:
      
      - when address learning is disabled by user space, through
        IFLA_BRPORT_LEARNING or the brport_attr_learning sysfs, what user
        space typically wants to achieve is to operate in a mode with no
        dynamic FDB entry on that port. But if the port is already up, some
        addresses might have been already learned on it, and it seems silly to
        wait for 5 minutes for them to expire until something useful can be
        done.
      
      - when a port leaves a bridge and becomes standalone, DSA turns off
        address learning on it. This also has the nice side effect of flushing
        the dynamically learned bridge FDB entries on it, which is a good idea
        because standalone ports should not have bridge FDB entries on them.
      
      We let drivers manage fast ageing under this condition because if DSA
      were to do it, it would need to track each port's learning state, and
      act upon the transition, which it currently doesn't.
      
      But there are 2 reasons why doing it is better after all:
      
      - drivers might get it wrong and not do it (see b53_port_set_learning)
      
      - we would like to flush the dynamic entries from the software bridge
        too, and letting drivers do that would be another pain point
      
      So track the port learning state and trigger a fast age process
      automatically within DSA.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      045c45d1
  2. 06 8月, 2021 1 次提交
    • V
      net: dsa: don't disable multicast flooding to the CPU even without an IGMP querier · c73c5708
      Vladimir Oltean 提交于
      Commit 08cc83cc ("net: dsa: add support for BRIDGE_MROUTER
      attribute") added an option for users to turn off multicast flooding
      towards the CPU if they turn off the IGMP querier on a bridge which
      already has enslaved ports (echo 0 > /sys/class/net/br0/bridge/multicast_router).
      
      And commit a8b659e7 ("net: dsa: act as passthrough for bridge port flags")
      simply papered over that issue, because it moved the decision to flood
      the CPU with multicast (or not) from the DSA core down to individual drivers,
      instead of taking a more radical position then.
      
      The truth is that disabling multicast flooding to the CPU is simply
      something we are not prepared to do now, if at all. Some reasons:
      
      - ICMP6 neighbor solicitation messages are unregistered multicast
        packets as far as the bridge is concerned. So if we stop flooding
        multicast, the outside world cannot ping the bridge device's IPv6
        link-local address.
      
      - There might be foreign interfaces bridged with our DSA switch ports
        (sending a packet towards the host does not necessarily equal
        termination, but maybe software forwarding). So if there is no one
        interested in that multicast traffic in the local network stack, that
        doesn't mean nobody is.
      
      - PTP over L4 (IPv4, IPv6) is multicast, but is unregistered as far as
        the bridge is concerned. This should reach the CPU port.
      
      - The switch driver might not do FDB partitioning. And since we don't
        even bother to do more fine-grained flood disabling (such as "disable
        flooding _from_port_N_ towards the CPU port" as opposed to "disable
        flooding _from_any_port_ towards the CPU port"), this breaks standalone
        ports, or even multiple bridges where one has an IGMP querier and one
        doesn't.
      
      Reverting the logic makes all of the above work.
      
      Fixes: a8b659e7 ("net: dsa: act as passthrough for bridge port flags")
      Fixes: 08cc83cc ("net: dsa: add support for BRIDGE_MROUTER attribute")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c73c5708
  3. 24 7月, 2021 1 次提交
  4. 23 7月, 2021 1 次提交
    • V
      net: dsa: mv88e6xxx: map virtual bridges with forwarding offload in the PVT · ce5df689
      Vladimir Oltean 提交于
      The mv88e6xxx switches have the ability to receive FORWARD (data plane)
      frames from the CPU port and route them according to the FDB. We can use
      this to offload the forwarding process of packets sent by the software
      bridge.
      
      Because DSA supports bridge domain isolation between user ports, just
      sending FORWARD frames is not enough, as they might leak the intended
      broadcast domain of the bridge on behalf of which the packets are sent.
      
      It should be noted that FORWARD frames are also (and typically) used to
      forward data plane packets on DSA links in cross-chip topologies. The
      FORWARD frame header contains the source port and switch ID, and
      switches receiving this frame header forward the packet according to
      their cross-chip port-based VLAN table (PVT).
      
      To address the bridging domain isolation in the context of offloading
      the forwarding on TX, the idea is that we can reuse the parts of the PVT
      that don't have any physical switch mapped to them, one entry for each
      software bridge. The switches will therefore think that behind their
      upstream port lie many switches, all in fact backed up by software
      bridges through tag_dsa.c, which constructs FORWARD packets with the
      right switch ID corresponding to each bridge.
      
      The mapping we use is absolutely trivial: DSA gives us a unique bridge
      number, and we add the number of the physical switches in the DSA switch
      tree to that, to obtain a unique virtual bridge device number to use in
      the PVT.
      Co-developed-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce5df689
  5. 02 7月, 2021 6 次提交
  6. 22 6月, 2021 1 次提交
    • E
      net: dsa: mv88e6xxx: Fix adding vlan 0 · b8b79c41
      Eldar Gasanov 提交于
      8021q module adds vlan 0 to all interfaces when it starts.
      When 8021q module is loaded it isn't possible to create bond
      with mv88e6xxx interfaces, bonding module dipslay error
      "Couldn't add bond vlan ids", because it tries to add vlan 0
      to slave interfaces.
      
      There is unexpected behavior in the switch. When a PVID
      is assigned to a port the switch changes VID to PVID
      in ingress frames with VID 0 on the port. Expected
      that the switch doesn't assign PVID to tagged frames
      with VID 0. But there isn't a way to change this behavior
      in the switch.
      
      Fixes: 57e661aa ("net: dsa: mv88e6xxx: Link aggregation support")
      Signed-off-by: NEldar Gasanov <eldargasanov2@gmail.com>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8b79c41
  7. 22 4月, 2021 1 次提交
  8. 21 4月, 2021 3 次提交
  9. 13 4月, 2021 1 次提交
    • P
      net: phy: marvell: fix detection of PHY on Topaz switches · 1fe976d3
      Pali Rohár 提交于
      Since commit fee2d546 ("net: phy: marvell: mv88e6390 temperature
      sensor reading"), Linux reports the temperature of Topaz hwmon as
      constant -75°C.
      
      This is because switches from the Topaz family (88E6141 / 88E6341) have
      the address of the temperature sensor register different from Peridot.
      
      This address is instead compatible with 88E1510 PHYs, as was used for
      Topaz before the above mentioned commit.
      
      Create a new mapping table between switch family and PHY ID for families
      which don't have a model number. And define PHY IDs for Topaz and Peridot
      families.
      
      Create a new PHY ID and a new PHY driver for Topaz's internal PHY.
      The only difference from Peridot's PHY driver is the HWMON probing
      method.
      
      Prior this change Topaz's internal PHY is detected by kernel as:
      
        PHY [...] driver [Marvell 88E6390] (irq=63)
      
      And afterwards as:
      
        PHY [...] driver [Marvell 88E6341 Family] (irq=63)
      Signed-off-by: NPali Rohár <pali@kernel.org>
      BugLink: https://github.com/globalscaletechnologies/linux/issues/1
      Fixes: fee2d546 ("net: phy: marvell: mv88e6390 temperature sensor reading")
      Reviewed-by: NMarek Behún <kabel@kernel.org>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1fe976d3
  10. 19 3月, 2021 7 次提交
  11. 18 3月, 2021 4 次提交
  12. 15 2月, 2021 2 次提交
  13. 13 2月, 2021 1 次提交
    • V
      net: dsa: act as passthrough for bridge port flags · a8b659e7
      Vladimir Oltean 提交于
      There are multiple ways in which a PORT_BRIDGE_FLAGS attribute can be
      expressed by the bridge through switchdev, and not all of them can be
      emulated by DSA mid-layer API at the same time.
      
      One possible configuration is when the bridge offloads the port flags
      using a mask that has a single bit set - therefore only one feature
      should change. However, DSA currently groups together unicast and
      multicast flooding in the .port_egress_floods method, which limits our
      options when we try to add support for turning off broadcast flooding:
      do we extend .port_egress_floods with a third parameter which b53 and
      mv88e6xxx will ignore? But that means that the DSA layer, which
      currently implements the PRE_BRIDGE_FLAGS attribute all by itself, will
      see that .port_egress_floods is implemented, and will report that all 3
      types of flooding are supported - not necessarily true.
      
      Another configuration is when the user specifies more than one flag at
      the same time, in the same netlink message. If we were to create one
      individual function per offloadable bridge port flag, we would limit the
      expressiveness of the switch driver of refusing certain combinations of
      flag values. For example, a switch may not have an explicit knob for
      flooding of unknown multicast, just for flooding in general. In that
      case, the only correct thing to do is to allow changes to BR_FLOOD and
      BR_MCAST_FLOOD in tandem, and never allow mismatched values. But having
      a separate .port_set_unicast_flood and .port_set_multicast_flood would
      not allow the driver to possibly reject that.
      
      Also, DSA doesn't consider it necessary to inform the driver that a
      SWITCHDEV_ATTR_ID_BRIDGE_MROUTER attribute was offloaded, because it
      just calls .port_egress_floods for the CPU port. When we'll add support
      for the plain SWITCHDEV_ATTR_ID_PORT_MROUTER, that will become a real
      problem because the flood settings will need to be held statefully in
      the DSA middle layer, otherwise changing the mrouter port attribute will
      impact the flooding attribute. And that's _assuming_ that the underlying
      hardware doesn't have anything else to do when a multicast router
      attaches to a port than flood unknown traffic to it.  If it does, there
      will need to be a dedicated .port_set_mrouter anyway.
      
      So we need to let the DSA drivers see the exact form that the bridge
      passes this switchdev attribute in, otherwise we are standing in the
      way. Therefore we also need to use this form of language when
      communicating to the driver that it needs to configure its initial
      (before bridge join) and final (after bridge leave) port flags.
      
      The b53 and mv88e6xxx drivers are converted to the passthrough API and
      their implementation of .port_egress_floods is split into two: a
      function that configures unicast flooding and another for multicast.
      The mv88e6xxx implementation is quite hairy, and it turns out that
      the implementations of unknown unicast flooding are actually the same
      for 6185 and for 6352:
      
      behind the confusing names actually lie two individual bits:
      NO_UNKNOWN_MC -> FLOOD_UC = 0x4 = BIT(2)
      NO_UNKNOWN_UC -> FLOOD_MC = 0x8 = BIT(3)
      
      so there was no reason to entangle them in the first place.
      
      Whereas the 6185 writes to MV88E6185_PORT_CTL0_FORWARD_UNKNOWN of
      PORT_CTL0, which has the exact same bit index. I have left the
      implementations separate though, for the only reason that the names are
      different enough to confuse me, since I am not able to double-check with
      a user manual. The multicast flooding setting for 6185 is in a different
      register than for 6352 though.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8b659e7
  14. 02 2月, 2021 1 次提交
  15. 28 1月, 2021 1 次提交
  16. 27 1月, 2021 2 次提交
  17. 16 1月, 2021 2 次提交
    • V
      net: dsa: set configure_vlan_while_not_filtering to true by default · 0ee2af4e
      Vladimir Oltean 提交于
      As explained in commit 54a0ed0d ("net: dsa: provide an option for
      drivers to always receive bridge VLANs"), DSA has historically been
      skipping VLAN switchdev operations when the bridge wasn't in
      vlan_filtering mode, but the reason why it was doing that has never been
      clear. So the configure_vlan_while_not_filtering option is there merely
      to preserve functionality for existing drivers. It isn't some behavior
      that drivers should opt into. Ideally, when all drivers leave this flag
      set, we can delete the dsa_port_skip_vlan_configuration() function.
      
      New drivers always seem to omit setting this flag, for some reason. So
      let's reverse the logic: the DSA core sets it by default to true before
      the .setup() callback, and legacy drivers can turn it off. This way, new
      drivers get the new behavior by default, unless they explicitly set the
      flag to false, which is more obvious during review.
      
      Remove the assignment from drivers which were setting it to true, and
      add the assignment to false for the drivers that didn't previously have
      it. This way, it should be easier to see how many we have left.
      
      The following drivers: lan9303, mv88e6060 were skipped from setting this
      flag to false, because they didn't have any VLAN offload ops in the
      first place.
      
      The Broadcom Starfighter 2 driver calls the common b53_switch_alloc and
      therefore also inherits the configure_vlan_while_not_filtering=true
      behavior.
      
      Also, print a message through netlink extack every time a VLAN has been
      skipped. This is mildly annoying on purpose, so that (a) it is at least
      clear that VLANs are being skipped - the legacy behavior in itself is
      confusing, and the extack should be much more difficult to miss, unlike
      kernel logs - and (b) people have one more incentive to convert to the
      new behavior.
      
      No behavior change except for the added prints is intended at this time.
      
      $ ip link add br0 type bridge vlan_filtering 0
      $ ip link set sw0p2 master br0
      [   60.315148] br0: port 1(sw0p2) entered blocking state
      [   60.320350] br0: port 1(sw0p2) entered disabled state
      [   60.327839] device sw0p2 entered promiscuous mode
      [   60.334905] br0: port 1(sw0p2) entered blocking state
      [   60.340142] br0: port 1(sw0p2) entered forwarding state
      Warning: dsa_core: skipping configuration of VLAN. # This was the pvid
      $ bridge vlan add dev sw0p2 vid 100
      Warning: dsa_core: skipping configuration of VLAN.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NKurt Kanzenbach <kurt@linutronix.de>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20210115231919.43834-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      0ee2af4e
    • T
      net: dsa: mv88e6xxx: Only allow LAG offload on supported hardware · b80dc51b
      Tobias Waldekranz 提交于
      There are chips that do have Global 2 registers, and therefore trunk
      mapping/mask tables are not available. Refuse the offload as early as
      possible on those devices.
      
      Fixes: 57e661aa ("net: dsa: mv88e6xxx: Link aggregation support")
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b80dc51b
  18. 15 1月, 2021 1 次提交
  19. 12 1月, 2021 3 次提交
    • V
      net: dsa: remove the transactional logic from VLAN objects · 1958d581
      Vladimir Oltean 提交于
      It should be the driver's business to logically separate its VLAN
      offloading into a preparation and a commit phase, and some drivers don't
      need / can't do this.
      
      So remove the transactional shim from DSA and let drivers propagate
      errors directly from the .port_vlan_add callback.
      
      It would appear that the code has worse error handling now than it had
      before. DSA is the only in-kernel user of switchdev that offloads one
      switchdev object to more than one port: for every VLAN object offloaded
      to a user port, that VLAN is also offloaded to the CPU port. So the
      "prepare for user port -> check for errors -> prepare for CPU port ->
      check for errors -> commit for user port -> commit for CPU port"
      sequence appears to make more sense than the one we are using now:
      "offload to user port -> check for errors -> offload to CPU port ->
      check for errors", but it is really a compromise. In the new way, we can
      catch errors from the commit phase that we previously had to ignore.
      But we have our hands tied and cannot do any rollback now: if we add a
      VLAN on the CPU port and it fails, we can't do the rollback by simply
      deleting it from the user port, because the switchdev API is not so nice
      with us: it could have simply been there already, even with the same
      flags. So we don't even attempt to rollback anything on addition error,
      just leave whatever VLANs managed to get offloaded right where they are.
      This should not be a problem at all in practice.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      1958d581
    • V
      net: dsa: remove the transactional logic from MDB entries · a52b2da7
      Vladimir Oltean 提交于
      For many drivers, the .port_mdb_prepare callback was not a good opportunity
      to avoid any error condition, and they would suppress errors found during
      the actual commit phase.
      
      Where a logical separation between the prepare and the commit phase
      existed, the function that used to implement the .port_mdb_prepare
      callback still exists, but now it is called directly from .port_mdb_add,
      which was modified to return an int code.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Wallei <linus.walleij@linaro.org> # RTL8366
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      a52b2da7
    • V
      net: switchdev: remove the transaction structure from port attributes · bae33f2b
      Vladimir Oltean 提交于
      Since the introduction of the switchdev API, port attributes were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      attribute notifier structures, and converts drivers to not look at this
      member.
      
      In part, this patch contains a revert of my previous commit 2e554a7a
      ("net: dsa: propagate switchdev vlan_filtering prepare phase to
      drivers").
      
      For the most part, the conversion was trivial except for:
      - Rocker's world implementation based on Broadcom OF-DPA had an odd
        implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
        done mechanically, by pasting the implementation twice, then only
        keeping the code that would get executed during prepare phase on top,
        then only keeping the code that gets executed during the commit phase
        on bottom, then simplifying the resulting code until this was obtained.
      - DSA's offloading of STP state, bridge flags, VLAN filtering and
        multicast router could be converted right away. But the ageing time
        could not, so a shim was introduced and this was left for a further
        commit.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bae33f2b