1. 20 9月, 2022 2 次提交
    • V
      net: dsa: allow the DSA master to be seen and changed through rtnetlink · 95f510d0
      Vladimir Oltean 提交于
      Some DSA switches have multiple CPU ports, which can be used to improve
      CPU termination throughput, but DSA, through dsa_tree_setup_cpu_ports(),
      sets up only the first one, leading to suboptimal use of hardware.
      
      The desire is to not change the default configuration but to permit the
      user to create a dynamic mapping between individual user ports and the
      CPU port that they are served by, configurable through rtnetlink. It is
      also intended to permit load balancing between CPU ports, and in that
      case, the foreseen model is for the DSA master to be a bonding interface
      whose lowers are the physical DSA masters.
      
      To that end, we create a struct rtnl_link_ops for DSA user ports with
      the "dsa" kind. We expose the IFLA_DSA_MASTER link attribute that
      contains the ifindex of the newly desired DSA master.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      95f510d0
    • V
      net: dsa: introduce dsa_port_get_master() · 8f6a19c0
      Vladimir Oltean 提交于
      There is a desire to support for DSA masters in a LAG.
      
      That configuration is intended to work by simply enslaving the master to
      a bonding/team device. But the physical DSA master (the LAG slave) still
      has a dev->dsa_ptr, and that cpu_dp still corresponds to the physical
      CPU port.
      
      However, we would like to be able to retrieve the LAG that's the upper
      of the physical DSA master. In preparation for that, introduce a helper
      called dsa_port_get_master() that replaces all occurrences of the
      dp->cpu_dp->master pattern. The distinction between LAG and non-LAG will
      be made later within the helper itself.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      8f6a19c0
  2. 23 8月, 2022 7 次提交
  3. 30 6月, 2022 1 次提交
  4. 27 6月, 2022 2 次提交
  5. 10 6月, 2022 1 次提交
  6. 13 5月, 2022 1 次提交
    • V
      net: dsa: felix: manage host flooding using a specific driver callback · 72c3b0c7
      Vladimir Oltean 提交于
      At the time - commit 7569459a ("net: dsa: manage flooding on the CPU
      ports") - not introducing a dedicated switch callback for host flooding
      made sense, because for the only user, the felix driver, there was
      nothing different to do for the CPU port than set the flood flags on the
      CPU port just like on any other bridge port.
      
      There are 2 reasons why this approach is not good enough, however.
      
      (1) Other drivers, like sja1105, support configuring flooding as a
          function of {ingress port, egress port}, whereas the DSA
          ->port_bridge_flags() function only operates on an egress port.
          So with that driver we'd have useless host flooding from user ports
          which don't need it.
      
      (2) Even with the felix driver, support for multiple CPU ports makes it
          difficult to piggyback on ->port_bridge_flags(). The way in which
          the felix driver is going to support host-filtered addresses with
          multiple CPU ports is that it will direct these addresses towards
          both CPU ports (in a sort of multicast fashion), then restrict the
          forwarding to only one of the two using the forwarding masks.
          Consequently, flooding will also be enabled towards both CPU ports.
          However, ->port_bridge_flags() gets passed the index of a single CPU
          port, and that leaves the flood settings out of sync between the 2
          CPU ports.
      
      This is to say, it's better to have a specific driver method for host
      flooding, which takes the user port as argument. This solves problem (1)
      by allowing the driver to do different things for different user ports,
      and problem (2) by abstracting the operation and letting the driver do
      whatever, rather than explicitly making the DSA core point to the CPU
      port it thinks needs to be touched.
      
      This new method also creates a problem, which is that cross-chip setups
      are not handled. However I don't have hardware right now where I can
      test what is the proper thing to do, and there isn't hardware compatible
      with multi-switch trees that supports host flooding. So it remains a
      problem to be tackled in the future.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      72c3b0c7
  7. 25 4月, 2022 1 次提交
    • V
      net: dsa: flood multicast to CPU when slave has IFF_PROMISC · 7c762e70
      Vladimir Oltean 提交于
      Certain DSA switches can eliminate flooding to the CPU when none of the
      ports have the IFF_ALLMULTI or IFF_PROMISC flags set. This is done by
      synthesizing a call to dsa_port_bridge_flags() for the CPU port, a call
      which normally comes from the bridge driver via switchdev.
      
      The bridge port flags and IFF_PROMISC|IFF_ALLMULTI have slightly
      different semantics, and due to inattention/lack of proper testing, the
      IFF_PROMISC flag allows unknown unicast to be flooded to the CPU, but
      not unknown multicast.
      
      This must be fixed by setting both BR_FLOOD (unicast) and BR_MCAST_FLOOD
      in the synthesized dsa_port_bridge_flags() call, since IFF_PROMISC means
      that packets should not be filtered regardless of their MAC DA.
      
      Fixes: 7569459a ("net: dsa: manage flooding on the CPU ports")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c762e70
  8. 20 4月, 2022 4 次提交
  9. 23 3月, 2022 1 次提交
  10. 18 3月, 2022 4 次提交
  11. 17 3月, 2022 1 次提交
  12. 14 3月, 2022 2 次提交
    • V
      net: dsa: report and change port dscp priority using dcbnl · 47d75f78
      Vladimir Oltean 提交于
      Similar to the port-based default priority, IEEE 802.1Q-2018 allows the
      Application Priority Table to define QoS classes (0 to 7) per IP DSCP
      value (0 to 63).
      
      In the absence of an app table entry for a packet with DSCP value X,
      QoS classification for that packet falls back to other methods (VLAN PCP
      or port-based default). The presence of an app table for DSCP value X
      with priority Y makes the hardware classify the packet to QoS class Y.
      
      As opposed to the default-prio where DSA exposes only a "set" in
      dsa_switch_ops (because the port-based default is the fallback, it
      always exists, either implicitly or explicitly), for DSCP priorities we
      expose an "add" and a "del". The addition of a DSCP entry means trusting
      that DSCP priority, the deletion means ignoring it.
      
      Drivers that already trust (at least some) DSCP values can describe
      their configuration in dsa_switch_ops :: port_get_dscp_prio(), which is
      called for each DSCP value from 0 to 63.
      
      Again, there can be more than one dcbnl app table entry for the same
      DSCP value, DSA chooses the one with the largest configured priority.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47d75f78
    • V
      net: dsa: report and change port default priority using dcbnl · d538eca8
      Vladimir Oltean 提交于
      The port-based default QoS class is assigned to packets that lack a
      VLAN PCP (or the port is configured to not trust the VLAN PCP),
      an IP DSCP (or the port is configured to not trust IP DSCP), and packets
      on which no tc-skbedit action has matched.
      
      Similar to other drivers, this can be exposed to user space using the
      DCB Application Priority Table. IEEE 802.1Q-2018 specifies in Table
      D-8 - Sel field values that when the Selector is 1, the Protocol ID
      value of 0 denotes the "Default application priority. For use when
      application priority is not otherwise specified."
      
      The way in which the dcbnl integration in DSA has been designed has to
      do with its requirements. Andrew Lunn explains that SOHO switches are
      expected to come with some sort of pre-configured QoS profile, and that
      it is desirable for this to come pre-loaded into the DSA slave interfaces'
      DCB application priority table.
      
      In the dcbnl design, this is possible because calls to dcb_ieee_setapp()
      can be initiated by anyone including being self-initiated by this device
      driver.
      
      However, what makes this challenging to implement in DSA is that the DSA
      core manages the net_devices (effectively hiding them from drivers),
      while drivers manage the hardware. The DSA core has no knowledge of what
      individual drivers' QoS policies are. DSA could export to drivers a
      wrapper over dcb_ieee_setapp() and these could call that function to
      pre-populate the app priority table, however drivers don't have a good
      moment in time to do this. The dsa_switch_ops :: setup() method gets
      called before the net_devices are created (dsa_slave_create), and so is
      dsa_switch_ops :: port_setup(). What remains is dsa_switch_ops ::
      port_enable(), but this gets called upon each ndo_open. If we add app
      table entries on every open, we'd need to remove them on close, to avoid
      duplicate entry errors. But if we delete app priority entries on close,
      what we delete may not be the initial, driver pre-populated entries, but
      rather user-added entries.
      
      So it is clear that letting drivers choose the timing of the
      dcb_ieee_setapp() call is inappropriate. The alternative which was
      chosen is to introduce hardware-specific ops in dsa_switch_ops, and
      effectively hide dcbnl details from drivers as well. For pre-populating
      the application table, dsa_slave_dcbnl_init() will call
      ds->ops->port_get_default_prio() which is supposed to read from
      hardware. If the operation succeeds, DSA creates a default-prio app
      table entry. The method is called as soon as the slave_dev is
      registered, but before we release the rtnl_mutex. This is done such that
      user space sees the app table entries as soon as it sees the interface
      being registered.
      
      The fact that we populate slave_dev->dcbnl_ops with a non-NULL pointer
      changes behavior in dcb_doit() from net/dcb/dcbnl.c, which used to
      return -EOPNOTSUPP for any dcbnl operation where netdev->dcbnl_ops is
      NULL. Because there are still dcbnl-unaware DSA drivers even if they
      have dcbnl_ops populated, the way to restore the behavior is to make all
      dcbnl_ops return -EOPNOTSUPP on absence of the hardware-specific
      dsa_switch_ops method.
      
      The dcbnl framework absurdly allows there to be more than one app table
      entry for the same selector and protocol (in other words, more than one
      port-based default priority). In the iproute2 dcb program, there is a
      "replace" syntactical sugar command which performs an "add" and a "del"
      to hide this away. But we choose the largest configured priority when we
      call ds->ops->port_set_default_prio(), using __fls(). When there is no
      default-prio app table entry left, the port-default priority is restored
      to 0.
      
      Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210113154139.1803705-2-olteanv@gmail.com/Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d538eca8
  13. 09 3月, 2022 1 次提交
  14. 03 3月, 2022 5 次提交
    • V
      net: dsa: manage flooding on the CPU ports · 7569459a
      Vladimir Oltean 提交于
      DSA can treat IFF_PROMISC and IFF_ALLMULTI on standalone user ports as
      signifying whether packets with an unknown MAC DA will be received or
      not. Since known MAC DAs are handled by FDB/MDB entries, this means that
      promiscuity is analogous to including/excluding the CPU port from the
      flood domain of those packets.
      
      There are two ways to signal CPU flooding to drivers.
      
      The first (chosen here) is to synthesize a call to
      ds->ops->port_bridge_flags() for the CPU port, with a mask of
      BR_FLOOD | BR_MCAST_FLOOD. This has the effect of turning on egress
      flooding on the CPU port regardless of source.
      
      The alternative would be to create a new ds->ops->port_host_flood()
      which is called per user port. Some switches (sja1105) have a flood
      domain that is managed per {ingress port, egress port} pair, so it would
      make more sense for this kind of switch to not flood the CPU from port A
      if just port B requires it. Nonetheless, the sja1105 has other quirks
      that prevent it from making use of unicast filtering, and without a
      concrete user making use of this feature, I chose not to implement it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7569459a
    • V
      net: dsa: install the primary unicast MAC address as standalone port host FDB · 499aa9e1
      Vladimir Oltean 提交于
      To be able to safely turn off CPU flooding for standalone ports, we need
      to ensure that the dev_addr of each DSA slave interface is installed as
      a standalone host FDB entry for compatible switches.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      499aa9e1
    • V
      net: dsa: install secondary unicast and multicast addresses as host FDB/MDB · 5e8a1e03
      Vladimir Oltean 提交于
      In preparation of disabling flooding towards the CPU in standalone ports
      mode, identify the addresses requested by upper interfaces and use the
      new API for DSA FDB isolation to request the hardware driver to offload
      these as FDB or MDB objects. The objects belong to the user port's
      database, and are installed pointing towards the CPU port.
      
      Because dev_uc_add()/dev_mc_add() is VLAN-unaware, we offload to the
      port standalone database addresses with VID 0 (also VLAN-unaware).
      So this excludes switches with global VLAN filtering from supporting
      unicast filtering, because there, it is possible for a port of a switch
      to join a VLAN-aware bridge, and this changes the VLAN awareness of
      standalone ports, requiring VLAN-aware standalone host FDB entries.
      For the same reason, hellcreek, which requires VLAN awareness in
      standalone mode, is also exempted from unicast filtering.
      
      We create "standalone" variants of dsa_port_host_fdb_add() and
      dsa_port_host_mdb_add() (and the _del coresponding functions).
      
      We also create a separate work item type for handling deferred
      standalone host FDB/MDB entries compared to the switchdev one.
      This is done for the purpose of clarity - the procedure for offloading a
      bridge FDB entry is different than offloading a standalone one, and
      the switchdev event work handles only FDBs anyway, not MDBs.
      Deferral is needed for standalone entries because ndo_set_rx_mode runs
      in atomic context. We could probably optimize things a little by first
      queuing up all entries that need to be offloaded, and scheduling the
      work item just once, but the data structures that we can pass through
      __dev_uc_sync() and __dev_mc_sync() are limiting (there is nothing like
      a void *priv), so we'd have to keep the list of queued events somewhere
      in struct dsa_switch, and possibly a lock for it. Too complicated for
      now.
      
      Adding the address to the master is handled by dev_uc_sync(), adding it
      to the hardware is handled by __dev_uc_sync(). So this is the reason why
      dsa_port_standalone_host_fdb_add() does not call dev_uc_add(). Not that
      it had the rtnl_mutex anyway - ndo_set_rx_mode has it, but is atomic.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e8a1e03
    • V
      net: dsa: rename the host FDB and MDB methods to contain the "bridge" namespace · 68d6d71e
      Vladimir Oltean 提交于
      We are preparing to add API in port.c that adds FDB and MDB entries that
      correspond to the port's standalone database. Rename the existing
      methods to make it clear that the FDB and MDB entries offloaded come
      from the bridge database.
      
      Since the function names lengthen in dsa_slave_switchdev_event_work(),
      we place "addr" and "vid" in temporary variables, to shorten those.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68d6d71e
    • V
      net: dsa: remove workarounds for changing master promisc/allmulti only while up · 35aae5ab
      Vladimir Oltean 提交于
      Lennert Buytenhek explains in commit df02c6ff ("dsa: fix master
      interface allmulti/promisc handling"), dated Nov 2008, that changing the
      promiscuity of interfaces that are down (here the master) is broken.
      
      This fact regarding promisc/allmulti has changed since commit
      b6c40d68 ("net: only invoke dev->change_rx_flags when device is UP")
      by Vlad Yasevich, dated Nov 2013.
      
      Therefore, DSA now has unnecessary complexity to handle master state
      transitions from down to up. In fact, syncing the unicast and multicast
      addresses can happen completely asynchronously to the administrative
      state changes.
      
      This change reduces that complexity by effectively fully reverting
      commit df02c6ff ("dsa: fix master interface allmulti/promisc
      handling").
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35aae5ab
  15. 25 2月, 2022 5 次提交
    • V
      net: dsa: support FDB events on offloaded LAG interfaces · e212fa7c
      Vladimir Oltean 提交于
      This change introduces support for installing static FDB entries towards
      a bridge port that is a LAG of multiple DSA switch ports, as well as
      support for filtering towards the CPU local FDB entries emitted for LAG
      interfaces that are bridge ports.
      
      Conceptually, host addresses on LAG ports are identical to what we do
      for plain bridge ports. Whereas FDB entries _towards_ a LAG can't simply
      be replicated towards all member ports like we do for multicast, or VLAN.
      Instead we need new driver API. Hardware usually considers a LAG to be a
      "logical port", and sets the entire LAG as the forwarding destination.
      The physical egress port selection within the LAG is made by hashing
      policy, as usual.
      
      To represent the logical port corresponding to the LAG, we pass by value
      a copy of the dsa_lag structure to all switches in the tree that have at
      least one port in that LAG.
      
      To illustrate why a refcounted list of FDB entries is needed in struct
      dsa_lag, it is enough to say that:
      - a LAG may be a bridge port and may therefore receive FDB events even
        while it isn't yet offloaded by any DSA interface
      - DSA interfaces may be removed from a LAG while that is a bridge port;
        we don't want FDB entries lingering around, but we don't want to
        remove entries that are still in use, either
      
      For all the cases below to work, the idea is to always keep an FDB entry
      on a LAG with a reference count equal to the DSA member ports. So:
      - if a port joins a LAG, it requests the bridge to replay the FDB, and
        the FDB entries get created, or their refcount gets bumped by one
      - if a port leaves a LAG, the FDB replay deletes or decrements refcount
        by one
      - if an FDB is installed towards a LAG with ports already present, that
        entry is created (if it doesn't exist) and its refcount is bumped by
        the amount of ports already present in the LAG
      
      echo "Adding FDB entry to bond with existing ports"
      ip link del bond0
      ip link add bond0 type bond mode 802.3ad
      ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
      ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up
      ip link del br0
      ip link add br0 type bridge
      ip link set bond0 master br0
      bridge fdb add dev bond0 00:01:02:03:04:05 master static
      
      ip link del br0
      ip link del bond0
      
      echo "Adding FDB entry to empty bond"
      ip link del bond0
      ip link add bond0 type bond mode 802.3ad
      ip link del br0
      ip link add br0 type bridge
      ip link set bond0 master br0
      bridge fdb add dev bond0 00:01:02:03:04:05 master static
      ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
      ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up
      
      ip link del br0
      ip link del bond0
      
      echo "Adding FDB entry to empty bond, then removing ports one by one"
      ip link del bond0
      ip link add bond0 type bond mode 802.3ad
      ip link del br0
      ip link add br0 type bridge
      ip link set bond0 master br0
      bridge fdb add dev bond0 00:01:02:03:04:05 master static
      ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
      ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up
      
      ip link set swp1 nomaster
      ip link set swp2 nomaster
      ip link del br0
      ip link del bond0
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e212fa7c
    • V
      net: dsa: call SWITCHDEV_FDB_OFFLOADED for the orig_dev · 93c79823
      Vladimir Oltean 提交于
      When switchdev_handle_fdb_event_to_device() replicates a FDB event
      emitted for the bridge or for a LAG port and DSA offloads that, we
      should notify back to switchdev that the FDB entry on the original
      device is what was offloaded, not on the DSA slave devices that the
      event is replicated on.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      93c79823
    • V
      net: dsa: remove "ds" and "port" from struct dsa_switchdev_event_work · e35f12e9
      Vladimir Oltean 提交于
      By construction, the struct net_device *dev passed to
      dsa_slave_switchdev_event_work() via struct dsa_switchdev_event_work
      is always a DSA slave device.
      
      Therefore, it is redundant to pass struct dsa_switch and int port
      information in the deferred work structure. This can be retrieved at all
      times from the provided struct net_device via dsa_slave_to_port().
      
      For the same reason, we can drop the dsa_is_user_port() check in
      dsa_fdb_offload_notify().
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e35f12e9
    • V
      net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device · ec638740
      Vladimir Oltean 提交于
      When the switchdev_handle_fdb_event_to_device() event replication helper
      was created, my original thought was that FDB events on LAG interfaces
      should most likely be special-cased, not just replicated towards all
      switchdev ports beneath that LAG. So this replication helper currently
      does not recurse through switchdev lower interfaces of LAG bridge ports,
      but rather calls the lag_mod_cb() if that was provided.
      
      No switchdev driver uses this helper for FDB events on LAG interfaces
      yet, so that was an assumption which was yet to be tested. It is
      certainly usable for that purpose, as my RFC series shows:
      
      https://patchwork.kernel.org/project/netdevbpf/cover/20220210125201.2859463-1-vladimir.oltean@nxp.com/
      
      however this approach is slightly convoluted because:
      
      - the switchdev driver gets a "dev" that isn't its own net device, but
        rather the LAG net device. It must call switchdev_lower_dev_find(dev)
        in order to get a handle of any of its own net devices (the ones that
        pass check_cb).
      
      - in order for FDB entries on LAG ports to be correctly refcounted per
        the number of switchdev ports beneath that LAG, we haven't escaped the
        need to iterate through the LAG's lower interfaces. Except that is now
        the responsibility of the switchdev driver, because the replication
        helper just stopped half-way.
      
      So, even though yes, FDB events on LAG bridge ports must be
      special-cased, in the end it's simpler to let switchdev_handle_fdb_*
      just iterate through the LAG port's switchdev lowers, and let the
      switchdev driver figure out that those physical ports are under a LAG.
      
      The switchdev_handle_fdb_event_to_device() helper takes a
      "foreign_dev_check" callback so it can figure out whether @dev can
      autonomously forward to @foreign_dev. DSA fills this method properly:
      if the LAG is offloaded by another port in the same tree as @dev, then
      it isn't foreign. If it is a software LAG, it is foreign - forwarding
      happens in software.
      
      Whether an interface is foreign or not decides whether the replication
      helper will go through the LAG's switchdev lowers or not. Since the
      lan966x doesn't properly fill this out, FDB events on software LAG
      uppers will get called. By changing lan966x_foreign_dev_check(), we can
      suppress them.
      
      Whereas DSA will now start receiving FDB events for its offloaded LAG
      uppers, so we need to return -EOPNOTSUPP, since we currently don't do
      the right thing for them.
      
      Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ec638740
    • V
      net: dsa: create a dsa_lag structure · dedd6a00
      Vladimir Oltean 提交于
      The main purpose of this change is to create a data structure for a LAG
      as seen by DSA. This is similar to what we have for bridging - we pass a
      copy of this structure by value to ->port_lag_join and ->port_lag_leave.
      For now we keep the lag_dev, id and a reference count in it. Future
      patches will add a list of FDB entries for the LAG (these also need to
      be refcounted to work properly).
      
      The LAG structure is created using dsa_port_lag_create() and destroyed
      using dsa_port_lag_destroy(), just like we have for bridging.
      
      Because now, the dsa_lag itself is refcounted, we can simplify
      dsa_lag_map() and dsa_lag_unmap(). These functions need to keep a LAG in
      the dst->lags array only as long as at least one port uses it. The
      refcounting logic inside those functions can be removed now - they are
      called only when we should perform the operation.
      
      dsa_lag_dev() is renamed to dsa_lag_by_id() and now returns the dsa_lag
      structure instead of the lag_dev net_device.
      
      dsa_lag_foreach_port() now takes the dsa_lag structure as argument.
      
      dst->lags holds an array of dsa_lag structures.
      
      dsa_lag_map() now also saves the dsa_lag->id value, so that linear
      walking of dst->lags in drivers using dsa_lag_id() is no longer
      necessary. They can just look at lag.id.
      
      dsa_port_lag_id_get() is a helper, similar to dsa_port_bridge_num_get(),
      which can be used by drivers to get the LAG ID assigned by DSA to a
      given port.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      dedd6a00
  16. 16 2月, 2022 2 次提交
    • V
      net: dsa: offload bridge port VLANs on foreign interfaces · 164f861b
      Vladimir Oltean 提交于
      DSA now explicitly handles VLANs installed with the 'self' flag on the
      bridge as host VLANs, instead of just replicating every bridge port VLAN
      also on the CPU port and never deleting it, which is what it did before.
      
      However, this leaves a corner case uncovered, as explained by
      Tobias Waldekranz:
      https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260
      
      Forwarding towards a bridge port VLAN installed on a bridge port foreign
      to DSA (separate NIC, Wi-Fi AP) used to work by virtue of the fact that
      DSA itself needed to have at least one port in that VLAN (therefore, it
      also had the CPU port in said VLAN). However, now that the CPU port may
      not be member of all VLANs that user ports are members of, we need to
      ensure this isn't the case if software forwarding to a foreign interface
      is required.
      
      The solution is to treat bridge port VLANs on standalone interfaces in
      the exact same way as host VLANs. From DSA's perspective, there is no
      difference between local termination and software forwarding; packets in
      that VLAN must reach the CPU in both cases.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      164f861b
    • V
      net: dsa: add explicit support for host bridge VLANs · 134ef238
      Vladimir Oltean 提交于
      Currently, DSA programs VLANs on shared (DSA and CPU) ports each time it
      does so on user ports. This is good for basic functionality but has
      several limitations:
      
      - the VLAN group which must reach the CPU may be radically different
        from the VLAN group that must be autonomously forwarded by the switch.
        In other words, the admin may want to isolate noisy stations and avoid
        traffic from them going to the control processor of the switch, where
        it would just waste useless cycles. The bridge already supports
        independent control of VLAN groups on bridge ports and on the bridge
        itself, and when VLAN-aware, it will drop packets in software anyway
        if their VID isn't added as a 'self' entry towards the bridge device.
      
      - Replaying host FDB entries may depend, for some drivers like mv88e6xxx,
        on replaying the host VLANs as well. The 2 VLAN groups are
        approximately the same in most regular cases, but there are corner
        cases when timing matters, and DSA's approximation of replicating
        VLANs on shared ports simply does not work.
      
      - If a user makes the bridge (implicitly the CPU port) join a VLAN by
        accident, there is no way for the CPU port to isolate itself from that
        noisy VLAN except by rebooting the system. This is because for each
        VLAN added on a user port, DSA will add it on shared ports too, but
        for each VLAN deletion on a user port, it will remain installed on
        shared ports, since DSA has no good indication of whether the VLAN is
        still in use or not.
      
      Now that the bridge driver emits well-balanced SWITCHDEV_OBJ_ID_PORT_VLAN
      addition and removal events, DSA has a simple and straightforward task
      of separating the bridge port VLANs (these have an orig_dev which is a
      DSA slave interface, or a LAG interface) from the host VLANs (these have
      an orig_dev which is a bridge interface), and to keep a simple reference
      count of each VID on each shared port.
      
      Forwarding VLANs must be installed on the bridge ports and on all DSA
      ports interconnecting them. We don't have a good view of the exact
      topology, so we simply install forwarding VLANs on all DSA ports, which
      is what has been done until now.
      
      Host VLANs must be installed primarily on the dedicated CPU port of each
      bridge port. More subtly, they must also be installed on upstream-facing
      and downstream-facing DSA ports that are connecting the bridge ports and
      the CPU. This ensures that the mv88e6xxx's problem (VID of host FDB
      entry may be absent from VTU) is still addressed even if that switch is
      in a cross-chip setup, and it has no local CPU port.
      
      Therefore:
      - user ports contain only bridge port (forwarding) VLANs, and no
        refcounting is necessary
      - DSA ports contain both forwarding and host VLANs. Refcounting is
        necessary among these 2 types.
      - CPU ports contain only host VLANs. Refcounting is also necessary.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      134ef238