1. 02 7月, 2022 1 次提交
  2. 30 6月, 2022 1 次提交
  3. 27 6月, 2022 3 次提交
  4. 10 6月, 2022 1 次提交
  5. 23 5月, 2022 1 次提交
  6. 13 5月, 2022 2 次提交
    • V
      net: dsa: remove port argument from ->change_tag_protocol() · bacf93b0
      Vladimir Oltean 提交于
      DSA has not supported (and probably will not support in the future
      either) independent tagging protocols per CPU port.
      
      Different switch drivers have different requirements, some may need to
      replicate some settings for each CPU port, some may need to apply some
      settings on a single CPU port, while some may have to configure some
      global settings and then some per-CPU-port settings.
      
      In any case, the current model where DSA calls ->change_tag_protocol for
      each CPU port turns out to be impractical for drivers where there are
      global things to be done. For example, felix calls dsa_tag_8021q_register(),
      which makes no sense per CPU port, so it suppresses the second call.
      
      Let drivers deal with replication towards all CPU ports, and remove the
      CPU port argument from the function prototype.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NLuiz Angelo Daros de Luca <luizluca@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bacf93b0
    • V
      net: dsa: felix: manage host flooding using a specific driver callback · 72c3b0c7
      Vladimir Oltean 提交于
      At the time - commit 7569459a ("net: dsa: manage flooding on the CPU
      ports") - not introducing a dedicated switch callback for host flooding
      made sense, because for the only user, the felix driver, there was
      nothing different to do for the CPU port than set the flood flags on the
      CPU port just like on any other bridge port.
      
      There are 2 reasons why this approach is not good enough, however.
      
      (1) Other drivers, like sja1105, support configuring flooding as a
          function of {ingress port, egress port}, whereas the DSA
          ->port_bridge_flags() function only operates on an egress port.
          So with that driver we'd have useless host flooding from user ports
          which don't need it.
      
      (2) Even with the felix driver, support for multiple CPU ports makes it
          difficult to piggyback on ->port_bridge_flags(). The way in which
          the felix driver is going to support host-filtered addresses with
          multiple CPU ports is that it will direct these addresses towards
          both CPU ports (in a sort of multicast fashion), then restrict the
          forwarding to only one of the two using the forwarding masks.
          Consequently, flooding will also be enabled towards both CPU ports.
          However, ->port_bridge_flags() gets passed the index of a single CPU
          port, and that leaves the flood settings out of sync between the 2
          CPU ports.
      
      This is to say, it's better to have a specific driver method for host
      flooding, which takes the user port as argument. This solves problem (1)
      by allowing the driver to do different things for different user ports,
      and problem (2) by abstracting the operation and letting the driver do
      whatever, rather than explicitly making the DSA core point to the CPU
      port it thinks needs to be touched.
      
      This new method also creates a problem, which is that cross-chip setups
      are not handled. However I don't have hardware right now where I can
      test what is the proper thing to do, and there isn't hardware compatible
      with multi-switch trees that supports host flooding. So it remains a
      problem to be tackled in the future.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      72c3b0c7
  7. 10 5月, 2022 1 次提交
    • V
      net: dsa: flush switchdev workqueue on bridge join error path · 630fd482
      Vladimir Oltean 提交于
      There is a race between switchdev_bridge_port_offload() and the
      dsa_port_switchdev_sync_attrs() call right below it.
      
      When switchdev_bridge_port_offload() finishes, FDB entries have been
      replayed by the bridge, but are scheduled for deferred execution later.
      
      However dsa_port_switchdev_sync_attrs -> dsa_port_can_apply_vlan_filtering()
      may impose restrictions on the vlan_filtering attribute and refuse
      offloading.
      
      When this happens, the delayed FDB entries will dereference dp->bridge,
      which is a NULL pointer because we have stopped the process of
      offloading this bridge.
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      Workqueue: dsa_ordered dsa_slave_switchdev_event_work
      pc : dsa_port_bridge_host_fdb_del+0x64/0x100
      lr : dsa_slave_switchdev_event_work+0x130/0x1bc
      Call trace:
       dsa_port_bridge_host_fdb_del+0x64/0x100
       dsa_slave_switchdev_event_work+0x130/0x1bc
       process_one_work+0x294/0x670
       worker_thread+0x80/0x460
      ---[ end trace 0000000000000000 ]---
      Error: dsa_core: Must first remove VLAN uppers having VIDs also present in bridge.
      
      Fix the bug by doing what we do on the normal bridge leave path as well,
      which is to wait until the deferred FDB entries complete executing, then
      exit.
      
      The placement of dsa_flush_workqueue() after switchdev_bridge_port_unoffload()
      guarantees that both the FDB additions and deletions on rollback are waited for.
      
      Fixes: d7d0d423 ("net: dsa: flush switchdev workqueue when leaving the bridge")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220507134550.1849834-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      630fd482
  8. 07 5月, 2022 1 次提交
  9. 25 4月, 2022 2 次提交
  10. 22 4月, 2022 1 次提交
  11. 20 4月, 2022 6 次提交
  12. 19 4月, 2022 1 次提交
  13. 13 4月, 2022 1 次提交
    • V
      Revert "net: dsa: setup master before ports" · 762c2998
      Vladimir Oltean 提交于
      This reverts commit 11fd667d.
      
      dsa_slave_change_mtu() updates the MTU of the DSA master and of the
      associated CPU port, but only if it detects a change to the master MTU.
      
      The blamed commit in the Fixes: tag below addressed a regression where
      dsa_slave_change_mtu() would return early and not do anything due to
      ds->ops->port_change_mtu() not being implemented.
      
      However, that commit also had the effect that the master MTU got set up
      to the correct value by dsa_master_setup(), but the associated CPU port's
      MTU did not get updated. This causes breakage for drivers that rely on
      the ->port_change_mtu() DSA call to account for the tagging overhead on
      the CPU port, and don't set up the initial MTU during the setup phase.
      
      Things actually worked before because they were in a fragile equilibrium
      where dsa_slave_change_mtu() was called before dsa_master_setup() was.
      So dsa_slave_change_mtu() could actually detect a change and update the
      CPU port MTU too.
      
      Restore the code to the way things used to work by reverting the reorder
      of dsa_tree_setup_master() and dsa_tree_setup_ports(). That change did
      not have a concrete motivation going for it anyway, it just looked
      better.
      
      Fixes: 066dfc42 ("Revert "net: dsa: stop updating master MTU from master.c"")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      762c2998
  14. 01 4月, 2022 1 次提交
  15. 23 3月, 2022 1 次提交
  16. 22 3月, 2022 1 次提交
    • V
      net: dsa: fix panic on shutdown if multi-chip tree failed to probe · 8fd36358
      Vladimir Oltean 提交于
      DSA probing is atypical because a tree of devices must probe all at
      once, so out of N switches which call dsa_tree_setup_routing_table()
      during probe, for (N - 1) of them, "complete" will return false and they
      will exit probing early. The Nth switch will set up the whole tree on
      their behalf.
      
      The implication is that for (N - 1) switches, the driver binds to the
      device successfully, without doing anything. When the driver is bound,
      the ->shutdown() method may run. But if the Nth switch has failed to
      initialize the tree, there is nothing to do for the (N - 1) driver
      instances, since the slave devices have not been created, etc. Moreover,
      dsa_switch_shutdown() expects that the calling @ds has been in fact
      initialized, so it jumps at dereferencing the various data structures,
      which is incorrect.
      
      Avoid the ensuing NULL pointer dereferences by simply checking whether
      the Nth switch has previously set "ds->setup = true" for the switch
      which is currently shutting down. The entire setup is serialized under
      dsa2_mutex which we already hold.
      
      Fixes: 0650bf52 ("net: dsa: be compatible with masters which unregister on shutdown")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220318195443.275026-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      8fd36358
  17. 18 3月, 2022 4 次提交
  18. 17 3月, 2022 2 次提交
  19. 14 3月, 2022 2 次提交
    • V
      net: dsa: report and change port dscp priority using dcbnl · 47d75f78
      Vladimir Oltean 提交于
      Similar to the port-based default priority, IEEE 802.1Q-2018 allows the
      Application Priority Table to define QoS classes (0 to 7) per IP DSCP
      value (0 to 63).
      
      In the absence of an app table entry for a packet with DSCP value X,
      QoS classification for that packet falls back to other methods (VLAN PCP
      or port-based default). The presence of an app table for DSCP value X
      with priority Y makes the hardware classify the packet to QoS class Y.
      
      As opposed to the default-prio where DSA exposes only a "set" in
      dsa_switch_ops (because the port-based default is the fallback, it
      always exists, either implicitly or explicitly), for DSCP priorities we
      expose an "add" and a "del". The addition of a DSCP entry means trusting
      that DSCP priority, the deletion means ignoring it.
      
      Drivers that already trust (at least some) DSCP values can describe
      their configuration in dsa_switch_ops :: port_get_dscp_prio(), which is
      called for each DSCP value from 0 to 63.
      
      Again, there can be more than one dcbnl app table entry for the same
      DSCP value, DSA chooses the one with the largest configured priority.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47d75f78
    • V
      net: dsa: report and change port default priority using dcbnl · d538eca8
      Vladimir Oltean 提交于
      The port-based default QoS class is assigned to packets that lack a
      VLAN PCP (or the port is configured to not trust the VLAN PCP),
      an IP DSCP (or the port is configured to not trust IP DSCP), and packets
      on which no tc-skbedit action has matched.
      
      Similar to other drivers, this can be exposed to user space using the
      DCB Application Priority Table. IEEE 802.1Q-2018 specifies in Table
      D-8 - Sel field values that when the Selector is 1, the Protocol ID
      value of 0 denotes the "Default application priority. For use when
      application priority is not otherwise specified."
      
      The way in which the dcbnl integration in DSA has been designed has to
      do with its requirements. Andrew Lunn explains that SOHO switches are
      expected to come with some sort of pre-configured QoS profile, and that
      it is desirable for this to come pre-loaded into the DSA slave interfaces'
      DCB application priority table.
      
      In the dcbnl design, this is possible because calls to dcb_ieee_setapp()
      can be initiated by anyone including being self-initiated by this device
      driver.
      
      However, what makes this challenging to implement in DSA is that the DSA
      core manages the net_devices (effectively hiding them from drivers),
      while drivers manage the hardware. The DSA core has no knowledge of what
      individual drivers' QoS policies are. DSA could export to drivers a
      wrapper over dcb_ieee_setapp() and these could call that function to
      pre-populate the app priority table, however drivers don't have a good
      moment in time to do this. The dsa_switch_ops :: setup() method gets
      called before the net_devices are created (dsa_slave_create), and so is
      dsa_switch_ops :: port_setup(). What remains is dsa_switch_ops ::
      port_enable(), but this gets called upon each ndo_open. If we add app
      table entries on every open, we'd need to remove them on close, to avoid
      duplicate entry errors. But if we delete app priority entries on close,
      what we delete may not be the initial, driver pre-populated entries, but
      rather user-added entries.
      
      So it is clear that letting drivers choose the timing of the
      dcb_ieee_setapp() call is inappropriate. The alternative which was
      chosen is to introduce hardware-specific ops in dsa_switch_ops, and
      effectively hide dcbnl details from drivers as well. For pre-populating
      the application table, dsa_slave_dcbnl_init() will call
      ds->ops->port_get_default_prio() which is supposed to read from
      hardware. If the operation succeeds, DSA creates a default-prio app
      table entry. The method is called as soon as the slave_dev is
      registered, but before we release the rtnl_mutex. This is done such that
      user space sees the app table entries as soon as it sees the interface
      being registered.
      
      The fact that we populate slave_dev->dcbnl_ops with a non-NULL pointer
      changes behavior in dcb_doit() from net/dcb/dcbnl.c, which used to
      return -EOPNOTSUPP for any dcbnl operation where netdev->dcbnl_ops is
      NULL. Because there are still dcbnl-unaware DSA drivers even if they
      have dcbnl_ops populated, the way to restore the behavior is to make all
      dcbnl_ops return -EOPNOTSUPP on absence of the hardware-specific
      dsa_switch_ops method.
      
      The dcbnl framework absurdly allows there to be more than one app table
      entry for the same selector and protocol (in other words, more than one
      port-based default priority). In the iproute2 dcb program, there is a
      "replace" syntactical sugar command which performs an "add" and a "del"
      to hide this away. But we choose the largest configured priority when we
      call ds->ops->port_set_default_prio(), using __fls(). When there is no
      default-prio app table entry left, the port-default priority is restored
      to 0.
      
      Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210113154139.1803705-2-olteanv@gmail.com/Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d538eca8
  20. 10 3月, 2022 1 次提交
  21. 09 3月, 2022 4 次提交
    • V
      net: dsa: felix: avoid early deletion of host FDB entries · 7e580490
      Vladimir Oltean 提交于
      The Felix driver declares FDB isolation but puts all standalone ports in
      VID 0. This is mostly problem-free as discussed with Alvin here:
      https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870
      
      however there is one catch. DSA still thinks that FDB entries are
      installed on the CPU port as many times as there are user ports, and
      this is problematic when multiple user ports share the same MAC address.
      
      Consider the default case where all user ports inherit their MAC address
      from the DSA master, and then the user runs:
      
      ip link set swp0 address 00:01:02:03:04:05
      
      The above will make dsa_slave_set_mac_address() call
      dsa_port_standalone_host_fdb_add() for 00:01:02:03:04:05 in port 0's
      standalone database, and dsa_port_standalone_host_fdb_del() for the old
      address of swp0, again in swp0's standalone database.
      
      Both the ->port_fdb_add() and ->port_fdb_del() will be propagated down
      to the felix driver, which will end up deleting the old MAC address from
      the CPU port. But this is still in use by other user ports, so we end up
      breaking unicast termination for them.
      
      There isn't a problem in the fact that DSA keeps track of host
      standalone addresses in the individual database of each user port: some
      drivers like sja1105 need this. There also isn't a problem in the fact
      that some drivers choose the same VID/FID for all standalone ports.
      It is just that the deletion of these host addresses must be delayed
      until they are known to not be in use any longer, and only the driver
      has this knowledge. Since DSA keeps these addresses in &cpu_dp->fdbs and
      &cpu_db->mdbs, it is just a matter of walking over those lists and see
      whether the same MAC address is present on the CPU port in the port db
      of another user port.
      
      I have considered reusing the generic dsa_port_walk_fdbs() and
      dsa_port_walk_mdbs() schemes for this, but locking makes it difficult.
      In the ->port_fdb_add() method and co, &dp->addr_lists_lock is held, but
      dsa_port_walk_fdbs() also acquires that lock. Also, even assuming that
      we introduce an unlocked variant of the address iterator, we'd still
      need some relatively complex data structures, and a void *ctx in the
      dsa_fdb_walk_cb_t which we don't currently pass, such that drivers are
      able to figure out, after iterating, whether the same MAC address is or
      isn't present in the port db of another port.
      
      All the above, plus the fact that I expect other drivers to follow the
      same model as felix where all standalone ports use the same FID, made me
      conclude that a generic method provided by DSA is necessary:
      dsa_fdb_present_in_other_db() and the mdb equivalent. Felix calls this
      from the ->port_fdb_del() handler for the CPU port, when the database
      was classified to either a port db, or a LAG db.
      
      For symmetry, we also call this from ->port_fdb_add(), because if the
      address was installed once, then installing it a second time serves no
      purpose: it's already in hardware in VID 0 and it affects all standalone
      ports.
      
      This change moves dsa_db_equal() from switch.c to dsa.c, since it now
      has one more caller.
      
      Fixes: 54c31984 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e580490
    • V
      net: dsa: be mostly no-op in dsa_slave_set_mac_address when down · e2d0576f
      Vladimir Oltean 提交于
      Since the slave unicast address is synced to hardware and to the DSA
      master during dsa_slave_open(), this means that a call to
      dsa_slave_set_mac_address() while the slave interface is down will
      result to a call to dsa_port_standalone_host_fdb_del() and to
      dev_uc_del() for the MAC address while there was no previous
      dsa_port_standalone_host_fdb_add() or dev_uc_add().
      
      This is a partial revert of the blamed commit below, which was too
      aggressive.
      
      Fixes: 35aae5ab ("net: dsa: remove workarounds for changing master promisc/allmulti only while up")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2d0576f
    • V
      net: dsa: move port lists initialization to dsa_port_touch · fe95784f
      Vladimir Oltean 提交于
      &cpu_db->fdbs and &cpu_db->mdbs may be uninitialized lists during some
      call paths of felix_set_tag_protocol().
      
      There was an attempt to avoid calling dsa_port_walk_fdbs() during setup
      by using a "bool change" in the felix driver, but this doesn't work when
      the tagging protocol is defined in the device tree, and a change is
      triggered by DSA at pseudo-runtime:
      
      dsa_tree_setup_switches
      -> dsa_switch_setup
         -> dsa_switch_setup_tag_protocol
            -> ds->ops->change_tag_protocol
      dsa_tree_setup_ports
      -> dsa_port_setup
         -> &dp->fdbs and &db->mdbs only get initialized here
      
      So it seems like the only way to fix this is to move the initialization
      of these lists earlier.
      
      dsa_port_touch() is called from dsa_switch_touch_ports() which is called
      from dsa_switch_parse_of(), and this runs completely before
      dsa_tree_setup(). Similarly, dsa_switch_release_ports() runs after
      dsa_tree_teardown().
      
      Fixes: f9cef64f ("net: dsa: felix: migrate host FDB and MDB entries when changing tag proto")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe95784f
    • V
      net: dsa: warn if port lists aren't empty in dsa_port_teardown · 0832cd9f
      Vladimir Oltean 提交于
      There has been recent work towards matching each switchdev object
      addition with a corresponding deletion.
      
      Therefore, having elements in the fdbs, mdbs, vlans lists at the time of
      a shared (DSA, CPU) port's teardown is indicative of a bug somewhere
      else, and not something that is to be expected.
      
      We shouldn't try to silently paper over that. Instead, print a warning
      and a stack trace.
      
      This change is a prerequisite for moving the initialization/teardown of
      these lists. Make it clear that clearing the lists isn't needed.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0832cd9f
  22. 08 3月, 2022 1 次提交
  23. 07 3月, 2022 1 次提交