1. 27 2月, 2022 2 次提交
    • V
      net: dsa: pass extack to .port_bridge_join driver methods · 06b9cce4
      Vladimir Oltean 提交于
      As FDB isolation cannot be enforced between VLAN-aware bridges in lack
      of hardware assistance like extra FID bits, it seems plausible that many
      DSA switches cannot do it. Therefore, they need to reject configurations
      with multiple VLAN-aware bridges from the two code paths that can
      transition towards that state:
      
      - joining a VLAN-aware bridge
      - toggling VLAN awareness on an existing bridge
      
      The .port_vlan_filtering method already propagates the netlink extack to
      the driver, let's propagate it from .port_bridge_join too, to make sure
      that the driver can use the same function for both.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06b9cce4
    • V
      net: dsa: request drivers to perform FDB isolation · c2693363
      Vladimir Oltean 提交于
      For DSA, to encourage drivers to perform FDB isolation simply means to
      track which bridge does each FDB and MDB entry belong to. It then
      becomes the driver responsibility to use something that makes the FDB
      entry from one bridge not match the FDB lookup of ports from other
      bridges.
      
      The top-level functions where the bridge is determined are:
      - dsa_port_fdb_{add,del}
      - dsa_port_host_fdb_{add,del}
      - dsa_port_mdb_{add,del}
      - dsa_port_host_mdb_{add,del}
      
      aka the pre-crosschip-notifier functions.
      
      Changing the API to pass a reference to a bridge is not superfluous, and
      looking at the passed bridge argument is not the same as having the
      driver look at dsa_to_port(ds, port)->bridge from the ->port_fdb_add()
      method.
      
      DSA installs FDB and MDB entries on shared (CPU and DSA) ports as well,
      and those do not have any dp->bridge information to retrieve, because
      they are not in any bridge - they are merely the pipes that serve the
      user ports that are in one or multiple bridges.
      
      The struct dsa_bridge associated with each FDB/MDB entry is encapsulated
      in a larger "struct dsa_db" database. Although only databases associated
      to bridges are notified for now, this API will be the starting point for
      implementing IFF_UNICAST_FLT in DSA. There, the idea is to install FDB
      entries on the CPU port which belong to the corresponding user port's
      port database. These are supposed to match only when the port is
      standalone.
      
      It is better to introduce the API in its expected final form than to
      introduce it for bridges first, then to have to change drivers which may
      have made one or more assumptions.
      
      Drivers can use the provided bridge.num, but they can also use a
      different numbering scheme that is more convenient.
      
      DSA must perform refcounting on the CPU and DSA ports by also taking
      into account the bridge number. So if two bridges request the same local
      address, DSA must notify the driver twice, once for each bridge.
      
      In fact, if the driver supports FDB isolation, DSA must perform
      refcounting per bridge, but if the driver doesn't, DSA must refcount
      host addresses across all bridges, otherwise it would be telling the
      driver to delete an FDB entry for a bridge and the driver would delete
      it for all bridges. So introduce a bool fdb_isolation in drivers which
      would make all bridge databases passed to the cross-chip notifier have
      the same number (0). This makes dsa_mac_addr_find() -> dsa_db_equal()
      say that all bridge databases are the same database - which is
      essentially the legacy behavior.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2693363
  2. 25 2月, 2022 4 次提交
    • V
      net: dsa: create a dsa_lag structure · dedd6a00
      Vladimir Oltean 提交于
      The main purpose of this change is to create a data structure for a LAG
      as seen by DSA. This is similar to what we have for bridging - we pass a
      copy of this structure by value to ->port_lag_join and ->port_lag_leave.
      For now we keep the lag_dev, id and a reference count in it. Future
      patches will add a list of FDB entries for the LAG (these also need to
      be refcounted to work properly).
      
      The LAG structure is created using dsa_port_lag_create() and destroyed
      using dsa_port_lag_destroy(), just like we have for bridging.
      
      Because now, the dsa_lag itself is refcounted, we can simplify
      dsa_lag_map() and dsa_lag_unmap(). These functions need to keep a LAG in
      the dst->lags array only as long as at least one port uses it. The
      refcounting logic inside those functions can be removed now - they are
      called only when we should perform the operation.
      
      dsa_lag_dev() is renamed to dsa_lag_by_id() and now returns the dsa_lag
      structure instead of the lag_dev net_device.
      
      dsa_lag_foreach_port() now takes the dsa_lag structure as argument.
      
      dst->lags holds an array of dsa_lag structures.
      
      dsa_lag_map() now also saves the dsa_lag->id value, so that linear
      walking of dst->lags in drivers using dsa_lag_id() is no longer
      necessary. They can just look at lag.id.
      
      dsa_port_lag_id_get() is a helper, similar to dsa_port_bridge_num_get(),
      which can be used by drivers to get the LAG ID assigned by DSA to a
      given port.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      dedd6a00
    • V
      net: dsa: mv88e6xxx: use dsa_switch_for_each_port in mv88e6xxx_lag_sync_masks · b99dbdf0
      Vladimir Oltean 提交于
      Make the intent of the code more clear by using the dedicated helper for
      iterating over the ports of a switch.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b99dbdf0
    • V
      net: dsa: make LAG IDs one-based · 3d4a0a2a
      Vladimir Oltean 提交于
      The DSA LAG API will be changed to become more similar with the bridge
      data structures, where struct dsa_bridge holds an unsigned int num,
      which is generated by DSA and is one-based. We have a similar thing
      going with the DSA LAG, except that isn't stored anywhere, it is
      calculated dynamically by dsa_lag_id() by iterating through dst->lags.
      
      The idea of encoding an invalid (or not requested) LAG ID as zero for
      the purpose of simplifying checks in drivers means that the LAG IDs
      passed by DSA to drivers need to be one-based too. So back-and-forth
      conversion is needed when indexing the dst->lags array, as well as in
      drivers which assume a zero-based index.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      3d4a0a2a
    • V
      net: dsa: mv88e6xxx: rename references to "lag" as "lag_dev" · e23eba72
      Vladimir Oltean 提交于
      In preparation of converting struct net_device *dp->lag_dev into a
      struct dsa_lag *dp->lag, we need to rename, for consistency purposes,
      all occurrences of the "lag" variable in mv88e6xxx to "lag_dev".
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e23eba72
  3. 23 2月, 2022 1 次提交
  4. 15 2月, 2022 1 次提交
  5. 14 2月, 2022 1 次提交
  6. 11 2月, 2022 2 次提交
  7. 09 2月, 2022 1 次提交
    • V
      net: dsa: mv88e6xxx: don't use devres for mdiobus · f53a2ce8
      Vladimir Oltean 提交于
      As explained in commits:
      74b6d7d1 ("net: dsa: realtek: register the MDIO bus under devres")
      5135e96a ("net: dsa: don't allocate the slave_mii_bus using devres")
      
      mdiobus_free() will panic when called from devm_mdiobus_free() <-
      devres_release_all() <- __device_release_driver(), and that mdiobus was
      not previously unregistered.
      
      The mv88e6xxx is an MDIO device, so the initial set of constraints that
      I thought would cause this (I2C or SPI buses which call ->remove on
      ->shutdown) do not apply. But there is one more which applies here.
      
      If the DSA master itself is on a bus that calls ->remove from ->shutdown
      (like dpaa2-eth, which is on the fsl-mc bus), there is a device link
      between the switch and the DSA master, and device_links_unbind_consumers()
      will unbind the Marvell switch driver on shutdown.
      
      systemd-shutdown[1]: Powering off.
      mv88e6085 0x0000000008b96000:00 sw_gl0: Link is Down
      fsl-mc dpbp.9: Removing from iommu group 7
      fsl-mc dpbp.8: Removing from iommu group 7
      ------------[ cut here ]------------
      kernel BUG at drivers/net/phy/mdio_bus.c:677!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00040-gdc05f73788e5 #15
      pc : mdiobus_free+0x44/0x50
      lr : devm_mdiobus_free+0x10/0x20
      Call trace:
       mdiobus_free+0x44/0x50
       devm_mdiobus_free+0x10/0x20
       devres_release_all+0xa0/0x100
       __device_release_driver+0x190/0x220
       device_release_driver_internal+0xac/0xb0
       device_links_unbind_consumers+0xd4/0x100
       __device_release_driver+0x4c/0x220
       device_release_driver_internal+0xac/0xb0
       device_links_unbind_consumers+0xd4/0x100
       __device_release_driver+0x94/0x220
       device_release_driver+0x28/0x40
       bus_remove_device+0x118/0x124
       device_del+0x174/0x420
       fsl_mc_device_remove+0x24/0x40
       __fsl_mc_device_remove+0xc/0x20
       device_for_each_child+0x58/0xa0
       dprc_remove+0x90/0xb0
       fsl_mc_driver_remove+0x20/0x5c
       __device_release_driver+0x21c/0x220
       device_release_driver+0x28/0x40
       bus_remove_device+0x118/0x124
       device_del+0x174/0x420
       fsl_mc_bus_remove+0x80/0x100
       fsl_mc_bus_shutdown+0xc/0x1c
       platform_shutdown+0x20/0x30
       device_shutdown+0x154/0x330
       kernel_power_off+0x34/0x6c
       __do_sys_reboot+0x15c/0x250
       __arm64_sys_reboot+0x20/0x30
       invoke_syscall.constprop.0+0x4c/0xe0
       do_el0_svc+0x4c/0x150
       el0_svc+0x24/0xb0
       el0t_64_sync_handler+0xa8/0xb0
       el0t_64_sync+0x178/0x17c
      
      So the same treatment must be applied to all DSA switch drivers, which
      is: either use devres for both the mdiobus allocation and registration,
      or don't use devres at all.
      
      The Marvell driver already has a good structure for mdiobus removal, so
      just plug in mdiobus_free and get rid of devres.
      
      Fixes: ac3a68d5 ("net: phy: don't abuse devres in devm_mdiobus_register()")
      Reported-by: NRafael Richter <Rafael.Richter@gin.de>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: NDaniel Klauer <daniel.klauer@gin.de>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f53a2ce8
  8. 07 2月, 2022 2 次提交
  9. 03 2月, 2022 5 次提交
  10. 31 1月, 2022 1 次提交
    • T
      net: dsa: mv88e6xxx: Improve performance of busy bit polling · 35da1dfd
      Tobias Waldekranz 提交于
      Avoid a long delay when a busy bit is still set and has to be polled
      again.
      
      Measurements on a system with 2 Opals (6097F) and one Agate (6352)
      show that even with this much tighter loop, we have about a 50% chance
      of the bit being cleared on the first poll, all other accesses see the
      bit being cleared on the second poll.
      
      On a standard MDIO bus running MDC at 2.5MHz, a single access with 32
      bits of preamble plus 32 bits of data takes 64*(1/2.5MHz) = 25.6us.
      
      This means that mv88e6xxx_smi_direct_wait took 26us + CPU overhead in
      the fast scenario, but 26us + 1500us + 26us + CPU overhead in the slow
      case - bringing the average close to 1ms.
      
      With this change in place, the slow case is closer to 2*26us + CPU
      overhead, with the average well below 100us - a 10x improvement.
      
      This translates to real-world winnings. On a 3-chip 20-port system,
      the modprobe time drops by 88%:
      
      Before:
      
      root@coronet:~# time modprobe mv88e6xxx
      real    0m 15.99s
      user    0m 0.00s
      sys     0m 1.52s
      
      After:
      
      root@coronet:~# time modprobe mv88e6xxx
      real    0m 2.21s
      user    0m 0.00s
      sys     0m 1.54s
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35da1dfd
  11. 13 12月, 2021 1 次提交
  12. 12 12月, 2021 1 次提交
  13. 10 12月, 2021 1 次提交
  14. 09 12月, 2021 9 次提交
    • R
      net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" · 2b29cb9e
      Russell King (Oracle) 提交于
      This commit fixes a misunderstanding in commit 4a3e0aed ("net: dsa:
      mv88e6xxx: don't use PHY_DETECT on internal PHY's").
      
      For Marvell DSA switches with the PHY_DETECT bit (for non-6250 family
      devices), controls whether the PPU polls the PHY to retrieve the link,
      speed, duplex and pause status to update the port configuration. This
      applies for both internal and external PHYs.
      
      For some switches such as 88E6352 and 88E6390X, PHY_DETECT has an
      additional function of enabling auto-media mode between the internal
      PHY and SERDES blocks depending on which first gains link.
      
      The original intention of commit 5d5b231d (net: dsa: mv88e6xxx: use
      PHY_DETECT in mac_link_up/mac_link_down) was to allow this bit to be
      used to detect when this propagation is enabled, and allow software to
      update the port configuration. This has found to be necessary for some
      switches which do not automatically propagate status from the SERDES to
      the port, which includes the 88E6390. However, commit 4a3e0aed
      ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's") breaks
      this assumption.
      
      Maarten Zanders has confirmed that the issue he was addressing was for
      an 88E6250 switch, which does not have a PHY_DETECT bit in bit 12, but
      instead a link status bit. Therefore, mv88e6xxx_port_ppu_updates() does
      not report correctly.
      
      This patch resolves the above issues by reverting Maarten's change and
      instead making mv88e6xxx_port_ppu_updates() indicate whether the port
      is internal for the 88E6250 family of switches.
      
        Yes, you're right, I'm targeting the 6250 family. And yes, your
        suggestion would solve my case and is a better implementation for
        the other devices (as far as I can see).
      
      Fixes: 4a3e0aed ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Tested-by: NMaarten Zanders <maarten.zanders@mind.be>
      Link: https://lore.kernel.org/r/E1muXm7-00EwJB-7n@rmk-PC.armlinux.org.ukSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2b29cb9e
    • V
      net: dsa: eliminate dsa_switch_ops :: port_bridge_tx_fwd_{,un}offload · 857fdd74
      Vladimir Oltean 提交于
      We don't really need new switch API for these, and with new switches
      which intend to add support for this feature, it will become cumbersome
      to maintain.
      
      The change consists in restructuring the two drivers that implement this
      offload (sja1105 and mv88e6xxx) such that the offload is enabled and
      disabled from the ->port_bridge_{join,leave} methods instead of the old
      ->port_bridge_tx_fwd_{,un}offload.
      
      The only non-trivial change is that mv88e6xxx_map_virtual_bridge_to_pvt()
      has been moved to avoid a forward declaration, and the
      mv88e6xxx_reg_lock() calls from inside it have been removed, since
      locking is now done from mv88e6xxx_port_bridge_{join,leave}.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      857fdd74
    • V
      net: dsa: add a "tx_fwd_offload" argument to ->port_bridge_join · b079922b
      Vladimir Oltean 提交于
      This is a preparation patch for the removal of the DSA switch methods
      ->port_bridge_tx_fwd_offload() and ->port_bridge_tx_fwd_unoffload().
      The plan is for the switch to report whether it offloads TX forwarding
      directly as a response to the ->port_bridge_join() method.
      
      This change deals with the noisy portion of converting all existing
      function prototypes to take this new boolean pointer argument.
      The bool is placed in the cross-chip notifier structure for bridge join,
      and a reference to it is provided to drivers. In the next change, DSA
      will then actually look at this value instead of calling
      ->port_bridge_tx_fwd_offload().
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b079922b
    • V
      net: dsa: keep the bridge_dev and bridge_num as part of the same structure · d3eed0e5
      Vladimir Oltean 提交于
      The main desire behind this is to provide coherent bridge information to
      the fast path without locking.
      
      For example, right now we set dp->bridge_dev and dp->bridge_num from
      separate code paths, it is theoretically possible for a packet
      transmission to read these two port properties consecutively and find a
      bridge number which does not correspond with the bridge device.
      
      Another desire is to start passing more complex bridge information to
      dsa_switch_ops functions. For example, with FDB isolation, it is
      expected that drivers will need to be passed the bridge which requested
      an FDB/MDB entry to be offloaded, and along with that bridge_dev, the
      associated bridge_num should be passed too, in case the driver might
      want to implement an isolation scheme based on that number.
      
      We already pass the {bridge_dev, bridge_num} pair to the TX forwarding
      offload switch API, however we'd like to remove that and squash it into
      the basic bridge join/leave API. So that means we need to pass this
      pair to the bridge join/leave API.
      
      During dsa_port_bridge_leave, first we unset dp->bridge_dev, then we
      call the driver's .port_bridge_leave with what used to be our
      dp->bridge_dev, but provided as an argument.
      
      When bridge_dev and bridge_num get folded into a single structure, we
      need to preserve this behavior in dsa_port_bridge_leave: we need a copy
      of what used to be in dp->bridge.
      
      Switch drivers check bridge membership by comparing dp->bridge_dev with
      the provided bridge_dev, but now, if we provide the struct dsa_bridge as
      a pointer, they cannot keep comparing dp->bridge to the provided
      pointer, since this only points to an on-stack copy. To make this
      obvious and prevent driver writers from forgetting and doing stupid
      things, in this new API, the struct dsa_bridge is provided as a full
      structure (not very large, contains an int and a pointer) instead of a
      pointer. An explicit comparison function needs to be used to determine
      bridge membership: dsa_port_offloads_bridge().
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      d3eed0e5
    • V
      net: dsa: hide dp->bridge_dev and dp->bridge_num in drivers behind helpers · 41fb0cf1
      Vladimir Oltean 提交于
      The location of the bridge device pointer and number is going to change.
      It is not going to be kept individually per port, but in a common
      structure allocated dynamically and which will have lockdep validation.
      
      Use the helpers to access these elements so that we have a migration
      path to the new organization.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      41fb0cf1
    • V
      net: dsa: mv88e6xxx: compute port vlan membership based on dp->bridge_dev comparison · 65144067
      Vladimir Oltean 提交于
      The goal of this change is to reduce mv88e6xxx_port_vlan() to a form
      where dsa_port_bridge_same() can be used, since the dp->bridge_dev
      pointer will be hidden in a future change.
      
      To do that, we observe that the "br" pointer is deduced from a
      dp->bridge_dev in both cases (of a physical switch port as well as a
      virtual bridge). So instead of keeping the "br" pointer, we can just
      keep the "dp" pointer from which "br" gets derived.
      
      In the last iteration over switch ports, we must use another iterator
      variable, "other_dp"since now we use the "dp" variable to keep an
      indirect reference to the bridge. While at it, the old code used to
      filter only the ports which were part of the same switch as "ds".
      There exists a dedicated DSA port iterator for that:
      dsa_switch_for_each_port (which skips the ports in the tree that belong
      to non-local switches), so we can just use that.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      65144067
    • V
      net: dsa: mv88e6xxx: iterate using dsa_switch_for_each_user_port in mv88e6xxx_port_check_hw_vlan · 0493fa79
      Vladimir Oltean 提交于
      Avoid a plethora of dsa_to_port() calls (some hidden behind
      dsa_is_*_port and some in plain sight) by keeping two struct dsa_port
      references: one to the port passed as argument, and another to the other
      ports of the switch that we're iterating over.
      
      This isn't called from the DSA initialization path, so there is no risk
      that we have user ports without a dp->slave populated. So the combined
      checks that a port isn't a DSA port, a CPU port, or doesn't have a slave
      net device (therefore is unused), are strictly equivalent to the simple
      check that the port is a user port. This is already handled by the DSA
      iterator.
      
      i gets replaced by other_dp->index, dsa_is_*_port calls get replaced by
      dsa_port_is_*, and dsa_to_port gets replaced by the respective pointer
      directly.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      0493fa79
    • V
      net: dsa: assign a bridge number even without TX forwarding offload · 947c8746
      Vladimir Oltean 提交于
      The service where DSA assigns a unique bridge number for each forwarding
      domain is useful even for drivers which do not implement the TX
      forwarding offload feature.
      
      For example, drivers might use the dp->bridge_num for FDB isolation.
      
      So rename ds->num_fwd_offloading_bridges to ds->max_num_bridges, and
      calculate a unique bridge_num for all drivers that set this value.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      947c8746
    • V
      net: dsa: make dp->bridge_num one-based · 3f9bb030
      Vladimir Oltean 提交于
      I have seen too many bugs already due to the fact that we must encode an
      invalid dp->bridge_num as a negative value, because the natural tendency
      is to check that invalid value using (!dp->bridge_num). Latest example
      can be seen in commit 1bec0f05 ("net: dsa: fix bridge_num not
      getting cleared after ports leaving the bridge").
      
      Convert the existing users to assume that dp->bridge_num == 0 is the
      encoding for invalid, and valid bridge numbers start from 1.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      3f9bb030
  15. 10 11月, 2021 1 次提交
  16. 24 10月, 2021 1 次提交
    • S
      net: convert users of bitmap_foo() to linkmode_foo() · 4973056c
      Sean Anderson 提交于
      This converts instances of
      	bitmap_foo(args..., __ETHTOOL_LINK_MODE_MASK_NBITS)
      to
      	linkmode_foo(args...)
      
      I manually fixed up some lines to prevent them from being excessively
      long. Otherwise, this change was generated with the following semantic
      patch:
      
      // Generated with
      // echo linux/linkmode.h > includes
      // git grep -Flf includes include/ | cut -f 2- -d / | cat includes - \
      // | sort | uniq | tee new_includes | wc -l && mv new_includes includes
      // and repeating until the number stopped going up
      @i@
      @@
      
      (
       #include <linux/acpi_mdio.h>
      |
       #include <linux/brcmphy.h>
      |
       #include <linux/dsa/loop.h>
      |
       #include <linux/dsa/sja1105.h>
      |
       #include <linux/ethtool.h>
      |
       #include <linux/ethtool_netlink.h>
      |
       #include <linux/fec.h>
      |
       #include <linux/fs_enet_pd.h>
      |
       #include <linux/fsl/enetc_mdio.h>
      |
       #include <linux/fwnode_mdio.h>
      |
       #include <linux/linkmode.h>
      |
       #include <linux/lsm_audit.h>
      |
       #include <linux/mdio-bitbang.h>
      |
       #include <linux/mdio.h>
      |
       #include <linux/mdio-mux.h>
      |
       #include <linux/mii.h>
      |
       #include <linux/mii_timestamper.h>
      |
       #include <linux/mlx5/accel.h>
      |
       #include <linux/mlx5/cq.h>
      |
       #include <linux/mlx5/device.h>
      |
       #include <linux/mlx5/driver.h>
      |
       #include <linux/mlx5/eswitch.h>
      |
       #include <linux/mlx5/fs.h>
      |
       #include <linux/mlx5/port.h>
      |
       #include <linux/mlx5/qp.h>
      |
       #include <linux/mlx5/rsc_dump.h>
      |
       #include <linux/mlx5/transobj.h>
      |
       #include <linux/mlx5/vport.h>
      |
       #include <linux/of_mdio.h>
      |
       #include <linux/of_net.h>
      |
       #include <linux/pcs-lynx.h>
      |
       #include <linux/pcs/pcs-xpcs.h>
      |
       #include <linux/phy.h>
      |
       #include <linux/phy_led_triggers.h>
      |
       #include <linux/phylink.h>
      |
       #include <linux/platform_data/bcmgenet.h>
      |
       #include <linux/platform_data/xilinx-ll-temac.h>
      |
       #include <linux/pxa168_eth.h>
      |
       #include <linux/qed/qed_eth_if.h>
      |
       #include <linux/qed/qed_fcoe_if.h>
      |
       #include <linux/qed/qed_if.h>
      |
       #include <linux/qed/qed_iov_if.h>
      |
       #include <linux/qed/qed_iscsi_if.h>
      |
       #include <linux/qed/qed_ll2_if.h>
      |
       #include <linux/qed/qed_nvmetcp_if.h>
      |
       #include <linux/qed/qed_rdma_if.h>
      |
       #include <linux/sfp.h>
      |
       #include <linux/sh_eth.h>
      |
       #include <linux/smsc911x.h>
      |
       #include <linux/soc/nxp/lpc32xx-misc.h>
      |
       #include <linux/stmmac.h>
      |
       #include <linux/sunrpc/svc_rdma.h>
      |
       #include <linux/sxgbe_platform.h>
      |
       #include <net/cfg80211.h>
      |
       #include <net/dsa.h>
      |
       #include <net/mac80211.h>
      |
       #include <net/selftests.h>
      |
       #include <rdma/ib_addr.h>
      |
       #include <rdma/ib_cache.h>
      |
       #include <rdma/ib_cm.h>
      |
       #include <rdma/ib_hdrs.h>
      |
       #include <rdma/ib_mad.h>
      |
       #include <rdma/ib_marshall.h>
      |
       #include <rdma/ib_pack.h>
      |
       #include <rdma/ib_pma.h>
      |
       #include <rdma/ib_sa.h>
      |
       #include <rdma/ib_smi.h>
      |
       #include <rdma/ib_umem.h>
      |
       #include <rdma/ib_umem_odp.h>
      |
       #include <rdma/ib_verbs.h>
      |
       #include <rdma/iw_cm.h>
      |
       #include <rdma/mr_pool.h>
      |
       #include <rdma/opa_addr.h>
      |
       #include <rdma/opa_port_info.h>
      |
       #include <rdma/opa_smi.h>
      |
       #include <rdma/opa_vnic.h>
      |
       #include <rdma/rdma_cm.h>
      |
       #include <rdma/rdma_cm_ib.h>
      |
       #include <rdma/rdmavt_cq.h>
      |
       #include <rdma/rdma_vt.h>
      |
       #include <rdma/rdmavt_qp.h>
      |
       #include <rdma/rw.h>
      |
       #include <rdma/tid_rdma_defs.h>
      |
       #include <rdma/uverbs_ioctl.h>
      |
       #include <rdma/uverbs_named_ioctl.h>
      |
       #include <rdma/uverbs_std_types.h>
      |
       #include <rdma/uverbs_types.h>
      |
       #include <soc/mscc/ocelot.h>
      |
       #include <soc/mscc/ocelot_ptp.h>
      |
       #include <soc/mscc/ocelot_vcap.h>
      |
       #include <trace/events/ib_mad.h>
      |
       #include <trace/events/rdma_core.h>
      |
       #include <trace/events/rdma.h>
      |
       #include <trace/events/rpcrdma.h>
      |
       #include <uapi/linux/ethtool.h>
      |
       #include <uapi/linux/ethtool_netlink.h>
      |
       #include <uapi/linux/mdio.h>
      |
       #include <uapi/linux/mii.h>
      )
      
      @depends on i@
      expression list args;
      @@
      
      (
      - bitmap_zero(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_zero(args)
      |
      - bitmap_copy(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_copy(args)
      |
      - bitmap_and(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_and(args)
      |
      - bitmap_or(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_or(args)
      |
      - bitmap_empty(args, ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_empty(args)
      |
      - bitmap_andnot(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_andnot(args)
      |
      - bitmap_equal(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_equal(args)
      |
      - bitmap_intersects(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_intersects(args)
      |
      - bitmap_subset(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_subset(args)
      )
      
      Add missing linux/mii.h include to mellanox. -DaveM
      Signed-off-by: NSean Anderson <sean.anderson@seco.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4973056c
  17. 12 10月, 2021 1 次提交
  18. 09 10月, 2021 2 次提交
    • V
      net: dsa: mv88e6xxx: isolate the ATU databases of standalone and bridged ports · 5bded825
      Vladimir Oltean 提交于
      Similar to commit 6087175b ("net: dsa: mt7530: use independent VLAN
      learning on VLAN-unaware bridges"), software forwarding between an
      unoffloaded LAG port (a bonding interface with an unsupported policy)
      and a mv88e6xxx user port directly under a bridge is broken.
      
      We adopt the same strategy, which is to make the standalone ports not
      find any ATU entry learned on a bridge port.
      
      Theory: the mv88e6xxx ATU is looked up by FID and MAC address. There are
      as many FIDs as VIDs (4096). The FID is derived from the VID when
      possible (the VTU maps a VID to a FID), with a fallback to the port
      based default FID value when not (802.1Q Mode is disabled on the port,
      or the classified VID isn't present in the VTU).
      
      The mv88e6xxx driver makes the following use of FIDs and VIDs:
      
      - the port's DefaultVID (to which untagged & pvid-tagged packets get
        classified) is 0 and is absent from the VTU, so this kind of packets is
        processed in FID 0, the default FID assigned by mv88e6xxx_setup_port.
      
      - every time a bridge VLAN is created, mv88e6xxx_port_vlan_join() ->
        mv88e6xxx_atu_new() associates a FID with that VID which increases
        linearly starting from 1. Like this:
      
        bridge vlan add dev lan0 vid 100 # FID 1
        bridge vlan add dev lan1 vid 100 # still FID 1
        bridge vlan add dev lan2 vid 1024 # FID 2
      
      The FID allocation made by the driver is sub-optimal for the following
      reasons:
      
      (a) A standalone port has a DefaultPVID of 0 and a default FID of 0 too.
          A VLAN-unaware bridged port has a DefaultPVID of 0 and a default FID
          of 0 too. The difference is that the bridged ports may learn ATU
          entries, while the standalone port has the requirement that it must
          not, and must not find them either. Standalone ports must not use
          the same FID as ports belonging to a bridge. All standalone ports
          can use the same FID, since the ATU will never have an entry in
          that FID.
      
      (b) Multiple VLAN-unaware bridges will all use a DefaultPVID of 0 and a
          default FID of 0 on all their ports. The FDBs will not be isolated
          between these bridges. Every VLAN-unaware bridge must use the same
          FID on all its ports, different from the FID of other bridge ports.
      
      (c) Each bridge VLAN uses a unique FID which is useful for Independent
          VLAN Learning, but the same VLAN ID on multiple VLAN-aware bridges
          will result in the same FID being used by mv88e6xxx_atu_new().
          The correct behavior is for VLAN 1 in br0 to have a different FID
          compared to VLAN 1 in br1.
      
      This patch cannot fix all the above. Traditionally the DSA framework did
      not care about this, and the reality is that DSA core involvement is
      needed for the aforementioned issues to be solved. The only thing we can
      solve here is an issue which does not require API changes, and that is
      issue (a), aka use a different FID for standalone ports vs ports under
      VLAN-unaware bridges.
      
      The first step is deciding what VID and FID to use for standalone ports,
      and what VID and FID for bridged ports. The 0/0 pair for standalone
      ports is what they used up till now, let's keep using that. For bridged
      ports, there are 2 cases:
      
      - VLAN-aware ports will never end up using the port default FID, because
        packets will always be classified to a VID in the VTU or dropped
        otherwise. The FID is the one associated with the VID in the VTU.
      
      - On VLAN-unaware ports, we _could_ leave their DefaultVID (pvid) at
        zero (just as in the case of standalone ports), and just change the
        port's default FID from 0 to a different number (say 1).
      
      However, Tobias points out that there is one more requirement to cater to:
      cross-chip bridging. The Marvell DSA header does not carry the FID in
      it, only the VID. So once a packet crosses a DSA link, if it has a VID
      of zero it will get classified to the default FID of that cascade port.
      Relying on a port default FID for upstream cascade ports results in
      contradictions: a default FID of 0 breaks ATU isolation of bridged ports
      on the downstream switch, a default FID of 1 breaks standalone ports on
      the downstream switch.
      
      So not only must standalone ports have different FIDs compared to
      bridged ports, they must also have different DefaultVID values.
      IEEE 802.1Q defines two reserved VID values: 0 and 4095. So we simply
      choose 4095 as the DefaultVID of ports belonging to VLAN-unaware
      bridges, and VID 4095 maps to FID 1.
      
      For the xmit operation to look up the same ATU database, we need to put
      VID 4095 in DSA tags sent to ports belonging to VLAN-unaware bridges
      too. All shared ports are configured to map this VID to the bridging
      FID, because they are members of that VLAN in the VTU. Shared ports
      don't need to have 802.1QMode enabled in any way, they always parse the
      VID from the DSA header, they don't need to look at the 802.1Q header.
      
      We install VID 4095 to the VTU in mv88e6xxx_setup_port(), with the
      mention that mv88e6xxx_vtu_setup() which was located right below that
      call was flushing the VTU so those entries wouldn't be preserved.
      So we need to relocate the VTU flushing prior to the port initialization
      during ->setup(). Also note that this is why it is safe to assume that
      VID 4095 will get associated with FID 1: the user ports haven't been
      created, so there is no avenue for the user to create a bridge VLAN
      which could otherwise race with the creation of another FID which would
      otherwise use up the non-reserved FID value of 1.
      
      [ Currently mv88e6xxx_port_vlan_join() doesn't have the option of
        specifying a preferred FID, it always calls mv88e6xxx_atu_new(). ]
      
      mv88e6xxx_port_db_load_purge() is the function to access the ATU for
      FDB/MDB entries, and it used to determine the FID to use for
      VLAN-unaware FDB entries (VID=0) using mv88e6xxx_port_get_fid().
      But the driver only called mv88e6xxx_port_set_fid() once, during probe,
      so no surprises, the port FID was always 0, the call to get_fid() was
      redundant. As much as I would have wanted to not touch that code, the
      logic is broken when we add a new FID which is not the port-based
      default. Now the port-based default FID only corresponds to standalone
      ports, and FDB/MDB entries belong to the bridging service. So while in
      the future, when the DSA API will support FDB isolation, we will have to
      figure out the FID based on the bridge number, for now there's a single
      bridging FID, so hardcode that.
      
      Lastly, the tagger needs to check, when it is transmitting a VLAN
      untagged skb, whether it is sending it towards a bridged or a standalone
      port. When we see it is bridged we assume the bridge is VLAN-unaware.
      Not because it cannot be VLAN-aware but:
      
      - if we are transmitting from a VLAN-aware bridge we are likely doing so
        using TX forwarding offload. That code path guarantees that skbs have
        a vlan hwaccel tag in them, so we would not enter the "else" branch
        of the "if (skb->protocol == htons(ETH_P_8021Q))" condition.
      
      - if we are transmitting on behalf of a VLAN-aware bridge but with no TX
        forwarding offload (no PVT support, out of space in the PVT, whatever),
        we would indeed be transmitting with VLAN 4095 instead of the bridge
        device's pvid. However we would be injecting a "From CPU" frame, and
        the switch won't learn from that - it only learns from "Forward" frames.
        So it is inconsequential for address learning. And VLAN 4095 is
        absolutely enough for the frame to exit the switch, since we never
        remove that VLAN from any port.
      
      Fixes: 57e661aa ("net: dsa: mv88e6xxx: Link aggregation support")
      Reported-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      5bded825
    • V
      net: dsa: mv88e6xxx: keep the pvid at 0 when VLAN-unaware · 8b6836d8
      Vladimir Oltean 提交于
      The VLAN support in mv88e6xxx has a loaded history. Commit 2ea7a679
      ("net: dsa: Don't add vlans when vlan filtering is disabled") noticed
      some issues with VLAN and decided the best way to deal with them was to
      make the DSA core ignore VLANs added by the bridge while VLAN awareness
      is turned off. Those issues were never explained, just presented as
      "at least one corner case".
      
      That approach had problems of its own, presented by
      commit 54a0ed0d ("net: dsa: provide an option for drivers to always
      receive bridge VLANs") for the DSA core, followed by
      commit 1fb74191 ("net: dsa: mv88e6xxx: fix vlan setup") which
      applied ds->configure_vlan_while_not_filtering = true for mv88e6xxx in
      particular.
      
      We still don't know what corner case Andrew saw when he wrote
      commit 2ea7a679 ("net: dsa: Don't add vlans when vlan filtering is
      disabled"), but Tobias now reports that when we use TX forwarding
      offload, pinging an external station from the bridge device is broken if
      the front-facing DSA user port has flooding turned off. The full
      description is in the link below, but for short, when a mv88e6xxx port
      is under a VLAN-unaware bridge, it inherits that bridge's pvid.
      So packets ingressing a user port will be classified to e.g. VID 1
      (assuming that value for the bridge_default_pvid), whereas when
      tag_dsa.c xmits towards a user port, it always sends packets using a VID
      of 0 if that port is standalone or under a VLAN-unaware bridge - or at
      least it did so prior to commit d82f8ab0 ("net: dsa: tag_dsa:
      offload the bridge forwarding process").
      
      In any case, when there is a conversation between the CPU and a station
      connected to a user port, the station's MAC address is learned in VID 1
      but the CPU tries to transmit through VID 0. The packets reach the
      intended station, but via flooding and not by virtue of matching the
      existing ATU entry.
      
      DSA has established (and enforced in other drivers: sja1105, felix,
      mt7530) that a VLAN-unaware port should use a private pvid, and not
      inherit the one from the bridge. The bridge's pvid should only be
      inherited when that bridge is VLAN-aware, so all state transitions need
      to be handled. On the other hand, all bridge VLANs should sit in the VTU
      starting with the moment when the bridge offloads them via switchdev,
      they are just not used.
      
      This solves the problem that Tobias sees because packets ingressing on
      VLAN-unaware user ports now get classified to VID 0, which is also the
      VID used by tag_dsa.c on xmit.
      
      Fixes: d82f8ab0 ("net: dsa: tag_dsa: offload the bridge forwarding process")
      Link: https://patchwork.kernel.org/project/netdevbpf/patch/20211003222312.284175-2-vladimir.oltean@nxp.com/#24491503Reported-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      8b6836d8
  19. 27 9月, 2021 3 次提交