1. 25 10月, 2021 1 次提交
  2. 24 10月, 2021 1 次提交
    • V
      net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work · 5cdfde49
      Vladimir Oltean 提交于
      After talking with Ido Schimmel, it became clear that rtnl_lock is not
      actually required for anything that is done inside the
      SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE deferred work handlers.
      
      The reason why it was probably added by Arkadi Sharshevsky in commit
      c9eb3e0f ("net: dsa: Add support for learning FDB through
      notification") was to offer the same locking/serialization guarantees as
      .ndo_fdb_{add,del} and avoid reworking any drivers.
      
      DSA has implemented .ndo_fdb_add and .ndo_fdb_del until commit
      b117e1e8 ("net: dsa: delete dsa_legacy_fdb_add and
      dsa_legacy_fdb_del") - that is to say, until fairly recently.
      
      But those methods have been deleted, so now we are free to drop the
      rtnl_lock as well.
      
      Note that exposing DSA switch drivers to an unlocked method which was
      previously serialized by the rtnl_mutex is a potentially dangerous
      affair. Driver writers couldn't ensure that their internal locking
      scheme does the right thing even if they wanted.
      
      We could err on the side of paranoia and introduce a switch-wide lock
      inside the DSA framework, but that seems way overreaching. Instead, we
      could check as many drivers for regressions as we can, fix those first,
      then let this change go in once it is assumed to be fairly safe.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cdfde49
  3. 21 10月, 2021 1 次提交
    • V
      net: dsa: remove the "dsa_to_port in a loop" antipattern from the core · d0004a02
      Vladimir Oltean 提交于
      Ever since Vivien's conversion of the ds->ports array into a dst->ports
      list, and the introduction of dsa_to_port, iterations through the ports
      of a switch became quadratic whenever dsa_to_port was needed.
      
      dsa_to_port can either be called directly, or indirectly through the
      dsa_is_{user,cpu,dsa,unused}_port helpers.
      
      Use the newly introduced dsa_switch_for_each_port() iteration macro
      that works with the iterator variable being a struct dsa_port *dp
      directly, and not an int i. It is an expensive variable to go from i to
      dp, but cheap to go from dp to i.
      
      This macro iterates through the entire ds->dst->ports list and filters
      by the ports belonging just to the switch provided as argument.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0004a02
  4. 18 10月, 2021 1 次提交
  5. 02 10月, 2021 1 次提交
  6. 16 9月, 2021 1 次提交
  7. 24 8月, 2021 3 次提交
    • V
      net: dsa: let drivers state that they need VLAN filtering while standalone · 58adf9dc
      Vladimir Oltean 提交于
      As explained in commit e358bef7 ("net: dsa: Give drivers the chance
      to veto certain upper devices"), the hellcreek driver uses some tricks
      to comply with the network stack expectations: it enforces port
      separation in standalone mode using VLANs. For untagged traffic,
      bridging between ports is prevented by using different PVIDs, and for
      VLAN-tagged traffic, it never accepts 8021q uppers with the same VID on
      two ports, so packets with one VLAN cannot leak from one port to another.
      
      That is almost fine*, and has worked because hellcreek relied on an
      implicit behavior of the DSA core that was changed by the previous
      patch: the standalone ports declare the 'rx-vlan-filter' feature as 'on
      [fixed]'. Since most of the DSA drivers are actually VLAN-unaware in
      standalone mode, that feature was actually incorrectly reflecting the
      hardware/driver state, so there was a desire to fix it. This leaves the
      hellcreek driver in a situation where it has to explicitly request this
      behavior from the DSA framework.
      
      We configure the ports as follows:
      
      - Standalone: 'rx-vlan-filter' is on. An 8021q upper on top of a
        standalone hellcreek port will go through dsa_slave_vlan_rx_add_vid
        and will add a VLAN to the hardware tables, giving the driver the
        opportunity to refuse it through .port_prechangeupper.
      
      - Bridged with vlan_filtering=0: 'rx-vlan-filter' is off. An 8021q upper
        on top of a bridged hellcreek port will not go through
        dsa_slave_vlan_rx_add_vid, because there will not be any attempt to
        offload this VLAN. The driver already disables VLAN awareness, so that
        upper should receive the traffic it needs.
      
      - Bridged with vlan_filtering=1: 'rx-vlan-filter' is on. An 8021q upper
        on top of a bridged hellcreek port will call dsa_slave_vlan_rx_add_vid,
        and can again be vetoed through .port_prechangeupper.
      
      *It is not actually completely fine, because if I follow through
      correctly, we can have the following situation:
      
      ip link add br0 type bridge vlan_filtering 0
      ip link set lan0 master br0 # lan0 now becomes VLAN-unaware
      ip link set lan0 nomaster # lan0 fails to become VLAN-aware again, therefore breaking isolation
      
      This patch fixes that corner case by extending the DSA core logic, based
      on this requested attribute, to change the VLAN awareness state of the
      switch (port) when it leaves the bridge.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NKurt Kanzenbach <kurt@linutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58adf9dc
    • V
      net: dsa: don't advertise 'rx-vlan-filter' when not needed · 06cfb2df
      Vladimir Oltean 提交于
      There have been multiple independent reports about
      dsa_slave_vlan_rx_add_vid being called (and consequently calling the
      drivers' .port_vlan_add) when it isn't needed, and sometimes (not
      always) causing problems in the process.
      
      Case 1:
      mv88e6xxx_port_vlan_prepare is stubborn and only accepts VLANs on
      bridged ports. That is understandably so, because standalone mv88e6xxx
      ports are VLAN-unaware, and VTU entries are said to be a scarce
      resource.
      
      Otherwise said, the following fails lamentably on mv88e6xxx:
      
      ip link add br0 type bridge vlan_filtering 1
      ip link set lan3 master br0
      ip link add link lan10 name lan10.1 type vlan id 1
      [485256.724147] mv88e6085 d0032004.mdio-mii:12: p10: hw VLAN 1 already used by port 3 in br0
      RTNETLINK answers: Operation not supported
      
      This has become a worse issue since commit 9b236d2a ("net: dsa:
      Advertise the VLAN offload netdev ability only if switch supports it").
      Up to that point, the driver was returning -EOPNOTSUPP and DSA was
      reconverting that error to 0, making the 8021q upper think all is ok
      (but obviously the error message was there even prior to this change).
      After that change the -EOPNOTSUPP is propagated to vlan_vid_add, and it
      is a hard error.
      
      Case 2:
      Ports that don't offload the Linux bridge (have a dp->bridge_dev = NULL
      because they don't implement .port_bridge_{join,leave}). Understandably,
      a standalone port should not offload VLANs either, it should remain VLAN
      unaware and any VLAN should be a software VLAN (as long as the hardware
      is not quirky, that is).
      
      In fact, dsa_slave_port_obj_add does do the right thing and rejects
      switchdev VLAN objects coming from the bridge when that bridge is not
      offloaded:
      
      	case SWITCHDEV_OBJ_ID_PORT_VLAN:
      		if (!dsa_port_offloads_bridge_port(dp, obj->orig_dev))
      			return -EOPNOTSUPP;
      
      		err = dsa_slave_vlan_add(dev, obj, extack);
      
      But it seems that the bridge is able to trick us. The __vlan_vid_add
      from br_vlan.c has:
      
      	/* Try switchdev op first. In case it is not supported, fallback to
      	 * 8021q add.
      	 */
      	err = br_switchdev_port_vlan_add(dev, v->vid, flags, extack);
      	if (err == -EOPNOTSUPP)
      		return vlan_vid_add(dev, br->vlan_proto, v->vid);
      
      So it says "no, no, you need this VLAN in your life!". And we, naive as
      we are, say "oh, this comes from the vlan_vid_add code path, it must be
      an 8021q upper, sure, I'll take that". And we end up with that bridge
      VLAN installed on our port anyway. But this time, it has the wrong flags:
      if the bridge was trying to install VLAN 1 as a pvid/untagged VLAN,
      failed via switchdev, retried via vlan_vid_add, we have this comment:
      
      	/* This API only allows programming tagged, non-PVID VIDs */
      
      So what we do makes absolutely no sense.
      
      Backtracing a bit, we see the common pattern. We allow the network stack
      to think that our standalone ports are VLAN-aware, but they aren't, for
      the vast majority of switches. The quirky ones should not dictate the
      norm. The dsa_slave_vlan_rx_add_vid and dsa_slave_vlan_rx_kill_vid
      methods exist for drivers that need the 'rx-vlan-filter: on' feature in
      ethtool -k, which can be due to any of the following reasons:
      
      1. vlan_filtering_is_global = true, and some ports are under a
         VLAN-aware bridge while others are standalone, and the standalone
         ports would otherwise drop VLAN-tagged traffic. This is described in
         commit 061f6a50 ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid
         implementation").
      
      2. the ports that are under a VLAN-aware bridge should also set this
         feature, for 8021q uppers having a VID not claimed by the bridge.
         In this case, the driver will essentially not even know that the VID
         is coming from the 8021q layer and not the bridge.
      
      3. Hellcreek. This driver needs it because in standalone mode, it uses
         unique VLANs per port to ensure separation. For separation of untagged
         traffic, it uses different PVIDs for each port, and for separation of
         VLAN-tagged traffic, it never accepts 8021q uppers with the same vid
         on two ports.
      
      If a driver does not fall under any of the above 3 categories, there is
      no reason why it should advertise the 'rx-vlan-filter' feature, therefore
      no reason why it should offload the VLANs added through vlan_vid_add.
      
      This commit fixes the problem by removing the 'rx-vlan-filter' feature
      from the slave devices when they operate in standalone mode, and when
      they offload a VLAN-unaware bridge.
      
      The way it works is that vlan_vid_add will now stop its processing here:
      
      vlan_add_rx_filter_info:
      	if (!vlan_hw_filter_capable(dev, proto))
      		return 0;
      
      So the VLAN will still be saved in the interface's VLAN RX filtering
      list, but because it does not declare VLAN filtering in its features,
      the 8021q module will return zero without committing that VLAN to
      hardware.
      
      This gives the drivers what they want, since it keeps the 8021q VLANs
      away from the VLAN table until VLAN awareness is enabled (point at which
      the ports are no longer standalone, hence in the mv88e6xxx case, the
      check in mv88e6xxx_port_vlan_prepare passes).
      
      Since the issue predates the existence of the hellcreek driver, case 3
      will be dealt with in a separate patch.
      
      The main change that this patch makes is to no longer set
      NETIF_F_HW_VLAN_CTAG_FILTER unconditionally, but toggle it dynamically
      (for most switches, never).
      
      The second part of the patch addresses an issue that the first part
      introduces: because the 'rx-vlan-filter' feature is now dynamically
      toggled, and our .ndo_vlan_rx_add_vid does not get called when
      'rx-vlan-filter' is off, we need to avoid bugs such as the following by
      replaying the VLANs from 8021q uppers every time we enable VLAN
      filtering:
      
      ip link add link lan0 name lan0.100 type vlan id 100
      ip addr add 192.168.100.1/24 dev lan0.100
      ping 192.168.100.2 # should work
      ip link add br0 type bridge vlan_filtering 0
      ip link set lan0 master br0
      ping 192.168.100.2 # should still work
      ip link set br0 type bridge vlan_filtering 1
      ping 192.168.100.2 # should still work but doesn't
      
      As reported by Florian, some drivers look at ds->vlan_filtering in
      their .port_vlan_add() implementation. So this patch also makes sure
      that ds->vlan_filtering is committed before calling the driver. This is
      the reason why it is first committed, then restored on the failure path.
      Reported-by: NTobias Waldekranz <tobias@waldekranz.com>
      Reported-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06cfb2df
    • V
      net: dsa: properly fall back to software bridging · 67b5fb5d
      Vladimir Oltean 提交于
      If the driver does not implement .port_bridge_{join,leave}, then we must
      fall back to standalone operation on that port, and trigger the error
      path of dsa_port_bridge_join. This sets dp->bridge_dev = NULL.
      
      In turn, having a non-NULL dp->bridge_dev when there is no offloading
      support makes the following things go wrong:
      
      - dsa_default_offload_fwd_mark make the wrong decision in setting
        skb->offload_fwd_mark. It should set skb->offload_fwd_mark = 0 for
        ports that don't offload the bridge, which should instruct the bridge
        to forward in software. But this does not happen, dp->bridge_dev is
        incorrectly set to point to the bridge, so the bridge is told that
        packets have been forwarded in hardware, which they haven't.
      
      - switchdev objects (MDBs, VLANs) should not be offloaded by ports that
        don't offload the bridge. Standalone ports should behave as packet-in,
        packet-out and the bridge should not be able to manipulate the pvid of
        the port, or tag stripping on egress, or ingress filtering. This
        should already work fine because dsa_slave_port_obj_add has:
      
      	case SWITCHDEV_OBJ_ID_PORT_VLAN:
      		if (!dsa_port_offloads_bridge_port(dp, obj->orig_dev))
      			return -EOPNOTSUPP;
      
      		err = dsa_slave_vlan_add(dev, obj, extack);
      
        but since dsa_port_offloads_bridge_port works based on dp->bridge_dev,
        this is again sabotaging us.
      
      All the above work in case the port has an unoffloaded LAG interface, so
      this is well exercised code, we should apply it for plain unoffloaded
      bridge ports too.
      Reported-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67b5fb5d
  8. 11 8月, 2021 1 次提交
  9. 08 8月, 2021 1 次提交
    • V
      net: dsa: don't fast age standalone ports · 39f32101
      Vladimir Oltean 提交于
      DSA drives the procedure to flush dynamic FDB entries from a port based
      on the change of STP state: whenever we go from a state where address
      learning is enabled (LEARNING, FORWARDING) to a state where it isn't
      (LISTENING, BLOCKING, DISABLED), we need to flush the existing dynamic
      entries.
      
      However, there are cases when this is not needed. Internally, when a
      DSA switch interface is not under a bridge, DSA still keeps it in the
      "FORWARDING" STP state. And when that interface joins a bridge, the
      bridge will meticulously iterate that port through all STP states,
      starting with BLOCKING and ending with FORWARDING. Because there is a
      state transition from the standalone version of FORWARDING into the
      temporary BLOCKING bridge port state, DSA calls the fast age procedure.
      
      Since commit 5e38c158 ("net: dsa: configure better brport flags when
      ports leave the bridge"), DSA asks standalone ports to disable address
      learning. Therefore, there can be no dynamic FDB entries on a standalone
      port. Therefore, it does not make sense to flush dynamic FDB entries on
      one.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39f32101
  10. 06 8月, 2021 1 次提交
    • V
      net: dsa: don't disable multicast flooding to the CPU even without an IGMP querier · c73c5708
      Vladimir Oltean 提交于
      Commit 08cc83cc ("net: dsa: add support for BRIDGE_MROUTER
      attribute") added an option for users to turn off multicast flooding
      towards the CPU if they turn off the IGMP querier on a bridge which
      already has enslaved ports (echo 0 > /sys/class/net/br0/bridge/multicast_router).
      
      And commit a8b659e7 ("net: dsa: act as passthrough for bridge port flags")
      simply papered over that issue, because it moved the decision to flood
      the CPU with multicast (or not) from the DSA core down to individual drivers,
      instead of taking a more radical position then.
      
      The truth is that disabling multicast flooding to the CPU is simply
      something we are not prepared to do now, if at all. Some reasons:
      
      - ICMP6 neighbor solicitation messages are unregistered multicast
        packets as far as the bridge is concerned. So if we stop flooding
        multicast, the outside world cannot ping the bridge device's IPv6
        link-local address.
      
      - There might be foreign interfaces bridged with our DSA switch ports
        (sending a packet towards the host does not necessarily equal
        termination, but maybe software forwarding). So if there is no one
        interested in that multicast traffic in the local network stack, that
        doesn't mean nobody is.
      
      - PTP over L4 (IPv4, IPv6) is multicast, but is unregistered as far as
        the bridge is concerned. This should reach the CPU port.
      
      - The switch driver might not do FDB partitioning. And since we don't
        even bother to do more fine-grained flood disabling (such as "disable
        flooding _from_port_N_ towards the CPU port" as opposed to "disable
        flooding _from_any_port_ towards the CPU port"), this breaks standalone
        ports, or even multiple bridges where one has an IGMP querier and one
        doesn't.
      
      Reverting the logic makes all of the above work.
      
      Fixes: a8b659e7 ("net: dsa: act as passthrough for bridge port flags")
      Fixes: 08cc83cc ("net: dsa: add support for BRIDGE_MROUTER attribute")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c73c5708
  11. 28 7月, 2021 1 次提交
    • A
      dev_ioctl: split out ndo_eth_ioctl · a7605370
      Arnd Bergmann 提交于
      Most users of ndo_do_ioctl are ethernet drivers that implement
      the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware
      timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP.
      
      Separate these from the few drivers that use ndo_do_ioctl to
      implement SIOCBOND, SIOCBR and SIOCWANDEV commands.
      
      This is a purely cosmetic change intended to help readers find
      their way through the implementation.
      
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Vladimir Oltean <olteanv@gmail.com>
      Cc: Leon Romanovsky <leon@kernel.org>
      Cc: linux-rdma@vger.kernel.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7605370
  12. 22 7月, 2021 2 次提交
    • V
      net: bridge: move the switchdev object replay helpers to "push" mode · 4e51bf44
      Vladimir Oltean 提交于
      Starting with commit 4f2673b3 ("net: bridge: add helper to replay
      port and host-joined mdb entries"), DSA has introduced some bridge
      helpers that replay switchdev events (FDB/MDB/VLAN additions and
      deletions) that can be lost by the switchdev drivers in a variety of
      circumstances:
      
      - an IP multicast group was host-joined on the bridge itself before any
        switchdev port joined the bridge, leading to the host MDB entries
        missing in the hardware database.
      - during the bridge creation process, the MAC address of the bridge was
        added to the FDB as an entry pointing towards the bridge device
        itself, but with no switchdev ports being part of the bridge yet, this
        local FDB entry would remain unknown to the switchdev hardware
        database.
      - a VLAN/FDB/MDB was added to a bridge port that is a LAG interface,
        before any switchdev port joined that LAG, leading to the hardware
        database missing those entries.
      - a switchdev port left a LAG that is a bridge port, while the LAG
        remained part of the bridge, and all FDB/MDB/VLAN entries remained
        installed in the hardware database of the switchdev port.
      
      Also, since commit 0d2cfbd4 ("net: bridge: ignore switchdev events
      for LAG ports which didn't request replay"), DSA introduced a method,
      based on a const void *ctx, to ensure that two switchdev ports under the
      same LAG that is a bridge port do not see the same MDB/VLAN entry being
      replayed twice by the bridge, once for every bridge port that joins the
      LAG.
      
      With so many ordering corner cases being possible, it seems unreasonable
      to expect a switchdev driver writer to get it right from the first try.
      Therefore, now that DSA has experimented with the bridge replay helpers
      for a little bit, we can move the code to the bridge driver where it is
      more readily available to all switchdev drivers.
      
      To convert the switchdev object replay helpers from "pull mode" (where
      the driver asks for them) to a "push mode" (where the bridge offers them
      automatically), the biggest problem is that the bridge needs to be aware
      when a switchdev port joins and leaves, even when the switchdev is only
      indirectly a bridge port (for example when the bridge port is a LAG
      upper of the switchdev).
      
      Luckily, we already have a hook for that, in the form of the newly
      introduced switchdev_bridge_port_offload() and
      switchdev_bridge_port_unoffload() calls. These offer a natural place for
      hooking the object addition and deletion replays.
      
      Extend the above 2 functions with:
      - pointers to the switchdev atomic notifier (for FDB replays) and the
        blocking notifier (for MDB and VLAN replays).
      - the "const void *ctx" argument required for drivers to be able to
        disambiguate between which port is targeted, when multiple ports are
        lowers of the same LAG that is a bridge port. Most of the drivers pass
        NULL to this argument, except the ones that support LAG offload and have
        the proper context check already in place in the switchdev blocking
        notifier handler.
      
      Also unexport the replay helpers, since nobody except the bridge calls
      them directly now.
      
      Note that:
      (a) we abuse the terminology slightly, because FDB entries are not
          "switchdev objects", but we count them as objects nonetheless.
          With no direct way to prove it, I think they are not modeled as
          switchdev objects because those can only be installed by the bridge
          to the hardware (as opposed to FDB entries which can be propagated
          in the other direction too). This is merely an abuse of terms, FDB
          entries are replayed too, despite not being objects.
      (b) the bridge does not attempt to sync port attributes to newly joined
          ports, just the countable stuff (the objects). The reason for this
          is simple: no universal and symmetric way to sync and unsync them is
          known. For example, VLAN filtering: what to do on unsync, disable or
          leave it enabled? Similarly, STP state, ageing timer, etc etc. What
          a switchdev port does when it becomes standalone again is not really
          up to the bridge's competence, and the driver should deal with it.
          On the other hand, replaying deletions of switchdev objects can be
          seen a matter of cleanup and therefore be treated by the bridge,
          hence this patch.
      
      We make the replay helpers opt-in for drivers, because they might not
      bring immediate benefits for them:
      
      - nbp_vlan_init() is called _after_ netdev_master_upper_dev_link(),
        so br_vlan_replay() should not do anything for the new drivers on
        which we call it. The existing drivers where there was even a slight
        possibility for there to exist a VLAN on a bridge port before they
        join it are already guarded against this: mlxsw and prestera deny
        joining LAG interfaces that are members of a bridge.
      
      - br_fdb_replay() should now notify of local FDB entries, but I patched
        all drivers except DSA to ignore these new entries in commit
        2c4eca3e ("net: bridge: switchdev: include local flag in FDB
        notifications"). Driver authors can lift this restriction as they
        wish, and when they do, they can also opt into the FDB replay
        functionality.
      
      - br_mdb_replay() should fix a real issue which is described in commit
        4f2673b3 ("net: bridge: add helper to replay port and host-joined
        mdb entries"). However most drivers do not offload the
        SWITCHDEV_OBJ_ID_HOST_MDB to see this issue: only cpsw and am65_cpsw
        offload this switchdev object, and I don't completely understand the
        way in which they offload this switchdev object anyway. So I'll leave
        it up to these drivers' respective maintainers to opt into
        br_mdb_replay().
      
      So most of the drivers pass NULL notifier blocks for the replay helpers,
      except:
      - dpaa2-switch which was already acked/regression-tested with the
        helpers enabled (and there isn't much of a downside in having them)
      - ocelot which already had replay logic in "pull" mode
      - DSA which already had replay logic in "pull" mode
      
      An important observation is that the drivers which don't currently
      request bridge event replays don't even have the
      switchdev_bridge_port_{offload,unoffload} calls placed in proper places
      right now. This was done to avoid unnecessary rework for drivers which
      might never even add support for this. For driver writers who wish to
      add replay support, this can be used as a tentative placement guide:
      https://patchwork.kernel.org/project/netdevbpf/patch/20210720134655.892334-11-vladimir.oltean@nxp.com/
      
      Cc: Vadym Kochan <vkochan@marvell.com>
      Cc: Taras Chornyi <tchornyi@marvell.com>
      Cc: Ioana Ciornei <ioana.ciornei@nxp.com>
      Cc: Lars Povlsen <lars.povlsen@microchip.com>
      Cc: Steen Hegelund <Steen.Hegelund@microchip.com>
      Cc: UNGLinuxDriver@microchip.com
      Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Grygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e51bf44
    • L
      net: dsa: ensure linearized SKBs in case of tail taggers · 21cf377a
      Lino Sanfilippo 提交于
      The function skb_put() that is used by tail taggers to make room for the
      DSA tag must only be called for linearized SKBS. However in case that the
      slave device inherited features like NETIF_F_HW_SG or NETIF_F_FRAGLIST the
      SKB passed to the slaves transmit function may not be linearized.
      Avoid those SKBs by clearing the NETIF_F_HW_SG and NETIF_F_FRAGLIST flags
      for tail taggers.
      Furthermore since the tagging protocol can be changed at runtime move the
      code for setting up the slaves features into dsa_slave_setup_tagger().
      Suggested-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21cf377a
  13. 20 7月, 2021 2 次提交
  14. 30 6月, 2021 7 次提交
    • V
      net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev · 4bed397c
      Vladimir Oltean 提交于
      When
      (a) "dev" is a bridge port which the DSA switch tree offloads, but is
          otherwise not a dsa slave (such as a LAG netdev), or
      (b) "dev" is the bridge net device itself
      
      then strange things happen to the dev_hold/dev_put pair:
      dsa_schedule_work() will still be called with a DSA port that offloads
      that netdev, but dev_hold() will be called on the non-DSA netdev.
      Then the "if" condition in dsa_slave_switchdev_event_work() does not
      pass, because "dev" is not a DSA netdev, so dev_put() is not called.
      
      This results in the simple fact that we have a reference counting
      mismatch on the "dev" net device.
      
      This can be seen when we add support for host addresses installed on the
      bridge net device.
      
      ip link add br1 type bridge
      ip link set br1 address 00:01:02:03:04:05
      ip link set swp0 master br1
      ip link del br1
      [  968.512278] unregister_netdevice: waiting for br1 to become free. Usage count = 5
      
      It seems foolish to do penny pinching and not add the net_device pointer
      in the dsa_switchdev_event_work structure, so let's finally do that.
      As an added bonus, when we start offloading local entries pointing
      towards the bridge, these will now properly appear as 'offloaded' in
      'bridge fdb' (this was not possible before, because 'dev' was assumed to
      only be a DSA net device):
      
      00:01:02:03:04:05 dev br0 vlan 1 offload master br0 permanent
      00:01:02:03:04:05 dev br0 offload master br0 permanent
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4bed397c
    • V
      net: dsa: include fdb entries pointing to bridge in the host fdb list · 81a619f7
      Vladimir Oltean 提交于
      The bridge supports a legacy way of adding local (non-forwarded) FDB
      entries, which works on an individual port basis:
      
      bridge fdb add dev swp0 00:01:02:03:04:05 master local
      
      As well as a new way, added by Roopa Prabhu in commit 3741873b
      ("bridge: allow adding of fdb entries pointing to the bridge device"):
      
      bridge fdb add dev br0 00:01:02:03:04:05 self local
      
      The two commands are functionally equivalent, except that the first one
      produces an entry with fdb->dst == swp0, and the other an entry with
      fdb->dst == NULL. The confusing part, though, is that even if fdb->dst
      is swp0 for the 'local on port' entry, that destination is not used.
      
      Nonetheless, the idea is that the bridge has reference counting for
      local entries, and local entries pointing towards the bridge are still
      'as local' as local entries for a port.
      
      The bridge adds the MAC addresses of the interfaces automatically as
      FDB entries with is_local=1. For the MAC address of the ports, fdb->dst
      will be equal to the port, and for the MAC address of the bridge,
      fdb->dst will point towards the bridge (i.e. be NULL). Therefore, if the
      MAC address of the bridge is not inherited from either of the physical
      ports, then we must explicitly catch local FDB entries emitted towards
      the br0, otherwise we'll miss the MAC address of the bridge (and, of
      course, any entry with 'bridge add dev br0 ... self local').
      Co-developed-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81a619f7
    • T
      net: dsa: include bridge addresses which are local in the host fdb list · 10fae4ac
      Tobias Waldekranz 提交于
      The bridge automatically creates local (not forwarded) fdb entries
      pointing towards physical ports with their interface MAC addresses.
      For switchdev, the significance of these fdb entries is the exact
      opposite of that of non-local entries: instead of sending these frame
      outwards, we must send them inwards (towards the host).
      
      NOTE: The bridge's own MAC address is also "local". If that address is
      not shared with any port, the bridge's MAC is not be added by this
      functionality - but the following commit takes care of that case.
      
      NOTE 2: We mark these addresses as host-filtered regardless of the value
      of ds->assisted_learning_on_cpu_port. This is because, as opposed to the
      speculative logic done for dynamic address learning on foreign
      interfaces, the local FDB entries are rather fixed, so there isn't any
      risk of them migrating from one bridge port to another.
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10fae4ac
    • V
      net: dsa: sync static FDB entries on foreign interfaces to hardware · 3068d466
      Vladimir Oltean 提交于
      DSA is able to install FDB entries towards the CPU port for addresses
      which were dynamically learnt by the software bridge on foreign
      interfaces that are in the same bridge with a DSA switch interface.
      Since this behavior is opportunistic, it is guarded by the
      "assisted_learning_on_cpu_port" property which can be enabled by drivers
      and is not done automatically (since certain switches may support
      address learning of packets coming from the CPU port).
      
      But if those FDB entries added on the foreign interfaces are static
      (added by the user) instead of dynamically learnt, currently DSA does
      not do anything (and arguably it should).
      
      Because static FDB entries are not supposed to move on their own, there
      is no downside in reusing the "assisted_learning_on_cpu_port" logic to
      sync static FDB entries to the DSA CPU port unconditionally, even if
      assisted_learning_on_cpu_port is not requested by the driver.
      
      For example, this situation:
      
         br0
         / \
      swp0 dummy0
      
      $ bridge fdb add 02:00:de:ad:00:01 dev dummy0 vlan 1 master static
      
      Results in DSA adding an entry in the hardware FDB, pointing this
      address towards the CPU port.
      
      The same is true for entries added to the bridge itself, e.g:
      
      $ bridge fdb add 02:00:de:ad:00:01 dev br0 vlan 1 self local
      
      (except that right now, DSA still ignores 'local' FDB entries, this will
      be changed in a later patch)
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3068d466
    • V
      net: dsa: introduce a separate cross-chip notifier type for host FDBs · 3dc80afc
      Vladimir Oltean 提交于
      DSA treats some bridge FDB entries by trapping them to the CPU port.
      Currently, the only class of such entries are FDB addresses learnt by
      the software bridge on a foreign interface. However there are many more
      to be added:
      
      - FDB entries with the is_local flag (for termination) added by the
        bridge on the user ports (typically containing the MAC address of the
        bridge port)
      - FDB entries pointing towards the bridge net device (for termination).
        Typically these contain the MAC address of the bridge net device.
      - Static FDB entries installed on a foreign interface that is in the
        same bridge with a DSA user port.
      
      The reason why a separate cross-chip notifier for host FDBs is justified
      compared to normal FDBs is the same as in the case of host MDBs: the
      cross-chip notifier matching function in switch.c should avoid
      installing these entries on routing ports that route towards the
      targeted switch, but not towards the CPU. This is required in order to
      have proper support for H-like multi-chip topologies.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3dc80afc
    • V
      net: dsa: introduce a separate cross-chip notifier type for host MDBs · b8e997c4
      Vladimir Oltean 提交于
      Commit abd49535 ("net: dsa: execute dsa_switch_mdb_add only for
      routing port in cross-chip topologies") does a surprisingly good job
      even for the SWITCHDEV_OBJ_ID_HOST_MDB use case, where DSA simply
      translates a switchdev object received on dp into a cross-chip notifier
      for dp->cpu_dp.
      
      To visualize how that works, imagine the daisy chain topology below and
      consider a SWITCHDEV_OBJ_ID_HOST_MDB object emitted on sw2p0. How does
      the cross-chip notifier know to match on all the right ports (sw0p4, the
      dedicated CPU port, sw1p4, an upstream DSA link, and sw2p4, another
      upstream DSA link)?
      
                                                      |
             sw0p0     sw0p1     sw0p2     sw0p3     sw0p4
          [  user ] [  user ] [  user ] [  dsa  ] [  cpu  ]
          [       ] [       ] [       ] [       ] [   x   ]
                                            |
                                            +---------+
                                                      |
             sw1p0     sw1p1     sw1p2     sw1p3     sw1p4
          [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]
          [       ] [       ] [       ] [       ] [   x   ]
                                            |
                                            +---------+
                                                      |
             sw2p0     sw2p1     sw2p2     sw2p3     sw2p4
          [  user ] [  user ] [  user ] [  user ] [  dsa  ]
          [       ] [       ] [       ] [       ] [   x   ]
      
      The answer is simple: the dedicated CPU port of sw2p0 is sw0p4, and
      dsa_routing_port returns the upstream port for all switches.
      
      That is fine, but there are other topologies where this does not work as
      well. There are trees with "H" topologies in the wild, where there are 2
      or more switches with DSA links between them, but every switch has its
      dedicated CPU port. For these topologies, it seems stupid for the neighbor
      switches to install an MDB entry on the routing port, since these
      multicast addresses are fundamentally different than the usual ones we
      support (and that is the justification for this patch, to introduce the
      concept of a termination plane multicast MAC address, as opposed to a
      forwarding plane multicast MAC address).
      
      For example, when a SWITCHDEV_OBJ_ID_HOST_MDB would get added to sw0p0,
      without this patch, it would get treated as a regular port MDB on sw0p2
      and it would match on the ports below (including the sw1p3 routing port).
      
                               |                                  |
          sw0p0     sw0p1     sw0p2     sw0p3          sw1p3     sw1p2     sw1p1     sw1p0
       [  user ] [  user ] [  cpu  ] [  dsa  ]      [  dsa  ] [  cpu  ] [  user ] [  user ]
       [       ] [       ] [   x   ] [       ] ---- [   x   ] [       ] [       ] [       ]
      
      With the patch, the host MDB notifier on sw0p0 matches only on the local
      switch, which is what we want for a termination plane address.
      
                               |                                  |
          sw0p0     sw0p1     sw0p2     sw0p3          sw1p3     sw1p2     sw1p1     sw1p0
       [  user ] [  user ] [  cpu  ] [  dsa  ]      [  dsa  ] [  cpu  ] [  user ] [  user ]
       [       ] [       ] [   x   ] [       ] ---- [       ] [       ] [       ] [       ]
      
      Name this new matching function "dsa_switch_host_address_match" since we
      will be reusing it soon for host FDB entries as well.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8e997c4
    • V
      net: dsa: delete dsa_legacy_fdb_add and dsa_legacy_fdb_del · b117e1e8
      Vladimir Oltean 提交于
      We want to add reference counting for FDB entries in cross-chip
      topologies, and in order for that to have any chance of working and not
      be unbalanced (leading to entries which are never deleted), we need to
      ensure that higher layers are sane, because if they aren't, it's garbage
      in, garbage out.
      
      For example, if we add a bridge FDB entry twice, the bridge properly
      errors out:
      
      $ bridge fdb add dev swp0 00:01:02:03:04:07 master static
      $ bridge fdb add dev swp0 00:01:02:03:04:07 master static
      RTNETLINK answers: File exists
      
      However, the same thing cannot be said about the bridge bypass
      operations:
      
      $ bridge fdb add dev swp0 00:01:02:03:04:07
      $ bridge fdb add dev swp0 00:01:02:03:04:07
      $ bridge fdb add dev swp0 00:01:02:03:04:07
      $ bridge fdb add dev swp0 00:01:02:03:04:07
      $ echo $?
      0
      
      But one 'bridge fdb del' is enough to remove the entry, no matter how
      many times it was added.
      
      The bridge bypass operations are impossible to maintain in these
      circumstances and lack of support for reference counting the cross-chip
      notifiers is holding us back from making further progress, so just drop
      support for them. The only way left for users to install static bridge
      FDB entries is the proper one, using the "master static" flags.
      
      With this change, rtnl_fdb_add() falls back to calling
      ndo_dflt_fdb_add() which uses the duplicate-exclusive variant of
      dev_uc_add(): dev_uc_add_excl(). Because DSA does not (yet) declare
      IFF_UNICAST_FLT, this results in us going to promiscuous mode:
      
      $ bridge fdb add dev swp0 00:01:02:03:04:05
      [   28.206743] device swp0 entered promiscuous mode
      $ bridge fdb add dev swp0 00:01:02:03:04:05
      RTNETLINK answers: File exists
      
      So even if it does not completely fail, there is at least some indication
      that it is behaving differently from before, and closer to user space
      expectations, I would argue (the lack of a "local|static" specifier
      defaults to "local", or "host-only", so dev_uc_add() is a reasonable
      default implementation). If the generic implementation of .ndo_fdb_add
      provided by Vlad Yasevich is a proof of anything, it only proves that
      the implementation provided by DSA was always wrong, by not looking at
      "ndm->ndm_state & NUD_NOARP" (the "static" flag which means that the FDB
      entry points outwards) and "ndm->ndm_state & NUD_PERMANENT" (the "local"
      flag which means that the FDB entry points towards the host). It all
      used to mean the same thing to DSA.
      
      Update the documentation so that the users are not confused about what's
      going on.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b117e1e8
  15. 29 6月, 2021 4 次提交
    • V
      net: dsa: replay a deletion of switchdev objects for ports leaving a bridged LAG · 74918945
      Vladimir Oltean 提交于
      When a DSA switch port leaves a bonding interface that is under a
      bridge, there might be dangling switchdev objects on that port left
      behind, because the bridge is not aware that its lower interface (the
      bond) changed state in any way.
      
      Call the bridge replay helpers with adding=false before changing
      dp->bridge_dev to NULL, because we need to simulate to
      dsa_slave_port_obj_del() that these notifications were emitted by the
      bridge.
      
      We add this hook to the NETDEV_PRECHANGEUPPER event handler, because
      we are calling into switchdev (and the __switchdev_handle_port_obj_del
      fanout helpers expect the upper/lower adjacency lists to still be valid)
      and PRECHANGEUPPER is the last moment in time when they still are.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74918945
    • V
      net: dsa: refactor the prechangeupper sanity checks into a dedicated function · 4ede74e7
      Vladimir Oltean 提交于
      We need to add more logic to the DSA NETDEV_PRECHANGEUPPER event
      handler, more exactly we need to request an unsync of switchdev objects.
      In order to fit more code, refactor the existing logic into a helper.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ede74e7
    • V
      net: bridge: ignore switchdev events for LAG ports which didn't request replay · 0d2cfbd4
      Vladimir Oltean 提交于
      There is a slight inconvenience in the switchdev replay helpers added
      recently, and this is when:
      
      ip link add br0 type bridge
      ip link add bond0 type bond
      ip link set bond0 master br0
      bridge vlan add dev bond0 vid 100
      ip link set swp0 master bond0
      ip link set swp1 master bond0
      
      Since the underlying driver (currently only DSA) asks for a replay of
      VLANs when swp0 and swp1 join the LAG because it is bridged, what will
      happen is that DSA will try to react twice on the VLAN event for swp0.
      This is not really a huge problem right now, because most drivers accept
      duplicates since the bridge itself does, but it will become a problem
      when we add support for replaying switchdev object deletions.
      
      Let's fix this by adding a blank void *ctx in the replay helpers, which
      will be passed on by the bridge in the switchdev notifications. If the
      context is NULL, everything is the same as before. But if the context is
      populated with a valid pointer, the underlying switchdev driver
      (currently DSA) can use the pointer to 'see through' the bridge port
      (which in the example above is bond0) and 'know' that the event is only
      for a particular physical port offloading that bridge port, and not for
      all of them.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d2cfbd4
    • V
      net: switchdev: add a context void pointer to struct switchdev_notifier_info · 69bfac96
      Vladimir Oltean 提交于
      In the case where the driver asks for a replay of a certain type of
      event (port object or attribute) for a bridge port that is a LAG, it may
      do so because this port has just joined the LAG.
      
      But there might already be other switchdev ports in that LAG, and it is
      preferable that those preexisting switchdev ports do not act upon the
      replayed event.
      
      The solution is to add a context to switchdev events, which is NULL most
      of the time (when the bridge layer initiates the call) but which can be
      set to a value controlled by the switchdev driver when a replay is
      requested. The driver can then check the context to figure out if all
      ports within the LAG should act upon the switchdev event, or just the
      ones that match the context.
      
      We have to modify all switchdev_handle_* helper functions as well as the
      prototypes in the drivers that use these helpers too, because these
      helpers hide the underlying struct switchdev_notifier_info from us and
      there is no way to retrieve the context otherwise.
      
      The context structure will be populated and used in later patches.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69bfac96
  16. 22 6月, 2021 2 次提交
    • V
      net: dsa: targeted MTU notifiers should only match on one port · 88faba20
      Vladimir Oltean 提交于
      dsa_slave_change_mtu() calls dsa_port_mtu_change() twice:
      - it sends a cross-chip notifier with the MTU of the CPU port which is
        used to update the DSA links.
      - it sends one targeted MTU notifier which is supposed to only match the
        user port on which we are changing the MTU. The "propagate_upstream"
        variable is used here to bypass the cross-chip notifier system from
        switch.c
      
      But due to a mistake, the second, targeted notifier matches not only on
      the user port, but also on the DSA link which is a member of the same
      switch, if that exists.
      
      And because the DSA links of the entire dst were programmed in a
      previous round to the largest_mtu via a "propagate_upstream == true"
      notification, then the dsa_port_mtu_change(propagate_upstream == false)
      call that is immediately upcoming will break the MTU on the one DSA link
      which is chip-wise local to the dp whose MTU is changing right now.
      
      Example given this daisy chain topology:
      
         sw0p0     sw0p1     sw0p2     sw0p3     sw0p4
      [  cpu  ] [  user ] [  user ] [  dsa  ] [  user ]
      [   x   ] [       ] [       ] [   x   ] [       ]
                                        |
                                        +---------+
                                                  |
         sw1p0     sw1p1     sw1p2     sw1p3     sw1p4
      [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]
      [       ] [       ] [       ] [       ] [   x   ]
      
      ip link set sw0p1 mtu 9000
      ip link set sw1p1 mtu 9000 # at this stage, sw0p1 and sw1p1 can talk
                                 # to one another using jumbo frames
      ip link set sw0p2 mtu 1500 # this programs the sw0p3 DSA link first to
                                 # the largest_mtu of 9000, then reprograms it to
                                 # 1500 with the "propagate_upstream == false"
                                 # notifier, breaking communication between
                                 # sw0p1 and sw1p1
      
      To escape from this situation, make the targeted match really match on a
      single port - the user port, and rename the "propagate_upstream"
      variable to "targeted_match" to clarify the intention and avoid future
      issues.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88faba20
    • V
      net: dsa: calculate the largest_mtu across all ports in the tree · 4e4ab795
      Vladimir Oltean 提交于
      If we have a cross-chip topology like this:
      
         sw0p0     sw0p1     sw0p2     sw0p3     sw0p4
      [  cpu  ] [  user ] [  user ] [  dsa  ] [  user ]
                                        |
                                        +---------+
                                                  |
         sw1p0     sw1p1     sw1p2     sw1p3     sw1p4
      [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]
      
      and we issue the following commands:
      
      1. ip link set sw0p1 mtu 1700
      2. ip link set sw1p1 mtu 1600
      
      we notice the following happening:
      
      Command 1. emits a non-targeted MTU notifier for the CPU port (sw0p0)
      with the largest_mtu calculated across switch 0, of 1700. This matches
      sw0p0, sw0p3 and sw1p4 (all CPU ports and DSA links).
      Then, it emits a targeted MTU notifier for the user port (sw0p1), again
      with MTU 1700 (this doesn't matter).
      
      Command 2. emits a non-targeted MTU notifier for the CPU port (sw0p0)
      with the largest_mtu calculated across switch 1, of 1600. This matches
      the same group of ports as above, and decreases the MTU for the CPU port
      and the DSA links from 1700 to 1600.
      
      As a result, the sw0p1 user port can no longer communicate with its CPU
      port at MTU 1700.
      
      To address this, we should calculate the largest_mtu across all switches
      that may share a CPU port, and only emit MTU notifiers with that value.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e4ab795
  17. 15 6月, 2021 1 次提交
  18. 12 6月, 2021 1 次提交
    • V
      net: dsa: generalize overhead for taggers that use both headers and trailers · 4e500251
      Vladimir Oltean 提交于
      Some really really weird switches just couldn't decide whether to use a
      normal or a tail tagger, so they just did both.
      
      This creates problems for DSA, because we only have the concept of an
      'overhead' which can be applied to the headroom or to the tailroom of
      the skb (like for example during the central TX reallocation procedure),
      depending on the value of bool tail_tag, but not to both.
      
      We need to generalize DSA to cater for these odd switches by
      transforming the 'overhead / tail_tag' pair into 'needed_headroom /
      needed_tailroom'.
      
      The DSA master's MTU is increased to account for both.
      
      The flow dissector code is modified such that it only calls the DSA
      adjustment callback if the tagger has a non-zero header length.
      
      Taggers are trivially modified to declare either needed_headroom or
      needed_tailroom, based on the tail_tag value that they currently
      declare.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e500251
  19. 11 5月, 2021 1 次提交
  20. 28 4月, 2021 4 次提交
  21. 21 4月, 2021 1 次提交
  22. 17 4月, 2021 1 次提交
  23. 14 4月, 2021 1 次提交
    • M
      of: net: pass the dst buffer to of_get_mac_address() · 83216e39
      Michael Walle 提交于
      of_get_mac_address() returns a "const void*" pointer to a MAC address.
      Lately, support to fetch the MAC address by an NVMEM provider was added.
      But this will only work with platform devices. It will not work with
      PCI devices (e.g. of an integrated root complex) and esp. not with DSA
      ports.
      
      There is an of_* variant of the nvmem binding which works without
      devices. The returned data of a nvmem_cell_read() has to be freed after
      use. On the other hand the return of_get_mac_address() points to some
      static data without a lifetime. The trick for now, was to allocate a
      device resource managed buffer which is then returned. This will only
      work if we have an actual device.
      
      Change it, so that the caller of of_get_mac_address() has to supply a
      buffer where the MAC address is written to. Unfortunately, this will
      touch all drivers which use the of_get_mac_address().
      
      Usually the code looks like:
      
        const char *addr;
        addr = of_get_mac_address(np);
        if (!IS_ERR(addr))
          ether_addr_copy(ndev->dev_addr, addr);
      
      This can then be simply rewritten as:
      
        of_get_mac_address(np, ndev->dev_addr);
      
      Sometimes is_valid_ether_addr() is used to test the MAC address.
      of_get_mac_address() already makes sure, it just returns a valid MAC
      address. Thus we can just test its return code. But we have to be
      careful if there are still other sources for the MAC address before the
      of_get_mac_address(). In this case we have to keep the
      is_valid_ether_addr() call.
      
      The following coccinelle patch was used to convert common cases to the
      new style. Afterwards, I've manually gone over the drivers and fixed the
      return code variable: either used a new one or if one was already
      available use that. Mansour Moufid, thanks for that coccinelle patch!
      
      <spml>
      @a@
      identifier x;
      expression y, z;
      @@
      - x = of_get_mac_address(y);
      + x = of_get_mac_address(y, z);
        <...
      - ether_addr_copy(z, x);
        ...>
      
      @@
      identifier a.x;
      @@
      - if (<+... x ...+>) {}
      
      @@
      identifier a.x;
      @@
        if (<+... x ...+>) {
            ...
        }
      - else {}
      
      @@
      identifier a.x;
      expression e;
      @@
      - if (<+... x ...+>@e)
      -     {}
      - else
      + if (!(e))
            {...}
      
      @@
      expression x, y, z;
      @@
      - x = of_get_mac_address(y, z);
      + of_get_mac_address(y, z);
        ... when != x
      </spml>
      
      All drivers, except drivers/net/ethernet/aeroflex/greth.c, were
      compile-time tested.
      Suggested-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NMichael Walle <michael@walle.cc>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83216e39