1. 12 1月, 2021 4 次提交
    • V
      net: switchdev: remove the transaction structure from port object notifiers · ffb68fc5
      Vladimir Oltean 提交于
      Since the introduction of the switchdev API, port objects were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      object notifier structures, and converts drivers to not look at this
      member.
      
      Where driver conversion is trivial (like in the case of the Marvell
      Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is
      done in this patch.
      
      Where driver conversion needs more attention (DSA, Mellanox Spectrum),
      the conversion is left for subsequent patches and here we only fake the
      prepare/commit phases at a lower level, just not in the switchdev
      notifier itself.
      
      Where the code has a natural structure that is best left alone as a
      preparation and a commit phase (as in the case of the Ocelot switch),
      that structure is left in place, just made to not depend upon the
      switchdev transactional model.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ffb68fc5
    • V
      net: dsa: mv88e6xxx: deny vid 0 on the CPU port and DSA links too · 3e85f580
      Vladimir Oltean 提交于
      mv88e6xxx apparently has a problem offloading VID 0, which the 8021q
      module tries to install as part of commit ad1afb00 ("vlan_dev: VLAN
      0 should be treated as "no vlan tag" (802.1p packet)"). That mv88e6xxx
      restriction seems to have been introduced by the "VTU GetNext VID-1
      trick to retrieve a single entry" - see commit 2fb5ef09 ("net: dsa:
      mv88e6xxx: extract single VLAN retrieval").
      
      There is one more problem. The mv88e6xxx CPU port and DSA links do not
      report properly in the prepare phase what are the VLANs that they can
      offload. They'll say they can offload everything:
      
      mv88e6xxx_port_vlan_prepare
      -> mv88e6xxx_port_check_hw_vlan:
      
      	/* DSA and CPU ports have to be members of multiple vlans */
      	if (dsa_is_dsa_port(ds, port) || dsa_is_cpu_port(ds, port))
      		return 0;
      
      Except that if you actually try to commit to it, they'll error out and
      print this message:
      
      [   32.802438] mv88e6085 d0032004.mdio-mii:12: p9: failed to add VLAN 0t
      
      which comes from:
      
      mv88e6xxx_port_vlan_add
      -> mv88e6xxx_port_vlan_join:
      
      	if (!vid)
      		return -EOPNOTSUPP;
      
      What prevents this condition from triggering in real life? The fact that
      when a DSA_NOTIFIER_VLAN_ADD is emitted, it never targets a DSA link
      directly. Instead, the notifier will always target either a user port or
      a CPU port. DSA links just happen to get dragged in by:
      
      static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
      				  struct dsa_notifier_vlan_info *info)
      {
      	...
      	if (dsa_is_dsa_port(ds, port))
      		return true;
      	...
      }
      
      So for every DSA VLAN notifier, during the prepare phase, it will just
      so happen that there will be somebody to say "no, don't do that".
      
      This will become a problem when the switchdev prepare/commit transactional
      model goes away. Every port needs to think on its own. DSA links can no
      longer bluff and rely on the fact that the prepare phase will not go
      through to the end, because there will be no prepare phase any longer.
      
      Fix this issue before it becomes a problem, by having the "vid == 0"
      check earlier than the check whether we are a CPU port / DSA link or not.
      Also, the "vid == 0" check becomes unnecessary in the .port_vlan_add
      callback, so we can remove it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      3e85f580
    • V
      net: switchdev: remove vid_begin -> vid_end range from VLAN objects · b7a9e0da
      Vladimir Oltean 提交于
      The call path of a switchdev VLAN addition to the bridge looks something
      like this today:
      
              nbp_vlan_init
              |  __br_vlan_set_default_pvid
              |  |                       |
              |  |    br_afspec          |
              |  |        |              |
              |  |        v              |
              |  | br_process_vlan_info  |
              |  |        |              |
              |  |        v              |
              |  |   br_vlan_info        |
              |  |       / \            /
              |  |      /   \          /
              |  |     /     \        /
              |  |    /       \      /
              v  v   v         v    v
            nbp_vlan_add   br_vlan_add ------+
             |              ^      ^ |       |
             |             /       | |       |
             |            /       /  /       |
             \ br_vlan_get_master/  /        v
              \        ^        /  /  br_vlan_add_existing
               \       |       /  /          |
                \      |      /  /          /
                 \     |     /  /          /
                  \    |    /  /          /
                   \   |   /  /          /
                    v  |   | v          /
                    __vlan_add         /
                       / |            /
                      /  |           /
                     v   |          /
         __vlan_vid_add  |         /
                     \   |        /
                      v  v        v
            br_switchdev_port_vlan_add
      
      The ranges UAPI was introduced to the bridge in commit bdced7ef
      ("bridge: support for multiple vlans and vlan ranges in setlink and
      dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec)
      have always been passed one by one, through struct bridge_vlan_info
      tmp_vinfo, to br_vlan_info. So the range never went too far in depth.
      
      Then Scott Feldman introduced the switchdev_port_bridge_setlink function
      in commit 47f8328b ("switchdev: add new switchdev bridge setlink").
      That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made
      full use of the range. But switchdev_port_bridge_setlink was called like
      this:
      
      br_setlink
      -> br_afspec
      -> switchdev_port_bridge_setlink
      
      Basically, the switchdev and the bridge code were not tightly integrated.
      Then commit 41c498b9 ("bridge: restore br_setlink back to original")
      came, and switchdev drivers were required to implement
      .ndo_bridge_setlink = switchdev_port_bridge_setlink for a while.
      
      In the meantime, commits such as 0944d6b5 ("bridge: try switchdev op
      first in __vlan_vid_add/del") finally made switchdev penetrate the
      br_vlan_info() barrier and start to develop the call path we have today.
      But remember, br_vlan_info() still receives VLANs one by one.
      
      Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit
      29ab586c ("net: switchdev: Remove bridge bypass support from
      switchdev") so that drivers would not implement .ndo_bridge_setlink any
      longer. The switchdev_port_bridge_setlink also got deleted.
      This refactoring removed the parallel bridge_setlink implementation from
      switchdev, and left the only switchdev VLAN objects to be the ones
      offloaded from __vlan_vid_add (basically RX filtering) and  __vlan_add
      (the latter coming from commit 9c86ce2c ("net: bridge: Notify about
      bridge VLANs")).
      
      That is to say, today the switchdev VLAN object ranges are not used in
      the kernel. Refactoring the above call path is a bit complicated, when
      the bridge VLAN call path is already a bit complicated.
      
      Let's go off and finish the job of commit 29ab586c by deleting the
      bogus iteration through the VLAN ranges from the drivers. Some aspects
      of this feature never made too much sense in the first place. For
      example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID
      flag supposed to mean, when a port can obviously have a single pvid?
      This particular configuration _is_ denied as of commit 6623c60d
      ("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API
      perspective, the driver still has to play pretend, and only offload the
      vlan->vid_end as pvid. And the addition of a switchdev VLAN object can
      modify the flags of another, completely unrelated, switchdev VLAN
      object! (a VLAN that is PVID will invalidate the PVID flag from whatever
      other VLAN had previously been offloaded with switchdev and had that
      flag. Yet switchdev never notifies about that change, drivers are
      supposed to guess).
      
      Nonetheless, having a VLAN range in the API makes error handling look
      scarier than it really is - unwinding on errors and all of that.
      When in reality, no one really calls this API with more than one VLAN.
      It is all unnecessary complexity.
      
      And despite appearing pretentious (two-phase transactional model and
      all), the switchdev API is really sloppy because the VLAN addition and
      removal operations are not paired with one another (you can add a VLAN
      100 times and delete it just once). The bridge notifies through
      switchdev of a VLAN addition not only when the flags of an existing VLAN
      change, but also when nothing changes. There are switchdev drivers out
      there who don't like adding a VLAN that has already been added, and
      those checks don't really belong at driver level. But the fact that the
      API contains ranges is yet another factor that prevents this from being
      addressed in the future.
      
      Of the existing switchdev pieces of hardware, it appears that only
      Mellanox Spectrum supports offloading more than one VLAN at a time,
      through mlxsw_sp_port_vlan_set. I have kept that code internal to the
      driver, because there is some more bookkeeping that makes use of it, but
      I deleted it from the switchdev API. But since the switchdev support for
      ranges has already been de facto deleted by a Mellanox employee and
      nobody noticed for 4 years, I'm going to assume it's not a biggie.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b7a9e0da
    • H
      r8169: deprecate support for RTL_GIGA_MAC_VER_27 · beb401ec
      Heiner Kallweit 提交于
      RTL8168dp is ancient anyway, and I haven't seen any trace of its early
      version 27 yet. This chip versions needs quite some special handling,
      therefore it would facilitate driver maintenance if support for it
      could be dropped. For now just disable detection of this chip version.
      If nobody complains we can remove support for it in the near future.
      
      v2:
      - extend unknown chip version error message
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/ca98f018-a0e1-8762-e95c-f0ad773a0271@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      beb401ec
  2. 10 1月, 2021 34 次提交
  3. 09 1月, 2021 2 次提交
    • J
      ppp: clean up endianness conversions · 09b5b5fb
      Julian Wiedmann 提交于
      sparse complains about some harmless endianness issues:
      
      > drivers/net/ppp/pptp.c:281:21: warning: incorrect type in assignment (different base types)
      > drivers/net/ppp/pptp.c:281:21:    expected unsigned int [usertype] ack
      > drivers/net/ppp/pptp.c:281:21:    got restricted __be32
      > drivers/net/ppp/pptp.c:283:23: warning: cast to restricted __be32
      
      Here 'ack' is assigned a value in network-order, and then also the
      byte-swapped value in host-order. Clean this up by doing the byte-swap
      as part of the assignment.
      
      > drivers/net/ppp/pptp.c:358:26: warning: cast from restricted __be16
      > drivers/net/ppp/pptp.c:358:26: warning: incorrect type in argument 1 (different base types)
      > drivers/net/ppp/pptp.c:358:26:    expected unsigned short [usertype] call_id
      > drivers/net/ppp/pptp.c:358:26:    got restricted __be16 [usertype]
      
      Here we use the wrong flavour of byte-swap. Use ntohs(), which of course
      gives the same result.
      
      Cc: Dmitry Kozlov <xeb@mail.ru>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Link: https://lore.kernel.org/r/20210107143956.25549-1-jwi@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      09b5b5fb
    • J
      net: ip_tunnel: clean up endianness conversions · fda4fde2
      Julian Wiedmann 提交于
      sparse complains about some harmless endianness issues:
      
      > net/ipv4/ip_tunnel_core.c:225:43: warning: cast to restricted __be16
      > net/ipv4/ip_tunnel_core.c:225:43: warning: incorrect type in initializer (different base types)
      > net/ipv4/ip_tunnel_core.c:225:43:    expected restricted __be16 [usertype] mtu
      > net/ipv4/ip_tunnel_core.c:225:43:    got unsigned short [usertype]
      
      iptunnel_pmtud_build_icmp() uses the wrong flavour of byte-order conversion
      when storing the MTU into the ICMPv4 packet. Use htons(), just like
      iptunnel_pmtud_build_icmpv6() does.
      
      > net/ipv4/ip_tunnel_core.c:248:35: warning: cast from restricted __be16
      > net/ipv4/ip_tunnel_core.c:248:35: warning: incorrect type in argument 3 (different base types)
      > net/ipv4/ip_tunnel_core.c:248:35:    expected unsigned short type
      > net/ipv4/ip_tunnel_core.c:248:35:    got restricted __be16 [usertype]
      > net/ipv4/ip_tunnel_core.c:341:35: warning: cast from restricted __be16
      > net/ipv4/ip_tunnel_core.c:341:35: warning: incorrect type in argument 3 (different base types)
      > net/ipv4/ip_tunnel_core.c:341:35:    expected unsigned short type
      > net/ipv4/ip_tunnel_core.c:341:35:    got restricted __be16 [usertype]
      
      eth_header() wants the Ethertype in host-order, use the correct flavour of
      byte-order conversion.
      
      > net/ipv4/ip_tunnel_core.c:600:45: warning: restricted __be16 degrades to integer
      > net/ipv4/ip_tunnel_core.c:609:30: warning: incorrect type in assignment (different base types)
      > net/ipv4/ip_tunnel_core.c:609:30:    expected int type
      > net/ipv4/ip_tunnel_core.c:609:30:    got restricted __be16 [usertype]
      > net/ipv4/ip_tunnel_core.c:619:30: warning: incorrect type in assignment (different base types)
      > net/ipv4/ip_tunnel_core.c:619:30:    expected int type
      > net/ipv4/ip_tunnel_core.c:619:30:    got restricted __be16 [usertype]
      > net/ipv4/ip_tunnel_core.c:629:30: warning: incorrect type in assignment (different base types)
      > net/ipv4/ip_tunnel_core.c:629:30:    expected int type
      > net/ipv4/ip_tunnel_core.c:629:30:    got restricted __be16 [usertype]
      
      The TUNNEL_* types are big-endian, so adjust the type of the local
      variable in ip_tun_parse_opts().
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Link: https://lore.kernel.org/r/20210107144008.25777-1-jwi@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      fda4fde2