1. 12 1月, 2021 2 次提交
    • V
      net: switchdev: remove the transaction structure from port attributes · bae33f2b
      Vladimir Oltean 提交于
      Since the introduction of the switchdev API, port attributes were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      attribute notifier structures, and converts drivers to not look at this
      member.
      
      In part, this patch contains a revert of my previous commit 2e554a7a
      ("net: dsa: propagate switchdev vlan_filtering prepare phase to
      drivers").
      
      For the most part, the conversion was trivial except for:
      - Rocker's world implementation based on Broadcom OF-DPA had an odd
        implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
        done mechanically, by pasting the implementation twice, then only
        keeping the code that would get executed during prepare phase on top,
        then only keeping the code that gets executed during the commit phase
        on bottom, then simplifying the resulting code until this was obtained.
      - DSA's offloading of STP state, bridge flags, VLAN filtering and
        multicast router could be converted right away. But the ageing time
        could not, so a shim was introduced and this was left for a further
        commit.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bae33f2b
    • V
      net: switchdev: remove vid_begin -> vid_end range from VLAN objects · b7a9e0da
      Vladimir Oltean 提交于
      The call path of a switchdev VLAN addition to the bridge looks something
      like this today:
      
              nbp_vlan_init
              |  __br_vlan_set_default_pvid
              |  |                       |
              |  |    br_afspec          |
              |  |        |              |
              |  |        v              |
              |  | br_process_vlan_info  |
              |  |        |              |
              |  |        v              |
              |  |   br_vlan_info        |
              |  |       / \            /
              |  |      /   \          /
              |  |     /     \        /
              |  |    /       \      /
              v  v   v         v    v
            nbp_vlan_add   br_vlan_add ------+
             |              ^      ^ |       |
             |             /       | |       |
             |            /       /  /       |
             \ br_vlan_get_master/  /        v
              \        ^        /  /  br_vlan_add_existing
               \       |       /  /          |
                \      |      /  /          /
                 \     |     /  /          /
                  \    |    /  /          /
                   \   |   /  /          /
                    v  |   | v          /
                    __vlan_add         /
                       / |            /
                      /  |           /
                     v   |          /
         __vlan_vid_add  |         /
                     \   |        /
                      v  v        v
            br_switchdev_port_vlan_add
      
      The ranges UAPI was introduced to the bridge in commit bdced7ef
      ("bridge: support for multiple vlans and vlan ranges in setlink and
      dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec)
      have always been passed one by one, through struct bridge_vlan_info
      tmp_vinfo, to br_vlan_info. So the range never went too far in depth.
      
      Then Scott Feldman introduced the switchdev_port_bridge_setlink function
      in commit 47f8328b ("switchdev: add new switchdev bridge setlink").
      That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made
      full use of the range. But switchdev_port_bridge_setlink was called like
      this:
      
      br_setlink
      -> br_afspec
      -> switchdev_port_bridge_setlink
      
      Basically, the switchdev and the bridge code were not tightly integrated.
      Then commit 41c498b9 ("bridge: restore br_setlink back to original")
      came, and switchdev drivers were required to implement
      .ndo_bridge_setlink = switchdev_port_bridge_setlink for a while.
      
      In the meantime, commits such as 0944d6b5 ("bridge: try switchdev op
      first in __vlan_vid_add/del") finally made switchdev penetrate the
      br_vlan_info() barrier and start to develop the call path we have today.
      But remember, br_vlan_info() still receives VLANs one by one.
      
      Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit
      29ab586c ("net: switchdev: Remove bridge bypass support from
      switchdev") so that drivers would not implement .ndo_bridge_setlink any
      longer. The switchdev_port_bridge_setlink also got deleted.
      This refactoring removed the parallel bridge_setlink implementation from
      switchdev, and left the only switchdev VLAN objects to be the ones
      offloaded from __vlan_vid_add (basically RX filtering) and  __vlan_add
      (the latter coming from commit 9c86ce2c ("net: bridge: Notify about
      bridge VLANs")).
      
      That is to say, today the switchdev VLAN object ranges are not used in
      the kernel. Refactoring the above call path is a bit complicated, when
      the bridge VLAN call path is already a bit complicated.
      
      Let's go off and finish the job of commit 29ab586c by deleting the
      bogus iteration through the VLAN ranges from the drivers. Some aspects
      of this feature never made too much sense in the first place. For
      example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID
      flag supposed to mean, when a port can obviously have a single pvid?
      This particular configuration _is_ denied as of commit 6623c60d
      ("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API
      perspective, the driver still has to play pretend, and only offload the
      vlan->vid_end as pvid. And the addition of a switchdev VLAN object can
      modify the flags of another, completely unrelated, switchdev VLAN
      object! (a VLAN that is PVID will invalidate the PVID flag from whatever
      other VLAN had previously been offloaded with switchdev and had that
      flag. Yet switchdev never notifies about that change, drivers are
      supposed to guess).
      
      Nonetheless, having a VLAN range in the API makes error handling look
      scarier than it really is - unwinding on errors and all of that.
      When in reality, no one really calls this API with more than one VLAN.
      It is all unnecessary complexity.
      
      And despite appearing pretentious (two-phase transactional model and
      all), the switchdev API is really sloppy because the VLAN addition and
      removal operations are not paired with one another (you can add a VLAN
      100 times and delete it just once). The bridge notifies through
      switchdev of a VLAN addition not only when the flags of an existing VLAN
      change, but also when nothing changes. There are switchdev drivers out
      there who don't like adding a VLAN that has already been added, and
      those checks don't really belong at driver level. But the fact that the
      API contains ranges is yet another factor that prevents this from being
      addressed in the future.
      
      Of the existing switchdev pieces of hardware, it appears that only
      Mellanox Spectrum supports offloading more than one VLAN at a time,
      through mlxsw_sp_port_vlan_set. I have kept that code internal to the
      driver, because there is some more bookkeeping that makes use of it, but
      I deleted it from the switchdev API. But since the switchdev support for
      ranges has already been de facto deleted by a Mellanox employee and
      nobody noticed for 4 years, I'm going to assume it's not a biggie.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b7a9e0da
  2. 03 12月, 2020 11 次提交
  3. 26 11月, 2020 3 次提交
  4. 13 10月, 2020 1 次提交
    • C
      net: dsa: microchip: fix race condition · 8098bd69
      Christian Eggers 提交于
      Between queuing the delayed work and finishing the setup of the dsa
      ports, the process may sleep in request_module() (via
      phy_device_create()) and the queued work may be executed prior to the
      switch net devices being registered. In ksz_mib_read_work(), a NULL
      dereference will happen within netof_carrier_ok(dp->slave).
      
      Not queuing the delayed work in ksz_init_mib_timer() makes things even
      worse because the work will now be queued for immediate execution
      (instead of 2000 ms) in ksz_mac_link_down() via
      dsa_port_link_register_of().
      
      Call tree:
      ksz9477_i2c_probe()
      \--ksz9477_switch_register()
         \--ksz_switch_register()
            +--dsa_register_switch()
            |  \--dsa_switch_probe()
            |     \--dsa_tree_setup()
            |        \--dsa_tree_setup_switches()
            |           +--dsa_switch_setup()
            |           |  +--ksz9477_setup()
            |           |  |  \--ksz_init_mib_timer()
            |           |  |     |--/* Start the timer 2 seconds later. */
            |           |  |     \--schedule_delayed_work(&dev->mib_read, msecs_to_jiffies(2000));
            |           |  \--__mdiobus_register()
            |           |     \--mdiobus_scan()
            |           |        \--get_phy_device()
            |           |           +--get_phy_id()
            |           |           \--phy_device_create()
            |           |              |--/* sleeping, ksz_mib_read_work() can be called meanwhile */
            |           |              \--request_module()
            |           |
            |           \--dsa_port_setup()
            |              +--/* Called for non-CPU ports */
            |              +--dsa_slave_create()
            |              |  +--/* Too late, ksz_mib_read_work() may be called beforehand */
            |              |  \--port->slave = ...
            |             ...
            |              +--Called for CPU port */
            |              \--dsa_port_link_register_of()
            |                 \--ksz_mac_link_down()
            |                    +--/* mib_read must be initialized here */
            |                    +--/* work is already scheduled, so it will be executed after 2000 ms */
            |                    \--schedule_delayed_work(&dev->mib_read, 0);
            \-- /* here port->slave is setup properly, scheduling the delayed work should be safe */
      
      Solution:
      1. Do not queue (only initialize) delayed work in ksz_init_mib_timer().
      2. Only queue delayed work in ksz_mac_link_down() if init is completed.
      3. Queue work once in ksz_switch_register(), after dsa_register_switch()
      has completed.
      
      Fixes: 7c6ff470 ("net: dsa: microchip: add MIB counter reading support")
      Signed-off-by: NChristian Eggers <ceggers@arri.de>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      8098bd69
  5. 10 10月, 2020 1 次提交
  6. 05 10月, 2020 1 次提交
    • V
      net: dsa: propagate switchdev vlan_filtering prepare phase to drivers · 2e554a7a
      Vladimir Oltean 提交于
      A driver may refuse to enable VLAN filtering for any reason beyond what
      the DSA framework cares about, such as:
      - having tc-flower rules that rely on the switch being VLAN-aware
      - the particular switch does not support VLAN, even if the driver does
        (the DSA framework just checks for the presence of the .port_vlan_add
        and .port_vlan_del pointers)
      - simply not supporting this configuration to be toggled at runtime
      
      Currently, when a driver rejects a configuration it cannot support, it
      does this from the commit phase, which triggers various warnings in
      switchdev.
      
      So propagate the prepare phase to drivers, to give them the ability to
      refuse invalid configurations cleanly and avoid the warnings.
      
      Since we need to modify all function prototypes and check for the
      prepare phase from within the drivers, take that opportunity and move
      the existing driver restrictions within the prepare phase where that is
      possible and easy.
      
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
      Cc: Hauke Mehrtens <hauke@hauke-m.de>
      Cc: Woojung Huh <woojung.huh@microchip.com>
      Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
      Cc: Sean Wang <sean.wang@mediatek.com>
      Cc: Landen Chao <Landen.Chao@mediatek.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Jonathan McDowell <noodles@earth.li>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e554a7a
  7. 25 9月, 2020 1 次提交
  8. 17 9月, 2020 1 次提交
  9. 11 9月, 2020 1 次提交
  10. 10 9月, 2020 4 次提交
  11. 24 8月, 2020 1 次提交
  12. 22 7月, 2020 1 次提交
    • H
      net: dsa: microchip: call phy_remove_link_mode during probe · 3506b2f4
      Helmut Grohne 提交于
      When doing "ip link set dev ... up" for a ksz9477 backed link,
      ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
      1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
      called. Doing so reverts any previous change to advertised link modes
      e.g. using a udevd .link file.
      
      phy_remove_link_mode is not meant to be used while opening a link and
      should be called during phy probe when the link is not yet available to
      userspace.
      
      Therefore move the phy_remove_link_mode calls into
      ksz9477_switch_register. It indirectly calls dsa_register_switch, which
      creates the relevant struct phy_devices and we update the link modes
      right after that. At that time dev->features is already initialized by
      ksz9477_switch_detect.
      
      Remove phy_setup from ksz_dev_ops as no users remain.
      
      Link: https://lore.kernel.org/netdev/20200715192722.GD1256692@lunn.ch/
      Fixes: 42fc6a4c ("net: dsa: microchip: prepare PHY for proper advertisement")
      Signed-off-by: NHelmut Grohne <helmut.grohne@intenta.de>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3506b2f4
  13. 05 7月, 2020 2 次提交
  14. 03 7月, 2020 1 次提交
  15. 02 7月, 2020 1 次提交
  16. 01 4月, 2020 1 次提交
  17. 11 3月, 2020 1 次提交
  18. 08 2月, 2020 1 次提交
  19. 09 1月, 2020 1 次提交
    • F
      net: dsa: Get information about stacked DSA protocol · 4d776482
      Florian Fainelli 提交于
      It is possible to stack multiple DSA switches in a way that they are not
      part of the tree (disjoint) but the DSA master of a switch is a DSA
      slave of another. When that happens switch drivers may have to know this
      is the case so as to determine whether their tagging protocol has a
      remove chance of working.
      
      This is useful for specific switch drivers such as b53 where devices
      have been known to be stacked in the wild without the Broadcom tag
      protocol supporting that feature. This allows b53 to continue supporting
      those devices by forcing the disabling of Broadcom tags on the outermost
      switches if necessary.
      
      The get_tag_protocol() function is therefore updated to gain an
      additional enum dsa_tag_protocol argument which denotes the current
      tagging protocol used by the DSA master we are attached to, else
      DSA_TAG_PROTO_NONE for the top of the dsa_switch_tree.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d776482
  20. 05 11月, 2019 1 次提交
    • A
      net: of_get_phy_mode: Change API to solve int/unit warnings · 0c65b2b9
      Andrew Lunn 提交于
      Before this change of_get_phy_mode() returned an enum,
      phy_interface_t. On error, -ENODEV etc, is returned. If the result of
      the function is stored in a variable of type phy_interface_t, and the
      compiler has decided to represent this as an unsigned int, comparision
      with -ENODEV etc, is a signed vs unsigned comparision.
      
      Fix this problem by changing the API. Make the function return an
      error, or 0 on success, and pass a pointer, of type phy_interface_t,
      where the phy mode should be stored.
      
      v2:
      Return with *interface set to PHY_INTERFACE_MODE_NA on error.
      Add error checks to all users of of_get_phy_mode()
      Fixup a few reverse christmas tree errors
      Fixup a few slightly malformed reverse christmas trees
      
      v3:
      Fix 0-day reported errors.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c65b2b9
  21. 23 10月, 2019 1 次提交
  22. 18 10月, 2019 2 次提交
    • M
      net: dsa: microchip: Add shared regmap mutex · 013572a2
      Marek Vasut 提交于
      The KSZ driver uses one regmap per register width (8/16/32), each with
      it's own lock, but accessing the same set of registers. In theory, it
      is possible to create a race condition between these regmaps, although
      the underlying bus (SPI or I2C) locking should assure nothing bad will
      really happen and the accesses would be correct.
      
      To make the driver do the right thing, add one single shared mutex for
      all the regmaps used by the driver instead. This assures that even if
      some future hardware is on a bus which does not serialize the accesses
      the same way SPI or I2C does, nothing bad will happen.
      
      Note that the status_mutex was unused and only initied, hence it was
      renamed and repurposed as the regmap mutex.
      Signed-off-by: NMarek Vasut <marex@denx.de>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: George McCollister <george.mccollister@gmail.com>
      Cc: Tristram Ha <Tristram.Ha@microchip.com>
      Cc: Woojung Huh <woojung.huh@microchip.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      013572a2
    • M
      net: dsa: microchip: Do not reinit mutexes on KSZ87xx · 7f238ca9
      Marek Vasut 提交于
      The KSZ87xx driver calls mutex_init() on mutexes already inited in
      ksz_common.c ksz_switch_register(). Do not do it twice, drop the
      reinitialization.
      Signed-off-by: NMarek Vasut <marex@denx.de>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: George McCollister <george.mccollister@gmail.com>
      Cc: Tristram Ha <Tristram.Ha@microchip.com>
      Cc: Woojung Huh <woojung.huh@microchip.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f238ca9