1. 12 1月, 2021 3 次提交
    • V
      net: switchdev: remove the transaction structure from port attributes · bae33f2b
      Vladimir Oltean 提交于
      Since the introduction of the switchdev API, port attributes were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      attribute notifier structures, and converts drivers to not look at this
      member.
      
      In part, this patch contains a revert of my previous commit 2e554a7a
      ("net: dsa: propagate switchdev vlan_filtering prepare phase to
      drivers").
      
      For the most part, the conversion was trivial except for:
      - Rocker's world implementation based on Broadcom OF-DPA had an odd
        implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
        done mechanically, by pasting the implementation twice, then only
        keeping the code that would get executed during prepare phase on top,
        then only keeping the code that gets executed during the commit phase
        on bottom, then simplifying the resulting code until this was obtained.
      - DSA's offloading of STP state, bridge flags, VLAN filtering and
        multicast router could be converted right away. But the ageing time
        could not, so a shim was introduced and this was left for a further
        commit.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bae33f2b
    • V
      net: dsa: mv88e6xxx: deny vid 0 on the CPU port and DSA links too · 3e85f580
      Vladimir Oltean 提交于
      mv88e6xxx apparently has a problem offloading VID 0, which the 8021q
      module tries to install as part of commit ad1afb00 ("vlan_dev: VLAN
      0 should be treated as "no vlan tag" (802.1p packet)"). That mv88e6xxx
      restriction seems to have been introduced by the "VTU GetNext VID-1
      trick to retrieve a single entry" - see commit 2fb5ef09 ("net: dsa:
      mv88e6xxx: extract single VLAN retrieval").
      
      There is one more problem. The mv88e6xxx CPU port and DSA links do not
      report properly in the prepare phase what are the VLANs that they can
      offload. They'll say they can offload everything:
      
      mv88e6xxx_port_vlan_prepare
      -> mv88e6xxx_port_check_hw_vlan:
      
      	/* DSA and CPU ports have to be members of multiple vlans */
      	if (dsa_is_dsa_port(ds, port) || dsa_is_cpu_port(ds, port))
      		return 0;
      
      Except that if you actually try to commit to it, they'll error out and
      print this message:
      
      [   32.802438] mv88e6085 d0032004.mdio-mii:12: p9: failed to add VLAN 0t
      
      which comes from:
      
      mv88e6xxx_port_vlan_add
      -> mv88e6xxx_port_vlan_join:
      
      	if (!vid)
      		return -EOPNOTSUPP;
      
      What prevents this condition from triggering in real life? The fact that
      when a DSA_NOTIFIER_VLAN_ADD is emitted, it never targets a DSA link
      directly. Instead, the notifier will always target either a user port or
      a CPU port. DSA links just happen to get dragged in by:
      
      static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
      				  struct dsa_notifier_vlan_info *info)
      {
      	...
      	if (dsa_is_dsa_port(ds, port))
      		return true;
      	...
      }
      
      So for every DSA VLAN notifier, during the prepare phase, it will just
      so happen that there will be somebody to say "no, don't do that".
      
      This will become a problem when the switchdev prepare/commit transactional
      model goes away. Every port needs to think on its own. DSA links can no
      longer bluff and rely on the fact that the prepare phase will not go
      through to the end, because there will be no prepare phase any longer.
      
      Fix this issue before it becomes a problem, by having the "vid == 0"
      check earlier than the check whether we are a CPU port / DSA link or not.
      Also, the "vid == 0" check becomes unnecessary in the .port_vlan_add
      callback, so we can remove it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      3e85f580
    • V
      net: switchdev: remove vid_begin -> vid_end range from VLAN objects · b7a9e0da
      Vladimir Oltean 提交于
      The call path of a switchdev VLAN addition to the bridge looks something
      like this today:
      
              nbp_vlan_init
              |  __br_vlan_set_default_pvid
              |  |                       |
              |  |    br_afspec          |
              |  |        |              |
              |  |        v              |
              |  | br_process_vlan_info  |
              |  |        |              |
              |  |        v              |
              |  |   br_vlan_info        |
              |  |       / \            /
              |  |      /   \          /
              |  |     /     \        /
              |  |    /       \      /
              v  v   v         v    v
            nbp_vlan_add   br_vlan_add ------+
             |              ^      ^ |       |
             |             /       | |       |
             |            /       /  /       |
             \ br_vlan_get_master/  /        v
              \        ^        /  /  br_vlan_add_existing
               \       |       /  /          |
                \      |      /  /          /
                 \     |     /  /          /
                  \    |    /  /          /
                   \   |   /  /          /
                    v  |   | v          /
                    __vlan_add         /
                       / |            /
                      /  |           /
                     v   |          /
         __vlan_vid_add  |         /
                     \   |        /
                      v  v        v
            br_switchdev_port_vlan_add
      
      The ranges UAPI was introduced to the bridge in commit bdced7ef
      ("bridge: support for multiple vlans and vlan ranges in setlink and
      dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec)
      have always been passed one by one, through struct bridge_vlan_info
      tmp_vinfo, to br_vlan_info. So the range never went too far in depth.
      
      Then Scott Feldman introduced the switchdev_port_bridge_setlink function
      in commit 47f8328b ("switchdev: add new switchdev bridge setlink").
      That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made
      full use of the range. But switchdev_port_bridge_setlink was called like
      this:
      
      br_setlink
      -> br_afspec
      -> switchdev_port_bridge_setlink
      
      Basically, the switchdev and the bridge code were not tightly integrated.
      Then commit 41c498b9 ("bridge: restore br_setlink back to original")
      came, and switchdev drivers were required to implement
      .ndo_bridge_setlink = switchdev_port_bridge_setlink for a while.
      
      In the meantime, commits such as 0944d6b5 ("bridge: try switchdev op
      first in __vlan_vid_add/del") finally made switchdev penetrate the
      br_vlan_info() barrier and start to develop the call path we have today.
      But remember, br_vlan_info() still receives VLANs one by one.
      
      Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit
      29ab586c ("net: switchdev: Remove bridge bypass support from
      switchdev") so that drivers would not implement .ndo_bridge_setlink any
      longer. The switchdev_port_bridge_setlink also got deleted.
      This refactoring removed the parallel bridge_setlink implementation from
      switchdev, and left the only switchdev VLAN objects to be the ones
      offloaded from __vlan_vid_add (basically RX filtering) and  __vlan_add
      (the latter coming from commit 9c86ce2c ("net: bridge: Notify about
      bridge VLANs")).
      
      That is to say, today the switchdev VLAN object ranges are not used in
      the kernel. Refactoring the above call path is a bit complicated, when
      the bridge VLAN call path is already a bit complicated.
      
      Let's go off and finish the job of commit 29ab586c by deleting the
      bogus iteration through the VLAN ranges from the drivers. Some aspects
      of this feature never made too much sense in the first place. For
      example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID
      flag supposed to mean, when a port can obviously have a single pvid?
      This particular configuration _is_ denied as of commit 6623c60d
      ("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API
      perspective, the driver still has to play pretend, and only offload the
      vlan->vid_end as pvid. And the addition of a switchdev VLAN object can
      modify the flags of another, completely unrelated, switchdev VLAN
      object! (a VLAN that is PVID will invalidate the PVID flag from whatever
      other VLAN had previously been offloaded with switchdev and had that
      flag. Yet switchdev never notifies about that change, drivers are
      supposed to guess).
      
      Nonetheless, having a VLAN range in the API makes error handling look
      scarier than it really is - unwinding on errors and all of that.
      When in reality, no one really calls this API with more than one VLAN.
      It is all unnecessary complexity.
      
      And despite appearing pretentious (two-phase transactional model and
      all), the switchdev API is really sloppy because the VLAN addition and
      removal operations are not paired with one another (you can add a VLAN
      100 times and delete it just once). The bridge notifies through
      switchdev of a VLAN addition not only when the flags of an existing VLAN
      change, but also when nothing changes. There are switchdev drivers out
      there who don't like adding a VLAN that has already been added, and
      those checks don't really belong at driver level. But the fact that the
      API contains ranges is yet another factor that prevents this from being
      addressed in the future.
      
      Of the existing switchdev pieces of hardware, it appears that only
      Mellanox Spectrum supports offloading more than one VLAN at a time,
      through mlxsw_sp_port_vlan_set. I have kept that code internal to the
      driver, because there is some more bookkeeping that makes use of it, but
      I deleted it from the switchdev API. But since the switchdev support for
      ranges has already been de facto deleted by a Mellanox employee and
      nobody noticed for 4 years, I'm going to assume it's not a biggie.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b7a9e0da
  2. 10 1月, 2021 1 次提交
  3. 08 1月, 2021 2 次提交
  4. 07 1月, 2021 1 次提交
  5. 06 1月, 2021 1 次提交
  6. 05 1月, 2021 2 次提交
  7. 17 12月, 2020 1 次提交
    • O
      net: dsa: qca: ar9331: fix sleeping function called from invalid context bug · 3e47495f
      Oleksij Rempel 提交于
      With lockdep enabled, we will get following warning:
      
       ar9331_switch ethernet.1:10 lan0 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:00] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13)
       BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
       in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 18, name: kworker/0:1
       INFO: lockdep is turned off.
       irq event stamp: 602
       hardirqs last  enabled at (601): [<8073fde0>] _raw_spin_unlock_irq+0x3c/0x80
       hardirqs last disabled at (602): [<8073a4f4>] __schedule+0x184/0x800
       softirqs last  enabled at (0): [<80080f60>] copy_process+0x578/0x14c8
       softirqs last disabled at (0): [<00000000>] 0x0
       CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.10.0-rc3-ar9331-00734-g7d644991df0c #31
       Workqueue: events deferred_probe_work_func
       Stack : 80980000 80980000 8089ef70 80890000 804b5414 80980000 00000002 80b53728
               00000000 800d1268 804b5414 ffffffde 00000017 800afe08 81943860 0f5bfc32
               00000000 00000000 8089ef70 819436c0 ffffffea 00000000 00000000 00000000
               8194390c 808e353c 0000000f 66657272 80980000 00000000 00000000 80890000
               804b5414 80980000 00000002 80b53728 00000000 00000000 00000000 80d40000
               ...
       Call Trace:
       [<80069ce0>] show_stack+0x9c/0x140
       [<800afe08>] ___might_sleep+0x220/0x244
       [<8073bfb0>] __mutex_lock+0x70/0x374
       [<8073c2e0>] mutex_lock_nested+0x2c/0x38
       [<804b5414>] regmap_update_bits_base+0x38/0x8c
       [<804ee584>] regmap_update_bits+0x1c/0x28
       [<804ee714>] ar9331_sw_unmask_irq+0x34/0x60
       [<800d91f0>] unmask_irq+0x48/0x70
       [<800d93d4>] irq_startup+0x114/0x11c
       [<800d65b4>] __setup_irq+0x4f4/0x6d0
       [<800d68a0>] request_threaded_irq+0x110/0x190
       [<804e3ef0>] phy_request_interrupt+0x4c/0xe4
       [<804df508>] phylink_bringup_phy+0x2c0/0x37c
       [<804df7bc>] phylink_of_phy_connect+0x118/0x130
       [<806c1a64>] dsa_slave_create+0x3d0/0x578
       [<806bc4ec>] dsa_register_switch+0x934/0xa20
       [<804eef98>] ar9331_sw_probe+0x34c/0x364
       [<804eb48c>] mdio_probe+0x44/0x70
       [<8049e3b4>] really_probe+0x30c/0x4f4
       [<8049ea10>] driver_probe_device+0x264/0x26c
       [<8049bc10>] bus_for_each_drv+0xb4/0xd8
       [<8049e684>] __device_attach+0xe8/0x18c
       [<8049ce58>] bus_probe_device+0x48/0xc4
       [<8049db70>] deferred_probe_work_func+0xdc/0xf8
       [<8009ff64>] process_one_work+0x2e4/0x4a0
       [<800a0770>] worker_thread+0x2a8/0x354
       [<800a774c>] kthread+0x16c/0x174
       [<8006306c>] ret_from_kernel_thread+0x14/0x1c
      
       ar9331_switch ethernet.1:10 lan1 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:02] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13)
       DSA: tree 0 setup
      
      To fix it, it is better to move access to MDIO register to the .irq_bus_sync_unlock
      call back.
      
      Fixes: ec6698c2 ("net: dsa: add support for Atheros AR9331 built-in switch")
      Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20201211110317.17061-1-o.rempel@pengutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      3e47495f
  8. 15 12月, 2020 1 次提交
  9. 13 12月, 2020 1 次提交
  10. 10 12月, 2020 2 次提交
  11. 09 12月, 2020 1 次提交
  12. 06 12月, 2020 1 次提交
    • V
      net: mscc: ocelot: fix dropping of unknown IPv4 multicast on Seville · edd2410b
      Vladimir Oltean 提交于
      The current assumption is that the felix DSA driver has flooding knobs
      per traffic class, while ocelot switchdev has a single flooding knob.
      This was correct for felix VSC9959 and ocelot VSC7514, but with the
      introduction of seville VSC9953, we see a switch driven by felix.c which
      has a single flooding knob.
      
      So it is clear that we must do what should have been done from the
      beginning, which is not to overwrite the configuration done by ocelot.c
      in felix, but instead to teach the common ocelot library about the
      differences in our switches, and set up the flooding PGIDs centrally.
      
      The effect that the bogus iteration through FELIX_NUM_TC has upon
      seville is quite dramatic. ANA_FLOODING is located at 0x00b548, and
      ANA_FLOODING_IPMC is located at 0x00b54c. So the bogus iteration will
      actually overwrite ANA_FLOODING_IPMC when attempting to write
      ANA_FLOODING[1]. There is no ANA_FLOODING[1] in sevile, just ANA_FLOODING.
      
      And when ANA_FLOODING_IPMC is overwritten with a bogus value, the effect
      is that ANA_FLOODING_IPMC gets the value of 0x0003CF7D:
      	MC6_DATA = 61,
      	MC6_CTRL = 61,
      	MC4_DATA = 60,
      	MC4_CTRL = 0.
      Because MC4_CTRL is zero, this means that IPv4 multicast control packets
      are not flooded, but dropped. An invalid configuration, and this is how
      the issue was actually spotted.
      Reported-by: NEldar Gasanov <eldargasanov2@gmail.com>
      Reported-by: NMaxim Kochetkov <fido_max@inbox.ru>
      Tested-by: NEldar Gasanov <eldargasanov2@gmail.com>
      Fixes: 84705fc1 ("net: dsa: felix: introduce support for Seville VSC9953 switch")
      Fixes: 3c7b51bd ("net: dsa: felix: allow flooding for all traffic classes")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Link: https://lore.kernel.org/r/20201204175416.1445937-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      edd2410b
  13. 03 12月, 2020 11 次提交
  14. 26 11月, 2020 7 次提交
  15. 24 11月, 2020 1 次提交
  16. 19 11月, 2020 1 次提交
  17. 17 11月, 2020 1 次提交
    • M
      net: lantiq: Wait for the GPHY firmware to be ready · 2a1828e3
      Martin Blumenstingl 提交于
      A user reports (slightly shortened from the original message):
        libphy: lantiq,xrx200-mdio: probed
        mdio_bus 1e108000.switch-mii: MDIO device at address 17 is missing.
        gswip 1e108000.switch lan: no phy at 2
        gswip 1e108000.switch lan: failed to connect to port 2: -19
        lantiq,xrx200-net 1e10b308.eth eth0: error -19 setting up slave phy
      
      This is a single-port board using the internal Fast Ethernet PHY. The
      user reports that switching to PHY scanning instead of configuring the
      PHY within device-tree works around this issue.
      
      The documentation for the standalone variant of the PHY11G (which is
      probably very similar to what is used inside the xRX200 SoCs but having
      the firmware burnt onto that standalone chip in the factory) states that
      the PHY needs 300ms to be ready for MDIO communication after releasing
      the reset.
      
      Add a 300ms delay after initializing all GPHYs to ensure that the GPHY
      firmware had enough time to initialize and to appear on the MDIO bus.
      Unfortunately there is no (known) documentation on what the minimum time
      to wait after releasing the reset on an internal PHY so play safe and
      take the one for the external variant. Only wait after the last GPHY
      firmware is loaded to not slow down the initialization too much (
      xRX200 has two GPHYs but newer SoCs have at least three GPHYs).
      
      Fixes: 14fceff4 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Acked-by: NHauke Mehrtens <hauke@hauke-m.de>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20201115165757.552641-1-martin.blumenstingl@googlemail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2a1828e3
  18. 15 11月, 2020 1 次提交
    • T
      net: dsa: mv88e6xxx: Avoid VTU corruption on 6097 · 92307069
      Tobias Waldekranz 提交于
      As soon as you add the second port to a VLAN, all other port
      membership configuration is overwritten with zeroes. The HW interprets
      this as all ports being "unmodified members" of the VLAN.
      
      In the simple case when all ports belong to the same VLAN, switching
      will still work. But using multiple VLANs or trying to set multiple
      ports as tagged members will not work.
      
      On the 6352, doing a VTU GetNext op, followed by an STU GetNext op
      will leave you with both the member- and state- data in the VTU/STU
      data registers. But on the 6097 (which uses the same implementation),
      the STU GetNext will override the information gathered from the VTU
      GetNext.
      
      Separate the two stages, parsing the result of the VTU GetNext before
      doing the STU GetNext.
      
      We opt to update the existing implementation for all applicable chips,
      as opposed to creating a separate callback for 6097, because although
      the previous implementation did work for (at least) 6352, the
      datasheet does not mention the masking behavior.
      
      Fixes: ef6fcea3 ("net: dsa: mv88e6xxx: get STU entry on VTU GetNext")
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Link: https://lore.kernel.org/r/20201112114335.27371-1-tobias@waldekranz.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      92307069
  19. 12 11月, 2020 1 次提交