1. 05 2月, 2021 1 次提交
  2. 13 1月, 2021 1 次提交
    • V
      net: dsa: clear devlink port type before unregistering slave netdevs · 91158e16
      Vladimir Oltean 提交于
      Florian reported a use-after-free bug in devlink_nl_port_fill found with
      KASAN:
      
      (devlink_nl_port_fill)
      (devlink_port_notify)
      (devlink_port_unregister)
      (dsa_switch_teardown.part.3)
      (dsa_tree_teardown_switches)
      (dsa_unregister_switch)
      (bcm_sf2_sw_remove)
      (platform_remove)
      (device_release_driver_internal)
      (device_links_unbind_consumers)
      (device_release_driver_internal)
      (device_driver_detach)
      (unbind_store)
      
      Allocated by task 31:
       alloc_netdev_mqs+0x5c/0x50c
       dsa_slave_create+0x110/0x9c8
       dsa_register_switch+0xdb0/0x13a4
       b53_switch_register+0x47c/0x6dc
       bcm_sf2_sw_probe+0xaa4/0xc98
       platform_probe+0x90/0xf4
       really_probe+0x184/0x728
       driver_probe_device+0xa4/0x278
       __device_attach_driver+0xe8/0x148
       bus_for_each_drv+0x108/0x158
      
      Freed by task 249:
       free_netdev+0x170/0x194
       dsa_slave_destroy+0xac/0xb0
       dsa_port_teardown.part.2+0xa0/0xb4
       dsa_tree_teardown_switches+0x50/0xc4
       dsa_unregister_switch+0x124/0x250
       bcm_sf2_sw_remove+0x98/0x13c
       platform_remove+0x44/0x5c
       device_release_driver_internal+0x150/0x254
       device_links_unbind_consumers+0xf8/0x12c
       device_release_driver_internal+0x84/0x254
       device_driver_detach+0x30/0x34
       unbind_store+0x90/0x134
      
      What happens is that devlink_port_unregister emits a netlink
      DEVLINK_CMD_PORT_DEL message which associates the devlink port that is
      getting unregistered with the ifindex of its corresponding net_device.
      Only trouble is, the net_device has already been unregistered.
      
      It looks like we can stub out the search for a corresponding net_device
      if we clear the devlink_port's type. This looks like a bit of a hack,
      but also seems to be the reason why the devlink_port_type_clear function
      exists in the first place.
      
      Fixes: 3122433e ("net: dsa: Register devlink ports before calling DSA driver setup()")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: NFlorian fainelli <f.fainelli@gmail.com>
      Reported-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20210112004831.3778323-1-olteanv@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      91158e16
  3. 05 10月, 2020 2 次提交
  4. 19 9月, 2020 1 次提交
  5. 23 7月, 2020 1 次提交
  6. 10 7月, 2020 1 次提交
  7. 11 5月, 2020 1 次提交
  8. 05 5月, 2020 1 次提交
  9. 28 3月, 2020 1 次提交
    • V
      net: dsa: implement auto-normalization of MTU for bridge hardware datapath · bff33f7e
      Vladimir Oltean 提交于
      Many switches don't have an explicit knob for configuring the MTU
      (maximum transmission unit per interface).  Instead, they do the
      length-based packet admission checks on the ingress interface, for
      reasons that are easy to understand (why would you accept a packet in
      the queuing subsystem if you know you're going to drop it anyway).
      
      So it is actually the MRU that these switches permit configuring.
      
      In Linux there only exists the IFLA_MTU netlink attribute and the
      associated dev_set_mtu function. The comments like to play blind and say
      that it's changing the "maximum transfer unit", which is to say that
      there isn't any directionality in the meaning of the MTU word. So that
      is the interpretation that this patch is giving to things: MTU == MRU.
      
      When 2 interfaces having different MTUs are bridged, the bridge driver
      MTU auto-adjustment logic kicks in: what br_mtu_auto_adjust() does is it
      adjusts the MTU of the bridge net device itself (and not that of the
      slave net devices) to the minimum value of all slave interfaces, in
      order for forwarded packets to not exceed the MTU regardless of the
      interface they are received and send on.
      
      The idea behind this behavior, and why the slave MTUs are not adjusted,
      is that normal termination from Linux over the L2 forwarding domain
      should happen over the bridge net device, which _is_ properly limited by
      the minimum MTU. And termination over individual slave devices is
      possible even if those are bridged. But that is not "forwarding", so
      there's no reason to do normalization there, since only a single
      interface sees that packet.
      
      The problem with those switches that can only control the MRU is with
      the offloaded data path, where a packet received on an interface with
      MRU 9000 would still be forwarded to an interface with MRU 1500. And the
      br_mtu_auto_adjust() function does not really help, since the MTU
      configured on the bridge net device is ignored.
      
      In order to enforce the de-facto MTU == MRU rule for these switches, we
      need to do MTU normalization, which means: in order for no packet larger
      than the MTU configured on this port to be sent, then we need to limit
      the MRU on all ports that this packet could possibly come from. AKA
      since we are configuring the MRU via MTU, it means that all ports within
      a bridge forwarding domain should have the same MTU.
      
      And that is exactly what this patch is trying to do.
      
      >From an implementation perspective, we try to follow the intent of the
      user, otherwise there is a risk that we might livelock them (they try to
      change the MTU on an already-bridged interface, but we just keep
      changing it back in an attempt to keep the MTU normalized). So the MTU
      that the bridge is normalized to is either:
      
       - The most recently changed one:
      
         ip link set dev swp0 master br0
         ip link set dev swp1 master br0
         ip link set dev swp0 mtu 1400
      
         This sequence will make swp1 inherit MTU 1400 from swp0.
      
       - The one of the most recently added interface to the bridge:
      
         ip link set dev swp0 master br0
         ip link set dev swp1 mtu 1400
         ip link set dev swp1 master br0
      
         The above sequence will make swp0 inherit MTU 1400 as well.
      Suggested-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bff33f7e
  10. 27 1月, 2020 1 次提交
    • V
      net: dsa: Fix use-after-free in probing of DSA switch tree · 6dc43cd3
      Vladimir Oltean 提交于
      DSA sets up a switch tree little by little. Every switch of the N
      members of the tree calls dsa_register_switch, and (N - 1) will just
      touch the dst->ports list with their ports and quickly exit. Only the
      last switch that calls dsa_register_switch will find all DSA links
      complete in dsa_tree_setup_routing_table, and not return zero as a
      result but instead go ahead and set up the entire DSA switch tree
      (practically on behalf of the other switches too).
      
      The trouble is that the (N - 1) switches don't clean up after themselves
      after they get an error such as EPROBE_DEFER. Their footprint left in
      dst->ports by dsa_switch_touch_ports is still there. And switch N, the
      one responsible with actually setting up the tree, is going to work with
      those stale dp, dp->ds and dp->ds->dev pointers. In particular ds and
      ds->dev might get freed by the device driver.
      
      Be there a 2-switch tree and the following calling order:
      - Switch 1 calls dsa_register_switch
        - Calls dsa_switch_touch_ports, populates dst->ports
        - Calls dsa_port_parse_cpu, gets -EPROBE_DEFER, exits.
      - Switch 2 calls dsa_register_switch
        - Calls dsa_switch_touch_ports, populates dst->ports
        - Probe doesn't get deferred, so it goes ahead.
        - Calls dsa_tree_setup_routing_table, which returns "complete == true"
          due to Switch 1 having called dsa_switch_touch_ports before.
        - Because the DSA links are complete, it calls dsa_tree_setup_switches
          now.
        - dsa_tree_setup_switches iterates through dst->ports, initializing
          the Switch 1 ds structure (invalid) and the Switch 2 ds structure
          (valid).
        - Undefined behavior (use after free, sometimes NULL pointers, etc).
      
      Real example below (debugging prints added by me, as well as guards
      against NULL pointers):
      
      [    5.477947] dsa_tree_setup_switches: Setting up port 0 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.313002] dsa_tree_setup_switches: Setting up port 1 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.319932] dsa_tree_setup_switches: Setting up port 2 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.329693] dsa_tree_setup_switches: Setting up port 3 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.339458] dsa_tree_setup_switches: Setting up port 4 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.349226] dsa_tree_setup_switches: Setting up port 5 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.358991] dsa_tree_setup_switches: Setting up port 6 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.368758] dsa_tree_setup_switches: Setting up port 7 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.378524] dsa_tree_setup_switches: Setting up port 8 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.388291] dsa_tree_setup_switches: Setting up port 9 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.398057] dsa_tree_setup_switches: Setting up port 10 of switch ffffff803df0b980 (dev ffffff803f775c00)
      [    6.407912] dsa_tree_setup_switches: Setting up port 0 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.417682] dsa_tree_setup_switches: Setting up port 1 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.427446] dsa_tree_setup_switches: Setting up port 2 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.437212] dsa_tree_setup_switches: Setting up port 3 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.446979] dsa_tree_setup_switches: Setting up port 4 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.456744] dsa_tree_setup_switches: Setting up port 5 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.466512] dsa_tree_setup_switches: Setting up port 6 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.476277] dsa_tree_setup_switches: Setting up port 7 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.486043] dsa_tree_setup_switches: Setting up port 8 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.495810] dsa_tree_setup_switches: Setting up port 9 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.505577] dsa_tree_setup_switches: Setting up port 10 of switch ffffff803da02f80 (dev 0000000000000000)
      [    6.515433] dsa_tree_setup_switches: Setting up port 0 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.354120] dsa_tree_setup_switches: Setting up port 1 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.361045] dsa_tree_setup_switches: Setting up port 2 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.370805] dsa_tree_setup_switches: Setting up port 3 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.380571] dsa_tree_setup_switches: Setting up port 4 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.390337] dsa_tree_setup_switches: Setting up port 5 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.400104] dsa_tree_setup_switches: Setting up port 6 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.409872] dsa_tree_setup_switches: Setting up port 7 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.419637] dsa_tree_setup_switches: Setting up port 8 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.429403] dsa_tree_setup_switches: Setting up port 9 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      [    7.439169] dsa_tree_setup_switches: Setting up port 10 of switch ffffff803db15b80 (dev ffffff803d8e4800)
      
      The solution is to recognize that the functions that call
      dsa_switch_touch_ports (dsa_switch_parse_of, dsa_switch_parse) have side
      effects, and therefore one should clean up their side effects on error
      path. The cleanup of dst->ports was taken from dsa_switch_remove and
      moved into a dedicated dsa_switch_release_ports function, which should
      really be per-switch (free only the members of dst->ports that are also
      members of ds, instead of all switch ports).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6dc43cd3
  11. 09 1月, 2020 1 次提交
    • F
      net: dsa: Get information about stacked DSA protocol · 4d776482
      Florian Fainelli 提交于
      It is possible to stack multiple DSA switches in a way that they are not
      part of the tree (disjoint) but the DSA master of a switch is a DSA
      slave of another. When that happens switch drivers may have to know this
      is the case so as to determine whether their tagging protocol has a
      remove chance of working.
      
      This is useful for specific switch drivers such as b53 where devices
      have been known to be stacked in the wild without the Broadcom tag
      protocol supporting that feature. This allows b53 to continue supporting
      those devices by forcing the disabling of Broadcom tags on the outermost
      switches if necessary.
      
      The get_tag_protocol() function is therefore updated to gain an
      additional enum dsa_tag_protocol argument which denotes the current
      tagging protocol used by the DSA master we are attached to, else
      DSA_TAG_PROTO_NONE for the top of the dsa_switch_tree.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d776482
  12. 18 12月, 2019 1 次提交
  13. 06 11月, 2019 1 次提交
  14. 01 11月, 2019 6 次提交
  15. 29 10月, 2019 2 次提交
  16. 23 10月, 2019 10 次提交
  17. 20 10月, 2019 1 次提交
  18. 16 9月, 2019 1 次提交
    • A
      net: dsa: Fix load order between DSA drivers and taggers · 23426a25
      Andrew Lunn 提交于
      The DSA core, DSA taggers and DSA drivers all make use of
      module_init(). Hence they get initialised at device_initcall() time.
      The ordering is non-deterministic. It can be a DSA driver is bound to
      a device before the needed tag driver has been initialised, resulting
      in the message:
      
      No tagger for this switch
      
      Rather than have this be fatal, return -EPROBE_DEFER so that it is
      tried again later once all the needed drivers have been loaded.
      
      Fixes: d3b8c049 ("dsa: Add boilerplate helper to register DSA tag driver modules")
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23426a25
  19. 03 9月, 2019 1 次提交
    • V
      net: dsa: Fix off-by-one number of calls to devlink_port_unregister · 4ba0ebbc
      Vladimir Oltean 提交于
      When a function such as dsa_slave_create fails, currently the following
      stack trace can be seen:
      
      [    2.038342] sja1105 spi0.1: Probed switch chip: SJA1105T
      [    2.054556] sja1105 spi0.1: Reset switch and programmed static config
      [    2.063837] sja1105 spi0.1: Enabled switch tagging
      [    2.068706] fsl-gianfar soc:ethernet@2d90000 eth2: error -19 setting up slave phy
      [    2.076371] ------------[ cut here ]------------
      [    2.080973] WARNING: CPU: 1 PID: 21 at net/core/devlink.c:6184 devlink_free+0x1b4/0x1c0
      [    2.088954] Modules linked in:
      [    2.092005] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 5.3.0-rc6-01360-g41b52e38d2b6-dirty #1746
      [    2.100912] Hardware name: Freescale LS1021A
      [    2.105162] Workqueue: events deferred_probe_work_func
      [    2.110287] [<c03133a4>] (unwind_backtrace) from [<c030d8cc>] (show_stack+0x10/0x14)
      [    2.117992] [<c030d8cc>] (show_stack) from [<c10b08d8>] (dump_stack+0xb4/0xc8)
      [    2.125180] [<c10b08d8>] (dump_stack) from [<c0349d04>] (__warn+0xe0/0xf8)
      [    2.132018] [<c0349d04>] (__warn) from [<c0349e34>] (warn_slowpath_null+0x40/0x48)
      [    2.139549] [<c0349e34>] (warn_slowpath_null) from [<c0f19d74>] (devlink_free+0x1b4/0x1c0)
      [    2.147772] [<c0f19d74>] (devlink_free) from [<c1064fc0>] (dsa_switch_teardown+0x60/0x6c)
      [    2.155907] [<c1064fc0>] (dsa_switch_teardown) from [<c1065950>] (dsa_register_switch+0x8e4/0xaa8)
      [    2.164821] [<c1065950>] (dsa_register_switch) from [<c0ba7fe4>] (sja1105_probe+0x21c/0x2ec)
      [    2.173216] [<c0ba7fe4>] (sja1105_probe) from [<c0b35948>] (spi_drv_probe+0x80/0xa4)
      [    2.180920] [<c0b35948>] (spi_drv_probe) from [<c0a4c1cc>] (really_probe+0x108/0x400)
      [    2.188711] [<c0a4c1cc>] (really_probe) from [<c0a4c694>] (driver_probe_device+0x78/0x1bc)
      [    2.196933] [<c0a4c694>] (driver_probe_device) from [<c0a4a3dc>] (bus_for_each_drv+0x58/0xb8)
      [    2.205414] [<c0a4a3dc>] (bus_for_each_drv) from [<c0a4c024>] (__device_attach+0xd0/0x168)
      [    2.213637] [<c0a4c024>] (__device_attach) from [<c0a4b1d0>] (bus_probe_device+0x84/0x8c)
      [    2.221772] [<c0a4b1d0>] (bus_probe_device) from [<c0a4b72c>] (deferred_probe_work_func+0x84/0xc4)
      [    2.230686] [<c0a4b72c>] (deferred_probe_work_func) from [<c03650a4>] (process_one_work+0x218/0x510)
      [    2.239772] [<c03650a4>] (process_one_work) from [<c03660d8>] (worker_thread+0x2a8/0x5c0)
      [    2.247908] [<c03660d8>] (worker_thread) from [<c036b348>] (kthread+0x148/0x150)
      [    2.255265] [<c036b348>] (kthread) from [<c03010e8>] (ret_from_fork+0x14/0x2c)
      [    2.262444] Exception stack(0xea965fb0 to 0xea965ff8)
      [    2.267466] 5fa0:                                     00000000 00000000 00000000 00000000
      [    2.275598] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      [    2.283729] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
      [    2.290333] ---[ end trace ca5d506728a0581a ]---
      
      devlink_free is complaining right here:
      
      	WARN_ON(!list_empty(&devlink->port_list));
      
      This happens because devlink_port_unregister is no longer done right
      away in dsa_port_setup when a DSA_PORT_TYPE_USER has failed.
      Vivien said about this change that:
      
          Also no need to call devlink_port_unregister from within dsa_port_setup
          as this step is inconditionally handled by dsa_port_teardown on error.
      
      which is not really true. The devlink_port_unregister function _is_
      being called unconditionally from within dsa_port_setup, but not for
      this port that just failed, just for the previous ones which were set
      up.
      
      ports_teardown:
      	for (i = 0; i < port; i++)
      		dsa_port_teardown(&ds->ports[i]);
      
      Initially I was tempted to fix this by extending the "for" loop to also
      cover the port that failed during setup. But this could have potentially
      unforeseen consequences unrelated to devlink_port or even other types of
      ports than user ports, which I can't really test for. For example, if
      for some reason devlink_port_register itself would fail, then
      unconditionally unregistering it in dsa_port_teardown would not be a
      smart idea. The list might go on.
      
      So just make dsa_port_setup undo the setup it had done upon failure, and
      let the for loop undo the work of setting up the previous ports, which
      are guaranteed to be brought up to a consistent state.
      
      Fixes: 955222ca ("net: dsa: use a single switch statement for port setup")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ba0ebbc
  20. 28 8月, 2019 1 次提交
    • V
      net: dsa: remove bitmap operations · e65d45cc
      Vivien Didelot 提交于
      The bitmap operations were introduced to simplify the switch drivers
      in the future, since most of them could implement the common VLAN and
      MDB operations (add, del, dump) with simple functions taking all target
      ports at once, and thus limiting the number of hardware accesses.
      
      Programming an MDB or VLAN this way in a single operation would clearly
      simplify the drivers a lot but would require a new get-set interface
      in DSA. The usage of such bitmap from the stack also raised concerned
      in the past, leading to the dynamic allocation of a new ds->_bitmap
      member in the dsa_switch structure. So let's get rid of them for now.
      
      This commit nicely wraps the ds->ops->port_{mdb,vlan}_{prepare,add}
      switch operations into new dsa_switch_{mdb,vlan}_{prepare,add}
      variants not using any bitmap argument anymore.
      
      New dsa_switch_{mdb,vlan}_match helpers have been introduced to make
      clear which local port of a switch must be programmed with the target
      object. While the targeted user port is an obvious candidate, the
      DSA links must also be programmed, as well as the CPU port for VLANs.
      
      While at it, also remove local variables that are only used once.
      Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e65d45cc
  21. 21 8月, 2019 2 次提交
  22. 09 6月, 2019 1 次提交
  23. 31 5月, 2019 1 次提交