1. 15 2月, 2021 1 次提交
  2. 13 2月, 2021 1 次提交
  3. 28 1月, 2021 1 次提交
    • R
      net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP · 20776b46
      Rasmus Villemoes 提交于
      It's not true that switchdev_port_obj_notify() only inspects the
      ->handled field of "struct switchdev_notifier_port_obj_info" if
      call_switchdev_blocking_notifiers() returns 0 - there's a WARN_ON()
      triggering for a non-zero return combined with ->handled not being
      true. But the real problem here is that -EOPNOTSUPP is not being
      properly handled.
      
      The wrapper functions switchdev_handle_port_obj_add() et al change a
      return value of -EOPNOTSUPP to 0, and the treatment of ->handled in
      switchdev_port_obj_notify() seems to be designed to change that back
      to -EOPNOTSUPP in case nobody actually acted on the notifier (i.e.,
      everybody returned -EOPNOTSUPP).
      
      Currently, as soon as some device down the stack passes the check_cb()
      check, ->handled gets set to true, which means that
      switchdev_port_obj_notify() cannot actually ever return -EOPNOTSUPP.
      
      This, for example, means that the detection of hardware offload
      support in the MRP code is broken: switchdev_port_obj_add() used by
      br_mrp_switchdev_send_ring_test() always returns 0, so since the MRP
      code thinks the generation of MRP test frames has been offloaded, no
      such frames are actually put on the wire. Similarly,
      br_mrp_switchdev_set_ring_role() also always returns 0, causing
      mrp->ring_role_offloaded to be set to 1.
      
      To fix this, continue to set ->handled true if any callback returns
      success or any error distinct from -EOPNOTSUPP. But if all the
      callbacks return -EOPNOTSUPP, make sure that ->handled stays false, so
      the logic in switchdev_port_obj_notify() can propagate that
      information.
      
      Fixes: 9a9f26e8 ("bridge: mrp: Connect MRP API with the switchdev API")
      Fixes: f30f0601 ("switchdev: Add helpers to aid traversal through lower devices")
      Reviewed-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NRasmus Villemoes <rasmus.villemoes@prevas.dk>
      Link: https://lore.kernel.org/r/20210125124116.102928-1-rasmus.villemoes@prevas.dkSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      20776b46
  4. 12 1月, 2021 3 次提交
    • V
      net: switchdev: remove the transaction structure from port attributes · bae33f2b
      Vladimir Oltean 提交于
      Since the introduction of the switchdev API, port attributes were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      attribute notifier structures, and converts drivers to not look at this
      member.
      
      In part, this patch contains a revert of my previous commit 2e554a7a
      ("net: dsa: propagate switchdev vlan_filtering prepare phase to
      drivers").
      
      For the most part, the conversion was trivial except for:
      - Rocker's world implementation based on Broadcom OF-DPA had an odd
        implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
        done mechanically, by pasting the implementation twice, then only
        keeping the code that would get executed during prepare phase on top,
        then only keeping the code that gets executed during the commit phase
        on bottom, then simplifying the resulting code until this was obtained.
      - DSA's offloading of STP state, bridge flags, VLAN filtering and
        multicast router could be converted right away. But the ageing time
        could not, so a shim was introduced and this was left for a further
        commit.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bae33f2b
    • V
      net: switchdev: delete switchdev_port_obj_add_now · cf6def51
      Vladimir Oltean 提交于
      After the removal of the transactional model inside
      switchdev_port_obj_add_now, it has no added value and we can just call
      switchdev_port_obj_notify directly, bypassing this function. Let's
      delete it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      cf6def51
    • V
      net: switchdev: remove the transaction structure from port object notifiers · ffb68fc5
      Vladimir Oltean 提交于
      Since the introduction of the switchdev API, port objects were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      object notifier structures, and converts drivers to not look at this
      member.
      
      Where driver conversion is trivial (like in the case of the Marvell
      Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is
      done in this patch.
      
      Where driver conversion needs more attention (DSA, Mellanox Spectrum),
      the conversion is left for subsequent patches and here we only fake the
      prepare/commit phases at a lower level, just not in the switchdev
      notifier itself.
      
      Where the code has a natural structure that is best left alone as a
      preparation and a commit phase (as in the case of the Ocelot switch),
      that structure is left in place, just made to not depend upon the
      switchdev transactional model.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NJiri Pirko <jiri@nvidia.com>
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ffb68fc5
  5. 24 9月, 2020 1 次提交
  6. 14 7月, 2020 1 次提交
  7. 27 2月, 2020 1 次提交
  8. 18 2月, 2020 1 次提交
  9. 31 5月, 2019 1 次提交
  10. 02 3月, 2019 1 次提交
  11. 28 2月, 2019 2 次提交
  12. 25 2月, 2019 1 次提交
  13. 07 2月, 2019 1 次提交
  14. 18 1月, 2019 1 次提交
  15. 13 12月, 2018 3 次提交
  16. 24 11月, 2018 3 次提交
    • P
      switchdev: Replace port obj add/del SDO with a notification · d17d9f5e
      Petr Machata 提交于
      Drop switchdev_ops.switchdev_port_obj_add and _del. Drop the uses of
      this field from all clients, which were migrated to use switchdev
      notification in the previous patches.
      
      Add a new function switchdev_port_obj_notify() that sends the switchdev
      notifications SWITCHDEV_PORT_OBJ_ADD and _DEL.
      
      Update switchdev_port_obj_del_now() to dispatch to this new function.
      Drop __switchdev_port_obj_add() and update switchdev_port_obj_add()
      likewise.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d17d9f5e
    • P
      switchdev: Add helpers to aid traversal through lower devices · f30f0601
      Petr Machata 提交于
      After the transition from switchdev operations to notifier chain (which
      will take place in following patches), the onus is on the driver to find
      its own devices below possible layer of LAG or other uppers.
      
      The logic to do so is fairly repetitive: each driver is looking for its
      own devices among the lowers of the notified device. For those that it
      finds, it calls a handler. To indicate that the event was handled,
      struct switchdev_notifier_port_obj_info.handled is set. The differences
      lie only in what constitutes an "own" device and what handler to call.
      
      Therefore abstract this logic into two helpers,
      switchdev_handle_port_obj_add() and switchdev_handle_port_obj_del(). If
      a driver only supports physical ports under a bridge device, it will
      simply avoid this layer of indirection.
      
      One area where this helper diverges from the current switchdev behavior
      is the case of mixed lowers, some of which are switchdev ports and some
      of which are not. Previously, such scenario would fail with -EOPNOTSUPP.
      The helper could do that for lowers for which the passed-in predicate
      doesn't hold. That would however break the case that switchdev ports
      from several different drivers are stashed under one master, a scenario
      that switchdev currently happily supports. Therefore tolerate any and
      all unknown netdevices, whether they are backed by a switchdev driver
      or not.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f30f0601
    • P
      switchdev: Add a blocking notifier chain · a93e3b17
      Petr Machata 提交于
      In general one can't assume that a switchdev notifier is called in a
      non-atomic context, and correspondingly, the switchdev notifier chain is
      an atomic one.
      
      However, port object addition and deletion messages are delivered from a
      process context. Even the MDB addition messages, whose delivery is
      scheduled from atomic context, are queued and the delivery itself takes
      place in blocking context. For VLAN messages in particular, keeping the
      blocking nature is important for error reporting.
      
      Therefore introduce a blocking notifier chain and related service
      functions to distribute the notifications for which a blocking context
      can be assumed.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a93e3b17
  17. 10 11月, 2017 1 次提交
  18. 08 8月, 2017 2 次提交
  19. 09 6月, 2017 1 次提交
  20. 14 4月, 2017 1 次提交
  21. 30 10月, 2016 1 次提交
  22. 19 10月, 2016 1 次提交
    • I
      switchdev: Execute bridge ndos only for bridge ports · 97c24290
      Ido Schimmel 提交于
      We recently got the following warning after setting up a vlan device on
      top of an offloaded bridge and executing 'bridge link':
      
      WARNING: CPU: 0 PID: 18566 at drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:81 mlxsw_sp_port_orig_get.part.9+0x55/0x70 [mlxsw_spectrum]
      [...]
       CPU: 0 PID: 18566 Comm: bridge Not tainted 4.8.0-rc7 #1
       Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015
        0000000000000286 00000000e64ab94f ffff880406e6f8f0 ffffffff8135eaa3
        0000000000000000 0000000000000000 ffff880406e6f930 ffffffff8108c43b
        0000005106e6f988 ffff8803df398840 ffff880403c60108 ffff880406e6f990
       Call Trace:
        [<ffffffff8135eaa3>] dump_stack+0x63/0x90
        [<ffffffff8108c43b>] __warn+0xcb/0xf0
        [<ffffffff8108c56d>] warn_slowpath_null+0x1d/0x20
        [<ffffffffa01420d5>] mlxsw_sp_port_orig_get.part.9+0x55/0x70 [mlxsw_spectrum]
        [<ffffffffa0142195>] mlxsw_sp_port_attr_get+0xa5/0xb0 [mlxsw_spectrum]
        [<ffffffff816f151f>] switchdev_port_attr_get+0x4f/0x140
        [<ffffffff816f15d0>] switchdev_port_attr_get+0x100/0x140
        [<ffffffff816f15d0>] switchdev_port_attr_get+0x100/0x140
        [<ffffffff816f1d6b>] switchdev_port_bridge_getlink+0x5b/0xc0
        [<ffffffff816f2680>] ? switchdev_port_fdb_dump+0x90/0x90
        [<ffffffff815f5427>] rtnl_bridge_getlink+0xe7/0x190
        [<ffffffff8161a1b2>] netlink_dump+0x122/0x290
        [<ffffffff8161b0df>] __netlink_dump_start+0x15f/0x190
        [<ffffffff815f5340>] ? rtnl_bridge_dellink+0x230/0x230
        [<ffffffff815fab46>] rtnetlink_rcv_msg+0x1a6/0x220
        [<ffffffff81208118>] ? __kmalloc_node_track_caller+0x208/0x2c0
        [<ffffffff815f5340>] ? rtnl_bridge_dellink+0x230/0x230
        [<ffffffff815fa9a0>] ? rtnl_newlink+0x890/0x890
        [<ffffffff8161cf54>] netlink_rcv_skb+0xa4/0xc0
        [<ffffffff815f56f8>] rtnetlink_rcv+0x28/0x30
        [<ffffffff8161c92c>] netlink_unicast+0x18c/0x240
        [<ffffffff8161ccdb>] netlink_sendmsg+0x2fb/0x3a0
        [<ffffffff815c5a48>] sock_sendmsg+0x38/0x50
        [<ffffffff815c6031>] SYSC_sendto+0x101/0x190
        [<ffffffff815c7111>] ? __sys_recvmsg+0x51/0x90
        [<ffffffff815c6b6e>] SyS_sendto+0xe/0x10
        [<ffffffff817017f2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
      
      The problem is that the 8021q module propagates the call to
      ndo_bridge_getlink() via switchdev ops, but the switch driver doesn't
      recognize the netdev, as it's not offloaded.
      
      While we can ignore calls being made to non-bridge ports inside the
      driver, a better fix would be to push this check up to the switchdev
      layer.
      
      Note that these ndos can be called for non-bridged netdev, but this only
      happens in certain PF drivers which don't call the corresponding
      switchdev functions anyway.
      
      Fixes: 99f44bb3 ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reported-by: NTamir Winetroub <tamirw@mellanox.com>
      Tested-by: NTamir Winetroub <tamirw@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97c24290
  23. 28 9月, 2016 2 次提交
  24. 02 9月, 2016 1 次提交
    • R
      rtnetlink: fdb dump: optimize by saving last interface markers · d297653d
      Roopa Prabhu 提交于
      fdb dumps spanning multiple skb's currently restart from the first
      interface again for every skb. This results in unnecessary
      iterations on the already visited interfaces and their fdb
      entries. In large scale setups, we have seen this to slow
      down fdb dumps considerably. On a system with 30k macs we
      see fdb dumps spanning across more than 300 skbs.
      
      To fix the problem, this patch replaces the existing single fdb
      marker with three markers: netdev hash entries, netdevs and fdb
      index to continue where we left off instead of restarting from the
      first netdev. This is consistent with link dumps.
      
      In the process of fixing the performance issue, this patch also
      re-implements fix done by
      commit 472681d5 ("net: ndo_fdb_dump should report -EMSGSIZE to rtnl_fdb_dump")
      (with an internal fix from Wilson Kok) in the following ways:
      - change ndo_fdb_dump handlers to return error code instead
      of the last fdb index
      - use cb->args strictly for dump frag markers and not error codes.
      This is consistent with other dump functions.
      
      Below results were taken on a system with 1000 netdevs
      and 35085 fdb entries:
      before patch:
      $time bridge fdb show | wc -l
      15065
      
      real    1m11.791s
      user    0m0.070s
      sys 1m8.395s
      
      (existing code does not return all macs)
      
      after patch:
      $time bridge fdb show | wc -l
      35085
      
      real    0m2.017s
      user    0m0.113s
      sys 0m1.942s
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NWilson Kok <wkok@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d297653d
  25. 27 8月, 2016 2 次提交
    • I
      bridge: switchdev: Add forward mark support for stacked devices · 6bc506b4
      Ido Schimmel 提交于
      switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of
      port netdevs so that packets being flooded by the device won't be
      flooded twice.
      
      It works by assigning a unique identifier (the ifindex of the first
      bridge port) to bridge ports sharing the same parent ID. This prevents
      packets from being flooded twice by the same switch, but will flood
      packets through bridge ports belonging to a different switch.
      
      This method is problematic when stacked devices are taken into account,
      such as VLANs. In such cases, a physical port netdev can have upper
      devices being members in two different bridges, thus requiring two
      different 'offload_fwd_mark's to be configured on the port netdev, which
      is impossible.
      
      The main problem is that packet and netdev marking is performed at the
      physical netdev level, whereas flooding occurs between bridge ports,
      which are not necessarily port netdevs.
      
      Instead, packet and netdev marking should really be done in the bridge
      driver with the switch driver only telling it which packets it already
      forwarded. The bridge driver will mark such packets using the mark
      assigned to the ingress bridge port and will prevent the packet from
      being forwarded through any bridge port sharing the same mark (i.e.
      having the same parent ID).
      
      Remove the current switchdev 'offload_fwd_mark' implementation and
      instead implement the proposed method. In addition, make rocker - the
      sole user of the mark - use the proposed method.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bc506b4
    • I
      switchdev: Support parent ID comparison for stacked devices · 5c326ab4
      Ido Schimmel 提交于
      switchdev_port_same_parent_id() currently expects port netdevs, but we
      need it to support stacked devices in the next patch, so drop the
      NO_RECURSE flag.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c326ab4
  26. 16 8月, 2016 1 次提交
  27. 15 7月, 2016 1 次提交
  28. 18 5月, 2016 1 次提交
  29. 25 4月, 2016 1 次提交
  30. 25 3月, 2016 1 次提交