1. 11 8月, 2021 5 次提交
  2. 05 8月, 2021 2 次提交
    • N
      net: bridge: fix ioctl old_deviceless bridge argument · cbd7ad29
      Nikolay Aleksandrov 提交于
      Commit ad2f99ae ("net: bridge: move bridge ioctls out of .ndo_do_ioctl")
      changed the source of the argument copy in bridge's old_deviceless() from
      args[1] (user ptr to device name) to uarg (ptr to ioctl arguments) causing
      wrong device name to be used.
      
      Example (broken, bridge exists but is up):
      $ brctl delbr bridge
      bridge bridge doesn't exist; can't delete it
      
      Example (working):
      $ brctl delbr bridge
      bridge bridge is still up; can't delete it
      
      Fixes: ad2f99ae ("net: bridge: move bridge ioctls out of .ndo_do_ioctl")
      Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbd7ad29
    • N
      net: bridge: fix ioctl locking · 893b1958
      Nikolay Aleksandrov 提交于
      Before commit ad2f99ae ("net: bridge: move bridge ioctls out of
      .ndo_do_ioctl") the bridge ioctl calls were divided in two parts:
      one was deviceless called by sock_ioctl and didn't expect rtnl to be held,
      the other was with a device called by dev_ifsioc() and expected rtnl to be
      held. After the commit above they were united in a single ioctl stub, but
      it didn't take care of the locking expectations.
      For sock_ioctl now we acquire  (1) br_ioctl_mutex, (2) rtnl
      and for dev_ifsioc we acquire  (1) rtnl,           (2) br_ioctl_mutex
      
      The fix is to get a refcnt on the netdev for dev_ifsioc calls and drop rtnl
      then to reacquire it in the bridge ioctl stub after br_ioctl_mutex has
      been acquired. That will avoid playing locking games and make the rules
      straight-forward: we always take br_ioctl_mutex first, and then rtnl.
      
      Reported-by: syzbot+34fe5894623c4ab1b379@syzkaller.appspotmail.com
      Fixes: ad2f99ae ("net: bridge: move bridge ioctls out of .ndo_do_ioctl")
      Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      893b1958
  3. 04 8月, 2021 2 次提交
    • V
      net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge · 957e2235
      Vladimir Oltean 提交于
      With the introduction of explicit offloading API in switchdev in commit
      2f5dc00f ("net: bridge: switchdev: let drivers inform which bridge
      ports are offloaded"), we started having Ethernet switch drivers calling
      directly into a function exported by net/bridge/br_switchdev.c, which is
      a function exported by the bridge driver.
      
      This means that drivers that did not have an explicit dependency on the
      bridge before, like cpsw and am65-cpsw, now do - otherwise it is not
      possible to call a symbol exported by a driver that can be built as
      module unless you are a module too.
      
      There was an attempt to solve the dependency issue in the form of commit
      b0e81817 ("net: build all switchdev drivers as modules when the
      bridge is a module"). Grygorii Strashko, however, says about it:
      
      | In my opinion, the problem is a bit bigger here than just fixing the
      | build :(
      |
      | In case, of ^cpsw the switchdev mode is kinda optional and in many
      | cases (especially for testing purposes, NFS) the multi-mac mode is
      | still preferable mode.
      |
      | There were no such tight dependency between switchdev drivers and
      | bridge core before and switchdev serviced as independent, notification
      | based layer between them, so ^cpsw still can be "Y" and bridge can be
      | "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE
      | will need to be set as "Y", or we will have to update drivers to
      | support build with BRIDGE=n and maintain separate builds for
      | networking vs non-networking testing.  But is this enough?  Wouldn't
      | it cause 'chain reaction' required to add more and more "Y" options
      | (like CONFIG_VLAN_8021Q)?
      |
      | PS. Just to be sure we on the same page - ARM builds will be forced
      | (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our
      | automation testing will just fail with omap2plus_defconfig.
      
      In the light of this, it would be desirable for some configurations to
      avoid dependencies between switchdev drivers and the bridge, and have
      the switchdev mode as completely optional within the driver.
      
      Arnd Bergmann also tried to write a patch which better expressed the
      build time dependency for Ethernet switch drivers where the switchdev
      support is optional, like cpsw/am65-cpsw, and this made the drivers
      follow the bridge (compile as module if the bridge is a module) only if
      the optional switchdev support in the driver was enabled in the first
      place:
      https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/
      
      but this still did not solve the fact that cpsw and am65-cpsw now must
      be built as modules when the bridge is a module - it just expressed
      correctly that optional dependency. But the new behavior is an apparent
      regression from Grygorii's perspective.
      
      So to support the use case where the Ethernet driver is built-in,
      NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we
      need a framework that can handle the possible absence of the bridge from
      the running system, i.e. runtime bloatware as opposed to build-time
      bloatware.
      
      Luckily we already have this framework, since switchdev has been using
      it extensively. Events from the bridge side are transmitted to the
      driver side using notifier chains - this was originally done so that
      unrelated drivers could snoop for events emitted by the bridge towards
      ports that are implemented by other drivers (think of a switch driver
      with LAG offload that listens for switchdev events on a bonding/team
      interface that it offloads).
      
      There are also events which are transmitted from the driver side to the
      bridge side, which again are modeled using notifiers.
      SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with
      notifying the bridge that a MAC address has been dynamically learned.
      So there is a precedent we can use for modeling the new framework.
      
      The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work
      that the bridge needs to do when a port becomes offloaded is blocking in
      its nature: replay VLANs, MDBs etc. The calling context is indeed
      blocking (we are under rtnl_mutex), but the existing switchdev
      notification chain that the bridge is subscribed to is only the atomic
      one. So we need to subscribe the bridge to the blocking switchdev
      notification chain too.
      
      This patch:
      - keeps the driver-side perception of the switchdev_bridge_port_{,un}offload
        unchanged
      - moves the implementation of switchdev_bridge_port_{,un}offload from
        the bridge module into the switchdev module.
      - makes everybody that is subscribed to the switchdev blocking notifier
        chain "hear" offload & unoffload events
      - makes the bridge driver subscribe and handle those events
      - moves the bridge driver's handling of those events into 2 new
        functions called br_switchdev_port_{,un}offload. These functions
        contain in fact the core of the logic that was previously in
        switchdev_bridge_port_{,un}offload, just that now we go through an
        extra indirection layer to reach them.
      
      Unlike all the other switchdev notification structures, the structure
      used to carry the bridge port information, struct
      switchdev_notifier_brport_info, does not contain a "bool handled".
      This is because in the current usage pattern, we always know that a
      switchdev bridge port offloading event will be handled by the bridge,
      because the switchdev_bridge_port_offload() call was initiated by a
      NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a
      bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event
      couldn't have happened.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      957e2235
    • V
      net: bridge: switchdev: fix incorrect use of FDB flags when picking the dst device · 2e19bb35
      Vladimir Oltean 提交于
      Nikolay points out that it is incorrect to assume that it is impossible
      to have an fdb entry with fdb->dst == NULL and the BR_FDB_LOCAL bit in
      fdb->flags not set. This is because there are reader-side places that
      test_bit(BR_FDB_LOCAL, &fdb->flags) without the br->hash_lock, and if
      the updating of the FDB entry happens on another CPU, there are no
      memory barriers at writer or reader side which would ensure that the
      reader sees the updates to both fdb->flags and fdb->dst in the same
      order, i.e. the reader will not see an inconsistent FDB entry.
      
      So we must be prepared to deal with FDB entries where fdb->dst and
      fdb->flags are in a potentially inconsistent state, and that means that
      fdb->dst == NULL should remain a condition to pick the net_device that
      we report to switchdev as being the bridge device, which is what the
      code did prior to the blamed patch.
      
      Fixes: 52e4bec1 ("net: bridge: switchdev: treat local FDBs the same as entries towards the bridge")
      Suggested-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Link: https://lore.kernel.org/r/20210802113633.189831-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2e19bb35
  4. 03 8月, 2021 1 次提交
    • V
      net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry · 0541a629
      Vladimir Oltean 提交于
      Currently it is possible to add broken extern_learn FDB entries to the
      bridge in two ways:
      
      1. Entries pointing towards the bridge device that are not local/permanent:
      
      ip link add br0 type bridge
      bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static
      
      2. Entries pointing towards the bridge device or towards a port that
      are marked as local/permanent, however the bridge does not process the
      'permanent' bit in any way, therefore they are recorded as though they
      aren't permanent:
      
      ip link add br0 type bridge
      bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent
      
      Since commit 52e4bec1 ("net: bridge: switchdev: treat local FDBs the
      same as entries towards the bridge"), these incorrect FDB entries can
      even trigger NULL pointer dereferences inside the kernel.
      
      This is because that commit made the assumption that all FDB entries
      that are not local/permanent have a valid destination port. For context,
      local / permanent FDB entries either have fdb->dst == NULL, and these
      point towards the bridge device and are therefore local and not to be
      used for forwarding, or have fdb->dst == a net_bridge_port structure
      (but are to be treated in the same way, i.e. not for forwarding).
      
      That assumption _is_ correct as long as things are working correctly in
      the bridge driver, i.e. we cannot logically have fdb->dst == NULL under
      any circumstance for FDB entries that are not local. However, the
      extern_learn code path where FDB entries are managed by a user space
      controller show that it is possible for the bridge kernel driver to
      misinterpret the NUD flags of an entry transmitted by user space, and
      end up having fdb->dst == NULL while not being a local entry. This is
      invalid and should be rejected.
      
      Before, the two commands listed above both crashed the kernel in this
      check from br_switchdev_fdb_notify:
      
      	struct net_device *dev = info.is_local ? br->dev : dst->dev;
      
      info.is_local == false, dst == NULL.
      
      After this patch, the invalid entry added by the first command is
      rejected:
      
      ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0
      Error: bridge: FDB entry towards bridge must be permanent.
      
      and the valid entry added by the second command is properly treated as a
      local address and does not crash br_switchdev_fdb_notify anymore:
      
      ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0
      
      Fixes: eb100e0e ("net: bridge: allow to add externally learned entries from user-space")
      Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Link: https://lore.kernel.org/r/20210801231730.7493-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      0541a629
  5. 02 8月, 2021 1 次提交
    • F
      netfilter: ebtables: do not hook tables by default · 87663c39
      Florian Westphal 提交于
      If any of these modules is loaded, hooks get registered in all netns:
      
      Before: 'unshare -n nft list hooks' shows:
      family bridge hook prerouting {
      	-2147483648 ebt_broute
      	-0000000300 ebt_nat_hook
      }
      family bridge hook input {
      	-0000000200 ebt_filter_hook
      }
      family bridge hook forward {
      	-0000000200 ebt_filter_hook
      }
      family bridge hook output {
      	+0000000100 ebt_nat_hook
      	+0000000200 ebt_filter_hook
      }
      family bridge hook postrouting {
      	+0000000300 ebt_nat_hook
      }
      
      This adds 'template 'tables' for ebtables.
      
      Each ebtable_foo registers the table as a template, with an init function
      that gets called once the first get/setsockopt call is made.
      
      ebtables core then searches the (per netns) list of tables.
      If no table is found, it searches the list of templates instead.
      If a template entry exists, the init function is called which will
      enable the table and register the hooks (so packets are diverted
      to the table).
      
      If no entry is found in the template list, request_module is called.
      
      After this, hook registration is delayed until the 'ebtables'
      (set/getsockopt) request is made for a given table and will only
      happen in the specific namespace.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      87663c39
  6. 29 7月, 2021 2 次提交
    • V
      net: bridge: switchdev: treat local FDBs the same as entries towards the bridge · 52e4bec1
      Vladimir Oltean 提交于
      Currently the following script:
      
      1. ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
      2. ip link set swp2 up && ip link set swp2 master br0
      3. ip link set swp3 up && ip link set swp3 master br0
      4. ip link set swp4 up && ip link set swp4 master br0
      5. bridge vlan del dev swp2 vid 1
      6. bridge vlan del dev swp3 vid 1
      7. ip link set swp4 nomaster
      8. ip link set swp3 nomaster
      
      produces the following output:
      
      [  641.010738] sja1105 spi0.1: port 2 failed to delete 00:1f:7b:63:02:48 vid 1 from fdb: -2
      
      [ swp2, swp3 and br0 all have the same MAC address, the one listed above ]
      
      In short, this happens because the number of FDB entry additions
      notified to switchdev is unbalanced with the number of deletions.
      
      At step 1, the bridge has a random MAC address. At step 2, the
      br_fdb_replay of swp2 receives this initial MAC address. Then the bridge
      inherits the MAC address of swp2 via br_fdb_change_mac_address(), and it
      notifies switchdev (only swp2 at this point) of the deletion of the
      random MAC address and the addition of 00:1f:7b:63:02:48 as a local FDB
      entry with fdb->dst == swp2, in VLANs 0 and the default_pvid (1).
      
      During step 7:
      
      del_nbp
      -> br_fdb_delete_by_port(br, p, vid=0, do_all=1);
         -> fdb_delete_local(br, p, f);
      
      br_fdb_delete_by_port() deletes all entries towards the ports,
      regardless of vid, because do_all is 1.
      
      fdb_delete_local() has logic to migrate local FDB entries deleted from
      one port to another port which shares the same MAC address and is in the
      same VLAN, or to the bridge device itself. This migration happens
      without notifying switchdev of the deletion on the old port and the
      addition on the new one, just fdb->dst is changed and the added_by_user
      flag is cleared.
      
      In the example above, the del_nbp(swp4) causes the
      "addr 00:1f:7b:63:02:48 vid 1" local FDB entry with fdb->dst == swp4
      that existed up until then to be migrated directly towards the bridge
      (fdb->dst == NULL). This is because it cannot be migrated to any of the
      other ports (swp2 and swp3 are not in VLAN 1).
      
      After the migration to br0 takes place, swp4 requests a deletion replay
      of all FDB entries. Since the "addr 00:1f:7b:63:02:48 vid 1" entry now
      point towards the bridge, a deletion of it is replayed. There was just
      a prior addition of this address, so the switchdev driver deletes this
      entry.
      
      Then, the del_nbp(swp3) at step 8 triggers another br_fdb_replay, and
      switchdev is notified again to delete "addr 00:1f:7b:63:02:48 vid 1".
      But it can't because it no longer has it, so it returns -ENOENT.
      
      There are other possibilities to trigger this issue, but this is by far
      the simplest to explain.
      
      To fix this, we must avoid the situation where the addition of an FDB
      entry is notified to switchdev as a local entry on a port, and the
      deletion is notified on the bridge itself.
      
      Considering that the 2 types of FDB entries are completely equivalent
      and we cannot have the same MAC address as a local entry on 2 bridge
      ports, or on a bridge port and pointing towards the bridge at the same
      time, it makes sense to hide away from switchdev completely the fact
      that a local FDB entry is associated with a given bridge port at all.
      Just say that it points towards the bridge, it should make no difference
      whatsoever to the switchdev driver and should even lead to a simpler
      overall implementation, will less cases to handle.
      
      This also avoids any modification at all to the core bridge driver, just
      what is reported to switchdev changes. With the local/permanent entries
      on bridge ports being already reported to user space, it is hard to
      believe that the bridge behavior can change in any backwards-incompatible
      way such as making all local FDB entries point towards the bridge.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52e4bec1
    • V
      net: bridge: switchdev: replay the entire FDB for each port · b4454bc6
      Vladimir Oltean 提交于
      Currently when a switchdev port joins a bridge, we replay all FDB
      entries pointing towards that port or towards the bridge.
      
      However, this is insufficient in certain situations:
      
      (a) DSA, through its assisted_learning_on_cpu_port logic, snoops
          dynamically learned FDB entries on foreign interfaces.
          These are FDB entries that are pointing neither towards the newly
          joined switchdev port, nor towards the bridge. So these addresses
          would be missed when joining a bridge where a foreign interface has
          already learned some addresses, and they would also linger on if the
          DSA port leaves the bridge before the foreign interface forgets them.
          None of this happens if we replay the entire FDB when the port joins.
      
      (b) There is a desire to treat local FDB entries on a port (i.e. the
          port's termination MAC address) identically to FDB entries pointing
          towards the bridge itself. More details on the reason behind this in
          the next patch. The point is that this cannot be done given the
          current structure of br_fdb_replay() in this situation:
            ip link set swp0 master br0  # br0 inherits its MAC address from swp0
            ip link set swp1 master br0
          What is desirable is that when swp1 joins the bridge, br_fdb_replay()
          also notifies swp1 of br0's MAC address, but this won't in fact
          happen because the MAC address of br0 does not have fdb->dst == NULL
          (it doesn't point towards the bridge), but it has fdb->dst == swp0.
          So our current logic makes it impossible for that address to be
          replayed. But if we dump the entire FDB instead of just the entries
          with fdb->dst == swp1 and fdb->dst == NULL, then the inherited MAC
          address of br0 will be replayed too, which is what we need.
      
      A natural question arises: say there is an FDB entry to be replayed,
      like a MAC address dynamically learned on a foreign interface that
      belongs to a bridge where no switchdev port has joined yet. If 10
      switchdev ports belonging to the same driver join this bridge, one by
      one, won't every port get notified 10 times of the foreign FDB entry,
      amounting to a total of 100 notifications for this FDB entry in the
      switchdev driver?
      
      Well, yes, but this is where the "void *ctx" argument for br_fdb_replay
      is useful: every port of the switchdev driver is notified whenever any
      other port requests an FDB replay, but because the replay was initiated
      by a different port, its context is different from the initiating port's
      context, so it ignores those replays.
      
      So the foreign FDB entry will be installed only 10 times, once per port.
      This is done so that the following 4 code paths are always well balanced:
      (a) addition of foreign FDB entry is replayed when port joins bridge
      (b) deletion of foreign FDB entry is replayed when port leaves bridge
      (c) addition of foreign FDB entry is notified to all ports currently in bridge
      (c) deletion of foreign FDB entry is notified to all ports currently in bridge
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4454bc6
  7. 28 7月, 2021 2 次提交
    • A
      net: bridge: move bridge ioctls out of .ndo_do_ioctl · ad2f99ae
      Arnd Bergmann 提交于
      Working towards obsoleting the .ndo_do_ioctl operation entirely,
      stop passing the SIOCBRADDIF/SIOCBRDELIF device ioctl commands
      into this callback.
      
      My first attempt was to add another ndo_siocbr() callback, but
      as there is only a single driver that takes these commands and
      there is already a hook mechanism to call directly into this
      driver, extend this hook instead, and use it for both the
      deviceless and the device specific ioctl commands.
      
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
      Cc: bridge@lists.linux-foundation.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad2f99ae
    • A
      bridge: use ndo_siocdevprivate · 561d8352
      Arnd Bergmann 提交于
      The bridge driver has an old set of ioctls using the SIOCDEVPRIVATE
      namespace that have never worked in compat mode and are explicitly
      forbidden already.
      
      Move them over to ndo_siocdevprivate and fix compat mode for these,
      because we can.
      
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
      Cc: bridge@lists.linux-foundation.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      561d8352
  8. 27 7月, 2021 2 次提交
    • V
      net: bridge: add a helper for retrieving port VLANs from the data path · ee80dd2e
      Vladimir Oltean 提交于
      Introduce a brother of br_vlan_get_info() which is protected by the RCU
      mechanism, as opposed to br_vlan_get_info() which relies on taking the
      write-side rtnl_mutex.
      
      This is needed for drivers which need to find out whether a bridge port
      has a VLAN configured or not. For example, certain DSA switches might
      not offer complete source port identification to the CPU on RX, just the
      VLAN in which the packet was received. Based on this VLAN, we cannot set
      an accurate skb->dev ingress port, but at least we can configure one
      that behaves the same as the correct one would (this is possible because
      DSA sets skb->offload_fwd_mark = 1).
      
      When we look at the bridge RX handler (br_handle_frame), we see that
      what matters regarding skb->dev is the VLAN ID and the port STP state.
      So we need to select an skb->dev that has the same bridge VLAN as the
      packet we're receiving, and is in the LEARNING or FORWARDING STP state.
      The latter is easy, but for the former, we should somehow keep a shadow
      list of the bridge VLANs on each port, and a lookup table between VLAN
      ID and the 'designated port for imprecise RX'. That is rather
      complicated to keep in sync properly (the designated port per VLAN needs
      to be updated on the addition and removal of a VLAN, as well as on the
      join/leave events of the bridge on that port).
      
      So, to avoid all that complexity, let's just iterate through our finite
      number of ports and ask the bridge, for each packet: "do you have this
      VLAN configured on this port?".
      
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
      Cc: Ido Schimmel <idosch@nvidia.com>
      Cc: Jiri Pirko <jiri@nvidia.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee80dd2e
    • V
      net: bridge: update BROPT_VLAN_ENABLED before notifying switchdev in br_vlan_filter_toggle · f7cdb3ec
      Vladimir Oltean 提交于
      SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING is notified by the bridge from
      two places:
      - nbp_vlan_init(), during bridge port creation
      - br_vlan_filter_toggle(), during a netlink/sysfs/ioctl change requested
        by user space
      
      If a switchdev driver uses br_vlan_enabled(br_dev) inside its handler
      for the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING attribute notifier,
      different things will be seen depending on whether the bridge calls from
      the first path or the second:
      - in nbp_vlan_init(), br_vlan_enabled() reflects the current state of
        the bridge
      - in br_vlan_filter_toggle(), br_vlan_enabled() reflects the past state
        of the bridge
      
      This can lead in some cases to complications in driver implementation,
      which can be avoided if these could reliably use br_vlan_enabled().
      
      Nothing seems to depend on this behavior, and it seems overall more
      straightforward for br_vlan_enabled() to return the proper value even
      during the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING notifier, so
      temporarily enable the bridge option, then revert it if the switchdev
      notifier failed.
      
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
      Cc: Ido Schimmel <idosch@nvidia.com>
      Cc: Jiri Pirko <jiri@nvidia.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7cdb3ec
  9. 25 7月, 2021 1 次提交
  10. 23 7月, 2021 1 次提交
    • T
      net: bridge: switchdev: allow the TX data plane forwarding to be offloaded · 47211192
      Tobias Waldekranz 提交于
      Allow switchdevs to forward frames from the CPU in accordance with the
      bridge configuration in the same way as is done between bridge
      ports. This means that the bridge will only send a single skb towards
      one of the ports under the switchdev's control, and expects the driver
      to deliver the packet to all eligible ports in its domain.
      
      Primarily this improves the performance of multicast flows with
      multiple subscribers, as it allows the hardware to perform the frame
      replication.
      
      The basic flow between the driver and the bridge is as follows:
      
      - When joining a bridge port, the switchdev driver calls
        switchdev_bridge_port_offload() with tx_fwd_offload = true.
      
      - The bridge sends offloadable skbs to one of the ports under the
        switchdev's control using skb->offload_fwd_mark = true.
      
      - The switchdev driver checks the skb->offload_fwd_mark field and lets
        its FDB lookup select the destination port mask for this packet.
      
      v1->v2:
      - convert br_input_skb_cb::fwd_hwdoms to a plain unsigned long
      - introduce a static key "br_switchdev_fwd_offload_used" to minimize the
        impact of the newly introduced feature on all the setups which don't
        have hardware that can make use of it
      - introduce a check for nbp->flags & BR_FWD_OFFLOAD to optimize cache
        line access
      - reorder nbp_switchdev_frame_mark_accel() and br_handle_vlan() in
        __br_forward()
      - do not strip VLAN on egress if forwarding offload on VLAN-aware bridge
        is being used
      - propagate errors from .ndo_dfwd_add_station() if not EOPNOTSUPP
      
      v2->v3:
      - replace the solution based on .ndo_dfwd_add_station with a solution
        based on switchdev_bridge_port_offload
      - rename BR_FWD_OFFLOAD to BR_TX_FWD_OFFLOAD
      v3->v4: rebase
      v4->v5:
      - make sure the static key is decremented on bridge port unoffload
      - more function and variable renaming and comments for them:
        br_switchdev_fwd_offload_used to br_switchdev_tx_fwd_offload
        br_switchdev_accels_skb to br_switchdev_frame_uses_tx_fwd_offload
        nbp_switchdev_frame_mark_tx_fwd to nbp_switchdev_frame_mark_tx_fwd_to_hwdom
        nbp_switchdev_frame_mark_accel to nbp_switchdev_frame_mark_tx_fwd_offload
        fwd_accel to tx_fwd_offload
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47211192
  11. 22 7月, 2021 8 次提交
    • V
      net: bridge: move the switchdev object replay helpers to "push" mode · 4e51bf44
      Vladimir Oltean 提交于
      Starting with commit 4f2673b3 ("net: bridge: add helper to replay
      port and host-joined mdb entries"), DSA has introduced some bridge
      helpers that replay switchdev events (FDB/MDB/VLAN additions and
      deletions) that can be lost by the switchdev drivers in a variety of
      circumstances:
      
      - an IP multicast group was host-joined on the bridge itself before any
        switchdev port joined the bridge, leading to the host MDB entries
        missing in the hardware database.
      - during the bridge creation process, the MAC address of the bridge was
        added to the FDB as an entry pointing towards the bridge device
        itself, but with no switchdev ports being part of the bridge yet, this
        local FDB entry would remain unknown to the switchdev hardware
        database.
      - a VLAN/FDB/MDB was added to a bridge port that is a LAG interface,
        before any switchdev port joined that LAG, leading to the hardware
        database missing those entries.
      - a switchdev port left a LAG that is a bridge port, while the LAG
        remained part of the bridge, and all FDB/MDB/VLAN entries remained
        installed in the hardware database of the switchdev port.
      
      Also, since commit 0d2cfbd4 ("net: bridge: ignore switchdev events
      for LAG ports which didn't request replay"), DSA introduced a method,
      based on a const void *ctx, to ensure that two switchdev ports under the
      same LAG that is a bridge port do not see the same MDB/VLAN entry being
      replayed twice by the bridge, once for every bridge port that joins the
      LAG.
      
      With so many ordering corner cases being possible, it seems unreasonable
      to expect a switchdev driver writer to get it right from the first try.
      Therefore, now that DSA has experimented with the bridge replay helpers
      for a little bit, we can move the code to the bridge driver where it is
      more readily available to all switchdev drivers.
      
      To convert the switchdev object replay helpers from "pull mode" (where
      the driver asks for them) to a "push mode" (where the bridge offers them
      automatically), the biggest problem is that the bridge needs to be aware
      when a switchdev port joins and leaves, even when the switchdev is only
      indirectly a bridge port (for example when the bridge port is a LAG
      upper of the switchdev).
      
      Luckily, we already have a hook for that, in the form of the newly
      introduced switchdev_bridge_port_offload() and
      switchdev_bridge_port_unoffload() calls. These offer a natural place for
      hooking the object addition and deletion replays.
      
      Extend the above 2 functions with:
      - pointers to the switchdev atomic notifier (for FDB replays) and the
        blocking notifier (for MDB and VLAN replays).
      - the "const void *ctx" argument required for drivers to be able to
        disambiguate between which port is targeted, when multiple ports are
        lowers of the same LAG that is a bridge port. Most of the drivers pass
        NULL to this argument, except the ones that support LAG offload and have
        the proper context check already in place in the switchdev blocking
        notifier handler.
      
      Also unexport the replay helpers, since nobody except the bridge calls
      them directly now.
      
      Note that:
      (a) we abuse the terminology slightly, because FDB entries are not
          "switchdev objects", but we count them as objects nonetheless.
          With no direct way to prove it, I think they are not modeled as
          switchdev objects because those can only be installed by the bridge
          to the hardware (as opposed to FDB entries which can be propagated
          in the other direction too). This is merely an abuse of terms, FDB
          entries are replayed too, despite not being objects.
      (b) the bridge does not attempt to sync port attributes to newly joined
          ports, just the countable stuff (the objects). The reason for this
          is simple: no universal and symmetric way to sync and unsync them is
          known. For example, VLAN filtering: what to do on unsync, disable or
          leave it enabled? Similarly, STP state, ageing timer, etc etc. What
          a switchdev port does when it becomes standalone again is not really
          up to the bridge's competence, and the driver should deal with it.
          On the other hand, replaying deletions of switchdev objects can be
          seen a matter of cleanup and therefore be treated by the bridge,
          hence this patch.
      
      We make the replay helpers opt-in for drivers, because they might not
      bring immediate benefits for them:
      
      - nbp_vlan_init() is called _after_ netdev_master_upper_dev_link(),
        so br_vlan_replay() should not do anything for the new drivers on
        which we call it. The existing drivers where there was even a slight
        possibility for there to exist a VLAN on a bridge port before they
        join it are already guarded against this: mlxsw and prestera deny
        joining LAG interfaces that are members of a bridge.
      
      - br_fdb_replay() should now notify of local FDB entries, but I patched
        all drivers except DSA to ignore these new entries in commit
        2c4eca3e ("net: bridge: switchdev: include local flag in FDB
        notifications"). Driver authors can lift this restriction as they
        wish, and when they do, they can also opt into the FDB replay
        functionality.
      
      - br_mdb_replay() should fix a real issue which is described in commit
        4f2673b3 ("net: bridge: add helper to replay port and host-joined
        mdb entries"). However most drivers do not offload the
        SWITCHDEV_OBJ_ID_HOST_MDB to see this issue: only cpsw and am65_cpsw
        offload this switchdev object, and I don't completely understand the
        way in which they offload this switchdev object anyway. So I'll leave
        it up to these drivers' respective maintainers to opt into
        br_mdb_replay().
      
      So most of the drivers pass NULL notifier blocks for the replay helpers,
      except:
      - dpaa2-switch which was already acked/regression-tested with the
        helpers enabled (and there isn't much of a downside in having them)
      - ocelot which already had replay logic in "pull" mode
      - DSA which already had replay logic in "pull" mode
      
      An important observation is that the drivers which don't currently
      request bridge event replays don't even have the
      switchdev_bridge_port_{offload,unoffload} calls placed in proper places
      right now. This was done to avoid unnecessary rework for drivers which
      might never even add support for this. For driver writers who wish to
      add replay support, this can be used as a tentative placement guide:
      https://patchwork.kernel.org/project/netdevbpf/patch/20210720134655.892334-11-vladimir.oltean@nxp.com/
      
      Cc: Vadym Kochan <vkochan@marvell.com>
      Cc: Taras Chornyi <tchornyi@marvell.com>
      Cc: Ioana Ciornei <ioana.ciornei@nxp.com>
      Cc: Lars Povlsen <lars.povlsen@microchip.com>
      Cc: Steen Hegelund <Steen.Hegelund@microchip.com>
      Cc: UNGLinuxDriver@microchip.com
      Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Grygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e51bf44
    • V
      net: bridge: guard the switchdev replay helpers against a NULL notifier block · 7105b50b
      Vladimir Oltean 提交于
      There is a desire to make the object and FDB replay helpers optional
      when moving them inside the bridge driver. For example a certain driver
      might not offload host MDBs and there is no case where the replay
      helpers would be of immediate use to it.
      
      So it would be nice if we could allow drivers to pass NULL pointers for
      the atomic and blocking notifier blocks, and the replay helpers to do
      nothing in that case.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7105b50b
    • V
      net: bridge: switchdev: let drivers inform which bridge ports are offloaded · 2f5dc00f
      Vladimir Oltean 提交于
      On reception of an skb, the bridge checks if it was marked as 'already
      forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it
      is, it assigns the source hardware domain of that skb based on the
      hardware domain of the ingress port. Then during forwarding, it enforces
      that the egress port must have a different hardware domain than the
      ingress one (this is done in nbp_switchdev_allowed_egress).
      
      Non-switchdev drivers don't report any physical switch id (neither
      through devlink nor .ndo_get_port_parent_id), therefore the bridge
      assigns them a hardware domain of 0, and packets coming from them will
      always have skb->offload_fwd_mark = 0. So there aren't any restrictions.
      
      Problems appear due to the fact that DSA would like to perform software
      fallback for bonding and team interfaces that the physical switch cannot
      offload.
      
             +-- br0 ---+
            / /   |      \
           / /    |       \
          /  |    |      bond0
         /   |    |     /    \
       swp0 swp1 swp2 swp3 swp4
      
      There, it is desirable that the presence of swp3 and swp4 under a
      non-offloaded LAG does not preclude us from doing hardware bridging
      beteen swp0, swp1 and swp2. The bandwidth of the CPU is often times high
      enough that software bridging between {swp0,swp1,swp2} and bond0 is not
      impractical.
      
      But this creates an impossible paradox given the current way in which
      port hardware domains are assigned. When the driver receives a packet
      from swp0 (say, due to flooding), it must set skb->offload_fwd_mark to
      something.
      
      - If we set it to 0, then the bridge will forward it towards swp1, swp2
        and bond0. But the switch has already forwarded it towards swp1 and
        swp2 (not to bond0, remember, that isn't offloaded, so as far as the
        switch is concerned, ports swp3 and swp4 are not looking up the FDB,
        and the entire bond0 is a destination that is strictly behind the
        CPU). But we don't want duplicated traffic towards swp1 and swp2, so
        it's not ok to set skb->offload_fwd_mark = 0.
      
      - If we set it to 1, then the bridge will not forward the skb towards
        the ports with the same switchdev mark, i.e. not to swp1, swp2 and
        bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should
        have forwarded the skb there.
      
      So the real issue is that bond0 will be assigned the same hardware
      domain as {swp0,swp1,swp2}, because the function that assigns hardware
      domains to bridge ports, nbp_switchdev_add(), recurses through bond0's
      lower interfaces until it finds something that implements devlink (calls
      dev_get_port_parent_id with bool recurse = true). This is a problem
      because the fact that bond0 can be offloaded by swp3 and swp4 in our
      example is merely an assumption.
      
      A solution is to give the bridge explicit hints as to what hardware
      domain it should use for each port.
      
      Currently, the bridging offload is very 'silent': a driver registers a
      netdevice notifier, which is put on the netns's notifier chain, and
      which sniffs around for NETDEV_CHANGEUPPER events where the upper is a
      bridge, and the lower is an interface it knows about (one registered by
      this driver, normally). Then, from within that notifier, it does a bunch
      of stuff behind the bridge's back, without the bridge necessarily
      knowing that there's somebody offloading that port. It looks like this:
      
           ip link set swp0 master br0
                        |
                        v
       br_add_if() calls netdev_master_upper_dev_link()
                        |
                        v
              call_netdevice_notifiers
                        |
                        v
             dsa_slave_netdevice_event
                        |
                        v
              oh, hey! it's for me!
                        |
                        v
                 .port_bridge_join
      
      What we do to solve the conundrum is to be less silent, and change the
      switchdev drivers to present themselves to the bridge. Something like this:
      
           ip link set swp0 master br0
                        |
                        v
       br_add_if() calls netdev_master_upper_dev_link()
                        |
                        v                    bridge: Aye! I'll use this
              call_netdevice_notifiers           ^  ppid as the
                        |                        |  hardware domain for
                        v                        |  this port, and zero
             dsa_slave_netdevice_event           |  if I got nothing.
                        |                        |
                        v                        |
              oh, hey! it's for me!              |
                        |                        |
                        v                        |
                 .port_bridge_join               |
                        |                        |
                        +------------------------+
                   switchdev_bridge_port_offload(swp0, swp0)
      
      Then stacked interfaces (like bond0 on top of swp3/swp4) would be
      treated differently in DSA, depending on whether we can or cannot
      offload them.
      
      The offload case:
      
          ip link set bond0 master br0
                        |
                        v
       br_add_if() calls netdev_master_upper_dev_link()
                        |
                        v                    bridge: Aye! I'll use this
              call_netdevice_notifiers           ^  ppid as the
                        |                        |  switchdev mark for
                        v                        |        bond0.
             dsa_slave_netdevice_event           | Coincidentally (or not),
                        |                        | bond0 and swp0, swp1, swp2
                        v                        | all have the same switchdev
              hmm, it's not quite for me,        | mark now, since the ASIC
               but my driver has already         | is able to forward towards
                 called .port_lag_join           | all these ports in hw.
                for it, because I have           |
            a port with dp->lag_dev == bond0.    |
                        |                        |
                        v                        |
                 .port_bridge_join               |
                 for swp3 and swp4               |
                        |                        |
                        +------------------------+
                  switchdev_bridge_port_offload(bond0, swp3)
                  switchdev_bridge_port_offload(bond0, swp4)
      
      And the non-offload case:
      
          ip link set bond0 master br0
                        |
                        v
       br_add_if() calls netdev_master_upper_dev_link()
                        |
                        v                    bridge waiting:
              call_netdevice_notifiers           ^  huh, switchdev_bridge_port_offload
                        |                        |  wasn't called, okay, I'll use a
                        v                        |  hwdom of zero for this one.
             dsa_slave_netdevice_event           :  Then packets received on swp0 will
                        |                        :  not be software-forwarded towards
                        v                        :  swp1, but they will towards bond0.
               it's not for me, but
             bond0 is an upper of swp3
            and swp4, but their dp->lag_dev
             is NULL because they couldn't
                  offload it.
      
      Basically we can draw the conclusion that the lowers of a bridge port
      can come and go, so depending on the configuration of lowers for a
      bridge port, it can dynamically toggle between offloaded and unoffloaded.
      Therefore, we need an equivalent switchdev_bridge_port_unoffload too.
      
      This patch changes the way any switchdev driver interacts with the
      bridge. From now on, everybody needs to call switchdev_bridge_port_offload
      and switchdev_bridge_port_unoffload, otherwise the bridge will treat the
      port as non-offloaded and allow software flooding to other ports from
      the same ASIC.
      
      Note that these functions lay the ground for a more complex handshake
      between switchdev drivers and the bridge in the future.
      
      For drivers that will request a replay of the switchdev objects when
      they offload and unoffload a bridge port (DSA, dpaa2-switch, ocelot), we
      place the call to switchdev_bridge_port_unoffload() strategically inside
      the NETDEV_PRECHANGEUPPER notifier's code path, and not inside
      NETDEV_CHANGEUPPER. This is because the switchdev object replay helpers
      need the netdev adjacency lists to be valid, and that is only true in
      NETDEV_PRECHANGEUPPER.
      
      Cc: Vadym Kochan <vkochan@marvell.com>
      Cc: Taras Chornyi <tchornyi@marvell.com>
      Cc: Ioana Ciornei <ioana.ciornei@nxp.com>
      Cc: Lars Povlsen <lars.povlsen@microchip.com>
      Cc: Steen Hegelund <Steen.Hegelund@microchip.com>
      Cc: UNGLinuxDriver@microchip.com
      Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Grygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch: regression
      Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch
      Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> # ocelot-switch
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f5dc00f
    • T
      net: bridge: switchdev: recycle unused hwdoms · 85826610
      Tobias Waldekranz 提交于
      Since hwdoms have only been used thus far for equality comparisons, the
      bridge has used the simplest possible assignment policy; using a
      counter to keep track of the last value handed out.
      
      With the upcoming transmit offloading, we need to perform set
      operations efficiently based on hwdoms, e.g. we want to answer
      questions like "has this skb been forwarded to any port within this
      hwdom?"
      
      Move to a bitmap-based allocation scheme that recycles hwdoms once all
      members leaves the bridge. This means that we can use a single
      unsigned long to keep track of the hwdoms that have received an skb.
      
      v1->v2: convert the typedef DECLARE_BITMAP(br_hwdom_map_t, BR_HWDOM_MAX)
              into a plain unsigned long.
      v2->v6: none
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85826610
    • T
      net: bridge: disambiguate offload_fwd_mark · f7cf972f
      Tobias Waldekranz 提交于
      Before this change, four related - but distinct - concepts where named
      offload_fwd_mark:
      
      - skb->offload_fwd_mark: Set by the switchdev driver if the underlying
        hardware has already forwarded this frame to the other ports in the
        same hardware domain.
      
      - nbp->offload_fwd_mark: An idetifier used to group ports that share
        the same hardware forwarding domain.
      
      - br->offload_fwd_mark: Counter used to make sure that unique IDs are
        used in cases where a bridge contains ports from multiple hardware
        domains.
      
      - skb->cb->offload_fwd_mark: The hardware domain on which the frame
        ingressed and was forwarded.
      
      Introduce the term "hardware forwarding domain" ("hwdom") in the
      bridge to denote a set of ports with the following property:
      
          If an skb with skb->offload_fwd_mark set, is received on a port
          belonging to hwdom N, that frame has already been forwarded to all
          other ports in hwdom N.
      
      By decoupling the name from "offload_fwd_mark", we can extend the
      term's definition in the future - e.g. to add constraints that
      describe expected egress behavior - without overloading the meaning of
      "offload_fwd_mark".
      
      - nbp->offload_fwd_mark thus becomes nbp->hwdom.
      
      - br->offload_fwd_mark becomes br->last_hwdom.
      
      - skb->cb->offload_fwd_mark becomes skb->cb->src_hwdom. The slight
        change in naming here mandates a slight change in behavior of the
        nbp_switchdev_frame_mark() function. Previously, it only set this
        value in skb->cb for packets with skb->offload_fwd_mark true (ones
        which were forwarded in hardware). Whereas now we always track the
        incoming hwdom for all packets coming from a switchdev (even for the
        packets which weren't forwarded in hardware, such as STP BPDUs, IGMP
        reports etc). As all uses of skb->cb->offload_fwd_mark were already
        gated behind checks of skb->offload_fwd_mark, this will not introduce
        any functional change, but it paves the way for future changes where
        the ingressing hwdom must be known for frames coming from a switchdev
        regardless of whether they were forwarded in hardware or not
        (basically, if the skb comes from a switchdev, skb->cb->src_hwdom now
        always tracks which one).
      
        A typical example where this is relevant: the switchdev has a fixed
        configuration to trap STP BPDUs, but STP is not running on the bridge
        and the group_fwd_mask allows them to be forwarded. Say we have this
        setup:
      
              br0
             / | \
            /  |  \
        swp0 swp1 swp2
      
        A BPDU comes in on swp0 and is trapped to the CPU; the driver does not
        set skb->offload_fwd_mark. The bridge determines that the frame should
        be forwarded to swp{1,2}. It is imperative that forward offloading is
        _not_ allowed in this case, as the source hwdom is already "poisoned".
      
        Recording the source hwdom allows this case to be handled properly.
      
      v2->v3: added code comments
      v3->v6: none
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7cf972f
    • N
      net: bridge: multicast: add context support for host-joined groups · 58d913a3
      Nikolay Aleksandrov 提交于
      Adding bridge multicast context support for host-joined groups is easy
      because we only need the proper timer value. We pass the already chosen
      context and use its timer value.
      Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58d913a3
    • N
      net: bridge: multicast: add mdb context support · 6567cb43
      Nikolay Aleksandrov 提交于
      Choose the proper bridge multicast context when user-spaces is adding
      mdb entries. Currently we require the vlan to be configured on at least
      one device (port or bridge) in order to add an mdb entry if vlan
      mcast snooping is enabled (vlan snooping implies vlan filtering).
      Note that we always allow deleting an entry, regardless of the vlan state.
      Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6567cb43
    • N
      net: bridge: multicast: fix igmp/mld port context null pointer dereferences · 54cb4319
      Nikolay Aleksandrov 提交于
      With the recent change to use bridge/port multicast context pointers
      instead of bridge/port I missed to convert two locations which pass the
      port pointer as-is, but with the new model we need to verify the port
      context is non-NULL first and retrieve the port from it. The first
      location is when doing querier selection when a query is received, the
      second location is when leaving a group. The port context will be null
      if the packets originated from the bridge device (i.e. from the host).
      The fix is simple just check if the port context exists and retrieve
      the port pointer from it.
      
      Fixes: adc47037 ("net: bridge: multicast: use multicast contexts instead of bridge or port")
      Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54cb4319
  12. 20 7月, 2021 13 次提交