1. 22 7月, 2021 3 次提交
    • T
      net: bridge: switchdev: recycle unused hwdoms · 85826610
      Tobias Waldekranz 提交于
      Since hwdoms have only been used thus far for equality comparisons, the
      bridge has used the simplest possible assignment policy; using a
      counter to keep track of the last value handed out.
      
      With the upcoming transmit offloading, we need to perform set
      operations efficiently based on hwdoms, e.g. we want to answer
      questions like "has this skb been forwarded to any port within this
      hwdom?"
      
      Move to a bitmap-based allocation scheme that recycles hwdoms once all
      members leaves the bridge. This means that we can use a single
      unsigned long to keep track of the hwdoms that have received an skb.
      
      v1->v2: convert the typedef DECLARE_BITMAP(br_hwdom_map_t, BR_HWDOM_MAX)
              into a plain unsigned long.
      v2->v6: none
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85826610
    • T
      net: bridge: disambiguate offload_fwd_mark · f7cf972f
      Tobias Waldekranz 提交于
      Before this change, four related - but distinct - concepts where named
      offload_fwd_mark:
      
      - skb->offload_fwd_mark: Set by the switchdev driver if the underlying
        hardware has already forwarded this frame to the other ports in the
        same hardware domain.
      
      - nbp->offload_fwd_mark: An idetifier used to group ports that share
        the same hardware forwarding domain.
      
      - br->offload_fwd_mark: Counter used to make sure that unique IDs are
        used in cases where a bridge contains ports from multiple hardware
        domains.
      
      - skb->cb->offload_fwd_mark: The hardware domain on which the frame
        ingressed and was forwarded.
      
      Introduce the term "hardware forwarding domain" ("hwdom") in the
      bridge to denote a set of ports with the following property:
      
          If an skb with skb->offload_fwd_mark set, is received on a port
          belonging to hwdom N, that frame has already been forwarded to all
          other ports in hwdom N.
      
      By decoupling the name from "offload_fwd_mark", we can extend the
      term's definition in the future - e.g. to add constraints that
      describe expected egress behavior - without overloading the meaning of
      "offload_fwd_mark".
      
      - nbp->offload_fwd_mark thus becomes nbp->hwdom.
      
      - br->offload_fwd_mark becomes br->last_hwdom.
      
      - skb->cb->offload_fwd_mark becomes skb->cb->src_hwdom. The slight
        change in naming here mandates a slight change in behavior of the
        nbp_switchdev_frame_mark() function. Previously, it only set this
        value in skb->cb for packets with skb->offload_fwd_mark true (ones
        which were forwarded in hardware). Whereas now we always track the
        incoming hwdom for all packets coming from a switchdev (even for the
        packets which weren't forwarded in hardware, such as STP BPDUs, IGMP
        reports etc). As all uses of skb->cb->offload_fwd_mark were already
        gated behind checks of skb->offload_fwd_mark, this will not introduce
        any functional change, but it paves the way for future changes where
        the ingressing hwdom must be known for frames coming from a switchdev
        regardless of whether they were forwarded in hardware or not
        (basically, if the skb comes from a switchdev, skb->cb->src_hwdom now
        always tracks which one).
      
        A typical example where this is relevant: the switchdev has a fixed
        configuration to trap STP BPDUs, but STP is not running on the bridge
        and the group_fwd_mask allows them to be forwarded. Say we have this
        setup:
      
              br0
             / | \
            /  |  \
        swp0 swp1 swp2
      
        A BPDU comes in on swp0 and is trapped to the CPU; the driver does not
        set skb->offload_fwd_mark. The bridge determines that the frame should
        be forwarded to swp{1,2}. It is imperative that forward offloading is
        _not_ allowed in this case, as the source hwdom is already "poisoned".
      
        Recording the source hwdom allows this case to be handled properly.
      
      v2->v3: added code comments
      v3->v6: none
      Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7cf972f
    • N
      net: bridge: multicast: add context support for host-joined groups · 58d913a3
      Nikolay Aleksandrov 提交于
      Adding bridge multicast context support for host-joined groups is easy
      because we only need the proper timer value. We pass the already chosen
      context and use its timer value.
      Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58d913a3
  2. 20 7月, 2021 11 次提交
  3. 30 6月, 2021 1 次提交
  4. 11 6月, 2021 1 次提交
    • N
      net: bridge: fix vlan tunnel dst null pointer dereference · 58e20717
      Nikolay Aleksandrov 提交于
      This patch fixes a tunnel_dst null pointer dereference due to lockless
      access in the tunnel egress path. When deleting a vlan tunnel the
      tunnel_dst pointer is set to NULL without waiting a grace period (i.e.
      while it's still usable) and packets egressing are dereferencing it
      without checking. Use READ/WRITE_ONCE to annotate the lockless use of
      tunnel_id, use RCU for accessing tunnel_dst and make sure it is read
      only once and checked in the egress path. The dst is already properly RCU
      protected so we don't need to do anything fancy than to make sure
      tunnel_id and tunnel_dst are read only once and checked in the egress path.
      
      Cc: stable@vger.kernel.org
      Fixes: 11538d03 ("bridge: vlan dst_metadata hooks in ingress and egress paths")
      Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58e20717
  5. 15 5月, 2021 1 次提交
  6. 14 5月, 2021 4 次提交
  7. 15 4月, 2021 1 次提交
  8. 25 3月, 2021 1 次提交
  9. 16 2月, 2021 1 次提交
  10. 15 2月, 2021 3 次提交
  11. 13 2月, 2021 1 次提交
  12. 28 1月, 2021 1 次提交
  13. 23 1月, 2021 3 次提交
  14. 08 12月, 2020 1 次提交
    • J
      bridge: Fix a deadlock when enabling multicast snooping · 851d0a73
      Joseph Huang 提交于
      When enabling multicast snooping, bridge module deadlocks on multicast_lock
      if 1) IPv6 is enabled, and 2) there is an existing querier on the same L2
      network.
      
      The deadlock was caused by the following sequence: While holding the lock,
      br_multicast_open calls br_multicast_join_snoopers, which eventually causes
      IP stack to (attempt to) send out a Listener Report (in igmp6_join_group).
      Since the destination Ethernet address is a multicast address, br_dev_xmit
      feeds the packet back to the bridge via br_multicast_rcv, which in turn
      calls br_multicast_add_group, which then deadlocks on multicast_lock.
      
      The fix is to move the call br_multicast_join_snoopers outside of the
      critical section. This works since br_multicast_join_snoopers only deals
      with IP and does not modify any multicast data structures of the bridge,
      so there's no need to hold the lock.
      
      Steps to reproduce:
      1. sysctl net.ipv6.conf.all.force_mld_version=1
      2. have another querier
      3. ip link set dev bridge type bridge mcast_snooping 0 && \
         ip link set dev bridge type bridge mcast_snooping 1 < deadlock >
      
      A typical call trace looks like the following:
      
      [  936.251495]  _raw_spin_lock+0x5c/0x68
      [  936.255221]  br_multicast_add_group+0x40/0x170 [bridge]
      [  936.260491]  br_multicast_rcv+0x7ac/0xe30 [bridge]
      [  936.265322]  br_dev_xmit+0x140/0x368 [bridge]
      [  936.269689]  dev_hard_start_xmit+0x94/0x158
      [  936.273876]  __dev_queue_xmit+0x5ac/0x7f8
      [  936.277890]  dev_queue_xmit+0x10/0x18
      [  936.281563]  neigh_resolve_output+0xec/0x198
      [  936.285845]  ip6_finish_output2+0x240/0x710
      [  936.290039]  __ip6_finish_output+0x130/0x170
      [  936.294318]  ip6_output+0x6c/0x1c8
      [  936.297731]  NF_HOOK.constprop.0+0xd8/0xe8
      [  936.301834]  igmp6_send+0x358/0x558
      [  936.305326]  igmp6_join_group.part.0+0x30/0xf0
      [  936.309774]  igmp6_group_added+0xfc/0x110
      [  936.313787]  __ipv6_dev_mc_inc+0x1a4/0x290
      [  936.317885]  ipv6_dev_mc_inc+0x10/0x18
      [  936.321677]  br_multicast_open+0xbc/0x110 [bridge]
      [  936.326506]  br_multicast_toggle+0xec/0x140 [bridge]
      
      Fixes: 4effd28c ("bridge: join all-snoopers multicast address")
      Signed-off-by: NJoseph Huang <Joseph.Huang@garmin.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Link: https://lore.kernel.org/r/20201204235628.50653-1-Joseph.Huang@garmin.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      851d0a73
  15. 22 11月, 2020 1 次提交
  16. 19 11月, 2020 1 次提交
  17. 10 11月, 2020 1 次提交
  18. 01 11月, 2020 1 次提交
  19. 31 10月, 2020 1 次提交
  20. 30 10月, 2020 2 次提交
    • H
      bridge: cfm: Netlink Notifications. · b6d0425b
      Henrik Bjoernlund 提交于
      This is the implementation of Netlink notifications out of CFM.
      
      Notifications are initiated whenever a state change happens in CFM.
      
      IFLA_BRIDGE_CFM:
          Points to the CFM information.
      
      IFLA_BRIDGE_CFM_MEP_STATUS_INFO:
          This indicate that the MEP instance status are following.
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO:
          This indicate that the peer MEP status are following.
      
      CFM nested attribute has the following attributes in next level.
      
      IFLA_BRIDGE_CFM_MEP_STATUS_INSTANCE:
          The MEP instance number of the delivered status.
          The type is NLA_U32.
      IFLA_BRIDGE_CFM_MEP_STATUS_OPCODE_UNEXP_SEEN:
          The MEP instance received CFM PDU with unexpected Opcode.
          The type is NLA_U32 (bool).
      IFLA_BRIDGE_CFM_MEP_STATUS_VERSION_UNEXP_SEEN:
          The MEP instance received CFM PDU with unexpected version.
          The type is NLA_U32 (bool).
      IFLA_BRIDGE_CFM_MEP_STATUS_RX_LEVEL_LOW_SEEN:
          The MEP instance received CCM PDU with MD level lower than
          configured level. This frame is discarded.
          The type is NLA_U32 (bool).
      
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_INSTANCE:
          The MEP instance number of the delivered status.
          The type is NLA_U32.
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_PEER_MEPID:
          The added Peer MEP ID of the delivered status.
          The type is NLA_U32.
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_CCM_DEFECT:
          The CCM defect status.
          The type is NLA_U32 (bool).
          True means no CCM frame is received for 3.25 intervals.
          IFLA_BRIDGE_CFM_CC_CONFIG_EXP_INTERVAL.
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_RDI:
          The last received CCM PDU RDI.
          The type is NLA_U32 (bool).
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_PORT_TLV_VALUE:
          The last received CCM PDU Port Status TLV value field.
          The type is NLA_U8.
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_IF_TLV_VALUE:
          The last received CCM PDU Interface Status TLV value field.
          The type is NLA_U8.
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_SEEN:
          A CCM frame has been received from Peer MEP.
          The type is NLA_U32 (bool).
          This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_TLV_SEEN:
          A CCM frame with TLV has been received from Peer MEP.
          The type is NLA_U32 (bool).
          This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
      IFLA_BRIDGE_CFM_CC_PEER_STATUS_SEQ_UNEXP_SEEN:
          A CCM frame with unexpected sequence number has been received
          from Peer MEP.
          The type is NLA_U32 (bool).
          When a sequence number is not one higher than previously received
          then it is unexpected.
          This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
      Signed-off-by: NHenrik Bjoernlund  <henrik.bjoernlund@microchip.com>
      Reviewed-by: NHoratiu Vultur  <horatiu.vultur@microchip.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b6d0425b
    • H
      bridge: cfm: Netlink GET status Interface. · e77824d8
      Henrik Bjoernlund 提交于
      This is the implementation of CFM netlink status
      get information interface.
      
      Add new nested netlink attributes. These attributes are used by the
      user space to get status information.
      
      GETLINK:
          Request filter RTEXT_FILTER_CFM_STATUS:
          Indicating that CFM status information must be delivered.
      
          IFLA_BRIDGE_CFM:
              Points to the CFM information.
      
          IFLA_BRIDGE_CFM_MEP_STATUS_INFO:
              This indicate that the MEP instance status are following.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO:
              This indicate that the peer MEP status are following.
      
      CFM nested attribute has the following attributes in next level.
      
      GETLINK RTEXT_FILTER_CFM_STATUS:
          IFLA_BRIDGE_CFM_MEP_STATUS_INSTANCE:
              The MEP instance number of the delivered status.
              The type is u32.
          IFLA_BRIDGE_CFM_MEP_STATUS_OPCODE_UNEXP_SEEN:
              The MEP instance received CFM PDU with unexpected Opcode.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_MEP_STATUS_VERSION_UNEXP_SEEN:
              The MEP instance received CFM PDU with unexpected version.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_MEP_STATUS_RX_LEVEL_LOW_SEEN:
              The MEP instance received CCM PDU with MD level lower than
              configured level. This frame is discarded.
              The type is u32 (bool).
      
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_INSTANCE:
              The MEP instance number of the delivered status.
              The type is u32.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_PEER_MEPID:
              The added Peer MEP ID of the delivered status.
              The type is u32.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_CCM_DEFECT:
              The CCM defect status.
              The type is u32 (bool).
              True means no CCM frame is received for 3.25 intervals.
              IFLA_BRIDGE_CFM_CC_CONFIG_EXP_INTERVAL.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_RDI:
              The last received CCM PDU RDI.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_PORT_TLV_VALUE:
              The last received CCM PDU Port Status TLV value field.
              The type is u8.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_IF_TLV_VALUE:
              The last received CCM PDU Interface Status TLV value field.
              The type is u8.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_SEEN:
              A CCM frame has been received from Peer MEP.
              The type is u32 (bool).
              This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_TLV_SEEN:
              A CCM frame with TLV has been received from Peer MEP.
              The type is u32 (bool).
              This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_SEQ_UNEXP_SEEN:
              A CCM frame with unexpected sequence number has been received
              from Peer MEP.
              The type is u32 (bool).
              When a sequence number is not one higher than previously received
              then it is unexpected.
              This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
      Signed-off-by: NHenrik Bjoernlund  <henrik.bjoernlund@microchip.com>
      Reviewed-by: NHoratiu Vultur  <horatiu.vultur@microchip.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e77824d8