1. 02 11月, 2019 1 次提交
  2. 30 10月, 2019 1 次提交
  3. 04 7月, 2019 1 次提交
  4. 03 7月, 2019 1 次提交
  5. 31 5月, 2019 1 次提交
  6. 17 4月, 2019 1 次提交
    • N
      net: bridge: fix per-port af_packet sockets · 3b2e2904
      Nikolay Aleksandrov 提交于
      When the commit below was introduced it changed two visible things:
       - the skb was no longer passed through the protocol handlers with the
         original device
       - the skb was passed up the stack with skb->dev = bridge
      
      The first change broke af_packet sockets on bridge ports. For example we
      use them for hostapd which listens for ETH_P_PAE packets on the ports.
      We discussed two possible fixes:
       - create a clone and pass it through NF_HOOK(), act on the original skb
         based on the result
       - somehow signal to the caller from the okfn() that it was called,
         meaning the skb is ok to be passed, which this patch is trying to
         implement via returning 1 from the bridge link-local okfn()
      
      Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and
      drop/error would return < 0 thus the okfn() is called only when the
      return was 1, so we signal to the caller that it was called by preserving
      the return value from nf_hook().
      
      Fixes: 8626c56c ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b2e2904
  7. 16 4月, 2019 1 次提交
  8. 12 4月, 2019 3 次提交
  9. 28 11月, 2018 1 次提交
  10. 27 9月, 2018 1 次提交
  11. 26 5月, 2018 1 次提交
  12. 10 11月, 2017 1 次提交
  13. 09 10月, 2017 2 次提交
  14. 29 9月, 2017 1 次提交
    • N
      net: bridge: add per-port group_fwd_mask with less restrictions · 5af48b59
      Nikolay Aleksandrov 提交于
      We need to be able to transparently forward most link-local frames via
      tunnels (e.g. vxlan, qinq). Currently the bridge's group_fwd_mask has a
      mask which restricts the forwarding of STP and LACP, but we need to be able
      to forward these over tunnels and control that forwarding on a per-port
      basis thus add a new per-port group_fwd_mask option which only disallows
      mac pause frames to be forwarded (they're always dropped anyway).
      The patch does not change the current default situation - all of the others
      are still restricted unless configured for forwarding.
      We have successfully tested this patch with LACP and STP forwarding over
      VxLAN and qinq tunnels.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5af48b59
  15. 14 7月, 2017 1 次提交
  16. 14 3月, 2017 1 次提交
  17. 15 2月, 2017 1 次提交
  18. 08 2月, 2017 1 次提交
  19. 07 2月, 2017 1 次提交
  20. 04 2月, 2017 1 次提交
  21. 02 9月, 2016 2 次提交
  22. 27 8月, 2016 1 次提交
    • I
      bridge: switchdev: Add forward mark support for stacked devices · 6bc506b4
      Ido Schimmel 提交于
      switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of
      port netdevs so that packets being flooded by the device won't be
      flooded twice.
      
      It works by assigning a unique identifier (the ifindex of the first
      bridge port) to bridge ports sharing the same parent ID. This prevents
      packets from being flooded twice by the same switch, but will flood
      packets through bridge ports belonging to a different switch.
      
      This method is problematic when stacked devices are taken into account,
      such as VLANs. In such cases, a physical port netdev can have upper
      devices being members in two different bridges, thus requiring two
      different 'offload_fwd_mark's to be configured on the port netdev, which
      is impossible.
      
      The main problem is that packet and netdev marking is performed at the
      physical netdev level, whereas flooding occurs between bridge ports,
      which are not necessarily port netdevs.
      
      Instead, packet and netdev marking should really be done in the bridge
      driver with the switch driver only telling it which packets it already
      forwarded. The bridge driver will mark such packets using the mark
      assigned to the ingress bridge port and will prevent the packet from
      being forwarded through any bridge port sharing the same mark (i.e.
      having the same parent ID).
      
      Remove the current switchdev 'offload_fwd_mark' implementation and
      instead implement the proposed method. In addition, make rocker - the
      sole user of the mark - use the proposed method.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bc506b4
  23. 26 7月, 2016 1 次提交
    • I
      bridge: Fix incorrect re-injection of LLDP packets · baedbe55
      Ido Schimmel 提交于
      Commit 8626c56c ("bridge: fix potential use-after-free when hook
      returns QUEUE or STOLEN verdict") caused LLDP packets arriving through a
      bridge port to be re-injected to the Rx path with skb->dev set to the
      bridge device, but this breaks the lldpad daemon.
      
      The lldpad daemon opens a packet socket with protocol set to ETH_P_LLDP
      for any valid device on the system, which doesn't not include soft
      devices such as bridge and VLAN.
      
      Since packet sockets (ptype_base) are processed in the Rx path after the
      Rx handler, LLDP packets with skb->dev set to the bridge device never
      reach the lldpad daemon.
      
      Fix this by making the bridge's Rx handler re-inject LLDP packets with
      RX_HANDLER_PASS, which effectively restores the behaviour prior to the
      mentioned commit.
      
      This means netfilter will never receive LLDP packets coming through a
      bridge port, as I don't see a way in which we can have okfn() consume
      the packet without breaking existing behaviour. I've already carried out
      a similar fix for STP packets in commit 56fae404 ("bridge: Fix
      incorrect re-injection of STP packets").
      
      Fixes: 8626c56c ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      baedbe55
  24. 17 7月, 2016 4 次提交
  25. 10 7月, 2016 1 次提交
  26. 30 6月, 2016 1 次提交
    • N
      net: bridge: add support for IGMP/MLD stats and export them via netlink · 1080ab95
      Nikolay Aleksandrov 提交于
      This patch adds stats support for the currently used IGMP/MLD types by the
      bridge. The stats are per-port (plus one stat per-bridge) and per-direction
      (RX/TX). The stats are exported via netlink via the new linkxstats API
      (RTM_GETSTATS). In order to minimize the performance impact, a new option
      is used to enable/disable the stats - multicast_stats_enabled, similar to
      the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
      lookups and checks, we make use of the current "igmp" member of the bridge
      private skb->cb region to record the type on Rx (both host-generated and
      external packets pass by multicast_rcv()). We can do that since the igmp
      member was used as a boolean and all the valid IGMP/MLD types are positive
      values. The normal bridge fast-path is not affected at all, the only
      affected paths are the flooding ones and since we make use of the IGMP/MLD
      type, we can quickly determine if the packet should be counted using
      cache-hot data (cb's igmp member). We add counters for:
      * IGMP Queries
      * IGMP Leaves
      * IGMP v1/v2/v3 reports
      
      * MLD Queries
      * MLD Leaves
      * MLD v1/v2 reports
      
      These are invaluable when monitoring or debugging complex multicast setups
      with bridges.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1080ab95
  27. 11 6月, 2016 1 次提交
    • I
      bridge: Fix incorrect re-injection of STP packets · 56fae404
      Ido Schimmel 提交于
      Commit 8626c56c ("bridge: fix potential use-after-free when hook
      returns QUEUE or STOLEN verdict") fixed incorrect usage of NF_HOOK's
      return value by consuming packets in okfn via br_pass_frame_up().
      
      However, this function re-injects packets to the Rx path with skb->dev
      set to the bridge device, which breaks kernel's STP, as all STP packets
      appear to originate from the bridge device itself.
      
      Instead, if STP is enabled and bridge isn't a 802.1ad bridge, then learn
      packet's SMAC and inject it back to the Rx path for further processing
      by the packet handlers.
      
      The patch also makes netfilter's behavior consistent with regards to
      packets destined to the Bridge Group Address, as no hook registered at
      LOCAL_IN will ever be called, regardless if STP is enabled or not.
      
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Shmulik Ladkani <shmulik.ladkani@gmail.com>
      Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Fixes: 8626c56c ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56fae404
  28. 15 3月, 2016 1 次提交
    • F
      bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict · 8626c56c
      Florian Westphal 提交于
      Zefir Kurtisi reported kernel panic with an openwrt specific patch.
      However, it turns out that mainline has a similar bug waiting to happen.
      
      Once NF_HOOK() returns the skb is in undefined state and must not be
      used.   Moreover, the okfn must consume the skb to support async
      processing (NF_QUEUE).
      
      Current okfn in this spot doesn't consume it and caller assumes that
      NF_HOOK return value tells us if skb was freed or not, but thats wrong.
      
      It "works" because no in-tree user registers a NFPROTO_BRIDGE hook at
      LOCAL_IN that returns STOLEN or NF_QUEUE verdicts.
      
      Once we add NF_QUEUE support for nftables bridge this will break --
      NF_QUEUE holds the skb for async processing, caller will erronoulsy
      return RX_HANDLER_PASS and on reinject netfilter will access free'd skb.
      
      Fix this by pushing skb up the stack in the okfn instead.
      
      NB: It also seems dubious to use LOCAL_IN while bypassing PRE_ROUTING
      completely in this case but this is how its been forever so it seems
      preferable to not change this.
      
      Cc: Felix Fietkau <nbd@openwrt.org>
      Cc: Zefir Kurtisi <zefir.kurtisi@neratec.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Tested-by: NZefir Kurtisi <zefir.kurtisi@neratec.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8626c56c
  29. 13 10月, 2015 1 次提交
  30. 02 10月, 2015 1 次提交
  31. 30 9月, 2015 1 次提交
    • N
      bridge: vlan: add per-vlan struct and move to rhashtables · 2594e906
      Nikolay Aleksandrov 提交于
      This patch changes the bridge vlan implementation to use rhashtables
      instead of bitmaps. The main motivation behind this change is that we
      need extensible per-vlan structures (both per-port and global) so more
      advanced features can be introduced and the vlan support can be
      extended. I've tried to break this up but the moment net_port_vlans is
      changed and the whole API goes away, thus this is a larger patch.
      A few short goals of this patch are:
      - Extensible per-vlan structs stored in rhashtables and a sorted list
      - Keep user-visible behaviour (compressed vlans etc)
      - Keep fastpath ingress/egress logic the same (optimizations to come
        later)
      
      Here's a brief list of some of the new features we'd like to introduce:
      - per-vlan counters
      - vlan ingress/egress mapping
      - per-vlan igmp configuration
      - vlan priorities
      - avoid fdb entries replication (e.g. local fdb scaling issues)
      
      The structure is kept single for both global and per-port entries so to
      avoid code duplication where possible and also because we'll soon introduce
      "port0 / aka bridge as port" which should simplify things further
      (thanks to Vlad for the suggestion!).
      
      Now we have per-vlan global rhashtable (bridge-wide) and per-vlan port
      rhashtable, if an entry is added to a port it'll get a pointer to its
      global context so it can be quickly accessed later. There's also a
      sorted vlan list which is used for stable walks and some user-visible
      behaviour such as the vlan ranges, also for error paths.
      VLANs are stored in a "vlan group" which currently contains the
      rhashtable, sorted vlan list and the number of "real" vlan entries.
      A good side-effect of this change is that it resembles how hw keeps
      per-vlan data.
      One important note after this change is that if a VLAN is being looked up
      in the bridge's rhashtable for filtering purposes (or to check if it's an
      existing usable entry, not just a global context) then the new helper
      br_vlan_should_use() needs to be used if the vlan is found. In case the
      lookup is done only with a port's vlan group, then this check can be
      skipped.
      
      Things tested so far:
      - basic vlan ingress/egress
      - pvids
      - untagged vlans
      - undef CONFIG_BRIDGE_VLAN_FILTERING
      - adding/deleting vlans in different scenarios (with/without global ctx,
        while transmitting traffic, in ranges etc)
      - loading/removing the module while having/adding/deleting vlans
      - extracting bridge vlan information (user ABI), compressed requests
      - adding/deleting fdbs on vlans
      - bridge mac change, promisc mode
      - default pvid change
      - kmemleak ON during the whole time
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2594e906
  32. 18 9月, 2015 2 次提交
    • E
      netfilter: Pass net into okfn · 0c4b51f0
      Eric W. Biederman 提交于
      This is immediately motivated by the bridge code that chains functions that
      call into netfilter.  Without passing net into the okfns the bridge code would
      need to guess about the best expression for the network namespace to process
      packets in.
      
      As net is frequently one of the first things computed in continuation functions
      after netfilter has done it's job passing in the desired network namespace is in
      many cases a code simplification.
      
      To support this change the function dst_output_okfn is introduced to
      simplify passing dst_output as an okfn.  For the moment dst_output_okfn
      just silently drops the struct net.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c4b51f0
    • E
      netfilter: Pass struct net into the netfilter hooks · 29a26a56
      Eric W. Biederman 提交于
      Pass a network namespace parameter into the netfilter hooks.  At the
      call site of the netfilter hooks the path a packet is taking through
      the network stack is well known which allows the network namespace to
      be easily and reliabily.
      
      This allows the replacement of magic code like
      "dev_net(state->in?:state->out)" that appears at the start of most
      netfilter hooks with "state->net".
      
      In almost all cases the network namespace passed in is derived
      from the first network device passed in, guaranteeing those
      paths will not see any changes in practice.
      
      The exceptions are:
      xfrm/xfrm_output.c:xfrm_output_resume()         xs_net(skb_dst(skb)->xfrm)
      ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont()      ip_vs_conn_net(cp)
      ipvs/ip_vs_xmit.c:ip_vs_send_or_cont()          ip_vs_conn_net(cp)
      ipv4/raw.c:raw_send_hdrinc()                    sock_net(sk)
      ipv6/ip6_output.c:ip6_xmit()			sock_net(sk)
      ipv6/ndisc.c:ndisc_send_skb()                   dev_net(skb->dev) not dev_net(dst->dev)
      ipv6/raw.c:raw6_send_hdrinc()                   sock_net(sk)
      br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev
      
      In all cases these exceptions seem to be a better expression for the
      network namespace the packet is being processed in then the historic
      "dev_net(in?in:out)".  I am documenting them in case something odd
      pops up and someone starts trying to track down what happened.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29a26a56