1. 25 1月, 2017 1 次提交
    • F
      bridge: multicast to unicast · 6db6f0ea
      Felix Fietkau 提交于
      Implements an optional, per bridge port flag and feature to deliver
      multicast packets to any host on the according port via unicast
      individually. This is done by copying the packet per host and
      changing the multicast destination MAC to a unicast one accordingly.
      
      multicast-to-unicast works on top of the multicast snooping feature of
      the bridge. Which means unicast copies are only delivered to hosts which
      are interested in it and signalized this via IGMP/MLD reports
      previously.
      
      This feature is intended for interface types which have a more reliable
      and/or efficient way to deliver unicast packets than broadcast ones
      (e.g. wifi).
      
      However, it should only be enabled on interfaces where no IGMPv2/MLDv1
      report suppression takes place. This feature is disabled by default.
      
      The initial patch and idea is from Felix Fietkau.
      Signed-off-by: NFelix Fietkau <nbd@nbd.name>
      [linus.luessing@c0d3.blue: various bug + style fixes, commit message]
      Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6db6f0ea
  2. 02 9月, 2016 2 次提交
  3. 27 8月, 2016 1 次提交
    • I
      bridge: switchdev: Add forward mark support for stacked devices · 6bc506b4
      Ido Schimmel 提交于
      switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of
      port netdevs so that packets being flooded by the device won't be
      flooded twice.
      
      It works by assigning a unique identifier (the ifindex of the first
      bridge port) to bridge ports sharing the same parent ID. This prevents
      packets from being flooded twice by the same switch, but will flood
      packets through bridge ports belonging to a different switch.
      
      This method is problematic when stacked devices are taken into account,
      such as VLANs. In such cases, a physical port netdev can have upper
      devices being members in two different bridges, thus requiring two
      different 'offload_fwd_mark's to be configured on the port netdev, which
      is impossible.
      
      The main problem is that packet and netdev marking is performed at the
      physical netdev level, whereas flooding occurs between bridge ports,
      which are not necessarily port netdevs.
      
      Instead, packet and netdev marking should really be done in the bridge
      driver with the switch driver only telling it which packets it already
      forwarded. The bridge driver will mark such packets using the mark
      assigned to the ingress bridge port and will prevent the packet from
      being forwarded through any bridge port sharing the same mark (i.e.
      having the same parent ID).
      
      Remove the current switchdev 'offload_fwd_mark' implementation and
      instead implement the proposed method. In addition, make rocker - the
      sole user of the mark - use the proposed method.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bc506b4
  4. 17 7月, 2016 2 次提交
  5. 10 7月, 2016 1 次提交
  6. 30 6月, 2016 1 次提交
    • N
      net: bridge: add support for IGMP/MLD stats and export them via netlink · 1080ab95
      Nikolay Aleksandrov 提交于
      This patch adds stats support for the currently used IGMP/MLD types by the
      bridge. The stats are per-port (plus one stat per-bridge) and per-direction
      (RX/TX). The stats are exported via netlink via the new linkxstats API
      (RTM_GETSTATS). In order to minimize the performance impact, a new option
      is used to enable/disable the stats - multicast_stats_enabled, similar to
      the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
      lookups and checks, we make use of the current "igmp" member of the bridge
      private skb->cb region to record the type on Rx (both host-generated and
      external packets pass by multicast_rcv()). We can do that since the igmp
      member was used as a boolean and all the valid IGMP/MLD types are positive
      values. The normal bridge fast-path is not affected at all, the only
      affected paths are the flooding ones and since we make use of the IGMP/MLD
      type, we can quickly determine if the packet should be counted using
      cache-hot data (cb's igmp member). We add counters for:
      * IGMP Queries
      * IGMP Leaves
      * IGMP v1/v2/v3 reports
      
      * MLD Queries
      * MLD Leaves
      * MLD v1/v2 reports
      
      These are invaluable when monitoring or debugging complex multicast setups
      with bridges.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1080ab95
  7. 02 3月, 2016 1 次提交
  8. 30 10月, 2015 1 次提交
    • R
      bridge: set is_local and is_static before fdb entry is added to the fdb hashtable · b7af1472
      Roopa Prabhu 提交于
      Problem Description:
      We can add fdbs pointing to the bridge with NULL ->dst but that has a
      few race conditions because br_fdb_insert() is used which first creates
      the fdb and then, after the fdb has been published/linked, sets
      "is_local" to 1 and in that time frame if a packet arrives for that fdb
      it may see it as non-local and either do a NULL ptr dereference in
      br_forward() or attach the fdb to the port where it arrived, and later
      br_fdb_insert() will make it local thus getting a wrong fdb entry.
      Call chain br_handle_frame_finish() -> br_forward():
      But in br_handle_frame_finish() in order to call br_forward() the dst
      should not be local i.e. skb != NULL, whenever the dst is
      found to be local skb is set to NULL so we can't forward it,
      and here comes the problem since it's running only
      with RCU when forwarding packets it can see the entry before "is_local"
      is set to 1 and actually try to dereference NULL.
      The main issue is that if someone sends a packet to the switch while
      it's adding the entry which points to the bridge device, it may
      dereference NULL ptr. This is needed now after we can add fdbs
      pointing to the bridge.  This poses a problem for
      br_fdb_update() as well, while someone's adding a bridge fdb, but
      before it has is_local == 1, it might get moved to a port if it comes
      as a source mac and then it may get its "is_local" set to 1
      
      This patch changes fdb_create to take is_local and is_static as
      arguments to set these values in the fdb entry before it is added to the
      hash. Also adds null check for port in br_forward.
      
      Fixes: 3741873b ("bridge: allow adding of fdb entries pointing to the bridge device")
      Reported-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7af1472
  9. 13 10月, 2015 1 次提交
  10. 30 9月, 2015 1 次提交
    • N
      bridge: vlan: add per-vlan struct and move to rhashtables · 2594e906
      Nikolay Aleksandrov 提交于
      This patch changes the bridge vlan implementation to use rhashtables
      instead of bitmaps. The main motivation behind this change is that we
      need extensible per-vlan structures (both per-port and global) so more
      advanced features can be introduced and the vlan support can be
      extended. I've tried to break this up but the moment net_port_vlans is
      changed and the whole API goes away, thus this is a larger patch.
      A few short goals of this patch are:
      - Extensible per-vlan structs stored in rhashtables and a sorted list
      - Keep user-visible behaviour (compressed vlans etc)
      - Keep fastpath ingress/egress logic the same (optimizations to come
        later)
      
      Here's a brief list of some of the new features we'd like to introduce:
      - per-vlan counters
      - vlan ingress/egress mapping
      - per-vlan igmp configuration
      - vlan priorities
      - avoid fdb entries replication (e.g. local fdb scaling issues)
      
      The structure is kept single for both global and per-port entries so to
      avoid code duplication where possible and also because we'll soon introduce
      "port0 / aka bridge as port" which should simplify things further
      (thanks to Vlad for the suggestion!).
      
      Now we have per-vlan global rhashtable (bridge-wide) and per-vlan port
      rhashtable, if an entry is added to a port it'll get a pointer to its
      global context so it can be quickly accessed later. There's also a
      sorted vlan list which is used for stable walks and some user-visible
      behaviour such as the vlan ranges, also for error paths.
      VLANs are stored in a "vlan group" which currently contains the
      rhashtable, sorted vlan list and the number of "real" vlan entries.
      A good side-effect of this change is that it resembles how hw keeps
      per-vlan data.
      One important note after this change is that if a VLAN is being looked up
      in the bridge's rhashtable for filtering purposes (or to check if it's an
      existing usable entry, not just a global context) then the new helper
      br_vlan_should_use() needs to be used if the vlan is found. In case the
      lookup is done only with a port's vlan group, then this check can be
      skipped.
      
      Things tested so far:
      - basic vlan ingress/egress
      - pvids
      - untagged vlans
      - undef CONFIG_BRIDGE_VLAN_FILTERING
      - adding/deleting vlans in different scenarios (with/without global ctx,
        while transmitting traffic, in ranges etc)
      - loading/removing the module while having/adding/deleting vlans
      - extracting bridge vlan information (user ABI), compressed requests
      - adding/deleting fdbs on vlans
      - bridge mac change, promisc mode
      - default pvid change
      - kmemleak ON during the whole time
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2594e906
  11. 18 9月, 2015 2 次提交
    • E
      netfilter: Pass net into okfn · 0c4b51f0
      Eric W. Biederman 提交于
      This is immediately motivated by the bridge code that chains functions that
      call into netfilter.  Without passing net into the okfns the bridge code would
      need to guess about the best expression for the network namespace to process
      packets in.
      
      As net is frequently one of the first things computed in continuation functions
      after netfilter has done it's job passing in the desired network namespace is in
      many cases a code simplification.
      
      To support this change the function dst_output_okfn is introduced to
      simplify passing dst_output as an okfn.  For the moment dst_output_okfn
      just silently drops the struct net.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c4b51f0
    • E
      netfilter: Pass struct net into the netfilter hooks · 29a26a56
      Eric W. Biederman 提交于
      Pass a network namespace parameter into the netfilter hooks.  At the
      call site of the netfilter hooks the path a packet is taking through
      the network stack is well known which allows the network namespace to
      be easily and reliabily.
      
      This allows the replacement of magic code like
      "dev_net(state->in?:state->out)" that appears at the start of most
      netfilter hooks with "state->net".
      
      In almost all cases the network namespace passed in is derived
      from the first network device passed in, guaranteeing those
      paths will not see any changes in practice.
      
      The exceptions are:
      xfrm/xfrm_output.c:xfrm_output_resume()         xs_net(skb_dst(skb)->xfrm)
      ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont()      ip_vs_conn_net(cp)
      ipvs/ip_vs_xmit.c:ip_vs_send_or_cont()          ip_vs_conn_net(cp)
      ipv4/raw.c:raw_send_hdrinc()                    sock_net(sk)
      ipv6/ip6_output.c:ip6_xmit()			sock_net(sk)
      ipv6/ndisc.c:ndisc_send_skb()                   dev_net(skb->dev) not dev_net(dst->dev)
      ipv6/raw.c:raw6_send_hdrinc()                   sock_net(sk)
      br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev
      
      In all cases these exceptions seem to be a better expression for the
      network namespace the packet is being processed in then the historic
      "dev_net(in?in:out)".  I am documenting them in case something odd
      pops up and someone starts trying to track down what happened.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29a26a56
  12. 30 7月, 2015 1 次提交
    • T
      bridge: Fix network header pointer for vlan tagged packets · df356d5e
      Toshiaki Makita 提交于
      There are several devices that can receive vlan tagged packets with
      CHECKSUM_PARTIAL like tap, possibly veth and xennet.
      When (multiple) vlan tagged packets with CHECKSUM_PARTIAL are forwarded
      by bridge to a device with the IP_CSUM feature, they end up with checksum
      error because before entering bridge, the network header is set to
      ETH_HLEN (not including vlan header length) in __netif_receive_skb_core(),
      get_rps_cpu(), or drivers' rx functions, and nobody fixes the pointer later.
      
      Since the network header is exepected to be ETH_HLEN in flow-dissection
      and hash-calculation in RPS in rx path, and since the header pointer fix
      is needed only in tx path, set the appropriate network header on forwarding
      packets.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df356d5e
  13. 10 7月, 2015 1 次提交
  14. 08 4月, 2015 1 次提交
    • D
      netfilter: Pass socket pointer down through okfn(). · 7026b1dd
      David Miller 提交于
      On the output paths in particular, we have to sometimes deal with two
      socket contexts.  First, and usually skb->sk, is the local socket that
      generated the frame.
      
      And second, is potentially the socket used to control a tunneling
      socket, such as one the encapsulates using UDP.
      
      We do not want to disassociate skb->sk when encapsulating in order
      to fix this, because that would break socket memory accounting.
      
      The most extreme case where this can cause huge problems is an
      AF_PACKET socket transmitting over a vxlan device.  We hit code
      paths doing checks that assume they are dealing with an ipv4
      socket, but are actually operating upon the AF_PACKET one.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7026b1dd
  15. 09 3月, 2015 1 次提交
  16. 06 3月, 2015 1 次提交
    • J
      bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi · 842a9ae0
      Jouni Malinen 提交于
      This extends the design in commit 95850116 ("bridge: Add support for
      IEEE 802.11 Proxy ARP") with optional set of rules that are needed to
      meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The
      previously added BR_PROXYARP behavior is left as-is and a new
      BR_PROXYARP_WIFI alternative is added so that this behavior can be
      configured from user space when required.
      
      In addition, this enables proxyarp functionality for unicast ARP
      requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible
      to use unicast as well as broadcast for these frames.
      
      The key differences in functionality:
      
      BR_PROXYARP:
      - uses the flag on the bridge port on which the request frame was
        received to determine whether to reply
      - block bridge port flooding completely on ports that enable proxy ARP
      
      BR_PROXYARP_WIFI:
      - uses the flag on the bridge port to which the target device of the
        request belongs
      - block bridge port flooding selectively based on whether the proxyarp
        functionality replied
      Signed-off-by: NJouni Malinen <jouni@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      842a9ae0
  17. 31 10月, 2014 1 次提交
    • P
      netfilter: nft_reject_bridge: don't use IP stack to reject traffic · 523b929d
      Pablo Neira Ayuso 提交于
      If the packet is received via the bridge stack, this cannot reject
      packets from the IP stack.
      
      This adds functions to build the reject packet and send it from the
      bridge stack. Comments and assumptions on this patch:
      
      1) Validate the IPv4 and IPv6 headers before further processing,
         given that the packet comes from the bridge stack, we cannot assume
         they are clean. Truncated packets are dropped, we follow similar
         approach in the existing iptables match/target extensions that need
         to inspect layer 4 headers that is not available. This also includes
         packets that are directed to multicast and broadcast ethernet
         addresses.
      
      2) br_deliver() is exported to inject the reject packet via
         bridge localout -> postrouting. So the approach is similar to what
         we already do in the iptables reject target. The reject packet is
         sent to the bridge port from which we have received the original
         packet.
      
      3) The reject packet is forged based on the original packet. The TTL
         is set based on sysctl_ip_default_ttl for IPv4 and per-net
         ipv6.devconf_all hoplimit for IPv6.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      523b929d
  18. 28 10月, 2014 1 次提交
    • K
      bridge: Add support for IEEE 802.11 Proxy ARP · 95850116
      Kyeyoon Park 提交于
      This feature is defined in IEEE Std 802.11-2012, 10.23.13. It allows
      the AP devices to keep track of the hardware-address-to-IP-address
      mapping of the mobile devices within the WLAN network.
      
      The AP will learn this mapping via observing DHCP, ARP, and NS/NA
      frames. When a request for such information is made (i.e. ARP request,
      Neighbor Solicitation), the AP will respond on behalf of the
      associated mobile device. In the process of doing so, the AP will drop
      the multicast request frame that was intended to go out to the wireless
      medium.
      
      It was recommended at the LKS workshop to do this implementation in
      the bridge layer. vxlan.c is already doing something very similar.
      The DHCP snooping code will be added to the userspace application
      (hostapd) per the recommendation.
      
      This RFC commit is only for IPv4. A similar approach in the bridge
      layer will be taken for IPv6 as well.
      Signed-off-by: NKyeyoon Park <kyeyoonp@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95850116
  19. 27 9月, 2014 1 次提交
    • P
      netfilter: bridge: move br_netfilter out of the core · 34666d46
      Pablo Neira Ayuso 提交于
      Jesper reported that br_netfilter always registers the hooks since
      this is part of the bridge core. This harms performance for people that
      don't need this.
      
      This patch modularizes br_netfilter so it can be rmmod'ed, thus,
      the hooks can be unregistered. I think the bridge netfilter should have
      been a separated module since the beginning, Patrick agreed on that.
      
      Note that this is breaking compatibility for users that expect that
      bridge netfilter is going to be available after explicitly 'modprobe
      bridge' or via automatic load through brctl.
      
      However, the damage can be easily undone by modprobing br_netfilter.
      The bridge core also spots a message to provide a clue to people that
      didn't notice that this has been deprecated.
      
      On top of that, the plan is that nftables will not rely on this software
      layer, but integrate the connection tracking into the bridge layer to
      enable stateful filtering and NAT, which is was bridge netfilter users
      seem to require.
      
      This patch still keeps the fake_dst_ops in the bridge core, since this
      is required by when the bridge port is initialized. So we can safely
      modprobe/rmmod br_netfilter anytime.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      34666d46
  20. 01 4月, 2014 1 次提交
  21. 20 12月, 2013 1 次提交
  22. 19 12月, 2013 1 次提交
  23. 11 6月, 2013 1 次提交
  24. 14 2月, 2013 2 次提交
  25. 15 8月, 2012 1 次提交
  26. 24 4月, 2012 1 次提交
  27. 16 4月, 2012 1 次提交
  28. 09 12月, 2011 1 次提交
  29. 07 1月, 2011 1 次提交
  30. 16 11月, 2010 1 次提交
  31. 20 7月, 2010 1 次提交
  32. 16 6月, 2010 2 次提交
  33. 06 5月, 2010 1 次提交
    • W
      bridge: make bridge support netpoll · c06ee961
      WANG Cong 提交于
      Based on the previous patch, make bridge support netpoll by:
      
      1) implement the 2 methods to support netpoll for bridge;
      
      2) modify netpoll during forwarding packets via bridge;
      
      3) disable netpoll support of bridge when a netpoll-unabled device
         is added to bridge;
      
      4) enable netpoll support when all underlying devices support netpoll.
      
      Cc: David Miller <davem@davemloft.net>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Stephen Hemminger <shemminger@linux-foundation.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Signed-off-by: NWANG Cong <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c06ee961
  34. 28 4月, 2010 2 次提交