1. 05 10月, 2015 2 次提交
  2. 02 10月, 2015 1 次提交
  3. 30 9月, 2015 1 次提交
    • N
      bridge: vlan: add per-vlan struct and move to rhashtables · 2594e906
      Nikolay Aleksandrov 提交于
      This patch changes the bridge vlan implementation to use rhashtables
      instead of bitmaps. The main motivation behind this change is that we
      need extensible per-vlan structures (both per-port and global) so more
      advanced features can be introduced and the vlan support can be
      extended. I've tried to break this up but the moment net_port_vlans is
      changed and the whole API goes away, thus this is a larger patch.
      A few short goals of this patch are:
      - Extensible per-vlan structs stored in rhashtables and a sorted list
      - Keep user-visible behaviour (compressed vlans etc)
      - Keep fastpath ingress/egress logic the same (optimizations to come
        later)
      
      Here's a brief list of some of the new features we'd like to introduce:
      - per-vlan counters
      - vlan ingress/egress mapping
      - per-vlan igmp configuration
      - vlan priorities
      - avoid fdb entries replication (e.g. local fdb scaling issues)
      
      The structure is kept single for both global and per-port entries so to
      avoid code duplication where possible and also because we'll soon introduce
      "port0 / aka bridge as port" which should simplify things further
      (thanks to Vlad for the suggestion!).
      
      Now we have per-vlan global rhashtable (bridge-wide) and per-vlan port
      rhashtable, if an entry is added to a port it'll get a pointer to its
      global context so it can be quickly accessed later. There's also a
      sorted vlan list which is used for stable walks and some user-visible
      behaviour such as the vlan ranges, also for error paths.
      VLANs are stored in a "vlan group" which currently contains the
      rhashtable, sorted vlan list and the number of "real" vlan entries.
      A good side-effect of this change is that it resembles how hw keeps
      per-vlan data.
      One important note after this change is that if a VLAN is being looked up
      in the bridge's rhashtable for filtering purposes (or to check if it's an
      existing usable entry, not just a global context) then the new helper
      br_vlan_should_use() needs to be used if the vlan is found. In case the
      lookup is done only with a port's vlan group, then this check can be
      skipped.
      
      Things tested so far:
      - basic vlan ingress/egress
      - pvids
      - untagged vlans
      - undef CONFIG_BRIDGE_VLAN_FILTERING
      - adding/deleting vlans in different scenarios (with/without global ctx,
        while transmitting traffic, in ranges etc)
      - loading/removing the module while having/adding/deleting vlans
      - extracting bridge vlan information (user ABI), compressed requests
      - adding/deleting fdbs on vlans
      - bridge mac change, promisc mode
      - default pvid change
      - kmemleak ON during the whole time
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2594e906
  4. 18 9月, 2015 1 次提交
    • E
      netfilter: Pass net into okfn · 0c4b51f0
      Eric W. Biederman 提交于
      This is immediately motivated by the bridge code that chains functions that
      call into netfilter.  Without passing net into the okfns the bridge code would
      need to guess about the best expression for the network namespace to process
      packets in.
      
      As net is frequently one of the first things computed in continuation functions
      after netfilter has done it's job passing in the desired network namespace is in
      many cases a code simplification.
      
      To support this change the function dst_output_okfn is introduced to
      simplify passing dst_output as an okfn.  For the moment dst_output_okfn
      just silently drops the struct net.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c4b51f0
  5. 28 8月, 2015 2 次提交
    • N
      bridge: fdb: rearrange net_bridge_fdb_entry · b22fbf22
      Nikolay Aleksandrov 提交于
      While looking into fixing the local entries scalability issue I noticed
      that the structure is badly arranged because vlan_id would fall in a
      second cache line while keeping rcu which is used only when deleting
      in the first, so re-arrange the structure and push rcu to the end so we
      can get 16 bytes which can be used for other fields (by pushing rcu
      fully in the second 64 byte chunk). With this change all the core
      necessary information when doing fdb lookups will be available in a
      single cache line.
      
      pahole before (note vlan_id):
      struct net_bridge_fdb_entry {
      	struct hlist_node          hlist;                /*     0    16 */
      	struct net_bridge_port *   dst;                  /*    16     8 */
      	struct callback_head       rcu;                  /*    24    16 */
      	long unsigned int          updated;              /*    40     8 */
      	long unsigned int          used;                 /*    48     8 */
      	mac_addr                   addr;                 /*    56     6 */
      	unsigned char              is_local:1;           /*    62: 7  1 */
      	unsigned char              is_static:1;          /*    62: 6  1 */
      	unsigned char              added_by_user:1;      /*    62: 5  1 */
      	unsigned char              added_by_external_learn:1; /*    62: 4  1 */
      
      	/* XXX 4 bits hole, try to pack */
      	/* XXX 1 byte hole, try to pack */
      
      	/* --- cacheline 1 boundary (64 bytes) --- */
      	__u16                      vlan_id;              /*    64     2 */
      
      	/* size: 72, cachelines: 2, members: 11 */
      	/* sum members: 65, holes: 1, sum holes: 1 */
      	/* bit holes: 1, sum bit holes: 4 bits */
      	/* padding: 6 */
      	/* last cacheline: 8 bytes */
      }
      
      pahole after (note vlan_id):
      struct net_bridge_fdb_entry {
      	struct hlist_node          hlist;                /*     0    16 */
      	struct net_bridge_port *   dst;                  /*    16     8 */
      	long unsigned int          updated;              /*    24     8 */
      	long unsigned int          used;                 /*    32     8 */
      	mac_addr                   addr;                 /*    40     6 */
      	__u16                      vlan_id;              /*    46     2 */
      	unsigned char              is_local:1;           /*    48: 7  1 */
      	unsigned char              is_static:1;          /*    48: 6  1 */
      	unsigned char              added_by_user:1;      /*    48: 5  1 */
      	unsigned char              added_by_external_learn:1; /*    48: 4  1 */
      
      	/* XXX 4 bits hole, try to pack */
      	/* XXX 7 bytes hole, try to pack */
      
      	struct callback_head       rcu;                  /*    56    16 */
      	/* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
      
      	/* size: 72, cachelines: 2, members: 11 */
      	/* sum members: 65, holes: 1, sum holes: 7 */
      	/* bit holes: 1, sum bit holes: 4 bits */
      	/* last cacheline: 8 bytes */
      }
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b22fbf22
    • T
      bridge: Add netlink support for vlan_protocol attribute · d2d427b3
      Toshiaki Makita 提交于
      This enables bridge vlan_protocol to be configured through netlink.
      
      When CONFIG_BRIDGE_VLAN_FILTERING is disabled, kernel behaves the
      same way as this feature is not implemented.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2d427b3
  6. 11 8月, 2015 1 次提交
  7. 27 7月, 2015 1 次提交
  8. 21 7月, 2015 2 次提交
  9. 10 7月, 2015 1 次提交
  10. 24 6月, 2015 1 次提交
  11. 12 6月, 2015 2 次提交
    • B
      netfilter: bridge: forward IPv6 fragmented packets · efb6de9b
      Bernhard Thaler 提交于
      IPv6 fragmented packets are not forwarded on an ethernet bridge
      with netfilter ip6_tables loaded. e.g. steps to reproduce
      
      1) create a simple bridge like this
      
              modprobe br_netfilter
              brctl addbr br0
              brctl addif br0 eth0
              brctl addif br0 eth2
              ifconfig eth0 up
              ifconfig eth2 up
              ifconfig br0 up
      
      2) place a host with an IPv6 address on each side of the bridge
      
              set IPv6 address on host A:
              ip -6 addr add fd01:2345:6789:1::1/64 dev eth0
      
              set IPv6 address on host B:
              ip -6 addr add fd01:2345:6789:1::2/64 dev eth0
      
      3) run a simple ping command on host A with packets > MTU
      
              ping6 -s 4000 fd01:2345:6789:1::2
      
      4) wait some time and run e.g. "ip6tables -t nat -nvL" on the bridge
      
      IPv6 fragmented packets traverse the bridge cleanly until somebody runs.
      "ip6tables -t nat -nvL". As soon as it is run (and netfilter modules are
      loaded) IPv6 fragmented packets do not traverse the bridge any more (you
      see no more responses in ping's output).
      
      After applying this patch IPv6 fragmented packets traverse the bridge
      cleanly in above scenario.
      Signed-off-by: NBernhard Thaler <bernhard.thaler@wvnet.at>
      [pablo@netfilter.org: small changes to br_nf_dev_queue_xmit]
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      efb6de9b
    • B
      netfilter: bridge: refactor frag_max_size · 411ffb4f
      Bernhard Thaler 提交于
      Currently frag_max_size is member of br_input_skb_cb and copied back and
      forth using IPCB(skb) and BR_INPUT_SKB_CB(skb) each time it is changed or
      used.
      
      Attach frag_max_size to nf_bridge_info and set value in pre_routing and
      forward functions. Use its value in forward and xmit functions.
      Signed-off-by: NBernhard Thaler <bernhard.thaler@wvnet.at>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      411ffb4f
  12. 06 5月, 2015 1 次提交
    • B
      bridge: change BR_GROUPFWD_RESTRICTED to allow forwarding of LLDP frames · 784b58a3
      Bernhard Thaler 提交于
      BR_GROUPFWD_RESTRICTED bitmask restricts users from setting values to
      /sys/class/net/brX/bridge/group_fwd_mask that allow forwarding of
      some IEEE 802.1D Table 7-10 Reserved addresses:
      
      	(MAC Control) 802.3		01-80-C2-00-00-01
      	(Link Aggregation) 802.3	01-80-C2-00-00-02
      	802.1AB LLDP			01-80-C2-00-00-0E
      
      Change BR_GROUPFWD_RESTRICTED to allow to forward LLDP frames and document
      group_fwd_mask.
      
      e.g.
         echo 16384 > /sys/class/net/brX/bridge/group_fwd_mask
      allows to forward LLDP frames.
      
      This may be needed for bridge setups used for network troubleshooting or
      any other scenario where forwarding of LLDP frames is desired (e.g. bridge
      connecting a virtual machine to real switch transmitting LLDP frames that
      virtual machine needs to receive).
      
      Tested on a simple bridge setup with two interfaces and host transmitting
      LLDP frames on one side of this bridge (used lldpd). Setting group_fwd_mask
      as described above lets LLDP frames traverse bridge.
      Signed-off-by: NBernhard Thaler <bernhard.thaler@wvnet.at>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      784b58a3
  13. 30 4月, 2015 1 次提交
    • N
      bridge/nl: remove wrong use of NLM_F_MULTI · 46c264da
      Nicolas Dichtel 提交于
      NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
      it is sent only at the end of a dump.
      
      Libraries like libnl will wait forever for NLMSG_DONE.
      
      Fixes: e5a55a89 ("net: create generic bridge ops")
      Fixes: 815cccbf ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
      CC: John Fastabend <john.r.fastabend@intel.com>
      CC: Sathya Perla <sathya.perla@emulex.com>
      CC: Subbu Seetharaman <subbu.seetharaman@emulex.com>
      CC: Ajit Khaparde <ajit.khaparde@emulex.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: intel-wired-lan@lists.osuosl.org
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: Scott Feldman <sfeldma@gmail.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: bridge@lists.linux-foundation.org
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46c264da
  14. 08 4月, 2015 1 次提交
    • D
      netfilter: Pass socket pointer down through okfn(). · 7026b1dd
      David Miller 提交于
      On the output paths in particular, we have to sometimes deal with two
      socket contexts.  First, and usually skb->sk, is the local socket that
      generated the frame.
      
      And second, is potentially the socket used to control a tunneling
      socket, such as one the encapsulates using UDP.
      
      We do not want to disassociate skb->sk when encapsulating in order
      to fix this, because that would break socket memory accounting.
      
      The most extreme case where this can cause huge problems is an
      AF_PACKET socket transmitting over a vxlan device.  We hit code
      paths doing checks that assume they are dealing with an ipv4
      socket, but are actually operating upon the AF_PACKET one.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7026b1dd
  15. 10 3月, 2015 2 次提交
  16. 06 3月, 2015 1 次提交
    • J
      bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi · 842a9ae0
      Jouni Malinen 提交于
      This extends the design in commit 95850116 ("bridge: Add support for
      IEEE 802.11 Proxy ARP") with optional set of rules that are needed to
      meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The
      previously added BR_PROXYARP behavior is left as-is and a new
      BR_PROXYARP_WIFI alternative is added so that this behavior can be
      configured from user space when required.
      
      In addition, this enables proxyarp functionality for unicast ARP
      requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible
      to use unicast as well as broadcast for these frames.
      
      The key differences in functionality:
      
      BR_PROXYARP:
      - uses the flag on the bridge port on which the request frame was
        received to determine whether to reply
      - block bridge port flooding completely on ports that enable proxy ARP
      
      BR_PROXYARP_WIFI:
      - uses the flag on the bridge port to which the target device of the
        request belongs
      - block bridge port flooding selectively based on whether the proxyarp
        functionality replied
      Signed-off-by: NJouni Malinen <jouni@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      842a9ae0
  17. 02 2月, 2015 1 次提交
  18. 18 1月, 2015 1 次提交
  19. 14 1月, 2015 1 次提交
  20. 03 12月, 2014 4 次提交
  21. 28 10月, 2014 1 次提交
    • K
      bridge: Add support for IEEE 802.11 Proxy ARP · 95850116
      Kyeyoon Park 提交于
      This feature is defined in IEEE Std 802.11-2012, 10.23.13. It allows
      the AP devices to keep track of the hardware-address-to-IP-address
      mapping of the mobile devices within the WLAN network.
      
      The AP will learn this mapping via observing DHCP, ARP, and NS/NA
      frames. When a request for such information is made (i.e. ARP request,
      Neighbor Solicitation), the AP will respond on behalf of the
      associated mobile device. In the process of doing so, the AP will drop
      the multicast request frame that was intended to go out to the wireless
      medium.
      
      It was recommended at the LKS workshop to do this implementation in
      the bridge layer. vxlan.c is already doing something very similar.
      The DHCP snooping code will be added to the userspace application
      (hostapd) per the recommendation.
      
      This RFC commit is only for IPv4. A similar approach in the bridge
      layer will be taken for IPv6 as well.
      Signed-off-by: NKyeyoon Park <kyeyoonp@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95850116
  22. 08 10月, 2014 1 次提交
    • H
      bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING · 93fdd47e
      Herbert Xu 提交于
      As we may defragment the packet in IPv4 PRE_ROUTING and refragment
      it after POST_ROUTING we should save the value of frag_max_size.
      
      This is still very wrong as the bridge is supposed to leave the
      packets intact, meaning that the right thing to do is to use the
      original frag_list for fragmentation.
      
      Unfortunately we don't currently guarantee that the frag_list is
      left untouched throughout netfilter so until this changes this is
      the best we can do.
      
      There is also a spot in FORWARD where it appears that we can
      forward a packet without going through fragmentation, mark it
      so that we can fix it later.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93fdd47e
  23. 06 10月, 2014 3 次提交
  24. 02 10月, 2014 1 次提交
  25. 27 9月, 2014 1 次提交
    • P
      netfilter: bridge: move br_netfilter out of the core · 34666d46
      Pablo Neira Ayuso 提交于
      Jesper reported that br_netfilter always registers the hooks since
      this is part of the bridge core. This harms performance for people that
      don't need this.
      
      This patch modularizes br_netfilter so it can be rmmod'ed, thus,
      the hooks can be unregistered. I think the bridge netfilter should have
      been a separated module since the beginning, Patrick agreed on that.
      
      Note that this is breaking compatibility for users that expect that
      bridge netfilter is going to be available after explicitly 'modprobe
      bridge' or via automatic load through brctl.
      
      However, the damage can be easily undone by modprobing br_netfilter.
      The bridge core also spots a message to provide a clue to people that
      didn't notice that this has been deprecated.
      
      On top of that, the plan is that nftables will not rely on this software
      layer, but integrate the connection tracking into the bridge layer to
      enable stateful filtering and NAT, which is was bridge netfilter users
      seem to require.
      
      This patch still keeps the fake_dst_ops in the bridge core, since this
      is required by when the bridge port is initialized. So we can safely
      modprobe/rmmod br_netfilter anytime.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      34666d46
  26. 14 9月, 2014 1 次提交
    • V
      bridge: Check if vlan filtering is enabled only once. · 20adfa1a
      Vlad Yasevich 提交于
      The bridge code checks if vlan filtering is enabled on both
      ingress and egress.   When the state flip happens, it
      is possible for the bridge to currently be forwarding packets
      and forwarding behavior becomes non-deterministic.  Bridge
      may drop packets on some interfaces, but not others.
      
      This patch solves this by caching the filtered state of the
      packet into skb_cb on ingress.  The skb_cb is guaranteed to
      not be over-written between the time packet entres bridge
      forwarding path and the time it leaves it.  On egress, we
      can then check the cached state to see if we need to
      apply filtering information.
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20adfa1a
  27. 11 7月, 2014 1 次提交
  28. 12 6月, 2014 3 次提交