1. 31 5月, 2015 1 次提交
  2. 23 5月, 2015 2 次提交
    • E
      bridge: fix lockdep splat · 93a33a58
      Eric Dumazet 提交于
      Following lockdep splat was reported :
      
      [   29.382286] ===============================
      [   29.382315] [ INFO: suspicious RCU usage. ]
      [   29.382344] 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 Not tainted
      [   29.382380] -------------------------------
      [   29.382409] net/bridge/br_private.h:626 suspicious
      rcu_dereference_check() usage!
      [   29.382455]
                     other info that might help us debug this:
      
      [   29.382507]
                     rcu_scheduler_active = 1, debug_locks = 0
      [   29.382549] 2 locks held by swapper/0/0:
      [   29.382576]  #0:  (((&p->forward_delay_timer))){+.-...}, at:
      [<ffffffff81139f75>] call_timer_fn+0x5/0x4f0
      [   29.382660]  #1:  (&(&br->lock)->rlock){+.-...}, at:
      [<ffffffffa0450dc1>] br_forward_delay_timer_expired+0x31/0x140
      [bridge]
      [   29.382754]
                     stack backtrace:
      [   29.382787] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
      4.1.0-0.rc0.git11.1.fc23.x86_64 #1
      [   29.382838] Hardware name: LENOVO 422916G/LENOVO, BIOS A1KT53AUS 04/07/2015
      [   29.382882]  0000000000000000 3ebfc20364115825 ffff880666603c48
      ffffffff81892d4b
      [   29.382943]  0000000000000000 ffffffff81e124e0 ffff880666603c78
      ffffffff8110bcd7
      [   29.383004]  ffff8800785c9d00 ffff88065485ac58 ffff880c62002800
      ffff880c5fc88ac0
      [   29.383065] Call Trace:
      [   29.383084]  <IRQ>  [<ffffffff81892d4b>] dump_stack+0x4c/0x65
      [   29.383130]  [<ffffffff8110bcd7>] lockdep_rcu_suspicious+0xe7/0x120
      [   29.383178]  [<ffffffffa04520f9>] br_fill_ifinfo+0x4a9/0x6a0 [bridge]
      [   29.383225]  [<ffffffffa045266b>] br_ifinfo_notify+0x11b/0x4b0 [bridge]
      [   29.383271]  [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
      [   29.383320]  [<ffffffffa0450de8>]
      br_forward_delay_timer_expired+0x58/0x140 [bridge]
      [   29.383371]  [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
      [   29.383416]  [<ffffffff8113a033>] call_timer_fn+0xc3/0x4f0
      [   29.383454]  [<ffffffff81139f75>] ? call_timer_fn+0x5/0x4f0
      [   29.383493]  [<ffffffff8110a90f>] ? lock_release_holdtime.part.29+0xf/0x200
      [   29.383541]  [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
      [   29.383587]  [<ffffffff8113a6a4>] run_timer_softirq+0x244/0x490
      [   29.383629]  [<ffffffff810b68cc>] __do_softirq+0xec/0x670
      [   29.383666]  [<ffffffff810b70d5>] irq_exit+0x145/0x150
      [   29.383703]  [<ffffffff8189f506>] smp_apic_timer_interrupt+0x46/0x60
      [   29.383744]  [<ffffffff8189d523>] apic_timer_interrupt+0x73/0x80
      [   29.383782]  <EOI>  [<ffffffff816f131f>] ? cpuidle_enter_state+0x5f/0x2f0
      [   29.383832]  [<ffffffff816f131b>] ? cpuidle_enter_state+0x5b/0x2f0
      
      Problem here is that br_forward_delay_timer_expired() is a timer
      handler, calling br_ifinfo_notify() which assumes either rcu_read_lock()
      or RTNL are held.
      
      Simplest fix seems to add rcu read lock section.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NJosh Boyer <jwboyer@fedoraproject.org>
      Reported-by: NDominick Grift <dac.override@gmail.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93a33a58
    • T
      bridge: fix parsing of MLDv2 reports · 47cc84ce
      Thadeu Lima de Souza Cascardo 提交于
      When more than a multicast address is present in a MLDv2 report, all but
      the first address is ignored, because the code breaks out of the loop if
      there has not been an error adding that address.
      
      This has caused failures when two guests connected through the bridge
      tried to communicate using IPv6. Neighbor discoveries would not be
      transmitted to the other guest when both used a link-local address and a
      static address.
      
      This only happens when there is a MLDv2 querier in the network.
      
      The fix will only break out of the loop when there is a failure adding a
      multicast address.
      
      The mdb before the patch:
      
      dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
      dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
      dev ovirtmgmt port bond0.86 grp ff02::2 temp
      
      After the patch:
      
      dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
      dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
      dev ovirtmgmt port bond0.86 grp ff02::fb temp
      dev ovirtmgmt port bond0.86 grp ff02::2 temp
      dev ovirtmgmt port bond0.86 grp ff02::d temp
      dev ovirtmgmt port vnet0 grp ff02::1:ff00:76 temp
      dev ovirtmgmt port bond0.86 grp ff02::16 temp
      dev ovirtmgmt port vnet1 grp ff02::1:ff00:77 temp
      dev ovirtmgmt port bond0.86 grp ff02::1:ff00:def temp
      dev ovirtmgmt port bond0.86 grp ff02::1:ffa1:40bf temp
      
      Fixes: 08b202b6 ("bridge br_multicast: IPv6 MLD support.")
      Reported-by: NRik Theys <Rik.Theys@esat.kuleuven.be>
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@redhat.com>
      Tested-by: NRik Theys <Rik.Theys@esat.kuleuven.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47cc84ce
  3. 20 5月, 2015 2 次提交
    • F
      Revert "netfilter: bridge: query conntrack about skb dnat" · faecbb45
      Florian Westphal 提交于
      This reverts commit c055d5b0.
      
      There are two issues:
      'dnat_took_place' made me think that this is related to
      -j DNAT/MASQUERADE.
      
      But thats only one part of the story.  This is also relevant for SNAT
      when we undo snat translation in reverse/reply direction.
      
      Furthermore, I originally wanted to do this mainly to avoid
      storing ipv6 addresses once we make DNAT/REDIRECT work
      for ipv6 on bridges.
      
      However, I forgot about SNPT/DNPT which is stateless.
      
      So we can't escape storing address for ipv6 anyway. Might as
      well do it for ipv4 too.
      Reported-and-tested-by: NBernhard Thaler <bernhard.thaler@wvnet.at>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      faecbb45
    • D
      netfilter: ensure number of counters is >0 in do_replace() · 1086bbe9
      Dave Jones 提交于
      After improving setsockopt() coverage in trinity, I started triggering
      vmalloc failures pretty reliably from this code path:
      
      warn_alloc_failed+0xe9/0x140
      __vmalloc_node_range+0x1be/0x270
      vzalloc+0x4b/0x50
      __do_replace+0x52/0x260 [ip_tables]
      do_ipt_set_ctl+0x15d/0x1d0 [ip_tables]
      nf_setsockopt+0x65/0x90
      ip_setsockopt+0x61/0xa0
      raw_setsockopt+0x16/0x60
      sock_common_setsockopt+0x14/0x20
      SyS_setsockopt+0x71/0xd0
      
      It turns out we don't validate that the num_counters field in the
      struct we pass in from userspace is initialized.
      
      The same problem also exists in ebtables, arptables, ipv6, and the
      compat variants.
      Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1086bbe9
  4. 30 4月, 2015 2 次提交
    • N
      bridge/nl: remove wrong use of NLM_F_MULTI · 46c264da
      Nicolas Dichtel 提交于
      NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
      it is sent only at the end of a dump.
      
      Libraries like libnl will wait forever for NLMSG_DONE.
      
      Fixes: e5a55a89 ("net: create generic bridge ops")
      Fixes: 815cccbf ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
      CC: John Fastabend <john.r.fastabend@intel.com>
      CC: Sathya Perla <sathya.perla@emulex.com>
      CC: Subbu Seetharaman <subbu.seetharaman@emulex.com>
      CC: Ajit Khaparde <ajit.khaparde@emulex.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: intel-wired-lan@lists.osuosl.org
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: Scott Feldman <sfeldma@gmail.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: bridge@lists.linux-foundation.org
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46c264da
    • N
      bridge/mdb: remove wrong use of NLM_F_MULTI · 82199679
      Nicolas Dichtel 提交于
      NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
      it is sent only at the end of a dump.
      
      Libraries like libnl will wait forever for NLMSG_DONE.
      
      Fixes: 37a393bc ("bridge: notify mdb changes via netlink")
      CC: Cong Wang <amwang@redhat.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: bridge@lists.linux-foundation.org
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82199679
  5. 13 4月, 2015 7 次提交
  6. 09 4月, 2015 1 次提交
  7. 08 4月, 2015 5 次提交
    • F
      netfilter: bridge: make BRNF_PKT_TYPE flag a bool · a1e67951
      Florian Westphal 提交于
      nf_bridge_info->mask is used for several things, for example to
      remember if skb->pkt_type was set to OTHER_HOST.
      
      For a bridge, OTHER_HOST is expected case. For ip forward its a non-starter
      though -- routing expects PACKET_HOST.
      
      Bridge netfilter thus changes OTHER_HOST to PACKET_HOST before hook
      invocation and then un-does it after hook traversal.
      
      This information is irrelevant outside of br_netfilter.
      
      After this change, ->mask now only contains flags that need to be
      known outside of br_netfilter in fast-path.
      
      Future patch changes mask into a 2bit state field in sk_buff, so that
      we can remove skb->nf_bridge pointer for good and consider all remaining
      places that access nf_bridge info content a not-so fastpath.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a1e67951
    • F
      netfilter: bridge: start splitting mask into public/private chunks · 3eaf4025
      Florian Westphal 提交于
      ->mask is a bit info field that mixes various use cases.
      
      In particular, we have flags that are mutually exlusive, and flags that
      are only used within br_netfilter while others need to be exposed to
      other parts of the kernel.
      
      Remove BRNF_8021Q/PPPoE flags.  They're mutually exclusive and only
      needed within br_netfilter context.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3eaf4025
    • F
      netfilter: bridge: add and use nf_bridge_info_get helper · 38330783
      Florian Westphal 提交于
      Don't access skb->nf_bridge directly, this pointer will be removed soon.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      38330783
    • F
      netfilter: bridge: don't use nf_bridge_info data to store mac header · e70deecb
      Florian Westphal 提交于
      br_netfilter maintains an extra state, nf_bridge_info, which is attached
      to skb via skb->nf_bridge pointer.
      
      Amongst other things we use skb->nf_bridge->data to store the original
      mac header for every processed skb.
      
      This is required for ip refragmentation when using conntrack
      on top of bridge, because ip_fragment doesn't copy it from original skb.
      
      However there is no need anymore to do this unconditionally.
      
      Move this to the one place where its needed -- when br_netfilter calls
      ip_fragment().
      
      Also switch to percpu storage for this so we can handle fragmenting
      without accessing nf_bridge meta data.
      
      Only user left is neigh resolution when DNAT is detected, to hold
      the original source mac address (neigh resolution builds new mac header
      using bridge mac), so rename ->data and reduce its size to whats needed.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      e70deecb
    • D
      netfilter: Pass socket pointer down through okfn(). · 7026b1dd
      David Miller 提交于
      On the output paths in particular, we have to sometimes deal with two
      socket contexts.  First, and usually skb->sk, is the local socket that
      generated the frame.
      
      And second, is potentially the socket used to control a tunneling
      socket, such as one the encapsulates using UDP.
      
      We do not want to disassociate skb->sk when encapsulating in order
      to fix this, because that would break socket memory accounting.
      
      The most extreme case where this can cause huge problems is an
      AF_PACKET socket transmitting over a vxlan device.  We hit code
      paths doing checks that assume they are dealing with an ipv4
      socket, but are actually operating upon the AF_PACKET one.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7026b1dd
  8. 05 4月, 2015 2 次提交
  9. 03 4月, 2015 1 次提交
  10. 02 4月, 2015 1 次提交
  11. 23 3月, 2015 1 次提交
  12. 19 3月, 2015 1 次提交
  13. 16 3月, 2015 2 次提交
  14. 15 3月, 2015 1 次提交
  15. 10 3月, 2015 4 次提交
  16. 09 3月, 2015 3 次提交
  17. 06 3月, 2015 1 次提交
    • J
      bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi · 842a9ae0
      Jouni Malinen 提交于
      This extends the design in commit 95850116 ("bridge: Add support for
      IEEE 802.11 Proxy ARP") with optional set of rules that are needed to
      meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The
      previously added BR_PROXYARP behavior is left as-is and a new
      BR_PROXYARP_WIFI alternative is added so that this behavior can be
      configured from user space when required.
      
      In addition, this enables proxyarp functionality for unicast ARP
      requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible
      to use unicast as well as broadcast for these frames.
      
      The key differences in functionality:
      
      BR_PROXYARP:
      - uses the flag on the bridge port on which the request frame was
        received to determine whether to reply
      - block bridge port flooding completely on ports that enable proxy ARP
      
      BR_PROXYARP_WIFI:
      - uses the flag on the bridge port to which the target device of the
        request belongs
      - block bridge port flooding selectively based on whether the proxyarp
        functionality replied
      Signed-off-by: NJouni Malinen <jouni@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      842a9ae0
  18. 04 3月, 2015 2 次提交
  19. 03 3月, 2015 1 次提交
    • F
      netfilter: bridge: rework reject handling · 72500bc1
      Florian Westphal 提交于
      bridge reject handling is not straightforward, there are many subtle
      differences depending on configuration.
      
      skb->dev is either the bridge port (PRE_ROUTING) or the bridge
      itself (INPUT), so we need to use indev instead.
      
      Also, checksum validation will only work reliably if we trim skb
      according to the l3 header size.
      
      While at it, add csum validation for ipv6 and skip existing tests
      if skb was already checked e.g. by GRO.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      72500bc1