1. 05 5月, 2015 1 次提交
  2. 08 4月, 2015 1 次提交
    • D
      netfilter: Pass socket pointer down through okfn(). · 7026b1dd
      David Miller 提交于
      On the output paths in particular, we have to sometimes deal with two
      socket contexts.  First, and usually skb->sk, is the local socket that
      generated the frame.
      
      And second, is potentially the socket used to control a tunneling
      socket, such as one the encapsulates using UDP.
      
      We do not want to disassociate skb->sk when encapsulating in order
      to fix this, because that would break socket memory accounting.
      
      The most extreme case where this can cause huge problems is an
      AF_PACKET socket transmitting over a vxlan device.  We hit code
      paths doing checks that assume they are dealing with an ipv4
      socket, but are actually operating upon the AF_PACKET one.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7026b1dd
  3. 17 11月, 2014 1 次提交
    • L
      bridge: fix netfilter/NF_BR_LOCAL_OUT for own, locally generated queries · f0b4eece
      Linus Lüssing 提交于
      Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected
      for both locally generated IGMP and MLD queries. The IP header specific
      filter options are off by 14 Bytes for netfilter (actual output on
      interfaces is fine).
      
      NF_HOOK() expects the skb->data to point to the IP header, not the
      ethernet one (while dev_queue_xmit() does not). Luckily there is an
      br_dev_queue_push_xmit() helper function already - let's just use that.
      
      Introduced by eb1d1641
      ("bridge: Add core IGMP snooping support")
      
      Ebtables example:
      
      $ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \
      	--log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROP
      
      before (broken):
      
      ~EBT:  IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
      	MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
      	SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \
      	DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \
      	IPv6 priority=0x3, Next Header=2
      
      after (working):
      
      ~EBT:  IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
      	MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
      	SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \
      	DST=ff02:0000:0000:0000:0000:0000:0000:0001, \
      	IPv6 priority=0x0, Next Header=0
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f0b4eece
  4. 23 8月, 2014 1 次提交
  5. 07 8月, 2014 1 次提交
    • K
      list: fix order of arguments for hlist_add_after(_rcu) · 1d023284
      Ken Helias 提交于
      All other add functions for lists have the new item as first argument
      and the position where it is added as second argument.  This was changed
      for no good reason in this function and makes using it unnecessary
      confusing.
      
      The name was changed to hlist_add_behind() to cause unconverted code to
      generate a compile error instead of using the wrong parameter order.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKen Helias <kenhelias@firemail.de>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[intel driver bits]
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d023284
  6. 09 7月, 2014 1 次提交
    • L
      bridge: export knowledge about the presence of IGMP/MLD queriers · c34963e2
      Linus Lüssing 提交于
      With this patch other modules are able to ask the bridge whether an
      IGMP or MLD querier exists on the according, bridged link layer.
      
      Multicast snooping can only be performed if a valid, selected querier
      exists on a link.
      
      Just like the bridge only enables its multicast snooping if a querier
      exists, e.g. batman-adv too can only activate its multicast
      snooping in bridged scenarios if a querier is present.
      
      For instance this export avoids having to reimplement IGMP/MLD
      querier message snooping and parsing in e.g. batman-adv, when
      multicast optimizations for bridged scenarios are added in the
      future.
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c34963e2
  7. 13 6月, 2014 2 次提交
  8. 11 6月, 2014 4 次提交
    • L
      bridge: memorize and export selected IGMP/MLD querier port · 2cd41431
      Linus Lüssing 提交于
      Adding bridge support to the batman-adv multicast optimization requires
      batman-adv knowing about the existence of bridged-in IGMP/MLD queriers
      to be able to reliably serve any multicast listener behind this same
      bridge.
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2cd41431
    • L
      bridge: add export of multicast database adjacent to net_dev · 07f8ac4a
      Linus Lüssing 提交于
      With this new, exported function br_multicast_list_adjacent(net_dev) a
      list of IPv4/6 addresses is returned. This list contains all multicast
      addresses sensed by the bridge multicast snooping feature on all bridge
      ports of the bridge interface of net_dev, excluding addresses from the
      specified net_device itself.
      
      Adding bridge support to the batman-adv multicast optimization requires
      batman-adv knowing about the existence of bridged-in multicast
      listeners to be able to reliably serve them with multicast packets.
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07f8ac4a
    • L
      bridge: adhere to querier election mechanism specified by RFCs · dc4eb53a
      Linus Lüssing 提交于
      MLDv1 (RFC2710 section 6), MLDv2 (RFC3810 section 7.6.2), IGMPv2
      (RFC2236 section 3) and IGMPv3 (RFC3376 section 6.6.2) specify that the
      querier with lowest source address shall become the selected
      querier.
      
      So far the bridge stopped its querier as soon as it heard another
      querier regardless of its source address. This results in the "wrong"
      querier potentially becoming the active querier or a potential,
      unnecessary querying delay.
      
      With this patch the bridge memorizes the source address of the currently
      selected querier and ignores queries from queriers with a higher source
      address than the currently selected one. This slight optimization is
      supposed to make it more RFC compliant (but is rather uncritical and
      therefore probably not necessary to be queued for stable kernels).
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc4eb53a
    • L
      bridge: rename struct bridge_mcast_query/querier · 90010b36
      Linus Lüssing 提交于
      The current naming of these two structs is very random, in that
      reversing their naming would not make any semantical difference.
      
      This patch tries to make the naming less confusing by giving them a more
      specific, distinguishable naming.
      
      This is also useful for the upcoming patches reintroducing the
      "struct bridge_mcast_querier" but for storing information about the
      selected querier (no matter if our own or a foreign querier).
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90010b36
  9. 12 3月, 2014 2 次提交
  10. 06 3月, 2014 1 次提交
  11. 25 2月, 2014 1 次提交
  12. 07 1月, 2014 1 次提交
    • C
      bridge: use spin_lock_bh() in br_multicast_set_hash_max · fe0d692b
      Curt Brune 提交于
      br_multicast_set_hash_max() is called from process context in
      net/bridge/br_sysfs_br.c by the sysfs store_hash_max() function.
      
      br_multicast_set_hash_max() calls spin_lock(&br->multicast_lock),
      which can deadlock the CPU if a softirq that also tries to take the
      same lock interrupts br_multicast_set_hash_max() while the lock is
      held .  This can happen quite easily when any of the bridge multicast
      timers expire, which try to take the same lock.
      
      The fix here is to use spin_lock_bh(), preventing other softirqs from
      executing on this CPU.
      
      Steps to reproduce:
      
      1. Create a bridge with several interfaces (I used 4).
      2. Set the "multicast query interval" to a low number, like 2.
      3. Enable the bridge as a multicast querier.
      4. Repeatedly set the bridge hash_max parameter via sysfs.
      
        # brctl addbr br0
        # brctl addif br0 eth1 eth2 eth3 eth4
        # brctl setmcqi br0 2
        # brctl setmcquerier br0 1
      
        # while true ; do echo 4096 > /sys/class/net/br0/bridge/hash_max; done
      Signed-off-by: NCurt Brune <curt@cumulusnetworks.com>
      Signed-off-by: NScott Feldman <sfeldma@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe0d692b
  13. 30 10月, 2013 1 次提交
  14. 23 10月, 2013 1 次提交
    • L
      Revert "bridge: only expire the mdb entry when query is received" · 454594f3
      Linus Lüssing 提交于
      While this commit was a good attempt to fix issues occuring when no
      multicast querier is present, this commit still has two more issues:
      
      1) There are cases where mdb entries do not expire even if there is a
      querier present. The bridge will unnecessarily continue flooding
      multicast packets on the according ports.
      
      2) Never removing an mdb entry could be exploited for a Denial of
      Service by an attacker on the local link, slowly, but steadily eating up
      all memory.
      
      Actually, this commit became obsolete with
      "bridge: disable snooping if there is no querier" (b00589af)
      which included fixes for a few more cases.
      
      Therefore reverting the following commits (the commit stated in the
      commit message plus three of its follow up fixes):
      
      ====================
      Revert "bridge: update mdb expiration timer upon reports."
      This reverts commit f144febd.
      Revert "bridge: do not call setup_timer() multiple times"
      This reverts commit 1faabf2a.
      Revert "bridge: fix some kernel warning in multicast timer"
      This reverts commit c7e8e8a8.
      Revert "bridge: only expire the mdb entry when query is received"
      This reverts commit 9f00b2e7.
      ====================
      
      CC: Cong Wang <amwang@redhat.com>
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Reviewed-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      454594f3
  15. 11 10月, 2013 1 次提交
    • V
      bridge: update mdb expiration timer upon reports. · f144febd
      Vlad Yasevich 提交于
      commit 9f00b2e7
      	bridge: only expire the mdb entry when query is received
      changed the mdb expiration timer to be armed only when QUERY is
      received.  Howerver, this causes issues in an environment where
      the multicast server socket comes and goes very fast while a client
      is trying to send traffic to it.
      
      The root cause is a race where a sequence of LEAVE followed by REPORT
      messages can race against QUERY messages generated in response to LEAVE.
      The QUERY ends up starting the expiration timer, and that timer can
      potentially expire after the new REPORT message has been received signaling
      the new join operation.  This leads to a significant drop in multicast
      traffic and possible complete stall.
      
      The solution is to have REPORT messages update the expiration timer
      on entries that already exist.
      
      CC: Cong Wang <xiyou.wangcong@gmail.com>
      CC: Herbert Xu <herbert@gondor.apana.org.au>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f144febd
  16. 03 10月, 2013 1 次提交
  17. 06 9月, 2013 2 次提交
  18. 05 9月, 2013 1 次提交
  19. 31 8月, 2013 2 次提交
    • D
      net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delay · 2d98c29b
      Daniel Borkmann 提交于
      While looking into MLDv1/v2 code, I noticed that bridging code does
      not convert it's max delay into jiffies for MLDv2 messages as we do
      in core IPv6' multicast code.
      
      RFC3810, 5.1.3. Maximum Response Code says:
      
        The Maximum Response Code field specifies the maximum time allowed
        before sending a responding Report. The actual time allowed, called
        the Maximum Response Delay, is represented in units of milliseconds,
        and is derived from the Maximum Response Code as follows: [...]
      
      As we update timers that work with jiffies, we need to convert it.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Linus Lüssing <linus.luessing@web.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d98c29b
    • L
      bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones · cc0fdd80
      Linus Lüssing 提交于
      Currently we would still potentially suffer multicast packet loss if there
      is just either an IGMP or an MLD querier: For the former case, we would
      possibly drop IPv6 multicast packets, for the latter IPv4 ones. This is
      because we are currently assuming that if either an IGMP or MLD querier
      is present that the other one is present, too.
      
      This patch makes the behaviour and fix added in
      "bridge: disable snooping if there is no querier" (b00589af)
      to also work if there is either just an IGMP or an MLD querier on the
      link: It refines the deactivation of the snooping to be protocol
      specific by using separate timers for the snooped IGMP and MLD queries
      as well as separate timers for our internal IGMP and MLD queriers.
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc0fdd80
  20. 06 8月, 2013 1 次提交
  21. 01 8月, 2013 1 次提交
    • L
      bridge: disable snooping if there is no querier · b00589af
      Linus Lüssing 提交于
      If there is no querier on a link then we won't get periodic reports and
      therefore won't be able to learn about multicast listeners behind ports,
      potentially leading to lost multicast packets, especially for multicast
      listeners that joined before the creation of the bridge.
      
      These lost multicast packets can appear since c5c23260
      ("bridge: Add multicast_querier toggle and disable queries by default")
      in particular.
      
      With this patch we are flooding multicast packets if our querier is
      disabled and if we didn't detect any other querier.
      
      A grace period of the Maximum Response Delay of the querier is added to
      give multicast responses enough time to arrive and to be learned from
      before disabling the flooding behaviour again.
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b00589af
  22. 20 7月, 2013 1 次提交
  23. 07 7月, 2013 1 次提交
    • C
      bridge: fix some kernel warning in multicast timer · c7e8e8a8
      Cong Wang 提交于
      Several people reported the warning: "kernel BUG at kernel/timer.c:729!"
      and the stack trace is:
      
      	#7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905
      	#8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge]
      	#9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge]
      	#10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge]
      	#11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge]
      	#12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc
      	#13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6
      	#14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad
      	#15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17
      	#16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68
      	#17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101
      	#18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8
      	#19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun]
      	#20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun]
      	#21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1
      	#22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe
      	#23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f
      	#24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1
      	#25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292
      
      this is due to I forgot to check if mp->timer is armed in
      br_multicast_del_pg(). This bug is introduced by
      commit 9f00b2e7 (bridge: only expire the mdb entry
      when query is received).
      
      Same for __br_mdb_del().
      Tested-by: Npoma <pomidorabelisima@gmail.com>
      Reported-by: NLiYonghua <809674045@qq.com>
      Reported-by: NRobert Hancock <hancockrwd@gmail.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7e8e8a8
  24. 24 6月, 2013 1 次提交
  25. 18 6月, 2013 1 次提交
  26. 23 5月, 2013 3 次提交
  27. 08 3月, 2013 2 次提交
  28. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  29. 14 2月, 2013 1 次提交
  30. 03 1月, 2013 1 次提交