1. 19 12月, 2015 2 次提交
    • B
      ipv6: addrconf: use stable address generator for ARPHRD_NONE · cc9da6cc
      Bjørn Mork 提交于
      Add a new address generator mode, using the stable address generator
      with an automatically generated secret. This is intended as a default
      address generator mode for device types with no EUI64 implementation.
      The new generator is used for ARPHRD_NONE interfaces initially, adding
      default IPv6 autoconf support to e.g. tun interfaces.
      
      If the addrgenmode is set to 'random', either by default or manually,
      and no stable secret is available, then a random secret is used as
      input for the stable-privacy address generator.  The secret can be
      read and modified like manually configured secrets, using the proc
      interface.  Modifying the secret will change the addrgen mode to
      'stable-privacy' to indicate that it operates on a known secret.
      
      Existing behaviour of the 'stable-privacy' mode is kept unchanged. If
      a known secret is available when the device is created, then the mode
      will default to 'stable-privacy' as before.  The mode can be manually
      set to 'random' but it will behave exactly like 'stable-privacy' in
      this case. The secret will not change.
      
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: 吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc9da6cc
    • A
      ila: add NETFILTER dependency · 8cb964da
      Arnd Bergmann 提交于
      The recently added generic ILA translation facility fails to
      build when CONFIG_NETFILTER is disabled:
      
      net/ipv6/ila/ila_xlat.c:229:20: warning: 'struct nf_hook_state' declared inside parameter list
      net/ipv6/ila/ila_xlat.c:235:27: error: array type has incomplete element type 'struct nf_hook_ops'
       static struct nf_hook_ops ila_nf_hook_ops[] __read_mostly = {
      
      This adds an explicit Kconfig dependency to avoid that case.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 7f00feaf ("ila: Add generic ILA translation facility")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb964da
  2. 18 12月, 2015 2 次提交
  3. 16 12月, 2015 5 次提交
  4. 15 12月, 2015 3 次提交
    • E
      net: fix IP early demux races · 5037e9ef
      Eric Dumazet 提交于
      David Wilder reported crashes caused by dst reuse.
      
      <quote David>
        I am seeing a crash on a distro V4.2.3 kernel caused by a double
        release of a dst_entry.  In ipv4_dst_destroy() the call to
        list_empty() finds a poisoned next pointer, indicating the dst_entry
        has already been removed from the list and freed. The crash occurs
        18 to 24 hours into a run of a network stress exerciser.
      </quote>
      
      Thanks to his detailed report and analysis, we were able to understand
      the core issue.
      
      IP early demux can associate a dst to skb, after a lookup in TCP/UDP
      sockets.
      
      When socket cache is not properly set, we want to store into
      sk->sk_dst_cache the dst for future IP early demux lookups,
      by acquiring a stable refcount on the dst.
      
      Problem is this acquisition is simply using an atomic_inc(),
      which works well, unless the dst was queued for destruction from
      dst_release() noticing dst refcount went to zero, if DST_NOCACHE
      was set on dst.
      
      We need to make sure current refcount is not zero before incrementing
      it, or risk double free as David reported.
      
      This patch, being a stable candidate, adds two new helpers, and use
      them only from IP early demux problematic paths.
      
      It might be possible to merge in net-next skb_dst_force() and
      skb_dst_force_safe(), but I prefer having the smallest patch for stable
      kernels : Maybe some skb_dst_force() callers do not expect skb->dst
      can suddenly be cleared.
      
      Can probably be backported back to linux-3.6 kernels
      Reported-by: NDavid J. Wilder <dwilder@us.ibm.com>
      Tested-by: NDavid J. Wilder <dwilder@us.ibm.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5037e9ef
    • A
      ipv6: addrconf: drop ieee802154 specific things · 5241c2d7
      Alexander Aring 提交于
      This patch removes ARPHRD_IEEE802154 from addrconf handling. In the
      earlier days of 802.15.4 6LoWPAN, the interface type was ARPHRD_IEEE802154
      which introduced several issues, because 802.15.4 interfaces used the
      same type.
      
      Since commit 965e613d ("ieee802154: 6lowpan: fix ARPHRD to
      ARPHRD_6LOWPAN") we use ARPHRD_6LOWPAN for 6LoWPAN interfaces. This
      patch will remove ARPHRD_IEEE802154 which is currently deadcode, because
      ARPHRD_IEEE802154 doesn't reach the minimum 1280 MTU of IPv6.
      
      Also we use 6LoWPAN EUI64 specific defines instead using link-layer
      constanst from 802.15.4 link-layer header.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5241c2d7
    • H
      net: add validation for the socket syscall protocol argument · 79462ad0
      Hannes Frederic Sowa 提交于
      郭永刚 reported that one could simply crash the kernel as root by
      using a simple program:
      
      	int socket_fd;
      	struct sockaddr_in addr;
      	addr.sin_port = 0;
      	addr.sin_addr.s_addr = INADDR_ANY;
      	addr.sin_family = 10;
      
      	socket_fd = socket(10,3,0x40000000);
      	connect(socket_fd , &addr,16);
      
      AF_INET, AF_INET6 sockets actually only support 8-bit protocol
      identifiers. inet_sock's skc_protocol field thus is sized accordingly,
      thus larger protocol identifiers simply cut off the higher bits and
      store a zero in the protocol fields.
      
      This could lead to e.g. NULL function pointer because as a result of
      the cut off inet_num is zero and we call down to inet_autobind, which
      is NULL for raw sockets.
      
      kernel: Call Trace:
      kernel:  [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70
      kernel:  [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
      kernel:  [<ffffffff81645069>] SYSC_connect+0xd9/0x110
      kernel:  [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
      kernel:  [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
      kernel:  [<ffffffff81645e0e>] SyS_connect+0xe/0x10
      kernel:  [<ffffffff81779515>] tracesys_phase2+0x84/0x89
      
      I found no particular commit which introduced this problem.
      
      CVE: CVE-2015-8543
      Cc: Cong Wang <cwang@twopensource.com>
      Reported-by: N郭永刚 <guoyonggang@360.cn>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79462ad0
  5. 11 12月, 2015 1 次提交
  6. 06 12月, 2015 2 次提交
  7. 05 12月, 2015 1 次提交
  8. 04 12月, 2015 2 次提交
    • P
      net: ipv6: restrict hop_limit sysctl setting to range [1; 255] · d6df198d
      Phil Sutter 提交于
      Setting a value bigger than 255 resulted in using only the lower eight
      bits of that value as it is assigned to the u8 header field. To avoid
      this unexpected result, reject such values.
      
      Setting a value of zero is technically possible, but hosts receiving
      such a packet have to treat it like hop_limit was set to one, according
      to RFC2460. Therefore I don't see a use-case for that.
      
      Setting a route's hop_limit to zero in iproute2 means to use the sysctl
      default, which is not the case here: Setting e.g.
      net.conf.eth0.hop_limit=0 will not make the kernel use
      net.conf.all.hop_limit for outgoing packets on eth0. To avoid these
      kinds of confusion, reject zero.
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6df198d
    • E
      ipv6: kill sk_dst_lock · 6bd4f355
      Eric Dumazet 提交于
      While testing the np->opt RCU conversion, I found that UDP/IPv6 was
      using a mixture of xchg() and sk_dst_lock to protect concurrent changes
      to sk->sk_dst_cache, leading to possible corruptions and crashes.
      
      ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest
      way to fix the mess is to remove sk_dst_lock completely, as we did for
      IPv4.
      
      __ip6_dst_store() and ip6_dst_store() share same implementation.
      
      sk_setup_caps() being called with socket lock being held or not,
      we have to use sk_dst_set() instead of __sk_dst_set()
      
      Note that I had to move the "np->dst_cookie = rt6_get_cookie(rt);"
      in ip6_dst_store() before the sk_setup_caps(sk, dst) call.
      
      This is because ip6_dst_store() can be called from process context,
      without any lock held.
      
      As soon as the dst is installed in sk->sk_dst_cache, dst can be freed
      from another cpu doing a concurrent ip6_dst_store()
      
      Doing the dst dereference before doing the install is needed to make
      sure no use after free would trigger.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bd4f355
  9. 03 12月, 2015 2 次提交
  10. 02 12月, 2015 1 次提交
  11. 01 12月, 2015 1 次提交
  12. 25 11月, 2015 2 次提交
  13. 23 11月, 2015 2 次提交
  14. 19 11月, 2015 1 次提交
  15. 17 11月, 2015 1 次提交
  16. 16 11月, 2015 4 次提交
    • E
      tcp: ensure proper barriers in lockless contexts · 00fd38d9
      Eric Dumazet 提交于
      Some functions access TCP sockets without holding a lock and
      might output non consistent data, depending on compiler and or
      architecture.
      
      tcp_diag_get_info(), tcp_get_info(), tcp_poll(), get_tcp4_sock() ...
      
      Introduce sk_state_load() and sk_state_store() to fix the issues,
      and more clearly document where this lack of locking is happening.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00fd38d9
    • M
      ipv6: Check rt->dst.from for the DST_NOCACHE route · 02bcf4e0
      Martin KaFai Lau 提交于
      All DST_NOCACHE rt6_info used to have rt->dst.from set to
      its parent.
      
      After commit 8e3d5be7 ("ipv6: Avoid double dst_free"),
      DST_NOCACHE is also set to rt6_info which does not have
      a parent (i.e. rt->dst.from is NULL).
      
      This patch catches the rt->dst.from == NULL case.
      
      Fixes: 8e3d5be7 ("ipv6: Avoid double dst_free")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02bcf4e0
    • M
      ipv6: Check expire on DST_NOCACHE route · 5973fb1e
      Martin KaFai Lau 提交于
      Since the expires of the DST_NOCACHE rt can be set during
      the ip6_rt_update_pmtu(), we also need to consider the expires
      value when doing ip6_dst_check().
      
      This patches creates __rt6_check_expired() to only
      check the expire value (if one exists) of the current rt.
      
      In rt6_dst_from_check(), it adds __rt6_check_expired() as
      one of the condition check.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5973fb1e
    • M
      ipv6: Avoid creating RTF_CACHE from a rt that is not managed by fib6 tree · 0d3f6d29
      Martin KaFai Lau 提交于
      The original bug report:
      https://bugzilla.redhat.com/show_bug.cgi?id=1272571
      
      The setup has a IPv4 GRE tunnel running in a IPSec.  The bug
      happens when ndisc starts sending router solicitation at the gre
      interface.  The simplified oops stack is like:
      
      __lock_acquire+0x1b2/0x1c30
      lock_acquire+0xb9/0x140
      _raw_write_lock_bh+0x3f/0x50
      __ip6_ins_rt+0x2e/0x60
      ip6_ins_rt+0x49/0x50
      ~~~~~~~~
      __ip6_rt_update_pmtu.part.54+0x145/0x250
      ip6_rt_update_pmtu+0x2e/0x40
      ~~~~~~~~
      ip_tunnel_xmit+0x1f1/0xf40
      __gre_xmit+0x7a/0x90
      ipgre_xmit+0x15a/0x220
      dev_hard_start_xmit+0x2bd/0x480
      __dev_queue_xmit+0x696/0x730
      dev_queue_xmit+0x10/0x20
      neigh_direct_output+0x11/0x20
      ip6_finish_output2+0x21f/0x770
      ip6_finish_output+0xa7/0x1d0
      ip6_output+0x56/0x190
      ~~~~~~~~
      ndisc_send_skb+0x1d9/0x400
      ndisc_send_rs+0x88/0xc0
      ~~~~~~~~
      
      The rt passed to ip6_rt_update_pmtu() is created by
      icmp6_dst_alloc() and it is not managed by the fib6 tree,
      so its rt6i_table == NULL.  When __ip6_rt_update_pmtu() creates
      a RTF_CACHE clone, the newly created clone also has rt6i_table == NULL
      and it causes the ip6_ins_rt() oops.
      
      During pmtu update, we only want to create a RTF_CACHE clone
      from a rt which is currently managed (or owned) by the
      fib6 tree.  It means either rt->rt6i_node != NULL or
      rt is a RTF_PCPU clone.
      
      It is worth to note that rt6i_table may not be NULL even it is
      not (yet) managed by the fib6 tree (e.g. addrconf_dst_alloc()).
      Hence, rt6i_node is a better check instead of rt6i_table.
      
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Reported-by: NChris Siebenmann <cks-rhbugzilla@cs.toronto.edu>
      Cc: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d3f6d29
  17. 06 11月, 2015 2 次提交
  18. 05 11月, 2015 1 次提交
  19. 03 11月, 2015 5 次提交