1. 16 8月, 2017 1 次提交
    • E
      ipv6: fix NULL dereference in ip6_route_dev_notify() · 12d94a80
      Eric Dumazet 提交于
      Based on a syzkaller report [1], I found that a per cpu allocation
      failure in snmp6_alloc_dev() would then lead to NULL dereference in
      ip6_route_dev_notify().
      
      It seems this is a very old bug, thus no Fixes tag in this submission.
      
      Let's add in6_dev_put_clear() helper, as we will probably use
      it elsewhere (once available/present in net-next)
      
      [1]
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 17294 Comm: syz-executor6 Not tainted 4.13.0-rc2+ #10
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      task: ffff88019f456680 task.stack: ffff8801c6e58000
      RIP: 0010:__read_once_size include/linux/compiler.h:250 [inline]
      RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
      RIP: 0010:refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178
      RSP: 0018:ffff8801c6e5f1b0 EFLAGS: 00010202
      RAX: 0000000000000037 RBX: dffffc0000000000 RCX: ffffc90005d25000
      RDX: ffff8801c6e5f218 RSI: ffffffff82342bbf RDI: 0000000000000001
      RBP: ffff8801c6e5f240 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff10038dcbe37
      R13: 0000000000000006 R14: 0000000000000001 R15: 00000000000001b8
      FS:  00007f21e0429700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001ddbc22000 CR3: 00000001d632b000 CR4: 00000000001426e0
      DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
      Call Trace:
       refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211
       in6_dev_put include/net/addrconf.h:335 [inline]
       ip6_route_dev_notify+0x1c9/0x4a0 net/ipv6/route.c:3732
       notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1678
       call_netdevice_notifiers net/core/dev.c:1694 [inline]
       rollback_registered_many+0x91c/0xe80 net/core/dev.c:7107
       rollback_registered+0x1be/0x3c0 net/core/dev.c:7149
       register_netdevice+0xbcd/0xee0 net/core/dev.c:7587
       register_netdev+0x1a/0x30 net/core/dev.c:7669
       loopback_net_init+0x76/0x160 drivers/net/loopback.c:214
       ops_init+0x10a/0x570 net/core/net_namespace.c:118
       setup_net+0x313/0x710 net/core/net_namespace.c:294
       copy_net_ns+0x27c/0x580 net/core/net_namespace.c:418
       create_new_namespaces+0x425/0x880 kernel/nsproxy.c:107
       unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:206
       SYSC_unshare kernel/fork.c:2347 [inline]
       SyS_unshare+0x653/0xfa0 kernel/fork.c:2297
       entry_SYSCALL_64_fastpath+0x1f/0xbe
      RIP: 0033:0x4512c9
      RSP: 002b:00007f21e0428c08 EFLAGS: 00000216 ORIG_RAX: 0000000000000110
      RAX: ffffffffffffffda RBX: 0000000000718150 RCX: 00000000004512c9
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000062020200
      RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004b973d
      R13: 00000000ffffffff R14: 000000002001d000 R15: 00000000000002dd
      Code: 50 2b 34 82 c7 00 f1 f1 f1 f1 c7 40 04 04 f2 f2 f2 c7 40 08 f3 f3
      f3 f3 e8 a1 43 39 ff 4c 89 f8 48 8b 95 70 ff ff ff 48 c1 e8 03 <0f> b6
      0c 18 4c 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
      RIP: __read_once_size include/linux/compiler.h:250 [inline] RSP:
      ffff8801c6e5f1b0
      RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP:
      ffff8801c6e5f1b0
      RIP: refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178 RSP:
      ffff8801c6e5f1b0
      ---[ end trace e441d046c6410d31 ]---
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12d94a80
  2. 04 7月, 2017 2 次提交
  3. 10 6月, 2017 1 次提交
    • K
      Ipvlan should return an error when an address is already in use. · 3ad7d246
      Krister Johansen 提交于
      The ipvlan code already knows how to detect when a duplicate address is
      about to be assigned to an ipvlan device.  However, that failure is not
      propogated outward and leads to a silent failure.
      
      Introduce a validation step at ip address creation time and allow device
      drivers to register to validate the incoming ip addresses.  The ipvlan
      code is the first consumer.  If it detects an address in use, we can
      return an error to the user before beginning to commit the new ifa in
      the networking code.
      
      This can be especially useful if it is necessary to provision many
      ipvlans in containers.  The provisioning software (or operator) can use
      this to detect situations where an ip address is unexpectedly in use.
      Signed-off-by: NKrister Johansen <kjlx@templeofstupid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ad7d246
  4. 09 5月, 2017 1 次提交
    • W
      ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf · 242d3a49
      WANG Cong 提交于
      For each netns (except init_net), we initialize its null entry
      in 3 places:
      
      1) The template itself, as we use kmemdup()
      2) Code around dst_init_metrics() in ip6_route_net_init()
      3) ip6_route_dev_notify(), which is supposed to initialize it after
         loopback registers
      
      Unfortunately the last one still happens in a wrong order because
      we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to
      net->loopback_dev's idev, thus we have to do that after we add
      idev to loopback. However, this notifier has priority == 0 same as
      ipv6_dev_notf, and ipv6_dev_notf is registered after
      ip6_route_dev_notifier so it is called actually after
      ip6_route_dev_notifier. This is similar to commit 2f460933
      ("ipv6: initialize route null entry in addrconf_init()") which
      fixes init_net.
      
      Fix it by picking a smaller priority for ip6_route_dev_notifier.
      Also, we have to release the refcnt accordingly when unregistering
      loopback_dev because device exit functions are called before subsys
      exit functions.
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      242d3a49
  5. 26 4月, 2017 1 次提交
  6. 29 3月, 2017 1 次提交
  7. 19 1月, 2017 1 次提交
  8. 21 10月, 2016 1 次提交
  9. 30 9月, 2016 1 次提交
  10. 16 6月, 2016 2 次提交
  11. 11 2月, 2016 1 次提交
  12. 05 1月, 2016 1 次提交
    • C
      soreuseport: fast reuseport UDP socket selection · e32ea7e7
      Craig Gallek 提交于
      Include a struct sock_reuseport instance when a UDP socket binds to
      a specific address for the first time with the reuseport flag set.
      When selecting a socket for an incoming UDP packet, use the information
      available in sock_reuseport if present.
      
      This required adding an additional field to the UDP source address
      equality function to differentiate between exact and wildcard matches.
      The original use case allowed wildcard matches when checking for
      existing port uses during bind.  The new use case of adding a socket
      to a reuseport group requires exact address matching.
      
      Performance test (using a machine with 2 CPU sockets and a total of
      48 cores):  Create reuseport groups of varying size.  Use one socket
      from this group per user thread (pinning each thread to a different
      core) calling recvmmsg in a tight loop.  Record number of messages
      received per second while saturating a 10G link.
        10 sockets: 18% increase (~2.8M -> 3.3M pkts/s)
        20 sockets: 14% increase (~2.9M -> 3.3M pkts/s)
        40 sockets: 13% increase (~3.0M -> 3.4M pkts/s)
      
      This work is based off a similar implementation written by
      Ying Cai <ycai@google.com> for implementing policy-based reuseport
      selection.
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e32ea7e7
  13. 25 9月, 2015 1 次提交
  14. 31 8月, 2015 1 次提交
  15. 01 8月, 2015 1 次提交
    • R
      ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument · 343d60aa
      Roopa Prabhu 提交于
      This patch adds net argument to ipv6_stub_impl.ipv6_dst_lookup
      for use cases where sk is not available (like mpls).
      sk appears to be needed to get the namespace 'net' and is optional
      otherwise. This patch series changes ipv6_stub_impl.ipv6_dst_lookup
      to take net argument. sk remains optional.
      
      All callers of ipv6_stub_impl.ipv6_dst_lookup have been modified
      to pass net. I have modified them to use already available
      'net' in the scope of the call. I can change them to
      sock_net(sk) to avoid any unintended change in behaviour if sock
      namespace is different. They dont seem to be from code inspection.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      343d60aa
  16. 05 5月, 2015 1 次提交
    • L
      net: Export IGMP/MLD message validation code · 9afd85c9
      Linus Lüssing 提交于
      With this patch, the IGMP and MLD message validation functions are moved
      from the bridge code to IPv4/IPv6 multicast files. Some small
      refactoring was done to enhance readibility and to iron out some
      differences in behaviour between the IGMP and MLD parsing code (e.g. the
      skb-cloning of MLD messages is now only done if necessary, just like the
      IGMP part always did).
      
      Finally, these IGMP and MLD message validation functions are exported so
      that not only the bridge can use it but batman-adv later, too.
      Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9afd85c9
  17. 06 2月, 2015 1 次提交
    • E
      net: ipv6: allow explicitly choosing optimistic addresses · c58da4c6
      Erik Kline 提交于
      RFC 4429 ("Optimistic DAD") states that optimistic addresses
      should be treated as deprecated addresses.  From section 2.1:
      
         Unless noted otherwise, components of the IPv6 protocol stack
         should treat addresses in the Optimistic state equivalently to
         those in the Deprecated state, indicating that the address is
         available for use but should not be used if another suitable
         address is available.
      
      Optimistic addresses are indeed avoided when other addresses are
      available (i.e. at source address selection time), but they have
      not heretofore been available for things like explicit bind() and
      sendmsg() with struct in6_pktinfo, etc.
      
      This change makes optimistic addresses treated more like
      deprecated addresses than tentative ones.
      Signed-off-by: NErik Kline <ek@google.com>
      Acked-by: NLorenzo Colitti <lorenzo@google.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c58da4c6
  18. 14 9月, 2014 1 次提交
  19. 13 9月, 2014 1 次提交
  20. 01 5月, 2014 1 次提交
  21. 28 2月, 2014 1 次提交
    • B
      ipv6: addrconf: silence sparse endianness warnings · bc861959
      Bjørn Mork 提交于
      Avoid the following sparse __CHECK_ENDIAN__ warnings:
      
       include/net/addrconf.h:318:25: warning: restricted __be64 degrades to integer
       include/net/addrconf.h:318:70: warning: restricted __be64 degrades to integer
       include/net/addrconf.h:330:25: warning: restricted __be64 degrades to integer
       include/net/addrconf.h:330:70: warning: restricted __be64 degrades to integer
       include/net/addrconf.h:347:25: warning: restricted __be64 degrades to integer
       include/net/addrconf.h:348:26: warning: restricted __be64 degrades to integer
       include/net/addrconf.h:349:18: warning: restricted __be64 degrades to integer
      
      The warnings are false but they make it harder to spot real
      bugs.
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc861959
  22. 23 1月, 2014 1 次提交
  23. 10 12月, 2013 1 次提交
  24. 07 12月, 2013 1 次提交
  25. 29 9月, 2013 1 次提交
  26. 01 9月, 2013 3 次提交
  27. 01 8月, 2013 1 次提交
  28. 02 7月, 2013 1 次提交
    • A
      ipv6,mcast: always hold idev->lock before mca_lock · 8965779d
      Amerigo Wang 提交于
      dingtianhong reported the following deadlock detected by lockdep:
      
       ======================================================
       [ INFO: possible circular locking dependency detected ]
       3.4.24.05-0.1-default #1 Not tainted
       -------------------------------------------------------
       ksoftirqd/0/3 is trying to acquire lock:
        (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
      
       but task is already holding lock:
        (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&mc->mca_lock){+.+...}:
              [<ffffffff810a8027>] validate_chain+0x637/0x730
              [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
              [<ffffffff810a8734>] lock_acquire+0x114/0x150
              [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
              [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
              [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
              [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
              [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
              [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
              [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
              [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
              [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
              [<ffffffff813d9360>] dev_change_flags+0x40/0x70
              [<ffffffff813ea627>] do_setlink+0x237/0x8a0
              [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
              [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
              [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
              [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
              [<ffffffff81403e20>] netlink_unicast+0x140/0x180
              [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
              [<ffffffff813c4252>] sock_sendmsg+0x112/0x130
              [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
              [<ffffffff813c5544>] sys_sendmsg+0x44/0x70
              [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b
      
       -> #0 (&ndev->lock){+.+...}:
              [<ffffffff810a798e>] check_prev_add+0x3de/0x440
              [<ffffffff810a8027>] validate_chain+0x637/0x730
              [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
              [<ffffffff810a8734>] lock_acquire+0x114/0x150
              [<ffffffff814f6c82>] rt_read_lock+0x42/0x60
              [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
              [<ffffffff8149b036>] mld_newpack+0xb6/0x160
              [<ffffffff8149b18b>] add_grhead+0xab/0xc0
              [<ffffffff8149d03b>] add_grec+0x3ab/0x460
              [<ffffffff8149d14a>] mld_send_report+0x5a/0x150
              [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
              [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
              [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
              [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
              [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
              [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
              [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
              [<ffffffff8106c7de>] kthread+0xae/0xc0
              [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10
      
      actually we can just hold idev->lock before taking pmc->mca_lock,
      and avoid taking idev->lock again when iterating idev->addr_list,
      since the upper callers of mld_newpack() already take
      read_lock_bh(&idev->lock).
      Reported-by: Ndingtianhong <dingtianhong@huawei.com>
      Cc: dingtianhong <dingtianhong@huawei.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Tested-by: NDing Tianhong <dingtianhong@huawei.com>
      Tested-by: NChen Weilong <chenweilong@huawei.com>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8965779d
  29. 29 6月, 2013 1 次提交
  30. 23 5月, 2013 1 次提交
    • F
      netfilter: add nf_ipv6_ops hook to fix xt_addrtype with IPv6 · 2a7851bf
      Florian Westphal 提交于
      Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812:
      
      [ ip6tables -m addrtype ]
      When I tried to use in the nat/PREROUTING it messes up the
      routing cache even if the rule didn't matched at all.
      [..]
      If I remove the --limit-iface-in from the non-working scenario, so just
      use the -m addrtype --dst-type LOCAL it works!
      
      This happens when LOCAL type matching is requested with --limit-iface-in,
      and the default ipv6 route is via the interface the packet we test
      arrived on.
      
      Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation
      creates an unwanted cached entry, and the packet won't make it to the
      real/expected destination.
      
      Silently ignoring --limit-iface-in makes the routing work but it breaks
      rule matching (--dst-type LOCAL with limit-iface-in is supposed to only
      match if the dst address is configured on the incoming interface;
      without --limit-iface-in it will match if the address is reachable
      via lo).
      
      The test should call ipv6_chk_addr() instead.  However, this would add
      a link-time dependency on ipv6.
      
      There are two possible solutions:
      
      1) Revert the commit that moved ipt_addrtype to xt_addrtype,
         and put ipv6 specific code into ip6t_addrtype.
      2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions.
      
      While the former might seem preferable, Pablo pointed out that there
      are more xt modules with link-time dependeny issues regarding ipv6,
      so lets go for 2).
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2a7851bf
  31. 15 4月, 2013 1 次提交
    • C
      ipv6: statically link register_inet6addr_notifier() · f88c91dd
      Cong Wang 提交于
      Tomas reported the following build error:
      
      net/built-in.o: In function `ieee80211_unregister_hw':
      (.text+0x10f0e1): undefined reference to `unregister_inet6addr_notifier'
      net/built-in.o: In function `ieee80211_register_hw':
      (.text+0x10f610): undefined reference to `register_inet6addr_notifier'
      make: *** [vmlinux] Error 1
      
      when built IPv6 as a module.
      
      So we have to statically link these symbols.
      Reported-by: NTomas Melin <tomas.melin@iki.fi>
      Cc: Tomas Melin <tomas.melin@iki.fi>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: YOSHIFUJI Hidaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f88c91dd
  32. 30 1月, 2013 1 次提交
    • J
      ipv4: introduce address lifetime · 5c766d64
      Jiri Pirko 提交于
      There are some usecase when lifetime of ipv4 addresses might be helpful.
      For example:
      1) initramfs networkmanager uses a DHCP daemon to learn network
      configuration parameters
      2) initramfs networkmanager addresses, routes and DNS configuration
      3) initramfs networkmanager is requested to stop
      4) initramfs networkmanager stops all daemons including dhclient
      5) there are addresses and routes configured but no daemon running. If
      the system doesn't start networkmanager for some reason, addresses and
      routes will be used forever, which violates RFC 2131.
      
      This patch is essentially a backport of ivp6 address lifetime mechanism
      for ipv4 addresses.
      
      Current "ip" tool supports this without any patch (since it does not
      distinguish between ipv4 and ipv6 addresses in this perspective.
      
      Also, this should be back-compatible with all current netlink users.
      Reported-by: NPavel Šimerda <psimerda@redhat.com>
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c766d64
  33. 21 1月, 2013 4 次提交