1. 22 3月, 2019 1 次提交
    • G
      net: kernel hookers service for toa module · eff26a80
      George Zhang 提交于
      LVS fullnat will replace network traffic's source ip with its local ip,
      and thus the backend servers cannot obtain the real client ip.
      
      To solve this, LVS has introduced the tcp option address (TOA) to store
      the essential ip address information in the last tcp ack packet of the
      3-way handshake, and the backend servers need to retrieve it from the
      packet header.
      
      In this patch, we have introduced the sk_toa_data member in the sock
      structure to hold the TOA information. There used to be an in-tree
      module for TOA managing, whereas it has now been maintained as an
      standalone module.
      
      In this case, the toa module should register its hook function(s) using
      the provided interfaces in the hookers module.
      
      TOA in sock structure:
      
      	__be32 sk_toa_data[16];
      
      The hookers module only provides the sk_toa_data placeholder, and the
      toa module can use this variable through the layout it needs.
      
      Hook interfaces:
      
      The hookers module replaces the kernel's syn_recv_sock and getname
      handler with a stub that chains the toa module's hook function(s) to the
      original handling function. The hookers module allows hook functions to
      be installed and uninstalled in any order.
      
      toa module:
      
      The external toa module will be provided in separate RPM package.
      
      [xuyu@linux.alibaba.com: amend commit log]
      Signed-off-by: NGeorge Zhang <georgezhang@linux.alibaba.com>
      Signed-off-by: NXu Yu <xuyu@linux.alibaba.com>
      Reviewed-by: NCaspar Zhang <caspar@linux.alibaba.com>
      eff26a80
  2. 19 3月, 2019 5 次提交
    • P
      ipv6: route: enforce RCU protection in ip6_route_check_nh_onlink() · 2e4b2aeb
      Paolo Abeni 提交于
      [ Upstream commit bf1dc8bad1d42287164d216d8efb51c5cd381b18 ]
      
      We need a RCU critical section around rt6_info->from deference, and
      proper annotation.
      
      Fixes: 4ed591c8ab44 ("net/ipv6: Allow onlink routes to have a device mismatch if it is the default route")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e4b2aeb
    • P
      ipv6: route: enforce RCU protection in rt6_update_exception_stamp_rt() · 96dd4ef3
      Paolo Abeni 提交于
      [ Upstream commit 193f3685d0546b0cea20c99894aadb70098e47bf ]
      
      We must access rt6_info->from under RCU read lock: move the
      dereference under such lock, with proper annotation.
      
      v1 -> v2:
       - avoid using multiple, racy, fetch operations for rt->from
      
      Fixes: a68886a6 ("net/ipv6: Make from in rt6_info rcu protected")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96dd4ef3
    • P
      ipv6: route: purge exception on removal · b9d0cb75
      Paolo Abeni 提交于
      [ Upstream commit f5b51fe804ec2a6edce0f8f6b11ea57283f5857b ]
      
      When a netdevice is unregistered, we flush the relevant exception
      via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().
      
      Finally, we end-up calling rt6_remove_exception(), where we release
      the relevant dst, while we keep the references to the related fib6_info and
      dev. Such references should be released later when the dst will be
      destroyed.
      
      There are a number of caches that can keep the exception around for an
      unlimited amount of time - namely dst_cache, possibly even socket cache.
      As a result device registration may hang, as demonstrated by this script:
      
      ip netns add cl
      ip netns add rt
      ip netns add srv
      ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1
      
      ip link add name cl_veth type veth peer name cl_rt_veth
      ip link set dev cl_veth netns cl
      ip -n cl link set dev cl_veth up
      ip -n cl addr add dev cl_veth 2001::2/64
      ip -n cl route add default via 2001::1
      
      ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 hoplimit 64 dev cl_veth
      ip -n cl link set tunv6 up
      ip -n cl addr add 2013::2/64 dev tunv6
      
      ip link set dev cl_rt_veth netns rt
      ip -n rt link set dev cl_rt_veth up
      ip -n rt addr add dev cl_rt_veth 2001::1/64
      
      ip link add name rt_srv_veth type veth peer name srv_veth
      ip link set dev srv_veth netns srv
      ip -n srv link set dev srv_veth up
      ip -n srv addr add dev srv_veth 2002::1/64
      ip -n srv route add default via 2002::2
      
      ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 hoplimit 64 dev srv_veth
      ip -n srv link set tunv6 up
      ip -n srv addr add 2013::1/64 dev tunv6
      
      ip link set dev rt_srv_veth netns rt
      ip -n rt link set dev rt_srv_veth up
      ip -n rt addr add dev rt_srv_veth 2002::2/64
      
      ip netns exec srv netserver & sleep 0.1
      ip netns exec cl ping6 -c 4 2013::1
      ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1
      ip -n rt link set dev rt_srv_veth mtu 1400
      wait %2
      
      ip -n cl link del cl_veth
      
      This commit addresses the issue purging all the references held by the
      exception at time, as we currently do for e.g. ipv6 pcpu dst entries.
      
      v1 -> v2:
       - re-order the code to avoid accessing dst and net after dst_dev_put()
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9d0cb75
    • K
      net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 · fe38cbc9
      Kalash Nainwal 提交于
      [ Upstream commit 97f0082a0592212fc15d4680f5a4d80f79a1687c ]
      
      Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 to
      keep legacy software happy. This is similar to what was done for
      ipv4 in commit 709772e6 ("net: Fix routing tables with
      id > 255 for legacy software").
      Signed-off-by: NKalash Nainwal <kalash@arista.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe38cbc9
    • M
      net: sit: fix UBSAN Undefined behaviour in check_6rd · 7cfb97ba
      Miaohe Lin 提交于
      [ Upstream commit a843dc4ebaecd15fca1f4d35a97210f72ea1473b ]
      
      In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to
      32,so UBSAN complain about it.
      
      UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47
      shift exponent 32 is too large for 32-bit type 'unsigned int'
      CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
      04/01/2014
      Call Trace:
      __dump_stack lib/dump_stack.c:77 [inline]
      dump_stack+0xca/0x13e lib/dump_stack.c:113
      ubsan_epilogue+0xe/0x81 lib/ubsan.c:159
      __ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425
      check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781
      try_6rd net/ipv6/sit.c:806 [inline]
      ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline]
      sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033
      __netdev_start_xmit include/linux/netdevice.h:4300 [inline]
      netdev_start_xmit include/linux/netdevice.h:4309 [inline]
      xmit_one net/core/dev.c:3243 [inline]
      dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259
      __dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829
      neigh_output include/net/neighbour.h:501 [inline]
      ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120
      ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154
      NF_HOOK_COND include/linux/netfilter.h:278 [inline]
      ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171
      dst_output include/net/dst.h:444 [inline]
      ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176
      ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697
      ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717
      rawv6_push_pending_frames net/ipv6/raw.c:616 [inline]
      rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946
      inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798
      sock_sendmsg_nosec net/socket.c:621 [inline]
      sock_sendmsg+0xc8/0x110 net/socket.c:631
      ___sys_sendmsg+0x6cf/0x890 net/socket.c:2114
      __sys_sendmsg+0xf0/0x1b0 net/socket.c:2152
      do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
      entry_SYSCALL_64_after_hwframe+0x49/0xbe
      Signed-off-by: Nlinmiaohe <linmiaohe@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cfb97ba
  3. 10 3月, 2019 4 次提交
    • D
      ipv6: Return error for RTA_VIA attribute · a68d31cc
      David Ahern 提交于
      [ Upstream commit e3818541b49fb88650ba339d33cc53e4095da5b3 ]
      
      IPv6 currently does not support nexthops outside of the AF_INET6 family.
      Specifically, it does not handle RTA_VIA attribute. If it is passed
      in a route add request, the actual route added only uses the device
      which is clearly not what the user intended:
      
        $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
        $ ip ro ls
        ...
        2001:db8:2::/64 dev eth0 metric 1024 pref medium
      
      Catch this and fail the route add:
        $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
        Error: IPv6 does not support RTA_VIA attribute.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a68d31cc
    • M
      net: sit: fix memory leak in sit_init_net() · d0bedaac
      Mao Wenan 提交于
      [ Upstream commit 07f12b26e21ab359261bf75cfcb424fdc7daeb6d ]
      
      If register_netdev() is failed to register sitn->fb_tunnel_dev,
      it will go to err_reg_dev and forget to free netdev(sitn->fb_tunnel_dev).
      
      BUG: memory leak
      unreferenced object 0xffff888378daad00 (size 512):
        comm "syz-executor.1", pid 4006, jiffies 4295121142 (age 16.115s)
        hex dump (first 32 bytes):
          00 e6 ed c0 83 88 ff ff 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
          [<00000000d6dcb63e>] kvmalloc include/linux/mm.h:577 [inline]
          [<00000000d6dcb63e>] kvzalloc include/linux/mm.h:585 [inline]
          [<00000000d6dcb63e>] netif_alloc_netdev_queues net/core/dev.c:8380 [inline]
          [<00000000d6dcb63e>] alloc_netdev_mqs+0x600/0xcc0 net/core/dev.c:8970
          [<00000000867e172f>] sit_init_net+0x295/0xa40 net/ipv6/sit.c:1848
          [<00000000871019fa>] ops_init+0xad/0x3e0 net/core/net_namespace.c:129
          [<00000000319507f6>] setup_net+0x2ba/0x690 net/core/net_namespace.c:314
          [<0000000087db4f96>] copy_net_ns+0x1dc/0x330 net/core/net_namespace.c:437
          [<0000000057efc651>] create_new_namespaces+0x382/0x730 kernel/nsproxy.c:107
          [<00000000676f83de>] copy_namespaces+0x2ed/0x3d0 kernel/nsproxy.c:165
          [<0000000030b74bac>] copy_process.part.27+0x231e/0x6db0 kernel/fork.c:1919
          [<00000000fff78746>] copy_process kernel/fork.c:1713 [inline]
          [<00000000fff78746>] _do_fork+0x1bc/0xe90 kernel/fork.c:2224
          [<000000001c2e0d1c>] do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
          [<00000000ec48bd44>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<0000000039acff8a>] 0xffffffffffffffff
      Signed-off-by: NMao Wenan <maowenan@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0bedaac
    • H
      ipv4: Add ICMPv6 support when parse route ipproto · 99ed9458
      Hangbin Liu 提交于
      [ Upstream commit 5e1a99eae84999a2536f50a0beaf5d5262337f40 ]
      
      For ip rules, we need to use 'ipproto ipv6-icmp' to match ICMPv6 headers.
      But for ip -6 route, currently we only support tcp, udp and icmp.
      
      Add ICMPv6 support so we can match ipv6-icmp rules for route lookup.
      
      v2: As David Ahern and Sabrina Dubroca suggested, Add an argument to
      rtm_getroute_parse_ip_proto() to handle ICMP/ICMPv6 with different family.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Fixes: eacb9384 ("ipv6: support sport, dport and ip_proto in RTM_GETROUTE")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99ed9458
    • I
      ip6mr: Do not call __IP6_INC_STATS() from preemptible context · b5ff77dd
      Ido Schimmel 提交于
      [ Upstream commit 87c11f1ddbbad38ad8bad47af133a8208985fbdf ]
      
      Similar to commit 44f49dd8 ("ipmr: fix possible race resulting from
      improper usage of IP_INC_STATS_BH() in preemptible context."), we cannot
      assume preemption is disabled when incrementing the counter and
      accessing a per-CPU variable.
      
      Preemption can be enabled when we add a route in process context that
      corresponds to packets stored in the unresolved queue, which are then
      forwarded using this route [1].
      
      Fix this by using IP6_INC_STATS() which takes care of disabling
      preemption on architectures where it is needed.
      
      [1]
      [  157.451447] BUG: using __this_cpu_add() in preemptible [00000000] code: smcrouted/2314
      [  157.460409] caller is ip6mr_forward2+0x73e/0x10e0
      [  157.460434] CPU: 3 PID: 2314 Comm: smcrouted Not tainted 5.0.0-rc7-custom-03635-g22f2712113f1 #1336
      [  157.460449] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
      [  157.460461] Call Trace:
      [  157.460486]  dump_stack+0xf9/0x1be
      [  157.460553]  check_preemption_disabled+0x1d6/0x200
      [  157.460576]  ip6mr_forward2+0x73e/0x10e0
      [  157.460705]  ip6_mr_forward+0x9a0/0x1510
      [  157.460771]  ip6mr_mfc_add+0x16b3/0x1e00
      [  157.461155]  ip6_mroute_setsockopt+0x3cb/0x13c0
      [  157.461384]  do_ipv6_setsockopt.isra.8+0x348/0x4060
      [  157.462013]  ipv6_setsockopt+0x90/0x110
      [  157.462036]  rawv6_setsockopt+0x4a/0x120
      [  157.462058]  __sys_setsockopt+0x16b/0x340
      [  157.462198]  __x64_sys_setsockopt+0xbf/0x160
      [  157.462220]  do_syscall_64+0x14d/0x610
      [  157.462349]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 0912ea38 ("[IPV6] MROUTE: Add stats in multicast routing module method ip6_mr_forward().")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reported-by: NAmit Cohen <amitc@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5ff77dd
  4. 27 2月, 2019 4 次提交
  5. 23 2月, 2019 2 次提交
    • L
      net: ip6_gre: initialize erspan_ver just for erspan tunnels · 4523bc86
      Lorenzo Bianconi 提交于
      [ Upstream commit 4974d5f678abb34401558559d47e2ea3d1c15cba ]
      
      After commit c706863bc890 ("net: ip6_gre: always reports o_key to
      userspace"), ip6gre and ip6gretap tunnels started reporting TUNNEL_KEY
      output flag even if it is not configured.
      ip6gre_fill_info checks erspan_ver value to add TUNNEL_KEY for
      erspan tunnels, however in commit 84581bda ("erspan: set
      erspan_ver to 1 by default when adding an erspan dev")
      erspan_ver is initialized to 1 even for ip6gre or ip6gretap
      Fix the issue moving erspan_ver initialization in a dedicated routine
      
      Fixes: c706863bc890 ("net: ip6_gre: always reports o_key to userspace")
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Reviewed-by: NGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4523bc86
    • Z
      net: fix IPv6 prefix route residue · 4c1b91b8
      Zhiqiang Liu 提交于
      [ Upstream commit e75913c93f7cd5f338ab373c34c93a655bd309cb ]
      
      Follow those steps:
       # ip addr add 2001:123::1/32 dev eth0
       # ip addr add 2001:123:456::2/64 dev eth0
       # ip addr del 2001:123::1/32 dev eth0
       # ip addr del 2001:123:456::2/64 dev eth0
      and then prefix route of 2001:123::1/32 will still exist.
      
      This is because ipv6_prefix_equal in check_cleanup_prefix_route
      func does not check whether two IPv6 addresses have the same
      prefix length. If the prefix of one address starts with another
      shorter address prefix, even though their prefix lengths are
      different, the return value of ipv6_prefix_equal is true.
      
      Here I add a check of whether two addresses have the same prefix
      to decide whether their prefixes are equal.
      
      Fixes: 5b84efec ("ipv6 addrconf: don't cleanup prefix route for IFA_F_NOPREFIXROUTE")
      Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
      Reported-by: NWenhao Zhang <zhangwenhao8@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4c1b91b8
  6. 13 2月, 2019 1 次提交
  7. 07 2月, 2019 5 次提交
    • N
      ip6mr: Fix notifiers call on mroute_clean_tables() · 505e5f3d
      Nir Dotan 提交于
      [ Upstream commit 146820cc240f4389cf33481c058d9493aef95e25 ]
      
      When the MC route socket is closed, mroute_clean_tables() is called to
      cleanup existing routes. Mistakenly notifiers call was put on the cleanup
      of the unresolved MC route entries cache.
      In a case where the MC socket closes before an unresolved route expires,
      the notifier call leads to a crash, caused by the driver trying to
      increment a non initialized refcount_t object [1] and then when handling
      is done, to decrement it [2]. This was detected by a test recently added in
      commit 6d4efada3b82 ("selftests: forwarding: Add multicast routing test").
      
      Fix that by putting notifiers call on the resolved entries traversal,
      instead of on the unresolved entries traversal.
      
      [1]
      
      [  245.748967] refcount_t: increment on 0; use-after-free.
      [  245.754829] WARNING: CPU: 3 PID: 3223 at lib/refcount.c:153 refcount_inc_checked+0x2b/0x30
      ...
      [  245.802357] Hardware name: Mellanox Technologies Ltd. MSN2740/SA001237, BIOS 5.6.5 06/07/2016
      [  245.811873] RIP: 0010:refcount_inc_checked+0x2b/0x30
      ...
      [  245.907487] Call Trace:
      [  245.910231]  mlxsw_sp_router_fib_event.cold.181+0x42/0x47 [mlxsw_spectrum]
      [  245.917913]  notifier_call_chain+0x45/0x7
      [  245.922484]  atomic_notifier_call_chain+0x15/0x20
      [  245.927729]  call_fib_notifiers+0x15/0x30
      [  245.932205]  mroute_clean_tables+0x372/0x3f
      [  245.936971]  ip6mr_sk_done+0xb1/0xc0
      [  245.940960]  ip6_mroute_setsockopt+0x1da/0x5f0
      ...
      
      [2]
      
      [  246.128487] refcount_t: underflow; use-after-free.
      [  246.133859] WARNING: CPU: 0 PID: 7 at lib/refcount.c:187 refcount_sub_and_test_checked+0x4c/0x60
      [  246.183521] Hardware name: Mellanox Technologies Ltd. MSN2740/SA001237, BIOS 5.6.5 06/07/2016
      ...
      [  246.193062] Workqueue: mlxsw_core_ordered mlxsw_sp_router_fibmr_event_work [mlxsw_spectrum]
      [  246.202394] RIP: 0010:refcount_sub_and_test_checked+0x4c/0x60
      ...
      [  246.298889] Call Trace:
      [  246.301617]  refcount_dec_and_test_checked+0x11/0x20
      [  246.307170]  mlxsw_sp_router_fibmr_event_work.cold.196+0x47/0x78 [mlxsw_spectrum]
      [  246.315531]  process_one_work+0x1fa/0x3f0
      [  246.320005]  worker_thread+0x2f/0x3e0
      [  246.324083]  kthread+0x118/0x130
      [  246.327683]  ? wq_update_unbound_numa+0x1b0/0x1b0
      [  246.332926]  ? kthread_park+0x80/0x80
      [  246.337013]  ret_from_fork+0x1f/0x30
      
      Fixes: 088aa3ee ("ip6mr: Support fib notifications")
      Signed-off-by: NNir Dotan <nird@mellanox.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      505e5f3d
    • L
      net: ip6_gre: always reports o_key to userspace · 9f7d849b
      Lorenzo Bianconi 提交于
      [ Upstream commit c706863bc8902d0c2d1a5a27ac8e1ead5d06b79d ]
      
      As Erspan_v4, Erspan_v6 protocol relies on o_key to configure
      session id header field. However TUNNEL_KEY bit is cleared in
      ip6erspan_tunnel_xmit since ERSPAN protocol does not set the key field
      of the external GRE header and so the configured o_key is not reported
      to userspace. The issue can be triggered with the following reproducer:
      
      $ip link add ip6erspan1 type ip6erspan local 2000::1 remote 2000::2 \
          key 1 seq erspan_ver 1
      $ip link set ip6erspan1 up
      ip -d link sh ip6erspan1
      
      ip6erspan1@NONE: <BROADCAST,MULTICAST> mtu 1422 qdisc noop state DOWN mode DEFAULT
          link/ether ba:ff:09:24:c3:0e brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500
          ip6erspan remote 2000::2 local 2000::1 encaplimit 4 flowlabel 0x00000 ikey 0.0.0.1 iseq oseq
      
      Fix the issue adding TUNNEL_KEY bit to the o_flags parameter in
      ip6gre_fill_info
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f7d849b
    • L
      net: ip_gre: use erspan key field for tunnel lookup · 0a198e0b
      Lorenzo Bianconi 提交于
      [ Upstream commit cb73ee40b1b381eaf3749e6dbeed567bb38e5258 ]
      
      Use ERSPAN key header field as tunnel key in gre_parse_header routine
      since ERSPAN protocol sets the key field of the external GRE header to
      0 resulting in a tunnel lookup fail in ip6gre_err.
      In addition remove key field parsing and pskb_may_pull check in
      erspan_rcv and ip6erspan_rcv
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a198e0b
    • Y
      ipv6: sr: clear IP6CB(skb) on SRH ip4ip6 encapsulation · 2f704348
      Yohei Kanemaru 提交于
      [ Upstream commit ef489749aae508e6f17886775c075f12ff919fb1 ]
      
      skb->cb may contain data from previous layers (in an observed case
      IPv4 with L3 Master Device). In the observed scenario, the data in
      IPCB(skb)->frags was misinterpreted as IP6CB(skb)->frag_max_size,
      eventually caused an unexpected IPv6 fragmentation in ip6_fragment()
      through ip6_finish_output().
      
      This patch clears IP6CB(skb), which potentially contains garbage data,
      on the SRH ip4ip6 encapsulation.
      
      Fixes: 32d99d0b ("ipv6: sr: add support for ip4ip6 encapsulation")
      Signed-off-by: NYohei Kanemaru <yohei.kanemaru@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f704348
    • D
      ipv6: Consider sk_bound_dev_if when binding a socket to an address · 7e9a6476
      David Ahern 提交于
      [ Upstream commit c5ee066333ebc322a24a00a743ed941a0c68617e ]
      
      IPv6 does not consider if the socket is bound to a device when binding
      to an address. The result is that a socket can be bound to eth0 and then
      bound to the address of eth1. If the device is a VRF, the result is that
      a socket can only be bound to an address in the default VRF.
      
      Resolve by considering the device if sk_bound_dev_if is set.
      
      This problem exists from the beginning of git history.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e9a6476
  8. 31 1月, 2019 4 次提交
  9. 26 1月, 2019 2 次提交
  10. 23 1月, 2019 3 次提交
    • E
      ipv6: make icmp6_send() robust against null skb->dev · a72e572f
      Eric Dumazet 提交于
      commit 8d933670452107e41165bea70a30dffbd281bef1 upstream.
      
      syzbot was able to crash one host with the following stack trace :
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 8625 Comm: syz-executor4 Not tainted 4.20.0+ #8
      RIP: 0010:dev_net include/linux/netdevice.h:2169 [inline]
      RIP: 0010:icmp6_send+0x116/0x2d30 net/ipv6/icmp.c:426
       icmpv6_send
       smack_socket_sock_rcv_skb
       security_sock_rcv_skb
       sk_filter_trim_cap
       __sk_receive_skb
       dccp_v6_do_rcv
       release_sock
      
      This is because a RX packet found socket owned by user and
      was stored into socket backlog. Before leaving RCU protected section,
      skb->dev was cleared in __sk_receive_skb(). When socket backlog
      was finally handled at release_sock() time, skb was fed to
      smack_socket_sock_rcv_skb() then icmp6_send()
      
      We could fix the bug in smack_socket_sock_rcv_skb(), or simply
      make icmp6_send() more robust against such possibility.
      
      In the future we might provide to icmp6_send() the net pointer
      instead of infering it.
      
      Fixes: d66a8acb ("Smack: Inform peer that IPv6 traffic has been blocked")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Piotr Sawicki <p.sawicki2@partner.samsung.com>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a72e572f
    • W
      ip: on queued skb use skb_header_pointer instead of pskb_may_pull · eb02c17f
      Willem de Bruijn 提交于
      [ Upstream commit 4a06fa67c4da20148803525151845276cdb995c1 ]
      
      Commit 2efd4fca ("ip: in cmsg IP(V6)_ORIGDSTADDR call
      pskb_may_pull") avoided a read beyond the end of the skb linear
      segment by calling pskb_may_pull.
      
      That function can trigger a BUG_ON in pskb_expand_head if the skb is
      shared, which it is when when peeking. It can also return ENOMEM.
      
      Avoid both by switching to safer skb_header_pointer.
      
      Fixes: 2efd4fca ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eb02c17f
    • E
      ipv6: fix kernel-infoleak in ipv6_local_error() · c0e1392e
      Eric Dumazet 提交于
      [ Upstream commit 7d033c9f6a7fd3821af75620a0257db87c2b552a ]
      
      This patch makes sure the flow label in the IPv6 header
      forged in ipv6_local_error() is initialized.
      
      BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
      CPU: 1 PID: 24675 Comm: syz-executor1 Not tainted 4.20.0-rc7+ #4
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x173/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
       kmsan_internal_check_memory+0x455/0xb00 mm/kmsan/kmsan.c:675
       kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
       _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
       copy_to_user include/linux/uaccess.h:177 [inline]
       move_addr_to_user+0x2e9/0x4f0 net/socket.c:227
       ___sys_recvmsg+0x5d7/0x1140 net/socket.c:2284
       __sys_recvmsg net/socket.c:2327 [inline]
       __do_sys_recvmsg net/socket.c:2337 [inline]
       __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
       __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      RIP: 0033:0x457ec9
      Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f8750c06c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
      RDX: 0000000000002000 RSI: 0000000020000400 RDI: 0000000000000005
      RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8750c076d4
      R13: 00000000004c4a60 R14: 00000000004d8140 R15: 00000000ffffffff
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:219 [inline]
       kmsan_internal_chain_origin+0x134/0x230 mm/kmsan/kmsan.c:439
       __msan_chain_origin+0x70/0xe0 mm/kmsan/kmsan_instr.c:200
       ipv6_recv_error+0x1e3f/0x1eb0 net/ipv6/datagram.c:475
       udpv6_recvmsg+0x398/0x2ab0 net/ipv6/udp.c:335
       inet_recvmsg+0x4fb/0x600 net/ipv4/af_inet.c:830
       sock_recvmsg_nosec net/socket.c:794 [inline]
       sock_recvmsg+0x1d1/0x230 net/socket.c:801
       ___sys_recvmsg+0x4d5/0x1140 net/socket.c:2278
       __sys_recvmsg net/socket.c:2327 [inline]
       __do_sys_recvmsg net/socket.c:2337 [inline]
       __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
       __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
       kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
       kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
       kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
       slab_post_alloc_hook mm/slab.h:446 [inline]
       slab_alloc_node mm/slub.c:2759 [inline]
       __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
       __kmalloc_reserve net/core/skbuff.c:137 [inline]
       __alloc_skb+0x309/0xa20 net/core/skbuff.c:205
       alloc_skb include/linux/skbuff.h:998 [inline]
       ipv6_local_error+0x1a7/0x9e0 net/ipv6/datagram.c:334
       __ip6_append_data+0x129f/0x4fd0 net/ipv6/ip6_output.c:1311
       ip6_make_skb+0x6cc/0xcf0 net/ipv6/ip6_output.c:1775
       udpv6_sendmsg+0x3f8e/0x45d0 net/ipv6/udp.c:1384
       inet_sendmsg+0x54a/0x720 net/ipv4/af_inet.c:798
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg net/socket.c:631 [inline]
       __sys_sendto+0x8c4/0xac0 net/socket.c:1788
       __do_sys_sendto net/socket.c:1800 [inline]
       __se_sys_sendto+0x107/0x130 net/socket.c:1796
       __x64_sys_sendto+0x6e/0x90 net/socket.c:1796
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      Bytes 4-7 of 28 are uninitialized
      Memory access of size 28 starts at ffff8881937bfce0
      Data copied to user address 0000000020000000
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c0e1392e
  11. 10 1月, 2019 8 次提交
    • S
      ipv6: route: Fix return value of ip6_neigh_lookup() on neigh_create() error · ccc8b374
      Stefano Brivio 提交于
      [ Upstream commit 7adf3246092f5e87ed0fa610e8088fae416c581f ]
      
      In ip6_neigh_lookup(), we must not return errors coming from
      neigh_create(): if creation of a neighbour entry fails, the lookup should
      return NULL, in the same way as it's done in __neigh_lookup().
      
      Otherwise, callers legitimately checking for a non-NULL return value of
      the lookup function might dereference an invalid pointer.
      
      For instance, on neighbour table overflow, ndisc_router_discovery()
      crashes ndisc_update() by passing ERR_PTR(-ENOBUFS) as 'neigh' argument.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Fixes: f8a1b43b ("net/ipv6: Create a neigh_lookup for FIB entries")
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ccc8b374
    • C
      net/ipv6: Fix a test against 'ipv6_find_idev()' return value · fe3f820c
      Christophe JAILLET 提交于
      [ Upstream commit 178fe94405bffbd1acd83b6ff3b40211185ae9c9 ]
      
      'ipv6_find_idev()' returns NULL on error, not an error pointer.
      Update the test accordingly and return -ENOBUFS, as already done in
      'addrconf_add_dev()', if NULL is returned.
      
      Fixes: ("ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses")
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe3f820c
    • H
      ipv6: frags: Fix bogus skb->sk in reassembled packets · 5ac4cc33
      Herbert Xu 提交于
      [ Upstream commit d15f5ac8deea936d3adf629421a66a88b42b8a2f ]
      
      It was reported that IPsec would crash when it encounters an IPv6
      reassembled packet because skb->sk is non-zero and not a valid
      pointer.
      
      This is because skb->sk is now a union with ip_defrag_offset.
      
      This patch fixes this by resetting skb->sk when exiting from
      the reassembly code.
      Reported-by: NXiumei Mu <xmu@redhat.com>
      Fixes: 219badfa ("ipv6: frags: get rid of ip6frag_skb_cb/...")
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5ac4cc33
    • E
      net: clear skb->tstamp in forwarding paths · 281731c8
      Eric Dumazet 提交于
      [ Upstream commit 8203e2d844d34af247a151d8ebd68553a6e91785 ]
      
      Sergey reported that forwarding was no longer working
      if fq packet scheduler was used.
      
      This is caused by the recent switch to EDT model, since incoming
      packets might have been timestamped by __net_timestamp()
      
      __net_timestamp() uses ktime_get_real(), while fq expects packets
      using CLOCK_MONOTONIC base.
      
      The fix is to clear skb->tstamp in forwarding paths.
      
      Fixes: 80b14dee ("net: Add a new socket option for a future transmit time.")
      Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NSergey Matyukevich <geomatsi@gmail.com>
      Tested-by: NSergey Matyukevich <geomatsi@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      281731c8
    • W
      ip: validate header length on virtual device xmit · cde81154
      Willem de Bruijn 提交于
      [ Upstream commit cb9f1b783850b14cbd7f87d061d784a666dfba1f ]
      
      KMSAN detected read beyond end of buffer in vti and sit devices when
      passing truncated packets with PF_PACKET. The issue affects additional
      ip tunnel devices.
      
      Extend commit 76c0ddd8 ("ip6_tunnel: be careful when accessing the
      inner header") and commit ccfec9e5 ("ip_tunnel: be careful when
      accessing the inner header").
      
      Move the check to a separate helper and call at the start of each
      ndo_start_xmit function in net/ipv4 and net/ipv6.
      
      Minor changes:
      - convert dev_kfree_skb to kfree_skb on error path,
        as dev_kfree_skb calls consume_skb which is not for error paths.
      - use pskb_network_may_pull even though that is pedantic here,
        as the same as pskb_may_pull for devices without llheaders.
      - do not cache ipv6 hdrs if used only once
        (unsafe across pskb_may_pull, was more relevant to earlier patch)
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cde81154
    • E
      ipv6: tunnels: fix two use-after-free · 0d2b652b
      Eric Dumazet 提交于
      [ Upstream commit cbb49697d5512ce9e61b45ce75d3ee43d7ea5524 ]
      
      xfrm6_policy_check() might have re-allocated skb->head, we need
      to reload ipv6 header pointer.
      
      sysbot reported :
      
      BUG: KASAN: use-after-free in __ipv6_addr_type+0x302/0x32f net/ipv6/addrconf_core.c:40
      Read of size 4 at addr ffff888191b8cb70 by task syz-executor2/1304
      
      CPU: 0 PID: 1304 Comm: syz-executor2 Not tainted 4.20.0-rc7+ #356
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x244/0x39d lib/dump_stack.c:113
       print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
       __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
       __ipv6_addr_type+0x302/0x32f net/ipv6/addrconf_core.c:40
       ipv6_addr_type include/net/ipv6.h:403 [inline]
       ip6_tnl_get_cap+0x27/0x190 net/ipv6/ip6_tunnel.c:727
       ip6_tnl_rcv_ctl+0xdb/0x2a0 net/ipv6/ip6_tunnel.c:757
       vti6_rcv+0x336/0x8f3 net/ipv6/ip6_vti.c:321
       xfrm6_ipcomp_rcv+0x1a5/0x3a0 net/ipv6/xfrm6_protocol.c:132
       ip6_protocol_deliver_rcu+0x372/0x1940 net/ipv6/ip6_input.c:394
       ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:434
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:443
      IPVS: ftp: loaded support on port[0] = 21
       ip6_mc_input+0x514/0x11c0 net/ipv6/ip6_input.c:537
       dst_input include/net/dst.h:450 [inline]
       ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ipv6_rcv+0x115/0x640 net/ipv6/ip6_input.c:272
       __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4973
       __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5083
       process_backlog+0x24e/0x7a0 net/core/dev.c:5923
       napi_poll net/core/dev.c:6346 [inline]
       net_rx_action+0x7fa/0x19b0 net/core/dev.c:6412
       __do_softirq+0x308/0xb7e kernel/softirq.c:292
       do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1027
       </IRQ>
       do_softirq.part.14+0x126/0x160 kernel/softirq.c:337
       do_softirq+0x19/0x20 kernel/softirq.c:340
       netif_rx_ni+0x521/0x860 net/core/dev.c:4569
       dev_loopback_xmit+0x287/0x8c0 net/core/dev.c:3576
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ip6_finish_output2+0x193a/0x2930 net/ipv6/ip6_output.c:84
       ip6_fragment+0x2b06/0x3850 net/ipv6/ip6_output.c:727
       ip6_finish_output+0x6b7/0xc50 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:278 [inline]
       ip6_output+0x232/0x9d0 net/ipv6/ip6_output.c:171
       dst_output include/net/dst.h:444 [inline]
       ip6_local_out+0xc5/0x1b0 net/ipv6/output_core.c:176
       ip6_send_skb+0xbc/0x340 net/ipv6/ip6_output.c:1727
       ip6_push_pending_frames+0xc5/0xf0 net/ipv6/ip6_output.c:1747
       rawv6_push_pending_frames net/ipv6/raw.c:615 [inline]
       rawv6_sendmsg+0x3a3e/0x4b40 net/ipv6/raw.c:945
      kobject: 'queues' (0000000089e6eea2): kobject_add_internal: parent: 'tunl0', set: '<NULL>'
      kobject: 'queues' (0000000089e6eea2): kobject_uevent_env
       inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
      kobject: 'queues' (0000000089e6eea2): kobject_uevent_env: filter function caused the event to drop!
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:631
       sock_write_iter+0x35e/0x5c0 net/socket.c:900
       call_write_iter include/linux/fs.h:1857 [inline]
       new_sync_write fs/read_write.c:474 [inline]
       __vfs_write+0x6b8/0x9f0 fs/read_write.c:487
      kobject: 'rx-0' (00000000e2d902d9): kobject_add_internal: parent: 'queues', set: 'queues'
      kobject: 'rx-0' (00000000e2d902d9): kobject_uevent_env
       vfs_write+0x1fc/0x560 fs/read_write.c:549
       ksys_write+0x101/0x260 fs/read_write.c:598
      kobject: 'rx-0' (00000000e2d902d9): fill_kobj_path: path = '/devices/virtual/net/tunl0/queues/rx-0'
       __do_sys_write fs/read_write.c:610 [inline]
       __se_sys_write fs/read_write.c:607 [inline]
       __x64_sys_write+0x73/0xb0 fs/read_write.c:607
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
      kobject: 'tx-0' (00000000443b70ac): kobject_add_internal: parent: 'queues', set: 'queues'
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457669
      Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f9bd200bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457669
      RDX: 000000000000058f RSI: 00000000200033c0 RDI: 0000000000000003
      kobject: 'tx-0' (00000000443b70ac): kobject_uevent_env
      RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9bd200c6d4
      R13: 00000000004c2dcc R14: 00000000004da398 R15: 00000000ffffffff
      
      Allocated by task 1304:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:448
       set_track mm/kasan/kasan.c:460 [inline]
       kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
       __do_kmalloc_node mm/slab.c:3684 [inline]
       __kmalloc_node_track_caller+0x50/0x70 mm/slab.c:3698
       __kmalloc_reserve.isra.41+0x41/0xe0 net/core/skbuff.c:140
       __alloc_skb+0x155/0x760 net/core/skbuff.c:208
      kobject: 'tx-0' (00000000443b70ac): fill_kobj_path: path = '/devices/virtual/net/tunl0/queues/tx-0'
       alloc_skb include/linux/skbuff.h:1011 [inline]
       __ip6_append_data.isra.49+0x2f1a/0x3f50 net/ipv6/ip6_output.c:1450
       ip6_append_data+0x1bc/0x2d0 net/ipv6/ip6_output.c:1619
       rawv6_sendmsg+0x15ab/0x4b40 net/ipv6/raw.c:938
       inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:631
       ___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
       __sys_sendmsg+0x11d/0x280 net/socket.c:2154
       __do_sys_sendmsg net/socket.c:2163 [inline]
       __se_sys_sendmsg net/socket.c:2161 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      kobject: 'gre0' (00000000cb1b2d7b): kobject_add_internal: parent: 'net', set: 'devices'
      
      Freed by task 1304:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:448
       set_track mm/kasan/kasan.c:460 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
       kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
       __cache_free mm/slab.c:3498 [inline]
       kfree+0xcf/0x230 mm/slab.c:3817
       skb_free_head+0x93/0xb0 net/core/skbuff.c:553
       pskb_expand_head+0x3b2/0x10d0 net/core/skbuff.c:1498
       __pskb_pull_tail+0x156/0x18a0 net/core/skbuff.c:1896
       pskb_may_pull include/linux/skbuff.h:2188 [inline]
       _decode_session6+0xd11/0x14d0 net/ipv6/xfrm6_policy.c:150
       __xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:3272
      kobject: 'gre0' (00000000cb1b2d7b): kobject_uevent_env
       __xfrm_policy_check+0x380/0x2c40 net/xfrm/xfrm_policy.c:3322
       __xfrm_policy_check2 include/net/xfrm.h:1170 [inline]
       xfrm_policy_check include/net/xfrm.h:1175 [inline]
       xfrm6_policy_check include/net/xfrm.h:1185 [inline]
       vti6_rcv+0x4bd/0x8f3 net/ipv6/ip6_vti.c:316
       xfrm6_ipcomp_rcv+0x1a5/0x3a0 net/ipv6/xfrm6_protocol.c:132
       ip6_protocol_deliver_rcu+0x372/0x1940 net/ipv6/ip6_input.c:394
       ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:434
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:443
       ip6_mc_input+0x514/0x11c0 net/ipv6/ip6_input.c:537
       dst_input include/net/dst.h:450 [inline]
       ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ipv6_rcv+0x115/0x640 net/ipv6/ip6_input.c:272
       __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4973
       __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5083
       process_backlog+0x24e/0x7a0 net/core/dev.c:5923
      kobject: 'gre0' (00000000cb1b2d7b): fill_kobj_path: path = '/devices/virtual/net/gre0'
       napi_poll net/core/dev.c:6346 [inline]
       net_rx_action+0x7fa/0x19b0 net/core/dev.c:6412
       __do_softirq+0x308/0xb7e kernel/softirq.c:292
      
      The buggy address belongs to the object at ffff888191b8cac0
       which belongs to the cache kmalloc-512 of size 512
      The buggy address is located 176 bytes inside of
       512-byte region [ffff888191b8cac0, ffff888191b8ccc0)
      The buggy address belongs to the page:
      page:ffffea000646e300 count:1 mapcount:0 mapping:ffff8881da800940 index:0x0
      flags: 0x2fffc0000000200(slab)
      raw: 02fffc0000000200 ffffea0006eaaa48 ffffea00065356c8 ffff8881da800940
      raw: 0000000000000000 ffff888191b8c0c0 0000000100000006 0000000000000000
      page dumped because: kasan: bad access detected
      kobject: 'queues' (000000005fd6226e): kobject_add_internal: parent: 'gre0', set: '<NULL>'
      
      Memory state around the buggy address:
       ffff888191b8ca00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff888191b8ca80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
      >ffff888191b8cb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                   ^
       ffff888191b8cb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff888191b8cc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 0d3c703a ("ipv6: Cleanup IPv6 tunnel receive path")
      Fixes: ed1efb2a ("ipv6: Add support for IPsec virtual tunnel interfaces")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d2b652b
    • C
      ipv6: explicitly initialize udp6_addr in udp_sock_create6() · cae3c9cf
      Cong Wang 提交于
      [ Upstream commit fb24274546310872eeeaf3d1d53799d8414aa0f2 ]
      
      syzbot reported the use of uninitialized udp6_addr::sin6_scope_id.
      We can just set ::sin6_scope_id to zero, as tunnels are unlikely
      to use an IPv6 address that needs a scope id and there is no
      interface to bind in this context.
      
      For net-next, it looks different as we have cfg->bind_ifindex there
      so we can probably call ipv6_iface_scope_id().
      
      Same for ::sin6_flowinfo, tunnels don't use it.
      
      Fixes: 8024e028 ("udp: Add udp_sock_create for UDP tunnels to open listener socket")
      Reported-by: syzbot+c56449ed3652e6720f30@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cae3c9cf
    • G
      ip6mr: Fix potential Spectre v1 vulnerability · 32403fd3
      Gustavo A. R. Silva 提交于
      [ Upstream commit 69d2c86766da2ded2b70281f1bf242cb0d58a778 ]
      
      vr.mifi is indirectly controlled by user-space, hence leading to
      a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      net/ipv6/ip6mr.c:1845 ip6mr_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)
      net/ipv6/ip6mr.c:1919 ip6mr_compat_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)
      
      Fix this by sanitizing vr.mifi before using it to index mrt->vif_table'
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32403fd3
  12. 17 12月, 2018 1 次提交
    • T
      netfilter: nat: fix double register in masquerade modules · 5517d4c6
      Taehee Yoo 提交于
      [ Upstream commit 095faf45e64be00bff4da2d6182dface3d69c9b7 ]
      
      There is a reference counter to ensure that masquerade modules register
      notifiers only once. However, the existing reference counter approach is
      not safe, test commands are:
      
         while :
         do
         	   modprobe ip6t_MASQUERADE &
      	   modprobe nft_masq_ipv6 &
      	   modprobe -rv ip6t_MASQUERADE &
      	   modprobe -rv nft_masq_ipv6 &
         done
      
      numbers below represent the reference counter.
      --------------------------------------------------------
      CPU0        CPU1        CPU2        CPU3        CPU4
      [insmod]    [insmod]    [rmmod]     [rmmod]     [insmod]
      --------------------------------------------------------
      0->1
      register    1->2
                  returns     2->1
      			returns     1->0
                                                      0->1
                                                      register <--
                                          unregister
      --------------------------------------------------------
      
      The unregistation of CPU3 should be processed before the
      registration of CPU4.
      
      In order to fix this, use a mutex instead of reference counter.
      
      splat looks like:
      [  323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381]
      [  323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n]
      [  323.869574] irq event stamp: 194074
      [  323.898930] hardirqs last  enabled at (194073): [<ffffffff90004a0d>] trace_hardirqs_on_thunk+0x1a/0x1c
      [  323.898930] hardirqs last disabled at (194074): [<ffffffff90004a29>] trace_hardirqs_off_thunk+0x1a/0x1c
      [  323.898930] softirqs last  enabled at (182132): [<ffffffff922006ec>] __do_softirq+0x6ec/0xa3b
      [  323.898930] softirqs last disabled at (182109): [<ffffffff90193426>] irq_exit+0x1a6/0x1e0
      [  323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27
      [  323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240
      [  323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 00 00 fc ff df eb 22 48 8d 7b 10 488
      [  323.898930] RSP: 0018:ffff888101597218 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
      [  323.898930] RAX: 0000000000000000 RBX: ffffffffc04361c0 RCX: 0000000000000000
      [  323.898930] RDX: 1ffffffff26132ae RSI: ffffffffc04aa3c0 RDI: ffffffffc04361d0
      [  323.898930] RBP: ffffffffc04361c8 R08: 0000000000000000 R09: 0000000000000001
      [  323.898930] R10: ffff8881015972b0 R11: fffffbfff26132c4 R12: dffffc0000000000
      [  323.898930] R13: 0000000000000000 R14: 1ffff110202b2e44 R15: ffffffffc04aa3c0
      [  323.898930] FS:  00007f813ed41540(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
      [  323.898930] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  323.898930] CR2: 0000559bf2c9f120 CR3: 000000010bc80000 CR4: 00000000001006f0
      [  323.898930] Call Trace:
      [  323.898930]  ? atomic_notifier_chain_register+0x2d0/0x2d0
      [  323.898930]  ? down_read+0x150/0x150
      [  323.898930]  ? sched_clock_cpu+0x126/0x170
      [  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
      [  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
      [  323.898930]  register_netdevice_notifier+0xbb/0x790
      [  323.898930]  ? __dev_close_many+0x2d0/0x2d0
      [  323.898930]  ? __mutex_unlock_slowpath+0x17f/0x740
      [  323.898930]  ? wait_for_completion+0x710/0x710
      [  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
      [  323.898930]  ? up_write+0x6c/0x210
      [  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
      [  324.127073]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
      [  324.127073]  nft_chain_filter_init+0x1e/0xe8a [nf_tables]
      [  324.127073]  nf_tables_module_init+0x37/0x92 [nf_tables]
      [ ... ]
      
      Fixes: 8dd33cc9 ("netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables")
      Fixes: be6b635c ("netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5517d4c6