1. 05 10月, 2018 4 次提交
  2. 03 10月, 2018 8 次提交
  3. 27 9月, 2018 3 次提交
  4. 22 9月, 2018 4 次提交
    • P
      net/ipfrag: let ip[6]frag_high_thresh in ns be higher than in init_net · 83619623
      Peter Oskolkov 提交于
      Currently, ip[6]frag_high_thresh sysctl values in new namespaces are
      hard-limited to those of the root/init ns.
      
      There are at least two use cases when it would be desirable to
      set the high_thresh values higher in a child namespace vs the global hard
      limit:
      
      - a security/ddos protection policy may lower the thresholds in the
        root/init ns but allow for a special exception in a child namespace
      - testing: a test running in a namespace may want to set these
        thresholds higher in its namespace than what is in the root/init ns
      
      The new behavior:
      
       # ip netns add testns
       # ip netns exec testns bash
      
       # sysctl -w net.ipv4.ipfrag_high_thresh=9000000
       net.ipv4.ipfrag_high_thresh = 9000000
      
       # sysctl net.ipv4.ipfrag_high_thresh
       net.ipv4.ipfrag_high_thresh = 9000000
      
       # sysctl -w net.ipv6.ip6frag_high_thresh=9000000
       net.ipv6.ip6frag_high_thresh = 9000000
      
       # sysctl net.ipv6.ip6frag_high_thresh
       net.ipv6.ip6frag_high_thresh = 9000000
      
      The old behavior:
      
       # ip netns add testns
       # ip netns exec testns bash
      
       # sysctl -w net.ipv4.ipfrag_high_thresh=9000000
       net.ipv4.ipfrag_high_thresh = 9000000
      
       # sysctl net.ipv4.ipfrag_high_thresh
       net.ipv4.ipfrag_high_thresh = 4194304
      
       # sysctl -w net.ipv6.ip6frag_high_thresh=9000000
       net.ipv6.ip6frag_high_thresh = 9000000
      
       # sysctl net.ipv6.ip6frag_high_thresh
       net.ipv6.ip6frag_high_thresh = 4194304
      Signed-off-by: NPeter Oskolkov <posk@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83619623
    • P
      ipv6: discard IP frag queue on more errors · 2475f59c
      Peter Oskolkov 提交于
      This is similar to how ipv4 now behaves:
      commit 0ff89efb ("ip: fail fast on IP defrag errors").
      Signed-off-by: NPeter Oskolkov <posk@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2475f59c
    • J
      net/ipv6: Display all addresses in output of /proc/net/if_inet6 · 86f9bd1f
      Jeff Barnhill 提交于
      The backend handling for /proc/net/if_inet6 in addrconf.c doesn't properly
      handle starting/stopping the iteration.  The problem is that at some point
      during the iteration, an overflow is detected and the process is
      subsequently stopped.  The item being shown via seq_printf() when the
      overflow occurs is not actually shown, though.  When start() is
      subsequently called to resume iterating, it returns the next item, and
      thus the item that was being processed when the overflow occurred never
      gets printed.
      
      Alter the meaning of the private data member "offset".  Currently, when it
      is not 0 (which only happens at the very beginning), "offset" represents
      the next hlist item to be printed.  After this change, "offset" always
      represents the current item.
      
      This is also consistent with the private data member "bucket", which
      represents the current bucket, and also the use of "pos" as defined in
      seq_file.txt:
          The pos passed to start() will always be either zero, or the most
          recent pos used in the previous session.
      Signed-off-by: NJeff Barnhill <0xeffeff@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86f9bd1f
    • Z
      ipv6: remove redundant null pointer check before kfree_skb · f2a2f216
      zhong jiang 提交于
      kfree_skb has taken the null pointer into account. hence it is safe
      to remove the redundant null pointer check before kfree_skb.
      Signed-off-by: Nzhong jiang <zhongjiang@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f2a2f216
  5. 20 9月, 2018 2 次提交
    • P
      ip6_tunnel: be careful when accessing the inner header · 76c0ddd8
      Paolo Abeni 提交于
      the ip6 tunnel xmit ndo assumes that the processed skb always
      contains an ip[v6] header, but syzbot has found a way to send
      frames that fall short of this assumption, leading to the following splat:
      
      BUG: KMSAN: uninit-value in ip6ip6_tnl_xmit net/ipv6/ip6_tunnel.c:1307
      [inline]
      BUG: KMSAN: uninit-value in ip6_tnl_start_xmit+0x7d2/0x1ef0
      net/ipv6/ip6_tunnel.c:1390
      CPU: 0 PID: 4504 Comm: syz-executor558 Not tainted 4.16.0+ #87
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x185/0x1d0 lib/dump_stack.c:53
        kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
        __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
        ip6ip6_tnl_xmit net/ipv6/ip6_tunnel.c:1307 [inline]
        ip6_tnl_start_xmit+0x7d2/0x1ef0 net/ipv6/ip6_tunnel.c:1390
        __netdev_start_xmit include/linux/netdevice.h:4066 [inline]
        netdev_start_xmit include/linux/netdevice.h:4075 [inline]
        xmit_one net/core/dev.c:3026 [inline]
        dev_hard_start_xmit+0x5f1/0xc70 net/core/dev.c:3042
        __dev_queue_xmit+0x27ee/0x3520 net/core/dev.c:3557
        dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590
        packet_snd net/packet/af_packet.c:2944 [inline]
        packet_sendmsg+0x7c70/0x8a30 net/packet/af_packet.c:2969
        sock_sendmsg_nosec net/socket.c:630 [inline]
        sock_sendmsg net/socket.c:640 [inline]
        ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
        __sys_sendmmsg+0x42d/0x800 net/socket.c:2136
        SYSC_sendmmsg+0xc4/0x110 net/socket.c:2167
        SyS_sendmmsg+0x63/0x90 net/socket.c:2162
        do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x441819
      RSP: 002b:00007ffe58ee8268 EFLAGS: 00000213 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441819
      RDX: 0000000000000002 RSI: 0000000020000100 RDI: 0000000000000003
      RBP: 00000000006cd018 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000402510
      R13: 00000000004025a0 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was created at:
        kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
        kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
        kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
        kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
        slab_post_alloc_hook mm/slab.h:445 [inline]
        slab_alloc_node mm/slub.c:2737 [inline]
        __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
        __kmalloc_reserve net/core/skbuff.c:138 [inline]
        __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
        alloc_skb include/linux/skbuff.h:984 [inline]
        alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
        sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
        packet_alloc_skb net/packet/af_packet.c:2803 [inline]
        packet_snd net/packet/af_packet.c:2894 [inline]
        packet_sendmsg+0x6454/0x8a30 net/packet/af_packet.c:2969
        sock_sendmsg_nosec net/socket.c:630 [inline]
        sock_sendmsg net/socket.c:640 [inline]
        ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
        __sys_sendmmsg+0x42d/0x800 net/socket.c:2136
        SYSC_sendmmsg+0xc4/0x110 net/socket.c:2167
        SyS_sendmmsg+0x63/0x90 net/socket.c:2162
        do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      This change addresses the issue adding the needed check before
      accessing the inner header.
      
      The ipv4 side of the issue is apparently there since the ipv4 over ipv6
      initial support, and the ipv6 side predates git history.
      
      Fixes: c4d3efaf ("[IPV6] IP6TUNNEL: Add support to IPv4 over IPv6 tunnel.")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: syzbot+3fde91d4d394747d6db4@syzkaller.appspotmail.com
      Tested-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76c0ddd8
    • R
      ipv6: Allow the l3mdev to be a loopback · 3ede0bbc
      Robert Shearman 提交于
      There is no way currently for an IPv6 client connect using a loopback
      address in a VRF, whereas for IPv4 the loopback address can be added:
      
          $ sudo ip addr add dev vrfred 127.0.0.1/8
          $ sudo ip -6 addr add ::1/128 dev vrfred
          RTNETLINK answers: Cannot assign requested address
      
      So allow ::1 to be configured on an L3 master device. In order for
      this to be usable ip_route_output_flags needs to not consider ::1 to
      be a link scope address (since oif == l3mdev and so it would be
      dropped), and ipv6_rcv needs to consider the l3mdev to be a loopback
      device so that it doesn't drop the packets.
      Signed-off-by: NRobert Shearman <rshearma@vyatta.att-mail.com>
      Signed-off-by: NMike Manning <mmanning@vyatta.att-mail.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ede0bbc
  6. 19 9月, 2018 2 次提交
    • W
      ipv6: fix memory leak on dst->_metrics · ce7ea4af
      Wei Wang 提交于
      When dst->_metrics and f6i->fib6_metrics share the same memory, both
      take reference count on the dst_metrics structure. However, when dst is
      destroyed, ip6_dst_destroy() only invokes dst_destroy_metrics_generic()
      which does not take care of READONLY metrics and does not release refcnt.
      This causes memory leak.
      Similar to ipv4 logic, the fix is to properly release refcnt and free
      the memory space pointed by dst->_metrics if refcnt becomes 0.
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Reported-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce7ea4af
    • W
      Revert "ipv6: fix double refcount of fib6_metrics" · 86758605
      Wei Wang 提交于
      This reverts commit e70a3aad.
      
      This change causes use-after-free on dst->_metrics.
      The crash trace looks like this:
      [   97.763269] BUG: KASAN: use-after-free in ip6_mtu+0x116/0x140
      [   97.769038] Read of size 4 at addr ffff881781d2cf84 by task svw_NetThreadEv/8801
      
      [   97.777954] CPU: 76 PID: 8801 Comm: svw_NetThreadEv Not tainted 4.15.0-smp-DEV #11
      [   97.777956] Hardware name: Default string Default string/Indus_QC_02, BIOS 5.46.4 03/29/2018
      [   97.777957] Call Trace:
      [   97.777971]  [<ffffffff895709db>] dump_stack+0x4d/0x72
      [   97.777985]  [<ffffffff881651df>] print_address_description+0x6f/0x260
      [   97.777997]  [<ffffffff88165747>] kasan_report+0x257/0x370
      [   97.778001]  [<ffffffff894488e6>] ? ip6_mtu+0x116/0x140
      [   97.778004]  [<ffffffff881658b9>] __asan_report_load4_noabort+0x19/0x20
      [   97.778008]  [<ffffffff894488e6>] ip6_mtu+0x116/0x140
      [   97.778013]  [<ffffffff892bb91e>] tcp_current_mss+0x12e/0x280
      [   97.778016]  [<ffffffff892bb7f0>] ? tcp_mtu_to_mss+0x2d0/0x2d0
      [   97.778022]  [<ffffffff887b45b8>] ? depot_save_stack+0x138/0x4a0
      [   97.778037]  [<ffffffff87c38985>] ? __mmdrop+0x145/0x1f0
      [   97.778040]  [<ffffffff881643b1>] ? save_stack+0xb1/0xd0
      [   97.778046]  [<ffffffff89264c82>] tcp_send_mss+0x22/0x220
      [   97.778059]  [<ffffffff89273a49>] tcp_sendmsg_locked+0x4f9/0x39f0
      [   97.778062]  [<ffffffff881642b4>] ? kasan_check_write+0x14/0x20
      [   97.778066]  [<ffffffff89273550>] ? tcp_sendpage+0x60/0x60
      [   97.778070]  [<ffffffff881cb359>] ? rw_copy_check_uvector+0x69/0x280
      [   97.778075]  [<ffffffff8873c65f>] ? import_iovec+0x9f/0x430
      [   97.778078]  [<ffffffff88164be7>] ? kasan_slab_free+0x87/0xc0
      [   97.778082]  [<ffffffff8873c5c0>] ? memzero_page+0x140/0x140
      [   97.778085]  [<ffffffff881642b4>] ? kasan_check_write+0x14/0x20
      [   97.778088]  [<ffffffff89276f6c>] tcp_sendmsg+0x2c/0x50
      [   97.778092]  [<ffffffff89276f6c>] ? tcp_sendmsg+0x2c/0x50
      [   97.778098]  [<ffffffff89352d43>] inet_sendmsg+0x103/0x480
      [   97.778102]  [<ffffffff89352c40>] ? inet_gso_segment+0x15b0/0x15b0
      [   97.778105]  [<ffffffff890294da>] sock_sendmsg+0xba/0xf0
      [   97.778108]  [<ffffffff8902ab6a>] ___sys_sendmsg+0x6ca/0x8e0
      [   97.778113]  [<ffffffff87dccac1>] ? hrtimer_try_to_cancel+0x71/0x3b0
      [   97.778116]  [<ffffffff8902a4a0>] ? copy_msghdr_from_user+0x3d0/0x3d0
      [   97.778119]  [<ffffffff881646d1>] ? memset+0x31/0x40
      [   97.778123]  [<ffffffff87a0cff5>] ? schedule_hrtimeout_range_clock+0x165/0x380
      [   97.778127]  [<ffffffff87a0ce90>] ? hrtimer_nanosleep_restart+0x250/0x250
      [   97.778130]  [<ffffffff87dcc700>] ? __hrtimer_init+0x180/0x180
      [   97.778133]  [<ffffffff87dd1f82>] ? ktime_get_ts64+0x172/0x200
      [   97.778137]  [<ffffffff8822b8ec>] ? __fget_light+0x8c/0x2f0
      [   97.778141]  [<ffffffff8902d5c6>] __sys_sendmsg+0xe6/0x190
      [   97.778144]  [<ffffffff8902d5c6>] ? __sys_sendmsg+0xe6/0x190
      [   97.778147]  [<ffffffff8902d4e0>] ? SyS_shutdown+0x20/0x20
      [   97.778152]  [<ffffffff87cd4370>] ? wake_up_q+0xe0/0xe0
      [   97.778155]  [<ffffffff8902d670>] ? __sys_sendmsg+0x190/0x190
      [   97.778158]  [<ffffffff8902d683>] SyS_sendmsg+0x13/0x20
      [   97.778162]  [<ffffffff87a1600c>] do_syscall_64+0x2ac/0x430
      [   97.778166]  [<ffffffff87c17515>] ? do_page_fault+0x35/0x3d0
      [   97.778171]  [<ffffffff8960131f>] ? page_fault+0x2f/0x50
      [   97.778174]  [<ffffffff89600071>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      [   97.778177] RIP: 0033:0x7f83fa36000d
      [   97.778178] RSP: 002b:00007f83ef9229e0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
      [   97.778180] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f83fa36000d
      [   97.778182] RDX: 0000000000004000 RSI: 00007f83ef922f00 RDI: 0000000000000036
      [   97.778183] RBP: 00007f83ef923040 R08: 00007f83ef9231f8 R09: 00007f83ef923168
      [   97.778184] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f83f69c5b40
      [   97.778185] R13: 000000000000001c R14: 0000000000000001 R15: 0000000000004000
      
      [   97.779684] Allocated by task 5919:
      [   97.783185]  save_stack+0x46/0xd0
      [   97.783187]  kasan_kmalloc+0xad/0xe0
      [   97.783189]  kmem_cache_alloc_trace+0xdf/0x580
      [   97.783190]  ip6_convert_metrics.isra.79+0x7e/0x190
      [   97.783192]  ip6_route_info_create+0x60a/0x2480
      [   97.783193]  ip6_route_add+0x1d/0x80
      [   97.783195]  inet6_rtm_newroute+0xdd/0xf0
      [   97.783198]  rtnetlink_rcv_msg+0x641/0xb10
      [   97.783200]  netlink_rcv_skb+0x27b/0x3e0
      [   97.783202]  rtnetlink_rcv+0x15/0x20
      [   97.783203]  netlink_unicast+0x4be/0x720
      [   97.783204]  netlink_sendmsg+0x7bc/0xbf0
      [   97.783205]  sock_sendmsg+0xba/0xf0
      [   97.783207]  ___sys_sendmsg+0x6ca/0x8e0
      [   97.783208]  __sys_sendmsg+0xe6/0x190
      [   97.783209]  SyS_sendmsg+0x13/0x20
      [   97.783211]  do_syscall_64+0x2ac/0x430
      [   97.783213]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      [   97.784709] Freed by task 0:
      [   97.785056] knetbase: Error: /proc/sys/net/core/txcs_enable does not exist
      [   97.794497]  save_stack+0x46/0xd0
      [   97.794499]  kasan_slab_free+0x71/0xc0
      [   97.794500]  kfree+0x7c/0xf0
      [   97.794501]  fib6_info_destroy_rcu+0x24f/0x310
      [   97.794504]  rcu_process_callbacks+0x38b/0x1730
      [   97.794506]  __do_softirq+0x1c8/0x5d0
      Reported-by: NJohn Sperbeck <jsperbeck@google.com>
      Signed-off-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86758605
  7. 18 9月, 2018 1 次提交
    • P
      net/ipv6: do not copy dst flags on rt init · 30bfd930
      Peter Oskolkov 提交于
      DST_NOCOUNT in dst_entry::flags tracks whether the entry counts
      toward route cache size (net->ipv6.sysctl.ip6_rt_max_size).
      
      If the flag is NOT set, dst_ops::pcpuc_entries counter is incremented
      in dist_init() and decremented in dst_destroy().
      
      This flag is tied to allocation/deallocation of dst_entry and
      should not be copied from another dst/route. Otherwise it can happen
      that dst_ops::pcpuc_entries counter grows until no new routes can
      be allocated because the counter reached ip6_rt_max_size due to
      DST_NOCOUNT not set and thus no counter decrements on gc-ed routes.
      
      Fixes: 3b6761d1 ("net/ipv6: Move dst flags to booleans in fib entries")
      Cc: David Ahern <dsahern@gmail.com>
      Acked-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NPeter Oskolkov <posk@google.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30bfd930
  8. 17 9月, 2018 3 次提交
  9. 14 9月, 2018 2 次提交
  10. 13 9月, 2018 2 次提交
    • X
      ipv6: use rt6_info members when dst is set in rt6_fill_node · 22d0bd82
      Xin Long 提交于
      In inet6_rtm_getroute, since Commit 93531c67 ("net/ipv6: separate
      handling of FIB entries from dst based routes"), it has used rt->from
      to dump route info instead of rt.
      
      However for some route like cache, some of its information like flags
      or gateway is not the same as that of the 'from' one. It caused 'ip
      route get' to dump the wrong route information.
      
      In Jianlin's testing, the output information even lost the expiration
      time for a pmtu route cache due to the wrong fib6_flags.
      
      So change to use rt6_info members for dst addr, src addr, flags and
      gateway when it tries to dump a route entry without fibmatch set.
      
      v1->v2:
        - not use rt6i_prefsrc.
        - also fix the gw dump issue.
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22d0bd82
    • A
      ipv6: Add sockopt IPV6_MULTICAST_ALL analogue to IP_MULTICAST_ALL · 15033f04
      Andre Naujoks 提交于
      The socket option will be enabled by default to ensure current behaviour
      is not changed. This is the same for the IPv4 version.
      
      A socket bound to in6addr_any and a specific port will receive all traffic
      on that port. Analogue to IP_MULTICAST_ALL, disable this behaviour, if
      one or more multicast groups were joined (using said socket) and only
      pass on multicast traffic from groups, which were explicitly joined via
      this socket.
      
      Without this option disabled a socket (system even) joined to multiple
      multicast groups is very hard to get right. Filtering by destination
      address has to take place in user space to avoid receiving multicast
      traffic from other multicast groups, which might have traffic on the same
      port.
      
      The extension of the IP_MULTICAST_ALL socketoption to just apply to ipv6,
      too, is not done to avoid changing the behaviour of current applications.
      Signed-off-by: NAndre Naujoks <nautsch2@gmail.com>
      Acked-By: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15033f04
  11. 11 9月, 2018 2 次提交
  12. 10 9月, 2018 1 次提交
    • T
      ip: frags: fix crash in ip_do_fragment() · 5d407b07
      Taehee Yoo 提交于
      A kernel crash occurrs when defragmented packet is fragmented
      in ip_do_fragment().
      In defragment routine, skb_orphan() is called and
      skb->ip_defrag_offset is set. but skb->sk and
      skb->ip_defrag_offset are same union member. so that
      frag->sk is not NULL.
      Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
      defragmented packet is fragmented.
      
      test commands:
         %iptables -t nat -I POSTROUTING -j MASQUERADE
         %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000
      
      splat looks like:
      [  261.069429] kernel BUG at net/ipv4/ip_output.c:636!
      [  261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [  261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
      [  261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
      [  261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
      [  261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
      [  261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
      [  261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
      [  261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
      [  261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
      [  261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
      [  261.174169] FS:  00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
      [  261.183012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
      [  261.198158] Call Trace:
      [  261.199018]  ? dst_output+0x180/0x180
      [  261.205011]  ? save_trace+0x300/0x300
      [  261.209018]  ? ip_copy_metadata+0xb00/0xb00
      [  261.213034]  ? sched_clock_local+0xd4/0x140
      [  261.218158]  ? kill_l4proto+0x120/0x120 [nf_conntrack]
      [  261.223014]  ? rt_cpu_seq_stop+0x10/0x10
      [  261.227014]  ? find_held_lock+0x39/0x1c0
      [  261.233008]  ip_finish_output+0x51d/0xb50
      [  261.237006]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.243011]  ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
      [  261.250152]  ? rcu_is_watching+0x77/0x120
      [  261.255010]  ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
      [  261.261033]  ? nf_hook_slow+0xb1/0x160
      [  261.265007]  ip_output+0x1c7/0x710
      [  261.269005]  ? ip_mc_output+0x13f0/0x13f0
      [  261.273002]  ? __local_bh_enable_ip+0xe9/0x1b0
      [  261.278152]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.282996]  ? nf_hook_slow+0xb1/0x160
      [  261.287007]  raw_sendmsg+0x21f9/0x4420
      [  261.291008]  ? dst_output+0x180/0x180
      [  261.297003]  ? sched_clock_cpu+0x126/0x170
      [  261.301003]  ? find_held_lock+0x39/0x1c0
      [  261.306155]  ? stop_critical_timings+0x420/0x420
      [  261.311004]  ? check_flags.part.36+0x450/0x450
      [  261.315005]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.320995]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.326142]  ? cyc2ns_read_end+0x10/0x10
      [  261.330139]  ? raw_bind+0x280/0x280
      [  261.334138]  ? sched_clock_cpu+0x126/0x170
      [  261.338995]  ? check_flags.part.36+0x450/0x450
      [  261.342991]  ? __lock_acquire+0x4500/0x4500
      [  261.348994]  ? inet_sendmsg+0x11c/0x500
      [  261.352989]  ? dst_output+0x180/0x180
      [  261.357012]  inet_sendmsg+0x11c/0x500
      [ ... ]
      
      v2:
       - clear skb->sk at reassembly routine.(Eric Dumarzet)
      
      Fixes: fa0f5273 ("ip: use rb trees for IP frag queue.")
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d407b07
  13. 06 9月, 2018 2 次提交
    • C
      ipv6: add inet6_fill_args · 203651b6
      Christian Brauner 提交于
      inet6_fill_if{addr,mcaddr, acaddr}() already took 6 arguments which
      meant the 7th argument would need to be pushed onto the stack on x86.
      Add a new struct inet6_fill_args which holds common information passed
      to inet6_fill_if{addr,mcaddr, acaddr}() and shortens the functions to
      three pointer arguments.
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      203651b6
    • C
      ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR · 6ecf4c37
      Christian Brauner 提交于
      - Backwards Compatibility:
        If userspace wants to determine whether ipv6 RTM_GETADDR requests
        support the new IFA_TARGET_NETNSID property it should verify that the
        reply includes the IFA_TARGET_NETNSID property. If it does not
        userspace should assume that IFA_TARGET_NETNSID is not supported for
        ipv6 RTM_GETADDR requests on this kernel.
      - From what I gather from current userspace tools that make use of
        RTM_GETADDR requests some of them pass down struct ifinfomsg when they
        should actually pass down struct ifaddrmsg. To not break existing
        tools that pass down the wrong struct we will do the same as for
        RTM_GETLINK | NLM_F_DUMP requests and not error out when the
        nlmsg_parse() fails.
      
      - Security:
        Callers must have CAP_NET_ADMIN in the owning user namespace of the
        target network namespace.
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ecf4c37
  14. 04 9月, 2018 2 次提交
  15. 03 9月, 2018 2 次提交
    • T
      xfrm6: call kfree_skb when skb is toobig · 215ab0f0
      Thadeu Lima de Souza Cascardo 提交于
      After commit d6990976 ("vti6: fix PMTU caching
      and reporting on xmit"), some too big skbs might be potentially passed down to
      __xfrm6_output, causing it to fail to transmit but not free the skb, causing a
      leak of skb, and consequentially a leak of dst references.
      
      After running pmtu.sh, that shows as failure to unregister devices in a namespace:
      
      [  311.397671] unregister_netdevice: waiting for veth_b to become free. Usage count = 1
      
      The fix is to call kfree_skb in case of transmit failures.
      
      Fixes: dd767856 ("xfrm6: Don't call icmpv6_send on local error")
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      215ab0f0
    • D
      net/ipv6: Only update MTU metric if it set · 15a81b41
      David Ahern 提交于
      Jan reported a regression after an update to 4.18.5. In this case ipv6
      default route is setup by systemd-networkd based on data from an RA. The
      RA contains an MTU of 1492 which is used when the route is first inserted
      but then systemd-networkd pushes down updates to the default route
      without the mtu set.
      
      Prior to the change to fib6_info, metrics such as MTU were held in the
      dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if
      any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not,
      the value from the metrics is used if it is set and finally falling
      back to the idev value.
      
      After the fib6_info change metrics are contained in the fib6_info struct
      and there is no equivalent to rt6i_pmtu. To maintain consistency with
      the old behavior the new code should only reset the MTU in the metrics
      if the route update has it set.
      
      Fixes: d4ead6b3 ("net/ipv6: move metrics from dst to rt6_info")
      Reported-by: NJan Janssen <medhefgo@web.de>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15a81b41