1. 18 5月, 2017 1 次提交
    • C
      ipv6: Prevent overrun when parsing v6 header options · 2423496a
      Craig Gallek 提交于
      The KASAN warning repoted below was discovered with a syzkaller
      program.  The reproducer is basically:
        int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP);
        send(s, &one_byte_of_data, 1, MSG_MORE);
        send(s, &more_than_mtu_bytes_data, 2000, 0);
      
      The socket() call sets the nexthdr field of the v6 header to
      NEXTHDR_HOP, the first send call primes the payload with a non zero
      byte of data, and the second send call triggers the fragmentation path.
      
      The fragmentation code tries to parse the header options in order
      to figure out where to insert the fragment option.  Since nexthdr points
      to an invalid option, the calculation of the size of the network header
      can made to be much larger than the linear section of the skb and data
      is read outside of it.
      
      This fix makes ip6_find_1stfrag return an error if it detects
      running out-of-bounds.
      
      [   42.361487] ==================================================================
      [   42.364412] BUG: KASAN: slab-out-of-bounds in ip6_fragment+0x11c8/0x3730
      [   42.365471] Read of size 840 at addr ffff88000969e798 by task ip6_fragment-oo/3789
      [   42.366469]
      [   42.366696] CPU: 1 PID: 3789 Comm: ip6_fragment-oo Not tainted 4.11.0+ #41
      [   42.367628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014
      [   42.368824] Call Trace:
      [   42.369183]  dump_stack+0xb3/0x10b
      [   42.369664]  print_address_description+0x73/0x290
      [   42.370325]  kasan_report+0x252/0x370
      [   42.370839]  ? ip6_fragment+0x11c8/0x3730
      [   42.371396]  check_memory_region+0x13c/0x1a0
      [   42.371978]  memcpy+0x23/0x50
      [   42.372395]  ip6_fragment+0x11c8/0x3730
      [   42.372920]  ? nf_ct_expect_unregister_notifier+0x110/0x110
      [   42.373681]  ? ip6_copy_metadata+0x7f0/0x7f0
      [   42.374263]  ? ip6_forward+0x2e30/0x2e30
      [   42.374803]  ip6_finish_output+0x584/0x990
      [   42.375350]  ip6_output+0x1b7/0x690
      [   42.375836]  ? ip6_finish_output+0x990/0x990
      [   42.376411]  ? ip6_fragment+0x3730/0x3730
      [   42.376968]  ip6_local_out+0x95/0x160
      [   42.377471]  ip6_send_skb+0xa1/0x330
      [   42.377969]  ip6_push_pending_frames+0xb3/0xe0
      [   42.378589]  rawv6_sendmsg+0x2051/0x2db0
      [   42.379129]  ? rawv6_bind+0x8b0/0x8b0
      [   42.379633]  ? _copy_from_user+0x84/0xe0
      [   42.380193]  ? debug_check_no_locks_freed+0x290/0x290
      [   42.380878]  ? ___sys_sendmsg+0x162/0x930
      [   42.381427]  ? rcu_read_lock_sched_held+0xa3/0x120
      [   42.382074]  ? sock_has_perm+0x1f6/0x290
      [   42.382614]  ? ___sys_sendmsg+0x167/0x930
      [   42.383173]  ? lock_downgrade+0x660/0x660
      [   42.383727]  inet_sendmsg+0x123/0x500
      [   42.384226]  ? inet_sendmsg+0x123/0x500
      [   42.384748]  ? inet_recvmsg+0x540/0x540
      [   42.385263]  sock_sendmsg+0xca/0x110
      [   42.385758]  SYSC_sendto+0x217/0x380
      [   42.386249]  ? SYSC_connect+0x310/0x310
      [   42.386783]  ? __might_fault+0x110/0x1d0
      [   42.387324]  ? lock_downgrade+0x660/0x660
      [   42.387880]  ? __fget_light+0xa1/0x1f0
      [   42.388403]  ? __fdget+0x18/0x20
      [   42.388851]  ? sock_common_setsockopt+0x95/0xd0
      [   42.389472]  ? SyS_setsockopt+0x17f/0x260
      [   42.390021]  ? entry_SYSCALL_64_fastpath+0x5/0xbe
      [   42.390650]  SyS_sendto+0x40/0x50
      [   42.391103]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.391731] RIP: 0033:0x7fbbb711e383
      [   42.392217] RSP: 002b:00007ffff4d34f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [   42.393235] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb711e383
      [   42.394195] RDX: 0000000000001000 RSI: 00007ffff4d34f60 RDI: 0000000000000003
      [   42.395145] RBP: 0000000000000046 R08: 00007ffff4d34f40 R09: 0000000000000018
      [   42.396056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400aad
      [   42.396598] R13: 0000000000000066 R14: 00007ffff4d34ee0 R15: 00007fbbb717af00
      [   42.397257]
      [   42.397411] Allocated by task 3789:
      [   42.397702]  save_stack_trace+0x16/0x20
      [   42.398005]  save_stack+0x46/0xd0
      [   42.398267]  kasan_kmalloc+0xad/0xe0
      [   42.398548]  kasan_slab_alloc+0x12/0x20
      [   42.398848]  __kmalloc_node_track_caller+0xcb/0x380
      [   42.399224]  __kmalloc_reserve.isra.32+0x41/0xe0
      [   42.399654]  __alloc_skb+0xf8/0x580
      [   42.400003]  sock_wmalloc+0xab/0xf0
      [   42.400346]  __ip6_append_data.isra.41+0x2472/0x33d0
      [   42.400813]  ip6_append_data+0x1a8/0x2f0
      [   42.401122]  rawv6_sendmsg+0x11ee/0x2db0
      [   42.401505]  inet_sendmsg+0x123/0x500
      [   42.401860]  sock_sendmsg+0xca/0x110
      [   42.402209]  ___sys_sendmsg+0x7cb/0x930
      [   42.402582]  __sys_sendmsg+0xd9/0x190
      [   42.402941]  SyS_sendmsg+0x2d/0x50
      [   42.403273]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.403718]
      [   42.403871] Freed by task 1794:
      [   42.404146]  save_stack_trace+0x16/0x20
      [   42.404515]  save_stack+0x46/0xd0
      [   42.404827]  kasan_slab_free+0x72/0xc0
      [   42.405167]  kfree+0xe8/0x2b0
      [   42.405462]  skb_free_head+0x74/0xb0
      [   42.405806]  skb_release_data+0x30e/0x3a0
      [   42.406198]  skb_release_all+0x4a/0x60
      [   42.406563]  consume_skb+0x113/0x2e0
      [   42.406910]  skb_free_datagram+0x1a/0xe0
      [   42.407288]  netlink_recvmsg+0x60d/0xe40
      [   42.407667]  sock_recvmsg+0xd7/0x110
      [   42.408022]  ___sys_recvmsg+0x25c/0x580
      [   42.408395]  __sys_recvmsg+0xd6/0x190
      [   42.408753]  SyS_recvmsg+0x2d/0x50
      [   42.409086]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.409513]
      [   42.409665] The buggy address belongs to the object at ffff88000969e780
      [   42.409665]  which belongs to the cache kmalloc-512 of size 512
      [   42.410846] The buggy address is located 24 bytes inside of
      [   42.410846]  512-byte region [ffff88000969e780, ffff88000969e980)
      [   42.411941] The buggy address belongs to the page:
      [   42.412405] page:ffffea000025a780 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
      [   42.413298] flags: 0x100000000008100(slab|head)
      [   42.413729] raw: 0100000000008100 0000000000000000 0000000000000000 00000001800c000c
      [   42.414387] raw: ffffea00002a9500 0000000900000007 ffff88000c401280 0000000000000000
      [   42.415074] page dumped because: kasan: bad access detected
      [   42.415604]
      [   42.415757] Memory state around the buggy address:
      [   42.416222]  ffff88000969e880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   42.416904]  ffff88000969e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   42.417591] >ffff88000969e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   42.418273]                    ^
      [   42.418588]  ffff88000969ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   42.419273]  ffff88000969ea80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   42.419882] ==================================================================
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2423496a
  2. 16 5月, 2017 1 次提交
    • M
      ipv6: avoid dad-failures for addresses with NODAD · 66eb9f86
      Mahesh Bandewar 提交于
      Every address gets added with TENTATIVE flag even for the addresses with
      IFA_F_NODAD flag and dad-work is scheduled for them. During this DAD process
      we realize it's an address with NODAD and complete the process without
      sending any probe. However the TENTATIVE flags stays on the
      address for sometime enough to cause misinterpretation when we receive a NS.
      While processing NS, if the address has TENTATIVE flag, we mark it DADFAILED
      and endup with an address that was originally configured as NODAD with
      DADFAILED.
      
      We can't avoid scheduling dad_work for addresses with NODAD but we can
      avoid adding TENTATIVE flag to avoid this racy situation.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66eb9f86
  3. 12 5月, 2017 1 次提交
  4. 09 5月, 2017 2 次提交
  5. 06 5月, 2017 1 次提交
    • E
      tcp: randomize timestamps on syncookies · 84b114b9
      Eric Dumazet 提交于
      Whole point of randomization was to hide server uptime, but an attacker
      can simply start a syn flood and TCP generates 'old style' timestamps,
      directly revealing server jiffies value.
      
      Also, TSval sent by the server to a particular remote address vary
      depending on syncookies being sent or not, potentially triggering PAWS
      drops for innocent clients.
      
      Lets implement proper randomization, including for SYNcookies.
      
      Also we do not need to export sysctl_tcp_timestamps, since it is not
      used from a module.
      
      In v2, I added Florian feedback and contribution, adding tsoff to
      tcp_get_cookie_sock().
      
      v3 removed one unused variable in tcp_v4_connect() as Florian spotted.
      
      Fixes: 95a22cae ("tcp: randomize tcp timestamp offsets for each connection")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NFlorian Westphal <fw@strlen.de>
      Tested-by: NFlorian Westphal <fw@strlen.de>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84b114b9
  6. 05 5月, 2017 1 次提交
  7. 04 5月, 2017 1 次提交
    • A
      ipv4, ipv6: ensure raw socket message is big enough to hold an IP header · 86f4c90a
      Alexander Potapenko 提交于
      raw_send_hdrinc() and rawv6_send_hdrinc() expect that the buffer copied
      from the userspace contains the IPv4/IPv6 header, so if too few bytes are
      copied, parts of the header may remain uninitialized.
      
      This bug has been detected with KMSAN.
      
      For the record, the KMSAN report:
      
      ==================================================================
      BUG: KMSAN: use of unitialized memory in nf_ct_frag6_gather+0xf5a/0x44a0
      inter: 0
      CPU: 0 PID: 1036 Comm: probe Not tainted 4.11.0-rc5+ #2455
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:16
       dump_stack+0x143/0x1b0 lib/dump_stack.c:52
       kmsan_report+0x16b/0x1e0 mm/kmsan/kmsan.c:1078
       __kmsan_warning_32+0x5c/0xa0 mm/kmsan/kmsan_instr.c:510
       nf_ct_frag6_gather+0xf5a/0x44a0 net/ipv6/netfilter/nf_conntrack_reasm.c:577
       ipv6_defrag+0x1d9/0x280 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn ./include/linux/netfilter.h:102
       nf_hook_slow+0x13f/0x3c0 net/netfilter/core.c:310
       nf_hook ./include/linux/netfilter.h:212
       NF_HOOK ./include/linux/netfilter.h:255
       rawv6_send_hdrinc net/ipv6/raw.c:673
       rawv6_sendmsg+0x2fcb/0x41a0 net/ipv6/raw.c:919
       inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762
       sock_sendmsg_nosec net/socket.c:633
       sock_sendmsg net/socket.c:643
       SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696
       SyS_sendto+0xbc/0xe0 net/socket.c:1664
       do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285
       entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
      RIP: 0033:0x436e03
      RSP: 002b:00007ffce48baf38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000436e03
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 00007ffce48baf90 R08: 00007ffce48baf50 R09: 000000000000001c
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000401790 R14: 0000000000401820 R15: 0000000000000000
      origin: 00000000d9400053
       save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:362
       kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:257
       kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:270
       slab_alloc_node mm/slub.c:2735
       __kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4341
       __kmalloc_reserve net/core/skbuff.c:138
       __alloc_skb+0x2cd/0x740 net/core/skbuff.c:231
       alloc_skb ./include/linux/skbuff.h:933
       alloc_skb_with_frags+0x209/0xbc0 net/core/skbuff.c:4678
       sock_alloc_send_pskb+0x9ff/0xe00 net/core/sock.c:1903
       sock_alloc_send_skb+0xe4/0x100 net/core/sock.c:1920
       rawv6_send_hdrinc net/ipv6/raw.c:638
       rawv6_sendmsg+0x2918/0x41a0 net/ipv6/raw.c:919
       inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762
       sock_sendmsg_nosec net/socket.c:633
       sock_sendmsg net/socket.c:643
       SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696
       SyS_sendto+0xbc/0xe0 net/socket.c:1664
       do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285
       return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
      ==================================================================
      
      , triggered by the following syscalls:
        socket(PF_INET6, SOCK_RAW, IPPROTO_RAW) = 3
        sendto(3, NULL, 0, 0, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "ff00::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EPERM
      
      A similar report is triggered in net/ipv4/raw.c if we use a PF_INET socket
      instead of a PF_INET6 one.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86f4c90a
  8. 03 5月, 2017 1 次提交
    • D
      net: ipv6: Do not duplicate DAD on link up · 6d717134
      David Ahern 提交于
      Andrey reported a warning triggered by the rcu code:
      
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 5911 at lib/debugobjects.c:289
      debug_print_object+0x175/0x210
      ODEBUG: activate active (active state 1) object type: rcu_head hint:
              (null)
      Modules linked in:
      CPU: 1 PID: 5911 Comm: a.out Not tainted 4.11.0-rc8+ #271
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:16
       dump_stack+0x192/0x22d lib/dump_stack.c:52
       __warn+0x19f/0x1e0 kernel/panic.c:549
       warn_slowpath_fmt+0xe0/0x120 kernel/panic.c:564
       debug_print_object+0x175/0x210 lib/debugobjects.c:286
       debug_object_activate+0x574/0x7e0 lib/debugobjects.c:442
       debug_rcu_head_queue kernel/rcu/rcu.h:75
       __call_rcu.constprop.76+0xff/0x9c0 kernel/rcu/tree.c:3229
       call_rcu_sched+0x12/0x20 kernel/rcu/tree.c:3288
       rt6_rcu_free net/ipv6/ip6_fib.c:158
       rt6_release+0x1ea/0x290 net/ipv6/ip6_fib.c:188
       fib6_del_route net/ipv6/ip6_fib.c:1461
       fib6_del+0xa42/0xdc0 net/ipv6/ip6_fib.c:1500
       __ip6_del_rt+0x100/0x160 net/ipv6/route.c:2174
       ip6_del_rt+0x140/0x1b0 net/ipv6/route.c:2187
       __ipv6_ifa_notify+0x269/0x780 net/ipv6/addrconf.c:5520
       addrconf_ifdown+0xe60/0x1a20 net/ipv6/addrconf.c:3672
      ...
      
      Andrey's reproducer program runs in a very tight loop, calling
      'unshare -n' and then spawning 2 sets of 14 threads running random ioctl
      calls. The relevant networking sequence:
      
      1. New network namespace created via unshare -n
      - ip6tnl0 device is created in down state
      
      2. address added to ip6tnl0
      - equivalent to ip -6 addr add dev ip6tnl0 fd00::bb/1
      - DAD is started on the address and when it completes the host
        route is inserted into the FIB
      
      3. ip6tnl0 is brought up
      - the new fixup_permanent_addr function restarts DAD on the address
      
      4. exit namespace
      - teardown / cleanup sequence starts
      - once in a blue moon, lo teardown appears to happen BEFORE teardown
        of ip6tunl0
        + down on 'lo' removes the host route from the FIB since the dst->dev
          for the route is loobback
        + host route added to rcu callback list
          * rcu callback has not run yet, so rt is NOT on the gc list so it has
            NOT been marked obsolete
      
      5. in parallel to 4. worker_thread runs addrconf_dad_completed
      - DAD on the address on ip6tnl0 completes
      - calls ipv6_ifa_notify which inserts the host route
      
      All of that happens very quickly. The result is that a host route that
      has been deleted from the IPv6 FIB and added to the RCU list is re-inserted
      into the FIB.
      
      The exit namespace eventually gets to cleaning up ip6tnl0 which removes the
      host route from the FIB again, calls the rcu function for cleanup -- and
      triggers the double rcu trace.
      
      The root cause is duplicate DAD on the address -- steps 2 and 3. Arguably,
      DAD should not be started in step 2. The interface is in the down state,
      so it can not really send out requests for the address which makes starting
      DAD pointless.
      
      Since the second DAD was introduced by a recent change, seems appropriate
      to use it for the Fixes tag and have the fixup function only start DAD for
      addresses in the PREDAD state which occurs in addrconf_ifdown if the
      address is retained.
      
      Big thanks to Andrey for isolating a reliable reproducer for this problem.
      Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d717134
  9. 02 5月, 2017 2 次提交
    • D
      ipv6: Need to export ipv6_push_frag_opts for tunneling now. · 5b8481fa
      David S. Miller 提交于
      Since that change also made the nfrag function not necessary
      for exports, remove it.
      
      Fixes: 89a23c8b ("ip6_tunnel: Fix missing tunnel encapsulation limit option")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b8481fa
    • C
      ip6_tunnel: Fix missing tunnel encapsulation limit option · 89a23c8b
      Craig Gallek 提交于
      The IPv6 tunneling code tries to insert IPV6_TLV_TNL_ENCAP_LIMIT and
      IPV6_TLV_PADN options when an encapsulation limit is defined (the
      default is a limit of 4).  An MTU adjustment is done to account for
      these options as well.  However, the options are never present in the
      generated packets.
      
      The issue appears to be a subtlety between IPV6_DSTOPTS and
      IPV6_RTHDRDSTOPTS defined in RFC 3542.  When the IPIP tunnel driver was
      written, the encap limit options were included as IPV6_RTHDRDSTOPTS in
      dst0opt of struct ipv6_txoptions.  Later, ipv6_push_nfrags_opts was
      (correctly) updated to require IPV6_RTHDR options when IPV6_RTHDRDSTOPTS
      are to be used.  This caused the options to no longer be included in v6
      encapsulated packets.
      
      The fix is to use IPV6_DSTOPTS (in dst1opt of struct ipv6_txoptions)
      instead.  IPV6_DSTOPTS do not have the additional IPV6_RTHDR requirement.
      
      Fixes: 1df64a8569c7: ("[IPV6]: Add ip6ip6 tunnel driver.")
      Fixes: 333fad53: ("[IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542)")
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89a23c8b
  10. 27 4月, 2017 2 次提交
    • J
      ipv6: check raw payload size correctly in ioctl · 105f5528
      Jamie Bainbridge 提交于
      In situations where an skb is paged, the transport header pointer and
      tail pointer can be the same because the skb contents are in frags.
      
      This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a
      length of 0 when the length to receive is actually greater than zero.
      
      skb->len is already correctly set in ip6_input_finish() with
      pskb_pull(), so use skb->len as it always returns the correct result
      for both linear and paged data.
      Signed-off-by: NJamie Bainbridge <jbainbri@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      105f5528
    • W
      ipv6: check skb->protocol before lookup for nexthop · 199ab00f
      WANG Cong 提交于
      Andrey reported a out-of-bound access in ip6_tnl_xmit(), this
      is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4
      neigh key as an IPv6 address:
      
              neigh = dst_neigh_lookup(skb_dst(skb),
                                       &ipv6_hdr(skb)->daddr);
              if (!neigh)
                      goto tx_err_link_failure;
      
              addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE
              addr_type = ipv6_addr_type(addr6);
      
              if (addr_type == IPV6_ADDR_ANY)
                      addr6 = &ipv6_hdr(skb)->daddr;
      
              memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
      
      Also the network header of the skb at this point should be still IPv4
      for 4in6 tunnels, we shold not just use it as IPv6 header.
      
      This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it
      is, we are safe to do the nexthop lookup using skb_dst() and
      ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which
      dest address we can pick here, we have to rely on callers to fill it
      from tunnel config, so just fall to ip6_route_output() to make the
      decision.
      
      Fixes: ea3dc960 ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      199ab00f
  11. 26 4月, 2017 6 次提交
    • F
      netfilter: don't attach a nat extension by default · 9a08ecfe
      Florian Westphal 提交于
      nowadays the NAT extension only stores the interface index
      (used to purge connections that got masqueraded when interface goes down)
      and pptp nat information.
      
      Previous patches moved nf_ct_nat_ext_add to those places that need it.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9a08ecfe
    • F
      netfilter: masquerade: attach nat extension if not present · ff459018
      Florian Westphal 提交于
      Currently the nat extension is always attached as soon as nat module is
      loaded.  However, most NAT uses do not need the nat extension anymore.
      
      Prepare to remove the add-nat-by-default by making those places that need
      it attach it if its not present yet.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ff459018
    • G
      netfilter: SYNPROXY: Return NF_STOLEN instead of NF_DROP during handshaking · 495dcb56
      Gao Feng 提交于
      Current SYNPROXY codes return NF_DROP during normal TCP handshaking,
      it is not friendly to caller. Because the nf_hook_slow would treat
      the NF_DROP as an error, and return -EPERM.
      As a result, it may cause the top caller think it meets one error.
      
      For example, the following codes are from cfv_rx_poll()
      	err = netif_receive_skb(skb);
      	if (unlikely(err)) {
      		++cfv->ndev->stats.rx_dropped;
      	} else {
      		++cfv->ndev->stats.rx_packets;
      		cfv->ndev->stats.rx_bytes += skb_len;
      	}
      When SYNPROXY returns NF_DROP, then netif_receive_skb returns -EPERM.
      As a result, the cfv driver would treat it as an error, and increase
      the rx_dropped counter.
      
      So use NF_STOLEN instead of NF_DROP now because there is no error
      happened indeed, and free the skb directly.
      Signed-off-by: NGao Feng <fgao@ikuai8.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      495dcb56
    • F
      netfilter: synproxy: only register hooks when needed · 1fefe147
      Florian Westphal 提交于
      Defer registration of the synproxy hooks until the first SYNPROXY rule is
      added.  Also means we only register hooks in namespaces that need it.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1fefe147
    • D
      net: ipv6: regenerate host route if moved to gc list · 8048ced9
      David Ahern 提交于
      Taking down the loopback device wreaks havoc on IPv6 routing. By
      extension, taking down a VRF device wreaks havoc on its table.
      
      Dmitry and Andrey both reported heap out-of-bounds reports in the IPv6
      FIB code while running syzkaller fuzzer. The root cause is a dead dst
      that is on the garbage list gets reinserted into the IPv6 FIB. While on
      the gc (or perhaps when it gets added to the gc list) the dst->next is
      set to an IPv4 dst. A subsequent walk of the ipv6 tables causes the
      out-of-bounds access.
      
      Andrey's reproducer was the key to getting to the bottom of this.
      
      With IPv6, host routes for an address have the dst->dev set to the
      loopback device. When the 'lo' device is taken down, rt6_ifdown initiates
      a walk of the fib evicting routes with the 'lo' device which means all
      host routes are removed. That process moves the dst which is attached to
      an inet6_ifaddr to the gc list and marks it as dead.
      
      The recent change to keep global IPv6 addresses added a new function,
      fixup_permanent_addr, that is called on admin up. That function restarts
      dad for an inet6_ifaddr and when it completes the host route attached
      to it is inserted into the fib. Since the route was marked dead and
      moved to the gc list, re-inserting the route causes the reported
      out-of-bounds accesses. If the device with the address is taken down
      or the address is removed, the WARN_ON in fib6_del is triggered.
      
      All of those faults are fixed by regenerating the host route if the
      existing one has been moved to the gc list, something that can be
      determined by checking if the rt6i_ref counter is 0.
      
      Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional")
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8048ced9
    • S
      ipv6: fix source routing · ec9c4215
      Sabrina Dubroca 提交于
      Commit a149e7c7 ("ipv6: sr: add support for SRH injection through
      setsockopt") introduced handling of IPV6_SRCRT_TYPE_4, but at the same
      time restricted it to only IPV6_SRCRT_TYPE_0 and
      IPV6_SRCRT_TYPE_4. Previously, ipv6_push_exthdr() and fl6_update_dst()
      would also handle other values (ie STRICT and TYPE_2).
      
      Restore previous source routing behavior, by handling IPV6_SRCRT_STRICT
      and IPV6_SRCRT_TYPE_2 the same way as IPV6_SRCRT_TYPE_0 in
      ipv6_push_exthdr() and fl6_update_dst().
      
      Fixes: a149e7c7 ("ipv6: sr: add support for SRH injection through setsockopt")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec9c4215
  12. 25 4月, 2017 3 次提交
  13. 24 4月, 2017 1 次提交
  14. 22 4月, 2017 5 次提交
    • N
      ip6mr: fix notification device destruction · 723b929c
      Nikolay Aleksandrov 提交于
      Andrey Konovalov reported a BUG caused by the ip6mr code which is caused
      because we call unregister_netdevice_many for a device that is already
      being destroyed. In IPv4's ipmr that has been resolved by two commits
      long time ago by introducing the "notify" parameter to the delete
      function and avoiding the unregister when called from a notifier, so
      let's do the same for ip6mr.
      
      The trace from Andrey:
      ------------[ cut here ]------------
      kernel BUG at net/core/dev.c:6813!
      invalid opcode: 0000 [#1] SMP KASAN
      Modules linked in:
      CPU: 1 PID: 1165 Comm: kworker/u4:3 Not tainted 4.11.0-rc7+ #251
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Workqueue: netns cleanup_net
      task: ffff880069208000 task.stack: ffff8800692d8000
      RIP: 0010:rollback_registered_many+0x348/0xeb0 net/core/dev.c:6813
      RSP: 0018:ffff8800692de7f0 EFLAGS: 00010297
      RAX: ffff880069208000 RBX: 0000000000000002 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88006af90569
      RBP: ffff8800692de9f0 R08: ffff8800692dec60 R09: 0000000000000000
      R10: 0000000000000006 R11: 0000000000000000 R12: ffff88006af90070
      R13: ffff8800692debf0 R14: dffffc0000000000 R15: ffff88006af90000
      FS:  0000000000000000(0000) GS:ffff88006cb00000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fe7e897d870 CR3: 00000000657e7000 CR4: 00000000000006e0
      Call Trace:
       unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881
       unregister_netdevice_many+0xc8/0x120 net/core/dev.c:7880
       ip6mr_device_event+0x362/0x3f0 net/ipv6/ip6mr.c:1346
       notifier_call_chain+0x145/0x2f0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1647
       call_netdevice_notifiers net/core/dev.c:1663
       rollback_registered_many+0x919/0xeb0 net/core/dev.c:6841
       unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881
       unregister_netdevice_many net/core/dev.c:7880
       default_device_exit_batch+0x4fa/0x640 net/core/dev.c:8333
       ops_exit_list.isra.4+0x100/0x150 net/core/net_namespace.c:144
       cleanup_net+0x5a8/0xb40 net/core/net_namespace.c:463
       process_one_work+0xc04/0x1c10 kernel/workqueue.c:2097
       worker_thread+0x223/0x19c0 kernel/workqueue.c:2231
       kthread+0x35e/0x430 kernel/kthread.c:231
       ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
      Code: 3c 32 00 0f 85 70 0b 00 00 48 b8 00 02 00 00 00 00 ad de 49 89
      47 78 e9 93 fe ff ff 49 8d 57 70 49 8d 5f 78 eb 9e e8 88 7a 14 fe <0f>
      0b 48 8b 9d 28 fe ff ff e8 7a 7a 14 fe 48 b8 00 00 00 00 00
      RIP: rollback_registered_many+0x348/0xeb0 RSP: ffff8800692de7f0
      ---[ end trace e0b29c57e9b3292c ]---
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      723b929c
    • D
      net: ipv6: RTF_PCPU should not be settable from userspace · 557c44be
      David Ahern 提交于
      Andrey reported a fault in the IPv6 route code:
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      Modules linked in:
      CPU: 1 PID: 4035 Comm: a.out Not tainted 4.11.0-rc7+ #250
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      task: ffff880069809600 task.stack: ffff880062dc8000
      RIP: 0010:ip6_rt_cache_alloc+0xa6/0x560 net/ipv6/route.c:975
      RSP: 0018:ffff880062dced30 EFLAGS: 00010206
      RAX: dffffc0000000000 RBX: ffff8800670561c0 RCX: 0000000000000006
      RDX: 0000000000000003 RSI: ffff880062dcfb28 RDI: 0000000000000018
      RBP: ffff880062dced68 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: ffff880062dcfb28 R14: dffffc0000000000 R15: 0000000000000000
      FS:  00007feebe37e7c0(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000205a0fe4 CR3: 000000006b5c9000 CR4: 00000000000006e0
      Call Trace:
       ip6_pol_route+0x1512/0x1f20 net/ipv6/route.c:1128
       ip6_pol_route_output+0x4c/0x60 net/ipv6/route.c:1212
      ...
      
      Andrey's syzkaller program passes rtmsg.rtmsg_flags with the RTF_PCPU bit
      set. Flags passed to the kernel are blindly copied to the allocated
      rt6_info by ip6_route_info_create making a newly inserted route appear
      as though it is a per-cpu route. ip6_rt_cache_alloc sees the flag set
      and expects rt->dst.from to be set - which it is not since it is not
      really a per-cpu copy. The subsequent call to __ip6_dst_alloc then
      generates the fault.
      
      Fix by checking for the flag and failing with EINVAL.
      
      Fixes: d52d3997 ("ipv6: Create percpu rt6_info")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      557c44be
    • C
      ip_tunnel: Allow policy-based routing through tunnels · 9830ad4c
      Craig Gallek 提交于
      This feature allows the administrator to set an fwmark for
      packets traversing a tunnel.  This allows the use of independent
      routing tables for tunneled packets without the use of iptables.
      
      There is no concept of per-packet routing decisions through IPv4
      tunnels, so this implementation does not need to work with
      per-packet route lookups as the v6 implementation may
      (with IP6_TNL_F_USE_ORIG_FWMARK).
      
      Further, since the v4 tunnel ioctls share datastructures
      (which can not be trivially modified) with the kernel's internal
      tunnel configuration structures, the mark attribute must be stored
      in the tunnel structure itself and passed as a parameter when
      creating or changing tunnel attributes.
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9830ad4c
    • C
      ip6_tunnel: Allow policy-based routing through tunnels · 0a473b82
      Craig Gallek 提交于
      This feature allows the administrator to set an fwmark for
      packets traversing a tunnel.  This allows the use of independent
      routing tables for tunneled packets without the use of iptables.
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a473b82
    • D
      ipv6: sr: fix double free of skb after handling invalid SRH · 95b9b88d
      David Lebrun 提交于
      The icmpv6_param_prob() function already does a kfree_skb(),
      this patch removes the duplicate one.
      
      Fixes: 1ababeba ("ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)")
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95b9b88d
  15. 21 4月, 2017 2 次提交
  16. 19 4月, 2017 3 次提交
    • I
      esp4/6: Fix GSO path for non-GSO SW-crypto packets · 8f92e03e
      Ilan Tayari 提交于
      If esp*_offload module is loaded, outbound packets take the
      GSO code path, being encapsulated at layer 3, but encrypted
      in layer 2. validate_xmit_xfrm calls esp*_xmit for that.
      
      esp*_xmit was wrongfully detecting these packets as going
      through hardware crypto offload, while in fact they should
      be encrypted in software, causing plaintext leakage to
      the network, and also dropping at the receiver side.
      
      Perform the encryption in esp*_xmit, if the SA doesn't have
      a hardware offload_handle.
      
      Also, align esp6 code to esp4 logic.
      
      Fixes: fca11ebd ("esp4: Reorganize esp_output")
      Fixes: 383d0350 ("esp6: Reorganize esp_output")
      Signed-off-by: NIlan Tayari <ilant@mellanox.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      8f92e03e
    • C
      esp6: fix incorrect null pointer check on xo · ffa6f571
      Colin Ian King 提交于
      The check for xo being null is incorrect, currently it is checking
      for non-null, it should be checking for null.
      
      Detected with CoverityScan, CID#1429349 ("Dereference after null check")
      
      Fixes: 7862b405 ("esp: Add gso handlers for esp4 and esp6")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      ffa6f571
    • P
      mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU · 5f0d5a3a
      Paul E. McKenney 提交于
      A group of Linux kernel hackers reported chasing a bug that resulted
      from their assumption that SLAB_DESTROY_BY_RCU provided an existence
      guarantee, that is, that no block from such a slab would be reallocated
      during an RCU read-side critical section.  Of course, that is not the
      case.  Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire
      slab of blocks.
      
      However, there is a phrase for this, namely "type safety".  This commit
      therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order
      to avoid future instances of this sort of confusion.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: <linux-mm@kvack.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      [ paulmck: Add comments mentioning the old name, as requested by Eric
        Dumazet, in order to help people familiar with the old name find
        the new one. ]
      Acked-by: NDavid Rientjes <rientjes@google.com>
      5f0d5a3a
  17. 18 4月, 2017 5 次提交
    • D
      net: rtnetlink: plumb extended ack to doit function · c21ef3e3
      David Ahern 提交于
      Add netlink_ext_ack arg to rtnl_doit_func. Pass extack arg to nlmsg_parse
      for doit functions that call it directly.
      
      This is the first step to using extended error reporting in rtnetlink.
      >From here individual subsystems can be updated to set netlink_ext_ack as
      needed.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c21ef3e3
    • D
      ipv6: sr: fix BUG due to headroom too small after SRH push · af3b5158
      David Lebrun 提交于
      When a locally generated packet receives an SRH with two or more segments,
      the remaining headroom is too small to push an ethernet header. This patch
      ensures that the headroom is large enough after SRH push.
      
      The BUG generated the following trace.
      
      [  192.950285] skbuff: skb_under_panic: text:ffffffff81809675 len:198 put:14 head:ffff88006f306400 data:ffff88006f3063fa tail:0xc0 end:0x2c0 dev:A-1
      [  192.952456] ------------[ cut here ]------------
      [  192.953218] kernel BUG at net/core/skbuff.c:105!
      [  192.953411] invalid opcode: 0000 [#1] PREEMPT SMP
      [  192.953411] Modules linked in:
      [  192.953411] CPU: 5 PID: 3433 Comm: ping6 Not tainted 4.11.0-rc3+ #237
      [  192.953411] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
      [  192.953411] task: ffff88007c2d42c0 task.stack: ffffc90000ef4000
      [  192.953411] RIP: 0010:skb_panic+0x61/0x70
      [  192.953411] RSP: 0018:ffffc90000ef7900 EFLAGS: 00010286
      [  192.953411] RAX: 0000000000000085 RBX: 00000000000086dd RCX: 0000000000000201
      [  192.953411] RDX: 0000000080000201 RSI: ffffffff81d104c5 RDI: 00000000ffffffff
      [  192.953411] RBP: ffffc90000ef7920 R08: 0000000000000001 R09: 0000000000000000
      [  192.953411] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      [  192.953411] R13: ffff88007c5a4000 R14: ffff88007b363d80 R15: 00000000000000b8
      [  192.953411] FS:  00007f94b558b700(0000) GS:ffff88007fd40000(0000) knlGS:0000000000000000
      [  192.953411] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  192.953411] CR2: 00007fff5ecd5080 CR3: 0000000074141000 CR4: 00000000001406e0
      [  192.953411] Call Trace:
      [  192.953411]  skb_push+0x3b/0x40
      [  192.953411]  eth_header+0x25/0xc0
      [  192.953411]  neigh_resolve_output+0x168/0x230
      [  192.953411]  ? ip6_finish_output2+0x242/0x8f0
      [  192.953411]  ip6_finish_output2+0x242/0x8f0
      [  192.953411]  ? ip6_finish_output2+0x76/0x8f0
      [  192.953411]  ip6_finish_output+0xa8/0x1d0
      [  192.953411]  ip6_output+0x64/0x2d0
      [  192.953411]  ? ip6_output+0x73/0x2d0
      [  192.953411]  ? ip6_dst_check+0xb5/0xc0
      [  192.953411]  ? dst_cache_per_cpu_get.isra.2+0x40/0x80
      [  192.953411]  seg6_output+0xb0/0x220
      [  192.953411]  lwtunnel_output+0xcf/0x210
      [  192.953411]  ? lwtunnel_output+0x59/0x210
      [  192.953411]  ip6_local_out+0x38/0x70
      [  192.953411]  ip6_send_skb+0x2a/0xb0
      [  192.953411]  ip6_push_pending_frames+0x48/0x50
      [  192.953411]  rawv6_sendmsg+0xa39/0xf10
      [  192.953411]  ? __lock_acquire+0x489/0x890
      [  192.953411]  ? __mutex_lock+0x1fc/0x970
      [  192.953411]  ? __lock_acquire+0x489/0x890
      [  192.953411]  ? __mutex_lock+0x1fc/0x970
      [  192.953411]  ? tty_ioctl+0x283/0xec0
      [  192.953411]  inet_sendmsg+0x45/0x1d0
      [  192.953411]  ? _copy_from_user+0x54/0x80
      [  192.953411]  sock_sendmsg+0x33/0x40
      [  192.953411]  SYSC_sendto+0xef/0x170
      [  192.953411]  ? entry_SYSCALL_64_fastpath+0x5/0xc2
      [  192.953411]  ? trace_hardirqs_on_caller+0x12b/0x1b0
      [  192.953411]  ? trace_hardirqs_on_thunk+0x1a/0x1c
      [  192.953411]  SyS_sendto+0x9/0x10
      [  192.953411]  entry_SYSCALL_64_fastpath+0x1f/0xc2
      [  192.953411] RIP: 0033:0x7f94b453db33
      [  192.953411] RSP: 002b:00007fff5ecd0578 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [  192.953411] RAX: ffffffffffffffda RBX: 00007fff5ecd16e0 RCX: 00007f94b453db33
      [  192.953411] RDX: 0000000000000040 RSI: 000055a78352e9c0 RDI: 0000000000000003
      [  192.953411] RBP: 00007fff5ecd1690 R08: 000055a78352c940 R09: 000000000000001c
      [  192.953411] R10: 0000000000000000 R11: 0000000000000246 R12: 000055a783321e10
      [  192.953411] R13: 000055a7839890c0 R14: 0000000000000004 R15: 0000000000000000
      [  192.953411] Code: 00 00 48 89 44 24 10 8b 87 c4 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 90 58 d2 81 48 89 04 24 31 c0 e8 4f 70 9a ff <0f> 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 97 d8 00 00
      [  192.953411] RIP: skb_panic+0x61/0x70 RSP: ffffc90000ef7900
      [  193.000186] ---[ end trace bd0b89fabdf2f92c ]---
      [  193.000951] Kernel panic - not syncing: Fatal exception in interrupt
      [  193.001137] Kernel Offset: disabled
      [  193.001169] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      
      Fixes: 19d5a26f ("ipv6: sr: expand skb head only if necessary")
      Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af3b5158
    • F
      ipv6: drop non loopback packets claiming to originate from ::1 · 0aa8c13e
      Florian Westphal 提交于
      We lack a saddr check for ::1. This causes security issues e.g. with acls
      permitting connections from ::1 because of assumption that these originate
      from local machine.
      
      Assuming a source address of ::1 is local seems reasonable.
      RFC4291 doesn't allow such a source address either, so drop such packets.
      Reported-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0aa8c13e
    • W
      net-timestamp: avoid use-after-free in ip_recv_error · 1862d620
      Willem de Bruijn 提交于
      Syzkaller reported a use-after-free in ip_recv_error at line
      
          info->ipi_ifindex = skb->dev->ifindex;
      
      This function is called on dequeue from the error queue, at which
      point the device pointer may no longer be valid.
      
      Save ifindex on enqueue in __skb_complete_tx_timestamp, when the
      pointer is valid or NULL. Store it in temporary storage skb->cb.
      
      It is safe to reference skb->dev here, as called from device drivers
      or dev_queue_xmit. The exception is when called from tcp_ack_tstamp;
      in that case it is NULL and ifindex is set to 0 (invalid).
      
      Do not return a pktinfo cmsg if ifindex is 0. This maintains the
      current behavior of not returning a cmsg if skb->dev was NULL.
      
      On dequeue, the ipv4 path will cast from sock_exterr_skb to
      in_pktinfo. Both have ifindex as their first element, so no explicit
      conversion is needed. This is by design, introduced in commit
      0b922b7a ("net: original ingress device index in PKTINFO"). For
      ipv6 ip6_datagram_support_cmsg converts to in6_pktinfo.
      
      Fixes: 829ae9d6 ("net-timestamp: allow reading recv cmsg on errqueue with origin tstamp")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1862d620
    • D
      net: ipv6: send unsolicited NA on admin up · 4a6e3c5d
      David Ahern 提交于
      ndisc_notify is the ipv6 equivalent to arp_notify. When arp_notify is
      set to 1, gratuitous arp requests are sent when the device is brought up.
      The same is expected when ndisc_notify is set to 1 (per ndisc_notify in
      Documentation/networking/ip-sysctl.txt). The NA is not sent on NETDEV_UP
      event; add it.
      
      Fixes: 5cb04436 ("ipv6: add knob to send unsolicited ND on link-layer address change")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a6e3c5d
  18. 15 4月, 2017 2 次提交
    • F
      netfilter: remove nf_ct_is_untracked · ab8bc7ed
      Florian Westphal 提交于
      This function is now obsolete and always returns false.
      This change has no effect on generated code.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ab8bc7ed
    • F
      netfilter: kill the fake untracked conntrack objects · cc41c84b
      Florian Westphal 提交于
      resurrect an old patch from Pablo Neira to remove the untracked objects.
      
      Currently, there are four possible states of an skb wrt. conntrack.
      
      1. No conntrack attached, ct is NULL.
      2. Normal (kmem cache allocated) ct attached.
      3. a template (kmalloc'd), not in any hash tables at any point in time
      4. the 'untracked' conntrack, a percpu nf_conn object, tagged via
         IPS_UNTRACKED_BIT in ct->status.
      
      Untracked is supposed to be identical to case 1.  It exists only
      so users can check
      
      -m conntrack --ctstate UNTRACKED vs.
      -m conntrack --ctstate INVALID
      
      e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is
      supposed to be a no-op.
      
      Thus currently we need to check
       ct == NULL || nf_ct_is_untracked(ct)
      
      in a lot of places in order to avoid altering untracked objects.
      
      The other consequence of the percpu untracked object is that all
      -j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op
      (inc/dec the untracked conntracks refcount).
      
      This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to
      make the distinction instead.
      
      The (few) places that care about packet invalid (ct is NULL) vs.
      packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED,
      but all other places can omit the nf_ct_is_untracked() check.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      cc41c84b