1. 12 5月, 2017 1 次提交
  2. 10 5月, 2017 1 次提交
  3. 09 5月, 2017 5 次提交
  4. 06 5月, 2017 1 次提交
    • E
      tcp: randomize timestamps on syncookies · 84b114b9
      Eric Dumazet 提交于
      Whole point of randomization was to hide server uptime, but an attacker
      can simply start a syn flood and TCP generates 'old style' timestamps,
      directly revealing server jiffies value.
      
      Also, TSval sent by the server to a particular remote address vary
      depending on syncookies being sent or not, potentially triggering PAWS
      drops for innocent clients.
      
      Lets implement proper randomization, including for SYNcookies.
      
      Also we do not need to export sysctl_tcp_timestamps, since it is not
      used from a module.
      
      In v2, I added Florian feedback and contribution, adding tsoff to
      tcp_get_cookie_sock().
      
      v3 removed one unused variable in tcp_v4_connect() as Florian spotted.
      
      Fixes: 95a22cae ("tcp: randomize tcp timestamp offsets for each connection")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NFlorian Westphal <fw@strlen.de>
      Tested-by: NFlorian Westphal <fw@strlen.de>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84b114b9
  5. 04 5月, 2017 2 次提交
    • A
      ipv4, ipv6: ensure raw socket message is big enough to hold an IP header · 86f4c90a
      Alexander Potapenko 提交于
      raw_send_hdrinc() and rawv6_send_hdrinc() expect that the buffer copied
      from the userspace contains the IPv4/IPv6 header, so if too few bytes are
      copied, parts of the header may remain uninitialized.
      
      This bug has been detected with KMSAN.
      
      For the record, the KMSAN report:
      
      ==================================================================
      BUG: KMSAN: use of unitialized memory in nf_ct_frag6_gather+0xf5a/0x44a0
      inter: 0
      CPU: 0 PID: 1036 Comm: probe Not tainted 4.11.0-rc5+ #2455
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:16
       dump_stack+0x143/0x1b0 lib/dump_stack.c:52
       kmsan_report+0x16b/0x1e0 mm/kmsan/kmsan.c:1078
       __kmsan_warning_32+0x5c/0xa0 mm/kmsan/kmsan_instr.c:510
       nf_ct_frag6_gather+0xf5a/0x44a0 net/ipv6/netfilter/nf_conntrack_reasm.c:577
       ipv6_defrag+0x1d9/0x280 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn ./include/linux/netfilter.h:102
       nf_hook_slow+0x13f/0x3c0 net/netfilter/core.c:310
       nf_hook ./include/linux/netfilter.h:212
       NF_HOOK ./include/linux/netfilter.h:255
       rawv6_send_hdrinc net/ipv6/raw.c:673
       rawv6_sendmsg+0x2fcb/0x41a0 net/ipv6/raw.c:919
       inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762
       sock_sendmsg_nosec net/socket.c:633
       sock_sendmsg net/socket.c:643
       SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696
       SyS_sendto+0xbc/0xe0 net/socket.c:1664
       do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285
       entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
      RIP: 0033:0x436e03
      RSP: 002b:00007ffce48baf38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000436e03
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 00007ffce48baf90 R08: 00007ffce48baf50 R09: 000000000000001c
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000401790 R14: 0000000000401820 R15: 0000000000000000
      origin: 00000000d9400053
       save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:362
       kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:257
       kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:270
       slab_alloc_node mm/slub.c:2735
       __kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4341
       __kmalloc_reserve net/core/skbuff.c:138
       __alloc_skb+0x2cd/0x740 net/core/skbuff.c:231
       alloc_skb ./include/linux/skbuff.h:933
       alloc_skb_with_frags+0x209/0xbc0 net/core/skbuff.c:4678
       sock_alloc_send_pskb+0x9ff/0xe00 net/core/sock.c:1903
       sock_alloc_send_skb+0xe4/0x100 net/core/sock.c:1920
       rawv6_send_hdrinc net/ipv6/raw.c:638
       rawv6_sendmsg+0x2918/0x41a0 net/ipv6/raw.c:919
       inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762
       sock_sendmsg_nosec net/socket.c:633
       sock_sendmsg net/socket.c:643
       SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696
       SyS_sendto+0xbc/0xe0 net/socket.c:1664
       do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285
       return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
      ==================================================================
      
      , triggered by the following syscalls:
        socket(PF_INET6, SOCK_RAW, IPPROTO_RAW) = 3
        sendto(3, NULL, 0, 0, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "ff00::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EPERM
      
      A similar report is triggered in net/ipv4/raw.c if we use a PF_INET socket
      instead of a PF_INET6 one.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86f4c90a
    • E
      tcp: do not inherit fastopen_req from parent · 8b485ce6
      Eric Dumazet 提交于
      Under fuzzer stress, it is possible that a child gets a non NULL
      fastopen_req pointer from its parent at accept() time, when/if parent
      morphs from listener to active session.
      
      We need to make sure this can not happen, by clearing the field after
      socket cloning.
      
      BUG: Double free or freeing an invalid pointer
      Unexpected shadow byte: 0xFB
      CPU: 3 PID: 20933 Comm: syz-executor3 Not tainted 4.11.0+ #306
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:16 [inline]
       dump_stack+0x292/0x395 lib/dump_stack.c:52
       kasan_object_err+0x1c/0x70 mm/kasan/report.c:164
       kasan_report_double_free+0x5c/0x70 mm/kasan/report.c:185
       kasan_slab_free+0x9d/0xc0 mm/kasan/kasan.c:580
       slab_free_hook mm/slub.c:1357 [inline]
       slab_free_freelist_hook mm/slub.c:1379 [inline]
       slab_free mm/slub.c:2961 [inline]
       kfree+0xe8/0x2b0 mm/slub.c:3882
       tcp_free_fastopen_req net/ipv4/tcp.c:1077 [inline]
       tcp_disconnect+0xc15/0x13e0 net/ipv4/tcp.c:2328
       inet_child_forget+0xb8/0x600 net/ipv4/inet_connection_sock.c:898
       inet_csk_reqsk_queue_add+0x1e7/0x250
      net/ipv4/inet_connection_sock.c:928
       tcp_get_cookie_sock+0x21a/0x510 net/ipv4/syncookies.c:217
       cookie_v4_check+0x1a19/0x28b0 net/ipv4/syncookies.c:384
       tcp_v4_cookie_check net/ipv4/tcp_ipv4.c:1384 [inline]
       tcp_v4_do_rcv+0x731/0x940 net/ipv4/tcp_ipv4.c:1421
       tcp_v4_rcv+0x2dc0/0x31c0 net/ipv4/tcp_ipv4.c:1715
       ip_local_deliver_finish+0x4cc/0xc20 net/ipv4/ip_input.c:216
       NF_HOOK include/linux/netfilter.h:257 [inline]
       ip_local_deliver+0x1ce/0x700 net/ipv4/ip_input.c:257
       dst_input include/net/dst.h:492 [inline]
       ip_rcv_finish+0xb1d/0x20b0 net/ipv4/ip_input.c:396
       NF_HOOK include/linux/netfilter.h:257 [inline]
       ip_rcv+0xd8c/0x19c0 net/ipv4/ip_input.c:487
       __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4210
       __netif_receive_skb+0x2a/0x1a0 net/core/dev.c:4248
       process_backlog+0xe5/0x6c0 net/core/dev.c:4868
       napi_poll net/core/dev.c:5270 [inline]
       net_rx_action+0xe70/0x18e0 net/core/dev.c:5335
       __do_softirq+0x2fb/0xb99 kernel/softirq.c:284
       do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:899
       </IRQ>
       do_softirq.part.17+0x1e8/0x230 kernel/softirq.c:328
       do_softirq kernel/softirq.c:176 [inline]
       __local_bh_enable_ip+0x1cf/0x1e0 kernel/softirq.c:181
       local_bh_enable include/linux/bottom_half.h:31 [inline]
       rcu_read_unlock_bh include/linux/rcupdate.h:931 [inline]
       ip_finish_output2+0x9ab/0x15e0 net/ipv4/ip_output.c:230
       ip_finish_output+0xa35/0xdf0 net/ipv4/ip_output.c:316
       NF_HOOK_COND include/linux/netfilter.h:246 [inline]
       ip_output+0x1f6/0x7b0 net/ipv4/ip_output.c:404
       dst_output include/net/dst.h:486 [inline]
       ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
       ip_queue_xmit+0x9a8/0x1a10 net/ipv4/ip_output.c:503
       tcp_transmit_skb+0x1ade/0x3470 net/ipv4/tcp_output.c:1057
       tcp_write_xmit+0x79e/0x55b0 net/ipv4/tcp_output.c:2265
       __tcp_push_pending_frames+0xfa/0x3a0 net/ipv4/tcp_output.c:2450
       tcp_push+0x4ee/0x780 net/ipv4/tcp.c:683
       tcp_sendmsg+0x128d/0x39b0 net/ipv4/tcp.c:1342
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762
       sock_sendmsg_nosec net/socket.c:633 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:643
       SYSC_sendto+0x660/0x810 net/socket.c:1696
       SyS_sendto+0x40/0x50 net/socket.c:1664
       entry_SYSCALL_64_fastpath+0x1f/0xbe
      RIP: 0033:0x446059
      RSP: 002b:00007faa6761fb58 EFLAGS: 00000282 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 0000000000446059
      RDX: 0000000000000001 RSI: 0000000020ba3fcd RDI: 0000000000000017
      RBP: 00000000006e40a0 R08: 0000000020ba4ff0 R09: 0000000000000010
      R10: 0000000020000000 R11: 0000000000000282 R12: 0000000000708150
      R13: 0000000000000000 R14: 00007faa676209c0 R15: 00007faa67620700
      Object at ffff88003b5bbcb8, in cache kmalloc-64 size: 64
      Allocated:
      PID = 20909
       save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
       save_stack+0x43/0xd0 mm/kasan/kasan.c:513
       set_track mm/kasan/kasan.c:525 [inline]
       kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616
       kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
       kmalloc include/linux/slab.h:490 [inline]
       kzalloc include/linux/slab.h:663 [inline]
       tcp_sendmsg_fastopen net/ipv4/tcp.c:1094 [inline]
       tcp_sendmsg+0x221a/0x39b0 net/ipv4/tcp.c:1139
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762
       sock_sendmsg_nosec net/socket.c:633 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:643
       SYSC_sendto+0x660/0x810 net/socket.c:1696
       SyS_sendto+0x40/0x50 net/socket.c:1664
       entry_SYSCALL_64_fastpath+0x1f/0xbe
      Freed:
      PID = 20909
       save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
       save_stack+0x43/0xd0 mm/kasan/kasan.c:513
       set_track mm/kasan/kasan.c:525 [inline]
       kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589
       slab_free_hook mm/slub.c:1357 [inline]
       slab_free_freelist_hook mm/slub.c:1379 [inline]
       slab_free mm/slub.c:2961 [inline]
       kfree+0xe8/0x2b0 mm/slub.c:3882
       tcp_free_fastopen_req net/ipv4/tcp.c:1077 [inline]
       tcp_disconnect+0xc15/0x13e0 net/ipv4/tcp.c:2328
       __inet_stream_connect+0x20c/0xf90 net/ipv4/af_inet.c:593
       tcp_sendmsg_fastopen net/ipv4/tcp.c:1111 [inline]
       tcp_sendmsg+0x23a8/0x39b0 net/ipv4/tcp.c:1139
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762
       sock_sendmsg_nosec net/socket.c:633 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:643
       SYSC_sendto+0x660/0x810 net/socket.c:1696
       SyS_sendto+0x40/0x50 net/socket.c:1664
       entry_SYSCALL_64_fastpath+0x1f/0xbe
      
      Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets")
      Fixes: 7db92362 ("tcp: fix potential double free issue for fastopen_req")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Acked-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b485ce6
  6. 03 5月, 2017 1 次提交
  7. 02 5月, 2017 1 次提交
    • I
      net/esp4: Fix invalid esph pointer crash · 67d349ed
      Ilan Tayari 提交于
      Both esp_output and esp_xmit take a pointer to the ESP header
      and place it in esp_info struct prior to calling esp_output_head.
      
      Inside esp_output_head, the call to esp_output_udp_encap
      makes sure to update the pointer if it gets invalid.
      However, if esp_output_head itself calls skb_cow_data, the
      pointer is not updated and stays invalid, causing a crash
      after esp_output_head returns.
      
      Update the pointer if it becomes invalid in esp_output_head
      
      Fixes: fca11ebd ("esp4: Reorganize esp_output")
      Signed-off-by: NIlan Tayari <ilant@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67d349ed
  8. 01 5月, 2017 3 次提交
  9. 29 4月, 2017 2 次提交
  10. 28 4月, 2017 1 次提交
  11. 27 4月, 2017 12 次提交
  12. 26 4月, 2017 5 次提交
  13. 25 4月, 2017 5 次提交
    • W
      net/tcp_fastopen: Remove mss check in tcp_write_timeout() · 59450f8d
      Wei Wang 提交于
      Christoph Paasch from Apple found another firewall issue for TFO:
      After successful 3WHS using TFO, server and client starts to exchange
      data. Afterwards, a 10s idle time occurs on this connection. After that,
      firewall starts to drop every packet on this connection.
      
      The fix for this issue is to extend existing firewall blackhole detection
      logic in tcp_write_timeout() by removing the mss check.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59450f8d
    • W
      net/tcp_fastopen: Add snmp counter for blackhole detection · 46c2fa39
      Wei Wang 提交于
      This counter records the number of times the firewall blackhole issue is
      detected and active TFO is disabled.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46c2fa39
    • W
      net/tcp_fastopen: Disable active side TFO in certain scenarios · cf1ef3f0
      Wei Wang 提交于
      Middlebox firewall issues can potentially cause server's data being
      blackholed after a successful 3WHS using TFO. Following are the related
      reports from Apple:
      https://www.nanog.org/sites/default/files/Paasch_Network_Support.pdf
      Slide 31 identifies an issue where the client ACK to the server's data
      sent during a TFO'd handshake is dropped.
      C ---> syn-data ---> S
      C <--- syn/ack ----- S
      C (accept & write)
      C <---- data ------- S
      C ----- ACK -> X     S
      		[retry and timeout]
      
      https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf
      Slide 5 shows a similar situation that the server's data gets dropped
      after 3WHS.
      C ---- syn-data ---> S
      C <--- syn/ack ----- S
      C ---- ack --------> S
      S (accept & write)
      C?  X <- data ------ S
      		[retry and timeout]
      
      This is the worst failure b/c the client can not detect such behavior to
      mitigate the situation (such as disabling TFO). Failing to proceed, the
      application (e.g., SSL library) may simply timeout and retry with TFO
      again, and the process repeats indefinitely.
      
      The proposed solution is to disable active TFO globally under the
      following circumstances:
      1. client side TFO socket detects out of order FIN
      2. client side TFO socket receives out of order RST
      
      We disable active side TFO globally for 1hr at first. Then if it
      happens again, we disable it for 2h, then 4h, 8h, ...
      And we reset the timeout to 1hr if a client side TFO sockets not opened
      on loopback has successfully received data segs from server.
      And we examine this condition during close().
      
      The rational behind it is that when such firewall issue happens,
      application running on the client should eventually close the socket as
      it is not able to get the data it is expecting. Or application running
      on the server should close the socket as it is not able to receive any
      response from client.
      In both cases, out of order FIN or RST will get received on the client
      given that the firewall will not block them as no data are in those
      frames.
      And we want to disable active TFO globally as it helps if the middle box
      is very close to the client and most of the connections are likely to
      fail.
      
      Also, add a debug sysctl:
        tcp_fastopen_blackhole_detect_timeout_sec:
          the initial timeout to use when firewall blackhole issue happens.
          This can be set and read.
          When setting it to 0, it means to disable the active disable logic.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf1ef3f0
    • D
      net: add rcu locking when changing early demux · 58c4c6a3
      David Ahern 提交于
      systemd-sysctl is triggering a suspicious RCU usage message when
      net.ipv4.tcp_early_demux or net.ipv4.udp_early_demux is changed via
      a sysctl config file:
      
      [   33.896184] ===============================
      [   33.899558] [ ERR: suspicious RCU usage.  ]
      [   33.900624] 4.11.0-rc7+ #104 Not tainted
      [   33.901698] -------------------------------
      [   33.903059] /home/dsa/kernel-2.git/net/ipv4/sysctl_net_ipv4.c:305 suspicious rcu_dereference_check() usage!
      [   33.905724]
      other info that might help us debug this:
      
      [   33.907656]
      rcu_scheduler_active = 2, debug_locks = 0
      [   33.909288] 1 lock held by systemd-sysctl/143:
      [   33.910373]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff8123a370>] file_start_write+0x45/0x48
      [   33.912407]
      stack backtrace:
      [   33.914018] CPU: 0 PID: 143 Comm: systemd-sysctl Not tainted 4.11.0-rc7+ #104
      [   33.915631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [   33.917870] Call Trace:
      [   33.918431]  dump_stack+0x81/0xb6
      [   33.919241]  lockdep_rcu_suspicious+0x10f/0x118
      [   33.920263]  proc_configure_early_demux+0x65/0x10a
      [   33.921391]  proc_udp_early_demux+0x3a/0x41
      
      add rcu locking to proc_configure_early_demux.
      
      Fixes: dddb64bc ("net: Add sysctl to toggle early demux for tcp and udp")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58c4c6a3
    • A
      udp: disable inner UDP checksum offloads in IPsec case · b40c5f4f
      Ansis Atteka 提交于
      Otherwise, UDP checksum offloads could corrupt ESP packets by attempting
      to calculate UDP checksum when this inner UDP packet is already protected
      by IPsec.
      
      One way to reproduce this bug is to have a VM with virtio_net driver (UFO
      set to ON in the guest VM); and then encapsulate all guest's Ethernet
      frames in Geneve; and then further encrypt Geneve with IPsec.  In this
      case following symptoms are observed:
      1. If using ixgbe NIC, then it will complain with following error message:
         ixgbe 0000:01:00.1: partial checksum but l4 proto=32!
      2. Receiving IPsec stack will drop all the corrupted ESP packets and
         increase XfrmInStateProtoError counter in /proc/net/xfrm_stat.
      3. iperf UDP test from the VM with packet sizes above MTU will not work at
         all.
      4. iperf TCP test from the VM will get ridiculously low performance because.
      Signed-off-by: NAnsis Atteka <aatteka@ovn.org>
      Co-authored-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b40c5f4f