1. 23 5月, 2018 2 次提交
  2. 19 5月, 2018 1 次提交
  3. 18 5月, 2018 16 次提交
  4. 17 5月, 2018 1 次提交
    • E
      tcp: purge write queue in tcp_connect_init() · 7f582b24
      Eric Dumazet 提交于
      syzkaller found a reliable way to crash the host, hitting a BUG()
      in __tcp_retransmit_skb()
      
      Malicous MSG_FASTOPEN is the root cause. We need to purge write queue
      in tcp_connect_init() at the point we init snd_una/write_seq.
      
      This patch also replaces the BUG() by a less intrusive WARN_ON_ONCE()
      
      kernel BUG at net/ipv4/tcp_output.c:2837!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 5276 Comm: syz-executor0 Not tainted 4.17.0-rc3+ #51
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__tcp_retransmit_skb+0x2992/0x2eb0 net/ipv4/tcp_output.c:2837
      RSP: 0000:ffff8801dae06ff8 EFLAGS: 00010206
      RAX: ffff8801b9fe61c0 RBX: 00000000ffc18a16 RCX: ffffffff864e1a49
      RDX: 0000000000000100 RSI: ffffffff864e2e12 RDI: 0000000000000005
      RBP: ffff8801dae073a0 R08: ffff8801b9fe61c0 R09: ffffed0039c40dd2
      R10: ffffed0039c40dd2 R11: ffff8801ce206e93 R12: 00000000421eeaad
      R13: ffff8801ce206d4e R14: ffff8801ce206cc0 R15: ffff8801cd4f4a80
      FS:  0000000000000000(0000) GS:ffff8801dae00000(0063) knlGS:00000000096bc900
      CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
      CR2: 0000000020000000 CR3: 00000001c47b6000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <IRQ>
       tcp_retransmit_skb+0x2e/0x250 net/ipv4/tcp_output.c:2923
       tcp_retransmit_timer+0xc50/0x3060 net/ipv4/tcp_timer.c:488
       tcp_write_timer_handler+0x339/0x960 net/ipv4/tcp_timer.c:573
       tcp_write_timer+0x111/0x1d0 net/ipv4/tcp_timer.c:593
       call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
       expire_timers kernel/time/timer.c:1363 [inline]
       __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
       run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
       __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
       invoke_softirq kernel/softirq.c:365 [inline]
       irq_exit+0x1d1/0x200 kernel/softirq.c:405
       exiting_irq arch/x86/include/asm/apic.h:525 [inline]
       smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
      
      Fixes: cf60af03 ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f582b24
  5. 14 5月, 2018 1 次提交
  6. 12 5月, 2018 4 次提交
    • W
      erspan: auto detect truncated ipv6 packets. · d5db21a3
      William Tu 提交于
      Currently the truncated bit is set only when 1) the mirrored packet
      is larger than mtu and 2) the ipv4 packet tot_len is larger than
      the actual skb->len.  This patch adds another case for detecting
      whether ipv6 packet is truncated or not, by checking the ipv6 header
      payload_len and the skb->len.
      Reported-by: NXiaoyan Jin <xiaoyanj@vmware.com>
      Signed-off-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5db21a3
    • E
      udp: avoid refcount_t saturation in __udp_gso_segment() · 575b65bc
      Eric Dumazet 提交于
      For some reason, Willem thought that the issue we fixed for TCP
      in commit 7ec318fe ("tcp: gso: avoid refcount_t warning from
      tcp_gso_segment()") was not relevant for UDP GSO.
      
      But syzbot found its way.
      
      refcount_t: saturated; leaking memory.
      WARNING: CPU: 0 PID: 10261 at lib/refcount.c:78 refcount_add_not_zero+0x2d4/0x320 lib/refcount.c:78
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 0 PID: 10261 Comm: syz-executor5 Not tainted 4.17.0-rc3+ #38
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1b9/0x294 lib/dump_stack.c:113
       panic+0x22f/0x4de kernel/panic.c:184
       __warn.cold.8+0x163/0x1b3 kernel/panic.c:536
       report_bug+0x252/0x2d0 lib/bug.c:186
       fixup_bug arch/x86/kernel/traps.c:178 [inline]
       do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296
       do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
       invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
      RIP: 0010:refcount_add_not_zero+0x2d4/0x320 lib/refcount.c:78
      RSP: 0018:ffff880196db6b90 EFLAGS: 00010282
      RAX: 0000000000000026 RBX: 00000000ffffff01 RCX: ffffc900040d9000
      RDX: 0000000000004a29 RSI: ffffffff8160f6f1 RDI: ffff880196db66f0
      RBP: ffff880196db6c78 R08: ffff8801b33d6740 R09: 0000000000000002
      R10: ffff8801b33d6740 R11: 0000000000000000 R12: 0000000000000000
      R13: 00000000ffffffff R14: ffff880196db6c50 R15: 0000000000020101
       refcount_add+0x1b/0x70 lib/refcount.c:102
       __udp_gso_segment+0xaa5/0xee0 net/ipv4/udp_offload.c:272
       udp4_ufo_fragment+0x592/0x7a0 net/ipv4/udp_offload.c:301
       inet_gso_segment+0x639/0x12b0 net/ipv4/af_inet.c:1342
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
       skb_gso_segment include/linux/netdevice.h:4050 [inline]
       validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3122
       __dev_queue_xmit+0xbf8/0x34c0 net/core/dev.c:3579
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3620
       neigh_direct_output+0x15/0x20 net/core/neighbour.c:1401
       neigh_output include/net/neighbour.h:483 [inline]
       ip_finish_output2+0xa5f/0x1840 net/ipv4/ip_output.c:229
       ip_finish_output+0x828/0xf80 net/ipv4/ip_output.c:317
       NF_HOOK_COND include/linux/netfilter.h:277 [inline]
       ip_output+0x21b/0x850 net/ipv4/ip_output.c:405
       dst_output include/net/dst.h:444 [inline]
       ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124
       ip_send_skb+0x40/0xe0 net/ipv4/ip_output.c:1434
       udp_send_skb.isra.37+0x5eb/0x1000 net/ipv4/udp.c:825
       udp_push_pending_frames+0x5c/0xf0 net/ipv4/udp.c:853
       udp_v6_push_pending_frames+0x380/0x3e0 net/ipv6/udp.c:1105
       udp_lib_setsockopt+0x59a/0x600 net/ipv4/udp.c:2403
       udpv6_setsockopt+0x95/0xa0 net/ipv6/udp.c:1447
       sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3046
       __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
       __do_sys_setsockopt net/socket.c:1914 [inline]
       __se_sys_setsockopt net/socket.c:1911 [inline]
       __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
       do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: ad405857 ("udp: better wmem accounting on gso")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      575b65bc
    • E
      tcp: switch pacing timer to softirq based hrtimer · 73a6bab5
      Eric Dumazet 提交于
      linux-4.16 got support for softirq based hrtimers.
      TCP can switch its pacing hrtimer to this variant, since this
      avoids going through a tasklet and some atomic operations.
      
      pacing timer logic looks like other (jiffies based) tcp timers.
      
      v2: use hrtimer_try_to_cancel() in tcp_clear_xmit_timers()
          to correctly release reference on socket if needed.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73a6bab5
    • A
      ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg · 1b97013b
      Andrey Ignatov 提交于
      Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed
      earlier in 91948309.
      
      * udp_sendmsg one was there since the beginning when linux sources were
        first added to git;
      * ping_v4_sendmsg one was copy/pasted in c319b4d7.
      
      Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options
      have to be freed if they were allocated previously.
      
      Add label so that future callers (if any) can use it instead of kfree()
      before return that is easy to forget.
      
      Fixes: c319b4d7 (net: ipv4: add IPPROTO_ICMP socket kind)
      Signed-off-by: NAndrey Ignatov <rdna@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b97013b
  7. 11 5月, 2018 6 次提交
  8. 09 5月, 2018 6 次提交
  9. 08 5月, 2018 1 次提交
  10. 07 5月, 2018 1 次提交
  11. 03 5月, 2018 1 次提交
    • E
      tcp: restore autocorking · 114f39fe
      Eric Dumazet 提交于
      When adding rb-tree for TCP retransmit queue, we inadvertently broke
      TCP autocorking.
      
      tcp_should_autocork() should really check if the rtx queue is not empty.
      
      Tested:
      
      Before the fix :
      $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -m 500;nstat | grep AutoCork
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      540000 262144    500    10.00      2682.85   2.47     1.59     3.618   2.329
      TcpExtTCPAutoCorking            33                 0.0
      
      // Same test, but forcing TCP_NODELAY
      $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -D -m 500;nstat | grep AutoCork
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET : nodelay
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      540000 262144    500    10.00      1408.75   2.44     2.96     6.802   8.259
      TcpExtTCPAutoCorking            1                  0.0
      
      After the fix :
      $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -m 500;nstat | grep AutoCork
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      540000 262144    500    10.00      5472.46   2.45     1.43     1.761   1.027
      TcpExtTCPAutoCorking            361293             0.0
      
      // With TCP_NODELAY option
      $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -D -m 500;nstat | grep AutoCork
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET : nodelay
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      540000 262144    500    10.00      5454.96   2.46     1.63     1.775   1.174
      TcpExtTCPAutoCorking            315448             0.0
      
      Fixes: 75c119af ("tcp: implement rb-tree based retransmit queue")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NMichael Wenig <mwenig@vmware.com>
      Tested-by: NMichael Wenig <mwenig@vmware.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NMichael Wenig <mwenig@vmware.com>
      Tested-by: NMichael Wenig <mwenig@vmware.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      114f39fe