1. 17 5月, 2018 1 次提交
    • E
      tcp: purge write queue in tcp_connect_init() · 7f582b24
      Eric Dumazet 提交于
      syzkaller found a reliable way to crash the host, hitting a BUG()
      in __tcp_retransmit_skb()
      
      Malicous MSG_FASTOPEN is the root cause. We need to purge write queue
      in tcp_connect_init() at the point we init snd_una/write_seq.
      
      This patch also replaces the BUG() by a less intrusive WARN_ON_ONCE()
      
      kernel BUG at net/ipv4/tcp_output.c:2837!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 5276 Comm: syz-executor0 Not tainted 4.17.0-rc3+ #51
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__tcp_retransmit_skb+0x2992/0x2eb0 net/ipv4/tcp_output.c:2837
      RSP: 0000:ffff8801dae06ff8 EFLAGS: 00010206
      RAX: ffff8801b9fe61c0 RBX: 00000000ffc18a16 RCX: ffffffff864e1a49
      RDX: 0000000000000100 RSI: ffffffff864e2e12 RDI: 0000000000000005
      RBP: ffff8801dae073a0 R08: ffff8801b9fe61c0 R09: ffffed0039c40dd2
      R10: ffffed0039c40dd2 R11: ffff8801ce206e93 R12: 00000000421eeaad
      R13: ffff8801ce206d4e R14: ffff8801ce206cc0 R15: ffff8801cd4f4a80
      FS:  0000000000000000(0000) GS:ffff8801dae00000(0063) knlGS:00000000096bc900
      CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
      CR2: 0000000020000000 CR3: 00000001c47b6000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <IRQ>
       tcp_retransmit_skb+0x2e/0x250 net/ipv4/tcp_output.c:2923
       tcp_retransmit_timer+0xc50/0x3060 net/ipv4/tcp_timer.c:488
       tcp_write_timer_handler+0x339/0x960 net/ipv4/tcp_timer.c:573
       tcp_write_timer+0x111/0x1d0 net/ipv4/tcp_timer.c:593
       call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
       expire_timers kernel/time/timer.c:1363 [inline]
       __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
       run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
       __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
       invoke_softirq kernel/softirq.c:365 [inline]
       irq_exit+0x1d1/0x200 kernel/softirq.c:405
       exiting_irq arch/x86/include/asm/apic.h:525 [inline]
       smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
      
      Fixes: cf60af03 ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f582b24
  2. 02 3月, 2018 1 次提交
    • E
      tcp_bbr: better deal with suboptimal GSO (II) · dcb8c9b4
      Eric Dumazet 提交于
      This is second part of dealing with suboptimal device gso parameters.
      In first patch (350c9f48 "tcp_bbr: better deal with suboptimal GSO")
      we dealt with devices having low gso_max_segs
      
      Some devices lower gso_max_size from 64KB to 16 KB (r8152 is an example)
      
      In order to probe an optimal cwnd, we want BBR being not sensitive
      to whatever GSO constraint a device can have.
      
      This patch removes tso_segs_goal() CC callback in favor of
      min_tso_segs() for CC wanting to override sysctl_tcp_min_tso_segs
      
      Next patch will remove bbr->tso_segs_goal since it does not have
      to be persistent.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dcb8c9b4
  3. 23 2月, 2018 1 次提交
    • E
      tcp_bbr: better deal with suboptimal GSO · 350c9f48
      Eric Dumazet 提交于
      BBR uses tcp_tso_autosize() in an attempt to probe what would be the
      burst sizes and to adjust cwnd in bbr_target_cwnd() with following
      gold formula :
      
      /* Allow enough full-sized skbs in flight to utilize end systems. */
      cwnd += 3 * bbr->tso_segs_goal;
      
      But GSO can be lacking or be constrained to very small
      units (ip link set dev ... gso_max_segs 2)
      
      What we really want is to have enough packets in flight so that both
      GSO and GRO are efficient.
      
      So in the case GSO is off or downgraded, we still want to have the same
      number of packets in flight as if GSO/TSO was fully operational, so
      that GRO can hopefully be working efficiently.
      
      To fix this issue, we make tcp_tso_autosize() unaware of
      sk->sk_gso_max_segs
      
      Only tcp_tso_segs() has to enforce the gso_max_segs limit.
      
      Tested:
      
      ethtool -K eth0 tso off gso off
      tc qd replace dev eth0 root pfifo_fast
      
      Before patch:
      for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
          691  (ss -temoi shows cwnd is stuck around 6 )
          667
          651
          631
          517
      
      After patch :
      # for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
         1733 (ss -temoi shows cwnd is around 386 )
         1778
         1746
         1781
         1718
      
      Fixes: 0f8782ea ("tcp_bbr: add BBR congestion control")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      350c9f48
  4. 22 2月, 2018 2 次提交
  5. 13 2月, 2018 1 次提交
  6. 26 1月, 2018 2 次提交
  7. 09 1月, 2018 1 次提交
  8. 14 12月, 2017 1 次提交
    • N
      tcp: allow TLP in ECN CWR · b4f70c3d
      Neal Cardwell 提交于
      This patch enables tail loss probe in cwnd reduction (CWR) state
      to detect potential losses. Prior to this patch, since the sender
      uses PRR to determine the cwnd in CWR state, the combination of
      CWR+PRR plus tcp_tso_should_defer() could cause unnecessary stalls
      upon losses: PRR makes cwnd so gentle that tcp_tso_should_defer()
      defers sending wait for more ACKs. The ACKs may not come due to
      packet losses.
      
      Disallowing TLP when there is unused cwnd had the primary effect
      of disallowing TLP when there is TSO deferral, Nagle deferral,
      or we hit the rwin limit. Because basically every application
      write() or incoming ACK will cause us to run tcp_write_xmit()
      to see if we can send more, and then if we sent something we call
      tcp_schedule_loss_probe() to see if we should schedule a TLP. At
      that point, there are a few common reasons why some cwnd budget
      could still be unused:
      
      (a) rwin limit
      (b) nagle check
      (c) TSO deferral
      (d) TSQ
      
      For (d), after the next packet tx completion the TSQ mechanism
      will allow us to send more packets, so we don't really need a
      TLP (in practice it shouldn't matter whether we schedule one
      or not). But for (a), (b), (c) the sender won't send any more
      packets until it gets another ACK. But if the whole flight was
      lost, or all the ACKs were lost, then we won't get any more ACKs,
      and ideally we should schedule and send a TLP to get more feedback.
      In particular for a long time we have wanted some kind of timer for
      TSO deferral, and at least this would give us some kind of timer
      Reported-by: NSteve Ibanez <sibanez@stanford.edu>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Reviewed-by: NNandita Dukkipati <nanditad@google.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4f70c3d
  9. 19 11月, 2017 1 次提交
    • N
      tcp: when scheduling TLP, time of RTO should account for current ACK · ed66dfaf
      Neal Cardwell 提交于
      Fix the TLP scheduling logic so that when scheduling a TLP probe, we
      ensure that the estimated time at which an RTO would fire accounts for
      the fact that ACKs indicating forward progress should push back RTO
      times.
      
      After the following fix:
      
      df92c839 ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")
      
      we had an unintentional behavior change in the following kind of
      scenario: suppose the RTT variance has been very low recently. Then
      suppose we send out a flight of N packets and our RTT is 100ms:
      
      t=0: send a flight of N packets
      t=100ms: receive an ACK for N-1 packets
      
      The response before df92c839 that was:
        -> schedule a TLP for now + RTO_interval
      
      The response after df92c839 is:
        -> schedule a TLP for t=0 + RTO_interval
      
      Since RTO_interval = srtt + RTT_variance, this means that we have
      scheduled a TLP timer at a point in the future that only accounts for
      RTT_variance. If the RTT_variance term is small, this means that the
      timer fires soon.
      
      Before df92c839 this would not happen, because in that code, when
      we receive an ACK for a prefix of flight, we did:
      
          1) Near the top of tcp_ack(), switch from TLP timer to RTO
             at write_queue_head->paket_tx_time + RTO_interval:
                  if (icsk->icsk_pending == ICSK_TIME_LOSS_PROBE)
                         tcp_rearm_rto(sk);
      
          2) In tcp_clean_rtx_queue(), update the RTO to now + RTO_interval:
                  if (flag & FLAG_ACKED) {
                         tcp_rearm_rto(sk);
      
          3) In tcp_ack() after tcp_fastretrans_alert() switch from RTO
             to TLP at now + RTO_interval:
                  if (icsk->icsk_pending == ICSK_TIME_RETRANS)
                         tcp_schedule_loss_probe(sk);
      
      In df92c839 we removed that 3-phase dance, and instead directly
      set the TLP timer once: we set the TLP timer in cases like this to
      write_queue_head->packet_tx_time + RTO_interval. So if the RTT
      variance is small, then this means that this is setting the TLP timer
      to fire quite soon. This means if the ACK for the tail of the flight
      takes longer than an RTT to arrive (often due to delayed ACKs), then
      the TLP timer fires too quickly.
      
      Fixes: df92c839 ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed66dfaf
  10. 14 11月, 2017 1 次提交
    • E
      tcp: allow drivers to tweak TSQ logic · 3a9b76fd
      Eric Dumazet 提交于
      I had many reports that TSQ logic breaks wifi aggregation.
      
      Current logic is to allow up to 1 ms of bytes to be queued into qdisc
      and drivers queues.
      
      But Wifi aggregation needs a bigger budget to allow bigger rates to
      be discovered by various TCP Congestion Controls algorithms.
      
      This patch adds an extra socket field, allowing wifi drivers to select
      another log scale to derive TCP Small Queue credit from current pacing
      rate.
      
      Initial value is 10, meaning that this patch does not change current
      behavior.
      
      We expect wifi drivers to set this field to smaller values (tests have
      been done with values from 6 to 9)
      
      They would have to use following template :
      
      if (skb->sk && skb->sk->sk_pacing_shift != MY_PACING_SHIFT)
           skb->sk->sk_pacing_shift = MY_PACING_SHIFT;
      
      Ref: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1670041Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Johannes Berg <johannes.berg@intel.com>
      Cc: Toke Høiland-Jørgensen <toke@toke.dk>
      Cc: Kir Kolyshkin <kir@openvz.org>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a9b76fd
  11. 11 11月, 2017 2 次提交
  12. 10 11月, 2017 1 次提交
  13. 05 11月, 2017 1 次提交
  14. 03 11月, 2017 3 次提交
  15. 01 11月, 2017 1 次提交
  16. 28 10月, 2017 5 次提交
  17. 27 10月, 2017 3 次提交
  18. 26 10月, 2017 2 次提交
  19. 25 10月, 2017 1 次提交
    • M
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns... · 6aa7de05
      Mark Rutland 提交于
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
      
      Please do not apply this to mainline directly, instead please re-run the
      coccinelle script shown below and apply its output.
      
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't harmful, and changing them results in
      churn.
      
      However, for some features, the read/write distinction is critical to
      correct operation. To distinguish these cases, separate read/write
      accessors must be used. This patch migrates (most) remaining
      ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
      coccinelle script:
      
      ----
      // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
      // WRITE_ONCE()
      
      // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
      
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: tj@kernel.org
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6aa7de05
  20. 24 10月, 2017 1 次提交
  21. 23 10月, 2017 1 次提交
    • K
      tcp: do tcp_mstamp_refresh before retransmits on TSQ handler · 3a91d29f
      Koichiro Den 提交于
      When retransmission on TSQ handler was introduced in the commit
      f9616c35 ("tcp: implement TSQ for retransmits"), the retransmitted
      skbs' timestamps were updated on the actual transmission. In the later
      commit 385e2070 ("tcp: use tp->tcp_mstamp in output path"), it stops
      being done so. In the commit, the comment says "We try to refresh
      tp->tcp_mstamp only when necessary", and at present tcp_tsq_handler and
      tcp_v4_mtu_reduced applies to this. About the latter, it's okay since
      it's rare enough.
      
      About the former, even though possible retransmissions on the tasklet
      comes just after the destructor run in NET_RX softirq handling, the time
      between them could be nonnegligibly large to the extent that
      tcp_rack_advance or rto rearming be affected if other (remaining) RX,
      BLOCK and (preceding) TASKLET sofirq handlings are unexpectedly heavy.
      
      So in the same way as tcp_write_timer_handler does, doing tcp_mstamp_refresh
      ensures the accuracy of algorithms relying on it.
      
      Fixes: 385e2070 ("tcp: use tp->tcp_mstamp in output path")
      Signed-off-by: NKoichiro Den <den@klaipeden.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a91d29f
  22. 21 10月, 2017 1 次提交
  23. 18 10月, 2017 1 次提交
  24. 15 10月, 2017 1 次提交
    • C
      tcp: add a tracepoint for tcp retransmission · e086101b
      Cong Wang 提交于
      We need a real-time notification for tcp retransmission
      for monitoring.
      
      Of course we could use ftrace to dynamically instrument this
      kernel function too, however we can't retrieve the connection
      information at the same time, for example perf-tools [1] reads
      /proc/net/tcp for socket details, which is slow when we have
      a lots of connections.
      
      Therefore, this patch adds a tracepoint for __tcp_retransmit_skb()
      and exposes src/dst IP addresses and ports of the connection.
      This also makes it easier to integrate into perf.
      
      Note, I expose both IPv4 and IPv6 addresses at the same time:
      for a IPv4 socket, v4 mapped address is used as IPv6 addresses,
      for a IPv6 socket, LOOPBACK4_IPV6 is already filled by kernel.
      Also, add sk and skb pointers as they are useful for BPF.
      
      1. https://github.com/brendangregg/perf-tools/blob/master/net/tcpretrans
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NBrendan Gregg <bgregg@netflix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e086101b
  25. 07 10月, 2017 1 次提交
    • E
      tcp: implement rb-tree based retransmit queue · 75c119af
      Eric Dumazet 提交于
      Using a linear list to store all skbs in write queue has been okay
      for quite a while : O(N) is not too bad when N < 500.
      
      Things get messy when N is the order of 100,000 : Modern TCP stacks
      want 10Gbit+ of throughput even with 200 ms RTT flows.
      
      40 ns per cache line miss means a full scan can use 4 ms,
      blowing away CPU caches.
      
      SACK processing often can use various hints to avoid parsing
      whole retransmit queue. But with high packet losses and/or high
      reordering, hints no longer work.
      
      Sender has to process thousands of unfriendly SACK, accumulating
      a huge socket backlog, burning a cpu and massively dropping packets.
      
      Using an rb-tree for retransmit queue has been avoided for years
      because it added complexity and overhead, but now is the time
      to be more resistant and say no to quadratic behavior.
      
      1) RTX queue is no longer part of the write queue : already sent skbs
      are stored in one rb-tree.
      
      2) Since reaching the head of write queue no longer needs
      sk->sk_send_head, we added an union of sk_send_head and tcp_rtx_queue
      
      Tested:
      
       On receiver :
       netem on ingress : delay 150ms 200us loss 1
       GRO disabled to force stress and SACK storms.
      
      for f in `seq 1 10`
      do
       ./netperf -H lpaa6 -l30 -- -K bbr -o THROUGHPUT|tail -1
      done | awk '{print $0} {sum += $0} END {printf "%7u\n",sum}'
      
      Before patch :
      
      323.87
      351.48
      339.59
      338.62
      306.72
      204.07
      304.93
      291.88
      202.47
      176.88
         2840
      
      After patch:
      
      1700.83
      2207.98
      2070.17
      1544.26
      2114.76
      2124.89
      1693.14
      1080.91
      2216.82
      1299.94
        18053
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75c119af
  26. 06 10月, 2017 1 次提交
    • E
      tcp: new list for sent but unacked skbs for RACK recovery · e2080072
      Eric Dumazet 提交于
      This patch adds a new queue (list) that tracks the sent but not yet
      acked or SACKed skbs for a TCP connection. The list is chronologically
      ordered by skb->skb_mstamp (the head is the oldest sent skb).
      
      This list will be used to optimize TCP Rack recovery, which checks
      an skb's timestamp to judge if it has been lost and needs to be
      retransmitted. Since TCP write queue is ordered by sequence instead
      of sent time, RACK has to scan over the write queue to catch all
      eligible packets to detect lost retransmission, and iterates through
      SACKed skbs repeatedly.
      
      Special cares for rare events:
      1. TCP repair fakes skb transmission so the send queue needs adjusted
      2. SACK reneging would require re-inserting SACKed skbs into the
         send queue. For now I believe it's not worth the complexity to
         make RACK work perfectly on SACK reneging, so we do nothing here.
      3. Fast Open: currently for non-TFO, send-queue correctly queues
         the pure SYN packet. For TFO which queues a pure SYN and
         then a data packet, send-queue only queues the data packet but
         not the pure SYN due to the structure of TFO code. This is okay
         because the SYN receiver would never respond with a SACK on a
         missing SYN (i.e. SYN is never fast-retransmitted by SACK/RACK).
      
      In order to not grow sk_buff, we use an union for the new list and
      _skb_refdst/destructor fields. This is a bit complicated because
      we need to make sure _skb_refdst and destructor are properly zeroed
      before skb is cloned/copied at transmit, and before being freed.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2080072
  27. 20 9月, 2017 1 次提交
    • E
      tcp: fastopen: fix on syn-data transmit failure · b5b7db8d
      Eric Dumazet 提交于
      Our recent change exposed a bug in TCP Fastopen Client that syzkaller
      found right away [1]
      
      When we prepare skb with SYN+DATA, we attempt to transmit it,
      and we update socket state as if the transmit was a success.
      
      In socket RTX queue we have two skbs, one with the SYN alone,
      and a second one containing the DATA.
      
      When (malicious) ACK comes in, we now complain that second one had no
      skb_mstamp.
      
      The proper fix is to make sure that if the transmit failed, we do not
      pretend we sent the DATA skb, and make it our send_head.
      
      When 3WHS completes, we can now send the DATA right away, without having
      to wait for a timeout.
      
      [1]
      WARNING: CPU: 0 PID: 100189 at net/ipv4/tcp_input.c:3117 tcp_clean_rtx_queue+0x2057/0x2ab0 net/ipv4/tcp_input.c:3117()
      
       WARN_ON_ONCE(last_ackt == 0);
      
      Modules linked in:
      CPU: 0 PID: 100189 Comm: syz-executor1 Not tainted
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       0000000000000000 ffff8800b35cb1d8 ffffffff81cad00d 0000000000000000
       ffffffff828a4347 ffff88009f86c080 ffffffff8316eb20 0000000000000d7f
       ffff8800b35cb220 ffffffff812c33c2 ffff8800baad2440 00000009d46575c0
      Call Trace:
       [<ffffffff81cad00d>] __dump_stack
       [<ffffffff81cad00d>] dump_stack+0xc1/0x124
       [<ffffffff812c33c2>] warn_slowpath_common+0xe2/0x150
       [<ffffffff812c361e>] warn_slowpath_null+0x2e/0x40
       [<ffffffff828a4347>] tcp_clean_rtx_queue+0x2057/0x2ab0 n
       [<ffffffff828ae6fd>] tcp_ack+0x151d/0x3930
       [<ffffffff828baa09>] tcp_rcv_state_process+0x1c69/0x4fd0
       [<ffffffff828efb7f>] tcp_v4_do_rcv+0x54f/0x7c0
       [<ffffffff8258aacb>] sk_backlog_rcv
       [<ffffffff8258aacb>] __release_sock+0x12b/0x3a0
       [<ffffffff8258ad9e>] release_sock+0x5e/0x1c0
       [<ffffffff8294a785>] inet_wait_for_connect
       [<ffffffff8294a785>] __inet_stream_connect+0x545/0xc50
       [<ffffffff82886f08>] tcp_sendmsg_fastopen
       [<ffffffff82886f08>] tcp_sendmsg+0x2298/0x35a0
       [<ffffffff82952515>] inet_sendmsg+0xe5/0x520
       [<ffffffff8257152f>] sock_sendmsg_nosec
       [<ffffffff8257152f>] sock_sendmsg+0xcf/0x110
      
      Fixes: 8c72c65b ("tcp: update skb->skb_mstamp more carefully")
      Fixes: 783237e8 ("net-tcp: Fast Open client - sending SYN-data")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5b7db8d
  28. 19 9月, 2017 1 次提交