1. 24 8月, 2016 1 次提交
  2. 01 7月, 2016 1 次提交
  3. 30 6月, 2016 1 次提交
    • A
      tcp: add an ability to dump and restore window parameters · b1ed4c4f
      Andrey Vagin 提交于
      We found that sometimes a restored tcp socket doesn't work.
      
      A reason of this bug is incorrect window parameters and in this case
      tcp_acceptable_seq() returns tcp_wnd_end(tp) instead of tp->snd_nxt. The
      other side drops packets with this seq, because seq is less than
      tp->rcv_nxt ( tcp_sequence() ).
      
      Data from a send queue is sent only if there is enough space in a
      window, so when we restore unacked data, we need to expand a window to
      fit this data.
      
      This was in a first version of this patch:
      "tcp: extend window to fit all restored unacked data in a send queue"
      
      Then Alexey recommended me to restore window parameters instead of
      adjusted them according with data in a sent queue. This sounds resonable.
      
      rcv_wnd has to be restored, because it was reported to another side
      and the offered window is never shrunk.
      One of reasons why we need to restore snd_wnd was described above.
      
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1ed4c4f
  4. 05 5月, 2016 1 次提交
  5. 03 5月, 2016 3 次提交
    • E
      tcp: make tcp_sendmsg() aware of socket backlog · d41a69f1
      Eric Dumazet 提交于
      Large sendmsg()/write() hold socket lock for the duration of the call,
      unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
      are parked into socket backlog for a long time.
      Critical decisions like fast retransmit might be delayed.
      Receivers have to maintain a big out of order queue with additional cpu
      overhead, and also possible stalls in TX once windows are full.
      
      Bidirectional flows are particularly hurt since the backlog can become
      quite big if the copy from user space triggers IO (page faults)
      
      Some applications learnt to use sendmsg() (or sendmmsg()) with small
      chunks to avoid this issue.
      
      Kernel should know better, right ?
      
      Add a generic sk_flush_backlog() helper and use it right
      before a new skb is allocated. Typically we put 64KB of payload
      per skb (unless MSG_EOR is requested) and checking socket backlog
      every 64KB gives good results.
      
      As a matter of fact, tests with TSO/GSO disabled give very nice
      results, as we manage to keep a small write queue and smaller
      perceived rtt.
      
      Note that sk_flush_backlog() maintains socket ownership,
      so is not equivalent to a {release_sock(sk); lock_sock(sk);},
      to ensure implicit atomicity rules that sendmsg() was
      giving to (possibly buggy) applications.
      
      In this simple implementation, I chose to not call tcp_release_cb(),
      but we might consider this later.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d41a69f1
    • E
      tcp: do not block bh during prequeue processing · fb3477c0
      Eric Dumazet 提交于
      AFAIK, nothing in current TCP stack absolutely wants BH
      being disabled once socket is owned by a thread running in
      process context.
      
      As mentioned in my prior patch ("tcp: give prequeue mode some care"),
      processing a batch of packets might take time, better not block BH
      at all.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb3477c0
    • E
      tcp: do not assume TCP code is non preemptible · c10d9310
      Eric Dumazet 提交于
      We want to to make TCP stack preemptible, as draining prequeue
      and backlog queues can take lot of time.
      
      Many SNMP updates were assuming that BH (and preemption) was disabled.
      
      Need to convert some __NET_INC_STATS() calls to NET_INC_STATS()
      and some __TCP_INC_STATS() to TCP_INC_STATS()
      
      Before using this_cpu_ptr(net->ipv4.tcp_sk) in tcp_v4_send_reset()
      and tcp_v4_send_ack(), we add an explicit preempt disabled section.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c10d9310
  6. 29 4月, 2016 3 次提交
    • M
      tcp: Make use of MSG_EOR in tcp_sendmsg · c134ecb8
      Martin KaFai Lau 提交于
      This patch adds an eor bit to the TCP_SKB_CB.  When MSG_EOR
      is passed to tcp_sendmsg, the eor bit will be set at the skb
      containing the last byte of the userland's msg.  The eor bit
      will prevent data from appending to that skb in the future.
      
      The change in do_tcp_sendpages is to honor the eor set
      during the previous tcp_sendmsg(MSG_EOR) call.
      
      This patch handles the tcp_sendmsg case.  The followup patches
      will handle other skb coalescing and fragment cases.
      
      One potential use case is to use MSG_EOR with
      SOF_TIMESTAMPING_TX_ACK to get a more accurate
      TCP ack timestamping on application protocol with
      multiple outgoing response messages (e.g. HTTP2).
      
      Packetdrill script for testing:
      ~~~~~~
      +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
      +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
      +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0 bind(3, ..., ...) = 0
      +0 listen(3, 1) = 0
      
      0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
      0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      0.200 < . 1:1(0) ack 1 win 257
      0.200 accept(3, ..., ...) = 4
      +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
      
      0.200 write(4, ..., 14600) = 14600
      0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
      0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
      
      0.200 > .  1:7301(7300) ack 1
      0.200 > P. 7301:14601(7300) ack 1
      
      0.300 < . 1:1(0) ack 14601 win 257
      0.300 > P. 14601:15331(730) ack 1
      0.300 > P. 15331:16061(730) ack 1
      
      0.400 < . 1:1(0) ack 16061 win 257
      0.400 close(4) = 0
      0.400 > F. 16061:16061(0) ack 1
      0.400 < F. 1:1(0) ack 16062 win 257
      0.400 > . 16062:16062(0) ack 2
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c134ecb8
    • S
      tcp: remove SKBTX_ACK_TSTAMP since it is redundant · 0a2cf20c
      Soheil Hassas Yeganeh 提交于
      The SKBTX_ACK_TSTAMP flag is set in skb_shinfo->tx_flags when
      the timestamp of the TCP acknowledgement should be reported on
      error queue. Since accessing skb_shinfo is likely to incur a
      cache-line miss at the time of receiving the ack, the
      txstamp_ack bit was added in tcp_skb_cb, which is set iff
      the SKBTX_ACK_TSTAMP flag is set for an skb. This makes
      SKBTX_ACK_TSTAMP flag redundant.
      
      Remove the SKBTX_ACK_TSTAMP and instead use the txstamp_ack bit
      everywhere.
      
      Note that this frees one bit in shinfo->tx_flags.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Suggested-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a2cf20c
    • S
      tcp: remove an unnecessary check in tcp_tx_timestamp · 863c1fd9
      Soheil Hassas Yeganeh 提交于
      Remove the redundant check for sk->sk_tsflags in tcp_tx_timestamp.
      
      tcp_tx_timestamp() receives the tsflags as a parameter. As a
      result the "sk->sk_tsflags || tsflags" is redundant, since
      tsflags already includes sk->sk_tsflags plus overrides from
      control messages.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      863c1fd9
  7. 28 4月, 2016 3 次提交
  8. 05 4月, 2016 2 次提交
    • S
      sock: enable timestamping using control messages · c14ac945
      Soheil Hassas Yeganeh 提交于
      Currently, SOL_TIMESTAMPING can only be enabled using setsockopt.
      This is very costly when users want to sample writes to gather
      tx timestamps.
      
      Add support for enabling SO_TIMESTAMPING via control messages by
      using tsflags added in `struct sockcm_cookie` (added in the previous
      patches in this series) to set the tx_flags of the last skb created in
      a sendmsg. With this patch, the timestamp recording bits in tx_flags
      of the skbuff is overridden if SO_TIMESTAMPING is passed in a cmsg.
      
      Please note that this is only effective for overriding the recording
      timestamps flags. Users should enable timestamp reporting (e.g.,
      SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_OPT_ID) using
      socket options and then should ask for SOF_TIMESTAMPING_TX_*
      using control messages per sendmsg to sample timestamps for each
      write.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c14ac945
    • S
      tcp: use one bit in TCP_SKB_CB to mark ACK timestamps · 6b084928
      Soheil Hassas Yeganeh 提交于
      Currently, to avoid a cache line miss for accessing skb_shinfo,
      tcp_ack_tstamp skips socket that do not have
      SOF_TIMESTAMPING_TX_ACK bit set in sk_tsflags. This is
      implemented based on an implicit assumption that the
      SOF_TIMESTAMPING_TX_ACK is set via socket options for the
      duration that ACK timestamps are needed.
      
      To implement per-write timestamps, this check should be
      removed and replaced with a per-packet alternative that
      quickly skips packets missing ACK timestamps marks without
      a cache-line miss.
      
      To enable per-packet marking without a cache line miss, use
      one bit in TCP_SKB_CB to mark a whether a SKB might need a
      ack tx timestamp or not. Further checks in tcp_ack_tstamp are not
      modified and work as before.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b084928
  9. 15 3月, 2016 1 次提交
    • M
      tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In · a44d6eac
      Martin KaFai Lau 提交于
      Per RFC4898, they count segments sent/received
      containing a positive length data segment (that includes
      retransmission segments carrying data).  Unlike
      tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
      carrying no data (e.g. pure ack).
      
      The patch also updates the segs_in in tcp_fastopen_add_skb()
      so that segs_in >= data_segs_in property is kept.
      
      Together with retransmission data, tcpi_data_segs_out
      gives a better signal on the rxmit rate.
      
      v6: Rebase on the latest net-next
      
      v5: Eric pointed out that checking skb->len is still needed in
      tcp_fastopen_add_skb() because skb can carry a FIN without data.
      Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in()
      helper is used.  Comment is added to the fastopen case to explain why
      segs_in has to be reset and tcp_segs_in() has to be called before
      __skb_pull().
      
      v4: Add comment to the changes in tcp_fastopen_add_skb()
      and also add remark on this case in the commit message.
      
      v3: Add const modifier to the skb parameter in tcp_segs_in()
      
      v2: Rework based on recent fix by Eric:
      commit a9d99ce2 ("tcp: fix tcpi_segs_in after connection establishment")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Chris Rapier <rapier@psc.edu>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Marcelo Ricardo Leitner <mleitner@redhat.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a44d6eac
  10. 10 3月, 2016 1 次提交
  11. 18 2月, 2016 1 次提交
  12. 17 2月, 2016 1 次提交
  13. 09 2月, 2016 1 次提交
  14. 08 2月, 2016 3 次提交
  15. 06 2月, 2016 1 次提交
  16. 29 1月, 2016 1 次提交
  17. 27 1月, 2016 1 次提交
  18. 15 1月, 2016 1 次提交
  19. 23 12月, 2015 1 次提交
  20. 19 12月, 2015 1 次提交
  21. 16 12月, 2015 3 次提交
  22. 02 12月, 2015 1 次提交
    • E
      net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA · 9cd3e072
      Eric Dumazet 提交于
      This patch is a cleanup to make following patch easier to
      review.
      
      Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
      from (struct socket)->flags to a (struct socket_wq)->flags
      to benefit from RCU protection in sock_wake_async()
      
      To ease backports, we rename both constants.
      
      Two new helpers, sk_set_bit(int nr, struct sock *sk)
      and sk_clear_bit(int net, struct sock *sk) are added so that
      following patch can change their implementation.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cd3e072
  23. 16 11月, 2015 1 次提交
  24. 21 10月, 2015 1 次提交
    • Y
      tcp: track min RTT using windowed min-filter · f6722583
      Yuchung Cheng 提交于
      Kathleen Nichols' algorithm for tracking the minimum RTT of a
      data stream over some measurement window. It uses constant space
      and constant time per update. Yet it almost always delivers
      the same minimum as an implementation that has to keep all
      the data in the window. The measurement window is tunable via
      sysctl.net.ipv4.tcp_min_rtt_wlen with a default value of 5 minutes.
      
      The algorithm keeps track of the best, 2nd best & 3rd best min
      values, maintaining an invariant that the measurement time of
      the n'th best >= n-1'th best. It also makes sure that the three
      values are widely separated in the time window since that bounds
      the worse case error when that data is monotonically increasing
      over the window.
      
      Upon getting a new min, we can forget everything earlier because
      it has no value - the new min is less than everything else in the
      window by definition and it's the most recent. So we restart fresh
      on every new min and overwrites the 2nd & 3rd choices. The same
      property holds for the 2nd & 3rd best.
      
      Therefore we have to maintain two invariants to maximize the
      information in the samples, one on values (1st.v <= 2nd.v <=
      3rd.v) and the other on times (now-win <=1st.t <= 2nd.t <= 3rd.t <=
      now). These invariants determine the structure of the code
      
      The RTT input to the windowed filter is the minimum RTT measured
      from ACK or SACK, or as the last resort from TCP timestamps.
      
      The accessor tcp_min_rtt() returns the minimum RTT seen in the
      window. ~0U indicates it is not available. The minimum is 1usec
      even if the true RTT is below that.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6722583
  25. 07 10月, 2015 1 次提交
  26. 30 9月, 2015 1 次提交
    • E
      tcp: prepare fastopen code for upcoming listener changes · 0536fcc0
      Eric Dumazet 提交于
      While auditing TCP stack for upcoming 'lockless' listener changes,
      I found I had to change fastopen_init_queue() to properly init the object
      before publishing it.
      
      Otherwise an other cpu could try to lock the spinlock before it gets
      properly initialized.
      
      Instead of adding appropriate barriers, just remove dynamic memory
      allocations :
      - Structure is 28 bytes on 64bit arches. Using additional 8 bytes
        for holding a pointer seems overkill.
      - Two listeners can share same cache line and performance would suffer.
      
      If we really want to save few bytes, we would instead dynamically allocate
      whole struct request_sock_queue in the future.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0536fcc0
  27. 26 8月, 2015 1 次提交
    • E
      tcp: fix slow start after idle vs TSO/GSO · 6f021c62
      Eric Dumazet 提交于
      slow start after idle might reduce cwnd, but we perform this
      after first packet was cooked and sent.
      
      With TSO/GSO, it means that we might send a full TSO packet
      even if cwnd should have been reduced to IW10.
      
      Moving the SSAI check in skb_entail() makes sense, because
      we slightly reduce number of times this check is done,
      especially for large send() and TCP Small queue callbacks from
      softirq context.
      
      As Neal pointed out, we also need to perform the check
      if/when receive window opens.
      
      Tested:
      
      Following packetdrill test demonstrates the problem
      // Test of slow start after idle
      
      `sysctl -q net.ipv4.tcp_slow_start_after_idle=1`
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0    setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0    bind(3, ..., ...) = 0
      +0    listen(3, 1) = 0
      
      +0    < S 0:0(0) win 65535 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      +0    > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
      +.100 < . 1:1(0) ack 1 win 511
      +0    accept(3, ..., ...) = 4
      +0    setsockopt(4, SOL_SOCKET, SO_SNDBUF, [200000], 4) = 0
      
      +0    write(4, ..., 26000) = 26000
      +0    > . 1:5001(5000) ack 1
      +0    > . 5001:10001(5000) ack 1
      +0    %{ assert tcpi_snd_cwnd == 10 }%
      
      +.100 < . 1:1(0) ack 10001 win 511
      +0    %{ assert tcpi_snd_cwnd == 20, tcpi_snd_cwnd }%
      +0    > . 10001:20001(10000) ack 1
      +0    > P. 20001:26001(6000) ack 1
      
      +.100 < . 1:1(0) ack 26001 win 511
      +0    %{ assert tcpi_snd_cwnd == 36, tcpi_snd_cwnd }%
      
      +4 write(4, ..., 20000) = 20000
      // If slow start after idle works properly, we should send 5 MSS here (cwnd/2)
      +0    > . 26001:31001(5000) ack 1
      +0    %{ assert tcpi_snd_cwnd == 10, tcpi_snd_cwnd }%
      +0    > . 31001:36001(5000) ack 1
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f021c62
  28. 27 7月, 2015 1 次提交
  29. 23 6月, 2015 1 次提交
    • C
      tcp: Do not call tcp_fastopen_reset_cipher from interrupt context · dfea2aa6
      Christoph Paasch 提交于
      tcp_fastopen_reset_cipher really cannot be called from interrupt
      context. It allocates the tcp_fastopen_context with GFP_KERNEL and
      calls crypto_alloc_cipher, which allocates all kind of stuff with
      GFP_KERNEL.
      
      Thus, we might sleep when the key-generation is triggered by an
      incoming TFO cookie-request which would then happen in interrupt-
      context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP:
      
      [   36.001813] BUG: sleeping function called from invalid context at mm/slub.c:1266
      [   36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill
      [   36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14
      [   36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      [   36.008250]  00000000000004f2 ffff88007f8838a8 ffffffff8171d53a ffff880075a084a8
      [   36.009630]  ffff880075a08000 ffff88007f8838c8 ffffffff810967d3 ffff88007f883928
      [   36.011076]  0000000000000000 ffff88007f8838f8 ffffffff81096892 ffff88007f89be00
      [   36.012494] Call Trace:
      [   36.012953]  <IRQ>  [<ffffffff8171d53a>] dump_stack+0x4f/0x6d
      [   36.014085]  [<ffffffff810967d3>] ___might_sleep+0x103/0x170
      [   36.015117]  [<ffffffff81096892>] __might_sleep+0x52/0x90
      [   36.016117]  [<ffffffff8118e887>] kmem_cache_alloc_trace+0x47/0x190
      [   36.017266]  [<ffffffff81680d82>] ? tcp_fastopen_reset_cipher+0x42/0x130
      [   36.018485]  [<ffffffff81680d82>] tcp_fastopen_reset_cipher+0x42/0x130
      [   36.019679]  [<ffffffff81680f01>] tcp_fastopen_init_key_once+0x61/0x70
      [   36.020884]  [<ffffffff81680f2c>] __tcp_fastopen_cookie_gen+0x1c/0x60
      [   36.022058]  [<ffffffff816814ff>] tcp_try_fastopen+0x58f/0x730
      [   36.023118]  [<ffffffff81671788>] tcp_conn_request+0x3e8/0x7b0
      [   36.024185]  [<ffffffff810e3872>] ? __module_text_address+0x12/0x60
      [   36.025327]  [<ffffffff8167b2e1>] tcp_v4_conn_request+0x51/0x60
      [   36.026410]  [<ffffffff816727e0>] tcp_rcv_state_process+0x190/0xda0
      [   36.027556]  [<ffffffff81661f97>] ? __inet_lookup_established+0x47/0x170
      [   36.028784]  [<ffffffff8167c2ad>] tcp_v4_do_rcv+0x16d/0x3d0
      [   36.029832]  [<ffffffff812e6806>] ? security_sock_rcv_skb+0x16/0x20
      [   36.030936]  [<ffffffff8167cc8a>] tcp_v4_rcv+0x77a/0x7b0
      [   36.031875]  [<ffffffff816af8c3>] ? iptable_filter_hook+0x33/0x70
      [   36.032953]  [<ffffffff81657d22>] ip_local_deliver_finish+0x92/0x1f0
      [   36.034065]  [<ffffffff81657f1a>] ip_local_deliver+0x9a/0xb0
      [   36.035069]  [<ffffffff81657c90>] ? ip_rcv+0x3d0/0x3d0
      [   36.035963]  [<ffffffff81657569>] ip_rcv_finish+0x119/0x330
      [   36.036950]  [<ffffffff81657ba7>] ip_rcv+0x2e7/0x3d0
      [   36.037847]  [<ffffffff81610652>] __netif_receive_skb_core+0x552/0x930
      [   36.038994]  [<ffffffff81610a57>] __netif_receive_skb+0x27/0x70
      [   36.040033]  [<ffffffff81610b72>] process_backlog+0xd2/0x1f0
      [   36.041025]  [<ffffffff81611482>] net_rx_action+0x122/0x310
      [   36.042007]  [<ffffffff81076743>] __do_softirq+0x103/0x2f0
      [   36.042978]  [<ffffffff81723e3c>] do_softirq_own_stack+0x1c/0x30
      
      This patch moves the call to tcp_fastopen_init_key_once to the places
      where a listener socket creates its TFO-state, which always happens in
      user-context (either from the setsockopt, or implicitly during the
      listen()-call)
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Fixes: 222e83d2 ("tcp: switch tcp_fastopen key generation to net_get_random_once")
      Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfea2aa6