- 02 7月, 2017 2 次提交
-
-
由 Lawrence Brakmo 提交于
Added callbacks to BPF SOCK_OPS type program before an active connection is intialized and after a passive or active connection is established. The following patch demostrates how they can be used to set send and receive buffer sizes. Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lawrence Brakmo 提交于
This patch adds support for setting a per connection SYN and SYN_ACK RTOs from within a BPF_SOCK_OPS program. For example, to set small RTOs when it is known both hosts are within a datacenter. Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 6月, 2017 4 次提交
-
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We want to move some TCP sysctls to net namespaces in the future. tcp_window_scaling, tcp_sack and tcp_timestamps being fetched from tcp_parse_options(), we need to pass an extra parameter. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2017 1 次提交
-
-
由 Yuchung Cheng 提交于
Currently when a data packet is retransmitted, we do not compute an RTT sample for congestion control due to Kern's check. Therefore the congestion control that uses RTT signals may not receive any update during loss recovery which could last many round trips. For example, BBR and Vegas may not be able to update its min RTT estimation if the network path has shortened until it recovers from losses. This patch mitigates that by using TCP timestamp options for RTT measurement for congestion control. Note that we already use timestamps for RTT estimation. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 5月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
Paul Fiterau Brostean reported : <quote> Linux TCP stack we analyze exhibits behavior that seems odd to me. The scenario is as follows (all packets have empty payloads, no window scaling, rcv/snd window size should not be a factor): TEST HARNESS (CLIENT) LINUX SERVER 1. - LISTEN (server listen, then accepts) 2. - --> <SEQ=100><CTL=SYN> --> SYN-RECEIVED 3. - <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED 4. - --> <SEQ=101><ACK=301><CTL=ACK> --> ESTABLISHED 5. - <-- <SEQ=301><ACK=101><CTL=FIN,ACK> <-- FIN WAIT-1 (server opts to close the data connection calling "close" on the connection socket) 6. - --> <SEQ=101><ACK=99999><CTL=FIN,ACK> --> CLOSING (client sends FIN,ACK with not yet sent acknowledgement number) 7. - <-- <SEQ=302><ACK=102><CTL=ACK> <-- CLOSING (ACK is 102 instead of 101, why?) ... (silence from CLIENT) 8. - <-- <SEQ=301><ACK=102><CTL=FIN,ACK> <-- CLOSING (retransmission, again ACK is 102) Now, note that packet 6 while having the expected sequence number, acknowledges something that wasn't sent by the server. So I would expect the packet to maybe prompt an ACK response from the server, and then be ignored. Yet it is not ignored and actually leads to an increase of the acknowledgement number in the server's retransmission of the FIN,ACK packet. The explanation I found is that the FIN in packet 6 was processed, despite the acknowledgement number being unacceptable. Further experiments indeed show that the server processes this FIN, transitioning to CLOSING, then on receiving an ACK for the FIN it had send in packet 5, the server (or better said connection) transitions from CLOSING to TIME_WAIT (as signaled by netstat). </quote> Indeed, tcp_rcv_state_process() calls tcp_ack() but does not exploit the @acceptable status but for TCP_SYN_RECV state. What we want here is to send a challenge ACK, if not in TCP_SYN_RECV state. TCP_FIN_WAIT1 state is not the only state we should fix. Add a FLAG_NO_CHALLENGE_ACK so that tcp_rcv_state_process() can choose to send a challenge ACK and discard the packet instead of wrongly change socket state. With help from Neal Cardwell. Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NPaul Fiterau Brostean <p.fiterau-brostean@science.ru.nl> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 5月, 2017 1 次提交
-
-
由 Soheil Hassas Yeganeh 提交于
Commit bafbb9c7 ("tcp: eliminate negative reordering in tcp_clean_rtx_queue") fixes an issue for negative reordering metrics. To be resilient to such errors, warn and return when a negative metric is passed to tcp_update_reordering(). Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 5月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
skbs in (re)transmit queue no longer have a copy of jiffies at the time of the transmit : skb->skb_mstamp is now in usec unit, with no correlation to tcp_jiffies32. We have to convert rto from jiffies to usec, compute a time difference in usec, then convert the delta to HZ units. Fixes: 9a568de4 ("tcp: switch TCP TS option (RFC 7323) to 1ms clock") Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 5月, 2017 6 次提交
-
-
由 Eric Dumazet 提交于
TCP Timestamps option is defined in RFC 7323 Traditionally on linux, it has been tied to the internal 'jiffies' variable, because it had been a cheap and good enough generator. For TCP flows on the Internet, 1 ms resolution would be much better than 4ms or 10ms (HZ=250 or HZ=100 respectively) For TCP flows in the DC, Google has used usec resolution for more than two years with great success [1] Receive size autotuning (DRS) is indeed more precise and converges faster to optimal window size. This patch converts tp->tcp_mstamp to a plain u64 value storing a 1 usec TCP clock. This choice will allow us to upstream the 1 usec TS option as discussed in IETF 97. [1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdfSigned-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
After this patch, all uses of tcp_time_stamp will require a change when we introduce 1 ms and/or 1 us TCP TS option. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This place wants to use tcp_jiffies32, this is good enough. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->snd_cwnd_stamp. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->lsndtime. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 5月, 2017 1 次提交
-
-
由 Soheil Hassas Yeganeh 提交于
tcp_ack() can call tcp_fragment() which may dededuct the value tp->fackets_out when MSS changes. When prior_fackets is larger than tp->fackets_out, tcp_clean_rtx_queue() can invoke tcp_update_reordering() with negative values. This results in absurd tp->reodering values higher than sysctl_tcp_max_reordering. Note that tcp_update_reordering indeeds sets tp->reordering to min(sysctl_tcp_max_reordering, metric), but because the comparison is signed, a negative metric always wins. Fixes: c7caf8d3 ("[TCP]: Fix reord detection due to snd_una covered holes") Reported-by: NRebecca Isaacs <risaacs@google.com> Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 5月, 2017 1 次提交
-
-
由 Yuchung Cheng 提交于
This patch fixes a bug in splitting an SKB during SACK processing. Specifically if an skb contains multiple packets and is only partially sacked in the higher sequences, tcp_match_sack_to_skb() splits the skb and marks the second fragment as SACKed. The current code further attempts rounding up the first fragment to MSS boundaries. But it misses a boundary condition when the rounded-up fragment size (pkt_len) is exactly skb size. Spliting such an skb is pointless and causses a kernel warning and aborts the SACK processing. This patch universally checks such over-split before calling tcp_fragment to prevent these unnecessary warnings. Fixes: adb92db8 ("tcp: Make SACK code to split only at mss boundaries") Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 5月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
Whole point of randomization was to hide server uptime, but an attacker can simply start a syn flood and TCP generates 'old style' timestamps, directly revealing server jiffies value. Also, TSval sent by the server to a particular remote address vary depending on syncookies being sent or not, potentially triggering PAWS drops for innocent clients. Lets implement proper randomization, including for SYNcookies. Also we do not need to export sysctl_tcp_timestamps, since it is not used from a module. In v2, I added Florian feedback and contribution, adding tsoff to tcp_get_cookie_sock(). v3 removed one unused variable in tcp_v4_connect() as Florian spotted. Fixes: 95a22cae ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NFlorian Westphal <fw@strlen.de> Tested-by: NFlorian Westphal <fw@strlen.de> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 4月, 2017 9 次提交
-
-
由 Eric Dumazet 提交于
Some devices or distributions use HZ=100 or HZ=250 TCP receive buffer autotuning has poor behavior caused by this choice. Since autotuning happens after 4 ms or 10 ms, short distance flows get their receive buffer tuned to a very high value, but after an initial period where it was frozen to (too small) initial value. With tp->tcp_mstamp introduction, we can switch to high resolution timestamps almost for free (at the expense of 8 additional bytes per TCP structure) Note that some TCP stacks use usec TCP timestamps where this patch makes even more sense : Many TCP flows have < 500 usec RTT. Hopefully this finer TS option can be standardized soon. Tested: HZ=100 kernel ./netperf -H lpaa24 -t TCP_RR -l 1000 -- -r 10000,10000 & Peer without patch : lpaa24:~# ss -tmi dst lpaa23 ... skmem:(r0,rb8388608,...) rcv_rtt:10 rcv_space:3210000 minrtt:0.017 Peer with the patch : lpaa23:~# ss -tmi dst lpaa24 ... skmem:(r0,rb428800,...) rcv_rtt:0.069 rcv_space:30000 minrtt:0.017 We can see saner RCVBUF, and more precise rcv_rtt information. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
It is no longer needed, everything uses tp->tcp_mstamp instead. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Following patch will remove ack_time from struct tcp_sacktag_state Same info is now found in tp->tcp_mstamp Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
No longer needed, since tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
No longer needed, since tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Not used anymore now tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Not used anymore now tp->tcp_mstamp holds the information. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This is no longer used, since tcp_rack_detect_loss() takes the timestamp from tp->tcp_mstamp Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We want to use precise timestamps in TCP stack, but we do not want to call possibly expensive kernel time services too often. tp->tcp_mstamp is guaranteed to be updated once per incoming packet. We will use it in the following patches, removing specific skb_mstamp_get() calls, and removing ack_time from struct tcp_sacktag_state. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 4月, 2017 2 次提交
-
-
由 Wei Wang 提交于
This counter records the number of times the firewall blackhole issue is detected and active TFO is disabled. Signed-off-by: NWei Wang <weiwan@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Wang 提交于
Middlebox firewall issues can potentially cause server's data being blackholed after a successful 3WHS using TFO. Following are the related reports from Apple: https://www.nanog.org/sites/default/files/Paasch_Network_Support.pdf Slide 31 identifies an issue where the client ACK to the server's data sent during a TFO'd handshake is dropped. C ---> syn-data ---> S C <--- syn/ack ----- S C (accept & write) C <---- data ------- S C ----- ACK -> X S [retry and timeout] https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf Slide 5 shows a similar situation that the server's data gets dropped after 3WHS. C ---- syn-data ---> S C <--- syn/ack ----- S C ---- ack --------> S S (accept & write) C? X <- data ------ S [retry and timeout] This is the worst failure b/c the client can not detect such behavior to mitigate the situation (such as disabling TFO). Failing to proceed, the application (e.g., SSL library) may simply timeout and retry with TFO again, and the process repeats indefinitely. The proposed solution is to disable active TFO globally under the following circumstances: 1. client side TFO socket detects out of order FIN 2. client side TFO socket receives out of order RST We disable active side TFO globally for 1hr at first. Then if it happens again, we disable it for 2h, then 4h, 8h, ... And we reset the timeout to 1hr if a client side TFO sockets not opened on loopback has successfully received data segs from server. And we examine this condition during close(). The rational behind it is that when such firewall issue happens, application running on the client should eventually close the socket as it is not able to get the data it is expecting. Or application running on the server should close the socket as it is not able to receive any response from client. In both cases, out of order FIN or RST will get received on the client given that the firewall will not block them as no data are in those frames. And we want to disable active TFO globally as it helps if the middle box is very close to the client and most of the connections are likely to fail. Also, add a debug sysctl: tcp_fastopen_blackhole_detect_timeout_sec: the initial timeout to use when firewall blackhole issue happens. This can be set and read. When setting it to 0, it means to disable the active disable logic. Signed-off-by: NWei Wang <weiwan@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 4月, 2017 2 次提交
-
-
由 Eric Dumazet 提交于
When using TCP FastOpen for an active session, we send one wakeup event from tcp_finish_connect(), right before the data eventually contained in the received SYNACK is queued to sk->sk_receive_queue. This means that depending on machine load or luck, poll() users might receive POLLOUT events instead of POLLIN|POLLOUT To fix this, we need to move the call to sk->sk_state_change() after the (optional) call to tcp_rcv_fastopen_synack() Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
When a RST packet is processed, we send two wakeup events to interested polling users. First one by a sk->sk_error_report(sk) from tcp_reset(), followed by a sk->sk_state_change(sk) from tcp_done(). Depending on machine load and luck, poll() can either return POLLERR, or POLLIN|POLLOUT|POLLERR|POLLHUP (this happens on 99 % of the cases) This is probably fine, but we can avoid the confusion by reordering things so that we have more TCP fields updated before the first wakeup. This might even allow us to remove some barriers we added in the past. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 4月, 2017 1 次提交
-
-
由 Yuchung Cheng 提交于
The recent extension of F-RTO 89fe18e4 ("tcp: extend F-RTO to catch more spurious timeouts") interacts badly with certain broken middle-boxes. These broken boxes modify and falsely raise the receive window on the ACKs. During a timeout induced recovery, F-RTO would send new data packets to probe if the timeout is false or not. Since the receive window is falsely raised, the receiver would silently drop these F-RTO packets. The recovery would take N (exponentially backoff) timeouts to repair N packet losses. A TCP performance killer. Due to this unfortunate situation, this patch removes this extension to revert F-RTO back to the RFC specification. Fixes: 89fe18e4 ("tcp: extend F-RTO to catch more spurious timeouts") Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 4月, 2017 1 次提交
-
-
由 Yuchung Cheng 提交于
Currently the reordering SNMP counters only increase if a connection sees a higher degree then it has previously seen. It ignores if the reordering degree is not greater than the default system threshold. This significantly under-counts the number of reordering events and falsely convey that reordering is rare on the network. This patch properly and faithfully records the number of reordering events detected by the TCP stack, just like the comment says "this exciting event is worth to be remembered". Note that even so TCP still under-estimate the actual reordering events because TCP requires TS options or certain packet sequences to detect reordering (i.e. ACKing never-retransmitted sequence in recovery or disordered state). Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 4月, 2017 1 次提交
-
-
由 Gao Feng 提交于
Define one new macro TCP_MAX_WSCALE instead of literal number '14', and use U16_MAX instead of 65535 as the max value of TCP window. There is another minor change, use rounddown(space, mss) instead of (space / mss) * mss; Signed-off-by: NGao Feng <fgao@ikuai8.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 4月, 2017 1 次提交
-
-
由 Marcelo Ricardo Leitner 提交于
Markus Trippelsdorf reported that after commit dcb17d22 ("tcp: warn on bogus MSS and try to amend it") the kernel started logging the warning for a NIC driver that doesn't even support GRO. It was diagnosed that it was possibly caused on connections that were using TCP Timestamps but some packets lacked the Timestamps option. As we reduce rcv_mss when timestamps are used, the lack of them would cause the packets to be bigger than expected, although this is a valid case. As this warning is more as a hint, getting a clean-cut on the threshold is probably not worth the execution time spent on it. This patch thus alleviates the false-positives with 2 quick checks: by accounting for the entire TCP option space and also checking against the interface MTU if it's available. These changes, specially the MTU one, might mask some real positives, though if they are really happening, it's possible that sooner or later it will be triggered anyway. Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 3月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
icsk_ack.lrcvtime has a 0 value at socket creation time. tcpi_last_data_recv can have bogus value if no payload is ever received. This patch initializes icsk_ack.lrcvtime for active sessions in tcp_finish_connect(), and for passive sessions in tcp_create_openreq_child() Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2017 2 次提交
-
-
由 Soheil Hassas Yeganeh 提交于
The tcp_tw_recycle was already broken for connections behind NAT, since the per-destination timestamp is not monotonically increasing for multiple machines behind a single destination address. After the randomization of TCP timestamp offsets in commit 8a5bd45f6616 (tcp: randomize tcp timestamp offsets for each connection), the tcp_tw_recycle is broken for all types of connections for the same reason: the timestamps received from a single machine is not monotonically increasing, anymore. Remove tcp_tw_recycle, since it is not functional. Also, remove the PAWSPassive SNMP counter since it is only used for tcp_tw_recycle, and simplify tcp_v4_route_req and tcp_v6_route_req since the strict argument is only set when tcp_tw_recycle is enabled. Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Cc: Lutz Vieweg <lvml@5t9.de> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Soheil Hassas Yeganeh 提交于
Commit 8a5bd45f6616 (tcp: randomize tcp timestamp offsets for each connection) randomizes TCP timestamps per connection. After this commit, there is no guarantee that the timestamps received from the same destination are monotonically increasing. As a result, the per-destination timestamp cache in TCP metrics (i.e., tcpm_ts in struct tcp_metrics_block) is broken and cannot be relied upon. Remove the per-destination timestamp cache and all related code paths. Note that this cache was already broken for caching timestamps of multiple machines behind a NAT sharing the same address. Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Cc: Lutz Vieweg <lvml@5t9.de> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 3月, 2017 1 次提交
-
-
由 Alexey Kodanev 提交于
The functions that are returning tcp sequence number also setup TS offset value, so rename them to better describe their purpose. No functional changes in this patch. Suggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-