提交 · 367a8ce896f14018cc2c6cf2681aa440fff274f4 · openanolis / cloud-kernel

20 5月, 2017 1 次提交

tcp: warn on negative reordering values · 6f5b24ee

由 Soheil Hassas Yeganeh 提交于 5月 16, 2017

Commit bafbb9c7 ("tcp: eliminate negative reordering
in tcp_clean_rtx_queue") fixes an issue for negative
reordering metrics.

To be resilient to such errors, warn and return
when a negative metric is passed to tcp_update_reordering().
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f5b24ee

19 5月, 2017 1 次提交

tcp: fix tcp_rearm_rto() · b17b8a20

由 Eric Dumazet 提交于 5月 18, 2017

skbs in (re)transmit queue no longer have a copy of jiffies
at the time of the transmit : skb->skb_mstamp is now in usec unit,
with no correlation to tcp_jiffies32.

We have to convert rto from jiffies to usec, compute a time difference
in usec, then convert the delta to HZ units.

Fixes: 9a568de4 ("tcp: switch TCP TS option (RFC 7323) to 1ms clock")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b17b8a20

18 5月, 2017 6 次提交

tcp: switch TCP TS option (RFC 7323) to 1ms clock · 9a568de4

由 Eric Dumazet 提交于 5月 16, 2017

TCP Timestamps option is defined in RFC 7323

Traditionally on linux, it has been tied to the internal
'jiffies' variable, because it had been a cheap and good enough
generator.

For TCP flows on the Internet, 1 ms resolution would be much better
than 4ms or 10ms (HZ=250 or HZ=100 respectively)

For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1]

Receive size autotuning (DRS) is indeed more precise and converges
faster to optimal window size.

This patch converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.

This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.

[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdfSigned-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a568de4

tcp: replace misc tcp_time_stamp to tcp_jiffies32 · ac9517fc

由 Eric Dumazet 提交于 5月 16, 2017

After this patch, all uses of tcp_time_stamp will require
a change when we introduce 1 ms and/or 1 us TCP TS option.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac9517fc

tcp: use tcp_jiffies32 in __tcp_oow_rate_limited() · 594208af

由 Eric Dumazet 提交于 5月 16, 2017

This place wants to use tcp_jiffies32, this is good enough.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

594208af

tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime · 70eabf0e

由 Eric Dumazet 提交于 5月 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70eabf0e

tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp · c2203cf7

由 Eric Dumazet 提交于 5月 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp to feed
tp->snd_cwnd_stamp.

tcp_time_stamp will soon be a litle bit more expensive
than simply reading 'jiffies'.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2203cf7

tcp: use tcp_jiffies32 to feed tp->lsndtime · d635fbe2

由 Eric Dumazet 提交于 5月 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp to feed
tp->lsndtime.

tcp_time_stamp will soon be a litle bit more expensive
than simply reading 'jiffies'.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d635fbe2

17 5月, 2017 1 次提交

tcp: eliminate negative reordering in tcp_clean_rtx_queue · bafbb9c7

由 Soheil Hassas Yeganeh 提交于 5月 15, 2017

tcp_ack() can call tcp_fragment() which may dededuct the
value tp->fackets_out when MSS changes. When prior_fackets
is larger than tp->fackets_out, tcp_clean_rtx_queue() can
invoke tcp_update_reordering() with negative values. This
results in absurd tp->reodering values higher than
sysctl_tcp_max_reordering.

Note that tcp_update_reordering indeeds sets tp->reordering
to min(sysctl_tcp_max_reordering, metric), but because
the comparison is signed, a negative metric always wins.

Fixes: c7caf8d3 ("[TCP]: Fix reord detection due to snd_una covered holes")
Reported-by: NRebecca Isaacs <risaacs@google.com>
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bafbb9c7

12 5月, 2017 1 次提交

tcp: avoid fragmenting peculiar skbs in SACK · b451e5d2

由 Yuchung Cheng 提交于 5月 10, 2017

This patch fixes a bug in splitting an SKB during SACK
processing. Specifically if an skb contains multiple
packets and is only partially sacked in the higher sequences,
tcp_match_sack_to_skb() splits the skb and marks the second fragment
as SACKed.

The current code further attempts rounding up the first fragment
to MSS boundaries. But it misses a boundary condition when the
rounded-up fragment size (pkt_len) is exactly skb size.  Spliting
such an skb is pointless and causses a kernel warning and aborts
the SACK processing. This patch universally checks such over-split
before calling tcp_fragment to prevent these unnecessary warnings.

Fixes: adb92db8 ("tcp: Make SACK code to split only at mss boundaries")
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b451e5d2

06 5月, 2017 1 次提交

tcp: randomize timestamps on syncookies · 84b114b9

由 Eric Dumazet 提交于 5月 05, 2017

Whole point of randomization was to hide server uptime, but an attacker
can simply start a syn flood and TCP generates 'old style' timestamps,
directly revealing server jiffies value.

Also, TSval sent by the server to a particular remote address vary
depending on syncookies being sent or not, potentially triggering PAWS
drops for innocent clients.

Lets implement proper randomization, including for SYNcookies.

Also we do not need to export sysctl_tcp_timestamps, since it is not
used from a module.

In v2, I added Florian feedback and contribution, adding tsoff to
tcp_get_cookie_sock().

v3 removed one unused variable in tcp_v4_connect() as Florian spotted.

Fixes: 95a22cae ("tcp: randomize tcp timestamp offsets for each connection")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NFlorian Westphal <fw@strlen.de>
Tested-by: NFlorian Westphal <fw@strlen.de>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84b114b9

27 4月, 2017 9 次提交

tcp: switch rcv_rtt_est and rcvq_space to high resolution timestamps · 645f4c6f