提交 98e09386 编写于 作者: E Eric Dumazet 提交者: David S. Miller

tcp: tsq: restore minimal amount of queueing

After commit c9eeec26 ("tcp: TSQ can use a dynamic limit"), several
users reported throughput regressions, notably on mvneta and wifi
adapters.

802.11 AMPDU requires a fair amount of queueing to be effective.

This patch partially reverts the change done in tcp_write_xmit()
so that the minimal amount is sysctl_tcp_limit_output_bytes.

It also remove the use of this sysctl while building skb stored
in write queue, as TSO autosizing does the right thing anyway.

Users with well behaving NICS and correct qdisc (like sch_fq),
can then lower the default sysctl_tcp_limit_output_bytes value from
128KB to 8KB.

This new usage of sysctl_tcp_limit_output_bytes permits each driver
authors to check how their driver performs when/if the value is set
to a minimum of 4KB.

Normally, line rate for a single TCP flow should be possible,
but some drivers rely on timers to perform TX completion and
too long TX completion delays prevent reaching full throughput.

Fixes: c9eeec26 ("tcp: TSQ can use a dynamic limit")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NSujith Manoharan <sujith@msujith.org>
Reported-by: NArnaud Ebalard <arno@natisbad.org>
Tested-by: NSujith Manoharan <sujith@msujith.org>
Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
上级 6afae645
...@@ -577,9 +577,6 @@ tcp_limit_output_bytes - INTEGER ...@@ -577,9 +577,6 @@ tcp_limit_output_bytes - INTEGER
typical pfifo_fast qdiscs. typical pfifo_fast qdiscs.
tcp_limit_output_bytes limits the number of bytes on qdisc tcp_limit_output_bytes limits the number of bytes on qdisc
or device to reduce artificial RTT/cwnd and reduce bufferbloat. or device to reduce artificial RTT/cwnd and reduce bufferbloat.
Note: For GSO/TSO enabled flows, we try to have at least two
packets in flight. Reducing tcp_limit_output_bytes might also
reduce the size of individual GSO packet (64KB being the max)
Default: 131072 Default: 131072
tcp_challenge_ack_limit - INTEGER tcp_challenge_ack_limit - INTEGER
......
...@@ -808,12 +808,6 @@ static unsigned int tcp_xmit_size_goal(struct sock *sk, u32 mss_now, ...@@ -808,12 +808,6 @@ static unsigned int tcp_xmit_size_goal(struct sock *sk, u32 mss_now,
xmit_size_goal = min_t(u32, gso_size, xmit_size_goal = min_t(u32, gso_size,
sk->sk_gso_max_size - 1 - hlen); sk->sk_gso_max_size - 1 - hlen);
/* TSQ : try to have at least two segments in flight
* (one in NIC TX ring, another in Qdisc)
*/
xmit_size_goal = min_t(u32, xmit_size_goal,
sysctl_tcp_limit_output_bytes >> 1);
xmit_size_goal = tcp_bound_to_half_wnd(tp, xmit_size_goal); xmit_size_goal = tcp_bound_to_half_wnd(tp, xmit_size_goal);
/* We try hard to avoid divides here */ /* We try hard to avoid divides here */
......
...@@ -1875,8 +1875,12 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, ...@@ -1875,8 +1875,12 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
* - better RTT estimation and ACK scheduling * - better RTT estimation and ACK scheduling
* - faster recovery * - faster recovery
* - high rates * - high rates
* Alas, some drivers / subsystems require a fair amount
* of queued bytes to ensure line rate.
* One example is wifi aggregation (802.11 AMPDU)
*/ */
limit = max(skb->truesize, sk->sk_pacing_rate >> 10); limit = max_t(unsigned int, sysctl_tcp_limit_output_bytes,
sk->sk_pacing_rate >> 10);
if (atomic_read(&sk->sk_wmem_alloc) > limit) { if (atomic_read(&sk->sk_wmem_alloc) > limit) {
set_bit(TSQ_THROTTLED, &tp->tsq_flags); set_bit(TSQ_THROTTLED, &tp->tsq_flags);
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册