提交 · 95224ac1801cbfadc2c587be15fded69a13c4e3b · openeuler / raspberrypi-kernel

29 1月, 2015 3 次提交

tcp: fix timing issue in CUBIC slope calculation · d6b1a8a9

由 Neal Cardwell 提交于 1月 28, 2015

This patch fixes a bug in CUBIC that causes cwnd to increase slightly
too slowly when multiple ACKs arrive in the same jiffy.

If cwnd is supposed to increase at a rate of more than once per jiffy,
then CUBIC was sometimes too slow. Because the bic_target is
calculated for a future point in time, calculated with time in
jiffies, the cwnd can increase over the course of the jiffy while the
bic_target calculated as the proper CUBIC cwnd at time
t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only
increases on jiffy tick boundaries.

So since the cnt is set to:
	ca->cnt = cwnd / (bic_target - cwnd);
as cwnd increases but bic_target does not increase due to jiffy
granularity, the cnt becomes too large, causing cwnd to increase
too slowly.

For example:
- suppose at the beginning of a jiffy, cwnd=40, bic_target=44
- so CUBIC sets:
   ca->cnt =  cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10
- suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai()
   increases cwnd to 41
- so CUBIC sets:
   ca->cnt =  cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13

So now CUBIC will wait for 13 packets to be ACKed before increasing
cwnd to 42, insted of 10 as it should.

The fix is to avoid adjusting the slope (determined by ca->cnt)
multiple times within a jiffy, and instead skip to compute the Reno
cwnd, the "TCP friendliness" code path.
Reported-by: NEyal Perry <eyalpe@mellanox.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6b1a8a9

tcp: fix stretch ACK bugs in CUBIC · 9cd981dc

由 Neal Cardwell 提交于 1月 28, 2015

Change CUBIC to properly handle stretch ACKs in additive increase mode
by passing in the count of ACKed packets to tcp_cong_avoid_ai().

In addition, because we are now precisely accounting for stretch ACKs,
including delayed ACKs, we can now remove the delayed ACK tracking and
estimation code that tracked recent delayed ACK behavior in
ca->delayed_ack.
Reported-by: NEyal Perry <eyalpe@mellanox.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9cd981dc

tcp: stretch ACK fixes prep · e73ebb08

由 Neal Cardwell 提交于 1月 28, 2015

LRO, GRO, delayed ACKs, and middleboxes can cause "stretch ACKs" that
cover more than the RFC-specified maximum of 2 packets. These stretch
ACKs can cause serious performance shortfalls in common congestion
control algorithms that were designed and tuned years ago with
receiver hosts that were not using LRO or GRO, and were instead
politely ACKing every other packet.

This patch series fixes Reno and CUBIC to handle stretch ACKs.

This patch prepares for the upcoming stretch ACK bug fix patches. It
adds an "acked" parameter to tcp_cong_avoid_ai() to allow for future
fixes to tcp_cong_avoid_ai() to correctly handle stretch ACKs, and
changes all congestion control algorithms to pass in 1 for the ACKed
count. It also changes tcp_slow_start() to return the number of packet
ACK "credits" that were not processed in slow start mode, and can be
processed by the congestion control module in additive increase mode.

In future patches we will fix tcp_cong_avoid_ai() to handle stretch
ACKs, and fix Reno and CUBIC handling of stretch ACKs in slow start
and additive increase mode.
Reported-by: NEyal Perry <eyalpe@mellanox.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e73ebb08

10 12月, 2014 2 次提交

tcp_cubic: refine Hystart delay threshold · 42eef7a0

由 Eric Dumazet 提交于 12月 04, 2014

In commit 2b4636a5 ("tcp_cubic: make the delay threshold of HyStart
less sensitive"), HYSTART_DELAY_MIN was changed to 4 ms.

The remaining problem is that using delay_min + (delay_min/16) as the
threshold is too sensitive.

6.25 % of variation is too small for rtt above 60 ms, which are not
uncommon.

Lets use 12.5 % instead (delay_min + (delay_min/8))

Tested:
 80 ms RTT between peers, FQ/pacing packet scheduler on sender.
 10 bulk transfers of 10 seconds :

nstat >/dev/null
for i in `seq 1 10`
 do
   netperf -H remote -- -k THROUGHPUT | grep THROUGHPUT
 done
nstat | grep Hystart

With the 6.25 % threshold :

THROUGHPUT=20.66
THROUGHPUT=249.38
THROUGHPUT=254.10
THROUGHPUT=14.94
THROUGHPUT=251.92
THROUGHPUT=237.73
THROUGHPUT=19.18
THROUGHPUT=252.89
THROUGHPUT=21.32
THROUGHPUT=15.58
TcpExtTCPHystartTrainDetect     2                  0.0
TcpExtTCPHystartTrainCwnd       4756               0.0
TcpExtTCPHystartDelayDetect     5                  0.0
TcpExtTCPHystartDelayCwnd       180                0.0

With the 12.5 % threshold
THROUGHPUT=251.09
THROUGHPUT=247.46
THROUGHPUT=250.92
THROUGHPUT=248.91
THROUGHPUT=250.88
THROUGHPUT=249.84
THROUGHPUT=250.51
THROUGHPUT=254.15
THROUGHPUT=250.62
THROUGHPUT=250.89
TcpExtTCPHystartTrainDetect     1                  0.0
TcpExtTCPHystartTrainCwnd       3175               0.0
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Tested-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42eef7a0

tcp_cubic: add SNMP counters to track how effective is Hystart · 6e3a8a93

由 Eric Dumazet 提交于 12月 04, 2014

When deploying FQ pacing, one thing we noticed is that CUBIC Hystart
triggers too soon.

Having SNMP counters to have an idea of how often the various Hystart
methods trigger is useful prior to any modifications.

This patch adds SNMP counters tracking, how many time "ack train" or
"Delay" based Hystart triggers, and cumulative sum of cwnd at the time
Hystart decided to end SS (Slow Start)

myhost:~# nstat -a | grep Hystart
TcpExtTCPHystartTrainDetect     9                  0.0
TcpExtTCPHystartTrainCwnd       20650              0.0
TcpExtTCPHystartDelayDetect     10                 0.0
TcpExtTCPHystartDelayCwnd       360                0.0

->
 Train detection was triggered 9 times, and average cwnd was
 20650/9=2294,
 Delay detection was triggered 10 times and average cwnd was 36
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e3a8a93

02 9月, 2014 1 次提交

tcp: whitespace fixes · 688d1945

由 stephen hemminger 提交于 8月 29, 2014

Fix places where there is space before tab, long lines, and
awkward if(){, double spacing etc. Add blank line after declaration/initialization.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

688d1945

04 5月, 2014 1 次提交

tcp: remove in_flight parameter from cong_avoid() methods · 24901551

由 Eric Dumazet 提交于 5月 02, 2014

Commit e114a710 ("tcp: fix cwnd limited checking to improve
congestion control") obsoleted in_flight parameter from
tcp_is_cwnd_limited() and its callers.

This patch does the removal as promised.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24901551

01 5月, 2014 1 次提交

tcp_cubic: fix the range of delayed_ack · 0cda345d

由 Liu Yu 提交于 4月 30, 2014

commit b9f47a3a (tcp_cubic: limit delayed_ack ratio to prevent
divide error) try to prevent divide error, but there is still a little
chance that delayed_ack can reach zero. In case the param cnt get
negative value, then ratio+cnt would overflow and may happen to be zero.
As a result, min(ratio, ACK_RATIO_LIMIT) will calculate to be zero.

In some old kernels, such as 2.6.32, there is a bug that would
pass negative param, which then ultimately leads to this divide error.

commit 5b35e1e6 (tcp: fix tcp_trim_head() to adjust segment count
with skb MSS) fixed the negative param issue. However,
it's safe that we fix the range of delayed_ack as well,
to make sure we do not hit a divide by zero.

CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NLiu Yu <allanyuliu@tencent.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0cda345d

27 2月, 2014 1 次提交

tcp: switch rtt estimations to usec resolution · 740b0f18

由 Eric Dumazet 提交于 2月 26, 2014

Upcoming congestion controls for TCP require usec resolution for RTT
estimations. Millisecond resolution is simply not enough these days.

FQ/pacing in DC environments also require this change for finer control
and removal of bimodal behavior due to the current hack in
tcp_update_pacing_rate() for 'small rtt'

TCP_CONG_RTT_STAMP is no longer needed.

As Julian Anastasov pointed out, we need to keep user compatibility :
tcp_metrics used to export RTT and RTTVAR in msec resolution,
so we added RTT_US and RTTVAR_US. An iproute2 patch is needed
to use the new attributes if provided by the kernel.

In this example ss command displays a srtt of 32 usecs (10Gbit link)

lpk51:~# ./ss -i dst lpk52
Netid  State      Recv-Q Send-Q   Local Address:Port       Peer
Address:Port
tcp    ESTAB      0      1         10.246.11.51:42959
10.246.11.52:64614
         cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448
cwnd:10 send
3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559

Updated iproute2 ip command displays :

lpk51:~# ./ip tcp_metrics | grep 10.246.11.52
10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source
10.246.11.51

Old binary displays :

lpk51:~# ip tcp_metrics | grep 10.246.11.52
10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source
10.246.11.51

With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Larry Brakmo <brakmo@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

740b0f18

05 11月, 2013 1 次提交

tcp: properly handle stretch acks in slow start · 9f9843a7

由 Yuchung Cheng 提交于 10月 31, 2013

Slow start now increases cwnd by 1 if an ACK acknowledges some packets,
regardless the number of packets. Consequently slow start performance
is highly dependent on the degree of the stretch ACKs caused by
receiver or network ACK compression mechanisms (e.g., delayed-ACK,
GRO, etc).  But slow start algorithm is to send twice the amount of
packets of packets left so it should process a stretch ACK of degree
N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A
follow up patch will use the remainder of the N (if greater than 1)
to adjust cwnd in the congestion avoidance phase.

In addition this patch retires the experimental limited slow start
(LSS) feature. LSS has multiple drawbacks but questionable benefit. The
fractional cwnd increase in LSS requires a loop in slow start even
though it's rarely used. Configuring such an increase step via a global
sysctl on different BDPS seems hard. Finally and most importantly the
slow start overshoot concern is now better covered by the Hybrid slow
start (hystart) enabled by default.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f9843a7

08 8月, 2013 2 次提交

tcp: cubic: fix bug in bictcp_acked() · cd6b423a

由 Eric Dumazet 提交于 8月 05, 2013

While investigating about strange increase of retransmit rates
on hosts ~24 days after boot, Van found hystart was disabled
if ca->epoch_start was 0, as following condition is true
when tcp_time_stamp high order bit is set.

(s32)(tcp_time_stamp - ca->epoch_start) < HZ

Quoting Van :

 At initialization & after every loss ca->epoch_start is set to zero so
 I believe that the above line will turn off hystart as soon as the 2^31
 bit is set in tcp_time_stamp & hystart will stay off for 24 days.
 I think we've observed that cubic's restart is too aggressive without
 hystart so this might account for the higher drop rate we observe.
Diagnosed-by: NVan Jacobson <vanj@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd6b423a

tcp: cubic: fix overflow error in bictcp_update() · 2ed0edf9

由 Eric Dumazet 提交于 8月 05, 2013

commit 17a6e9f1 ("tcp_cubic: fix clock dependency") added an
overflow error in bictcp_update() in following code :

/* change the unit from HZ to bictcp_HZ */
t = ((tcp_time_stamp + msecs_to_jiffies(ca->delay_min>>3) -
      ca->epoch_start) << BICTCP_HZ) / HZ;

Because msecs_to_jiffies() being unsigned long, compiler does
implicit type promotion.

We really want to constrain (tcp_time_stamp - ca->epoch_start)
to a signed 32bit value, or else 't' has unexpected high values.

This bugs triggers an increase of retransmit rates ~24 days after
boot [1], as the high order bit of tcp_time_stamp flips.

[1] for hosts with HZ=1000

Big thanks to Van Jacobson for spotting this problem.
Diagnosed-by: NVan Jacobson <vanj@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ed0edf9

21 1月, 2012 1 次提交

tcp: fix undo after RTO for CUBIC · 5a45f008

由 Neal Cardwell 提交于 1月 18, 2012

This patch fixes CUBIC so that cwnd reductions made during RTOs can be
undone (just as they already can be undone when using the default/Reno
behavior).

When undoing cwnd reductions, BIC-derived congestion control modules
were restoring the cwnd from last_max_cwnd. There were two problems
with using last_max_cwnd to restore a cwnd during undo:

(a) last_max_cwnd was set to 0 on state transitions into TCP_CA_Loss
(by calling the module's reset() functions), so cwnd reductions from
RTOs could not be undone.

(b) when fast_covergence is enabled (which it is by default)
last_max_cwnd does not actually hold the value of snd_cwnd before the
loss; instead, it holds a scaled-down version of snd_cwnd.

This patch makes the following changes:

(1) upon undo, revert snd_cwnd to ca->loss_cwnd, which is already, as
the existing comment notes, the "congestion window at last loss"

(2) stop forgetting ca->loss_cwnd on TCP_CA_Loss events

(3) use ca->last_max_cwnd to check if we're in slow start
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: NSangtae Ha <sangtae.ha@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a45f008

09 5月, 2011 1 次提交

tcp_cubic: limit delayed_ack ratio to prevent divide error · b9f47a3a

由 stephen hemminger 提交于 5月 04, 2011

TCP Cubic keeps a metric that estimates the amount of delayed
acknowledgements to use in adjusting the window. If an abnormally
large number of packets are acknowledged at once, then the update
could wrap and reach zero. This kind of ACK could only
happen when there was a large window and huge number of
ACK's were lost.

This patch limits the value of delayed ack ratio. The choice of 32
is just a conservative value since normally it should be range of
1 to 4 packets.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9f47a3a

15 3月, 2011 6 次提交

tcp_cubic: fix low utilization of CUBIC with HyStart · b5ccd073

由 Sangtae Ha 提交于 3月 14, 2011

HyStart sets the initial exit point of slow start.
Suppose that HyStart exits at 0.5BDP in a BDP network and no history exists.
If the BDP of a network is large, CUBIC's initial cwnd growth may be
too conservative to utilize the link.
CUBIC increases the cwnd 20% per RTT in this case.
Signed-off-by: NSangtae Ha <sangtae.ha@gmail.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5ccd073

tcp_cubic: make the delay threshold of HyStart less sensitive · 2b4636a5

由 Sangtae Ha 提交于 3月 14, 2011

Make HyStart less sensitive to abrupt delay variations due to buffer bloat.
Signed-off-by: NSangtae Ha <sangtae.ha@gmail.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Reported-by: NLucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b4636a5

tcp_cubic: enable high resolution ack time if needed · 3b585b34

由 stephen hemminger 提交于 3月 14, 2011

This is a refined version of an earlier patch by Lucas Nussbaum.
Cubic needs RTT values in milliseconds. If HZ < 1000 then
the values will be too coarse.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Reported-by: NLucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b585b34

tcp_cubic: fix clock dependency · 17a6e9f1

由 stephen hemminger 提交于 3月 14, 2011

The hystart code was written with assumption that HZ=1000.
Replace the use of jiffies with bictcp_clock as a millisecond
real time clock.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Reported-by: NLucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17a6e9f1

tcp_cubic: make ack train delta value a parameter · aac46324

由 stephen hemminger 提交于 3月 14, 2011

Make the spacing between ACK's that indicates a train a tuneable
value like other hystart values.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aac46324

tcp_cubic: fix comparison of jiffies · c54b4b76

由 stephen hemminger 提交于 3月 14, 2011

Jiffies wraps around therefore the correct way to compare is
to use cast to signed value.

Note: cubic is not using full jiffies value on 64 bit arch
because using full unsigned long makes struct bictcp grow too
large for the available ca_priv area.

Includes correction from Sangtae Ha to improve ack train detection.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c54b4b76

10 3月, 2011 1 次提交

tcp: mark tcp_congestion_ops read_mostly · a252bebe

由 Stephen Hemminger 提交于 3月 10, 2011

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a252bebe

02 3月, 2009 1 次提交

tcp: add helper for AI algorithm · 758ce5c8

由 Ilpo Järvinen 提交于 2月 28, 2009

It seems that implementation in yeah was inconsistent to what
other did as it would increase cwnd one ack earlier than the
others do.

Size benefits:

  bictcp_cong_avoid |  -36
  tcp_cong_avoid_ai |  +52
  bictcp_cong_avoid |  -34
  tcp_scalable_cong_avoid |  -36
  tcp_veno_cong_avoid |  -12
  tcp_yeah_cong_avoid |  -38

= -104 bytes total
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

758ce5c8

02 11月, 2008 1 次提交

[TCP] CUBIC v2.3 · ae27e98a

由 Sangtae Ha 提交于 10月 29, 2008

Signed-off-by: NSangtae Ha <sha2@ncsu.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae27e98a

01 5月, 2008 1 次提交

rename div64_64 to div64_u64 · 6f6d6a1a

由 Roman Zippel 提交于 5月 01, 2008

Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide.  Move its definition
to math64.h as currently no architecture overrides the generic implementation.
 They can still override it of course, but the duplicated declarations are
avoided.
Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6f6d6a1a

05 3月, 2008 1 次提交

[TCP]: TCP cubic v2.2 · 6b3d6263

由 Sangtae Ha 提交于 3月 04, 2008

We have updated CUBIC to fix some issues with slow increase in large
BDP networks. We also improved its convergence speed. The fix is in
fact very simple -- the window increase limit of smax during the
window probing phase (i.e., convex growth phase) is removed. We found
that this does not affect TCP friendliness, but only improves its
scalability. We have run some tests in our lab and also over the
Internet path from NCSU to Japan. These results can be seen from the
following page:

http://netsrv.csc.ncsu.edu/wiki/index.php/Intra_protocol_fairness_testing_with_linux-2.6.23.9
http://netsrv.csc.ncsu.edu/wiki/index.php/RTT_fairness_testing_with_linux-2.6.23.9
http://netsrv.csc.ncsu.edu/wiki/index.php/TCP_friendliness_testing_with_linux-2.6.23.9Signed-off-by: NSangtae Ha <sha2@ncsu.edu>
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b3d6263

29 1月, 2008 1 次提交

[TCP]: Cong.ctrl modules: remove unused good_ack from cong_avoid · c3a05c60

由 Ilpo Järvinen 提交于 12月 02, 2007

Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3a05c60

11 10月, 2007 1 次提交

[TCP]: Remove num_acked>0 checks from cong.ctrl mods pkts_acked · 35e86941

由 Ilpo Järvinen 提交于 5月 31, 2007

There is no need for such check in pkts_acked because the
callback is not invoked unless at least one segment got fully
ACKed (i.e., the snd_una moved past skb's end_seq) by the
cumulative ACK's snd_una advancement.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35e86941

31 7月, 2007 2 次提交

[TCP]: cubic - eliminate use of receive time stamp · e7d0c885

由 Stephen Hemminger 提交于 7月 25, 2007

Remove use of received timestamp option value from RTT calculation in Cubic.
A hostile receiver may be returning a larger timestamp option than the original
value. This would cause the sender to believe the malevolent receiver had
a larger RTT and because Cubic tries to provide some RTT friendliness, the
sender would then favor the liar.

Instead, use the jiffie resolutionRTT value already computed and
passed back after ack.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7d0c885

[TCP]: congestion control API pass RTT in microseconds · 30cfd0ba

由 Stephen Hemminger 提交于 7月 25, 2007

This patch changes the API for the callback that is done after an ACK is
received. It solves a couple of issues:

  * Some congestion controls want higher resolution value of RTT
    (controlled by TCP_CONG_RTT_SAMPLE flag). These don't really want a ktime, but
    all compute a RTT in microseconds.

  * Other congestion control could use RTT at jiffies resolution.

To keep API consistent the units should be the same for both cases, just the
resolution should change.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30cfd0ba

18 7月, 2007 1 次提交

[TCP]: remove unused argument to cong_avoid op · 16751347

由 Stephen Hemminger 提交于 7月 16, 2007

None of the existing TCP congestion controls use the rtt value pased
in the ca_ops->cong_avoid interface. Which is lucky because seq_rtt
could have been -1 when handling a duplicate ack.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16751347

13 6月, 2007 1 次提交

[TCP]: Set initial_ssthresh default to zero in Cubic and BIC. · 66e1e3b2

由 David S. Miller 提交于 6月 13, 2007

Because of the current default of 100, Cubic and BIC perform very
poorly compared to standard Reno.

In the worst case, this change makes Cubic and BIC as aggressive as
Reno.  So this change should be very safe.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66e1e3b2

26 4月, 2007 5 次提交

[TCP]: Congestion control API update. · 164891aa

由 Stephen Hemminger 提交于 4月 23, 2007

Do some simple changes to make congestion control API faster/cleaner.
* use ktime_t rather than timeval
* merge rtt sampling into existing ack callback
  this means one indirect call versus two per ack.
* use flags bits to store options/settings
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

164891aa

[TCP]: cubic update for net-2.6.22 · e1c3e7ab

由 Stephen Hemminger 提交于 3月 24, 2007

The following update received from Injong updates TCP cubic to the latest
version. I am running more complete tests and will have results after 4/1.

According to Injong: the new version improves on its scalability,
fairness and stability. So in all properties, we confirmed it shows better
performance.

NCSU results (for 2.6.18 and 2.6.20) available:
http://netsrv.csc.ncsu.edu/wiki/index.php/TCP_Testing

This version is described in a new Internet draft for CUBIC.
http://www.ietf.org/internet-drafts/draft-rhee-tcp-cubic-00.txtSigned-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1c3e7ab

[TCP]: cubic optimization · 7e58886b

由 Stephen Hemminger 提交于 3月 22, 2007

Use willy's work in optimizing cube root by having table for small values.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e58886b

[TCP] tcp_cubic: faster cube root · c5f5877c

由 Stephen Hemminger 提交于 3月 25, 2007

The Newton-Raphson method is quadratically convergent so
only a small fixed number of steps are necessary.
Therefore it is faster to unroll the loop. Since div64_64 is no longer
inline it won't cause code explosion.

Also fixes a bug that can occur if x^2 was bigger than 32 bits.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5f5877c

[NET]: div64_64 consolidate (rev3) · 3927f2e8

由 Stephen Hemminger 提交于 3月 25, 2007

Here is the current version of the 64 bit divide common code.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3927f2e8

13 2月, 2007 1 次提交

[TCP]: Use read mostly for CUBIC parameters. · 59758f44

由 Stephen Hemminger 提交于 2月 12, 2007

These module parameters should be in the read mostly area to avoid
cache pollution.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59758f44

11 2月, 2007 1 次提交

[NET] IPV4: Fix whitespace errors. · e905a9ed

由 YOSHIFUJI Hideaki 提交于 2月 09, 2007

Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e905a9ed

26 10月, 2006 1 次提交

[TCP] cubic: scaling error · 22119240

由 Stephen Hemminger 提交于 10月 25, 2006

Doug Leith observed a discrepancy between the version of CUBIC described
in the papers and the version in 2.6.18. A math error related to scaling
causes Cubic to grow too slowly.

Patch is from "Sangtae Ha" <sha2@ncsu.edu>. I validated that
it does fix the problems.

See the following to show behavior over 500ms 100 Mbit link.

Sender (2.6.19-rc3) --- Bridge (2.6.18-rt7) ------- Receiver (2.6.19-rc3)
1G [netem] 100M

http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-orig.png
http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-fix.pngSigned-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22119240

23 9月, 2006 1 次提交
- A
  [TCP] Congestion control (modulo lp, bic): use BUILD_BUG_ON · 74975d40
  由 Alexey Dobriyan 提交于 8月 25, 2006
```
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  74975d40