提交 · f698204bd0bfdc645642e271da117b56b795aee0 · OpenHarmony / kernel_linux

28 11月, 2011 4 次提交

tcp: allow undo from reordered DSACKs · f698204b

由 Neal Cardwell 提交于 11月 16, 2011

Previously, SACK-enabled connections hung around in TCP_CA_Disorder
state while snd_una==high_seq, just waiting to accumulate DSACKs and
hopefully undo a cwnd reduction. This could and did lead to the
following unfortunate scenario: if some incoming ACKs advance snd_una
beyond high_seq then we were setting undo_marker to 0 and moving to
TCP_CA_Open, so if (due to reordering in the ACK return path) we
shortly thereafter received a DSACK then we were no longer able to
undo the cwnd reduction.

The change: Simplify the congestion avoidance state machine by
removing the behavior where SACK-enabled connections hung around in
the TCP_CA_Disorder state just waiting for DSACKs. Instead, when
snd_una advances to high_seq or beyond we typically move to
TCP_CA_Open immediately and allow an undo in either TCP_CA_Open or
TCP_CA_Disorder if we later receive enough DSACKs.

Other patches in this series will provide other changes that are
necessary to fully fix this problem.
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f698204b

tcp: use SACKs and DSACKs that arrive on ACKs below snd_una · e95ae2f2

由 Neal Cardwell 提交于 11月 16, 2011

The bug: When the ACK field is below snd_una (which can happen when
ACKs are reordered), senders ignored DSACKs (preventing undo) and did
not call tcp_fastretrans_alert, so they did not increment
prr_delivered to reflect newly-SACKed sequence ranges, and did not
call tcp_xmit_retransmit_queue, thus passing up chances to send out
more retransmitted and new packets based on any newly-SACKed packets.

The change: When the ACK field is below snd_una (the "old_ack" goto
label), call tcp_fastretrans_alert to allow undo based on any
newly-arrived DSACKs and try to send out more packets based on
newly-SACKed packets.

Other patches in this series will provide other changes that are
necessary to fully fix this problem.
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e95ae2f2

tcp: use DSACKs that arrive when packets_out is 0 · 5628adf1

由 Neal Cardwell 提交于 11月 16, 2011

The bug: Senders ignored DSACKs after recovery when there were no
outstanding packets (a common scenario for HTTP servers).

The change: when there are no outstanding packets (the "no_queue" goto
label), call tcp_fastretrans_alert() in order to use DSACKs to undo
congestion window reductions.

Other patches in this series will provide other changes that are
necessary to fully fix this problem.
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5628adf1

tcp: make is_dupack a parameter to tcp_fastretrans_alert() · 7d2b55f8

由 Neal Cardwell 提交于 11月 16, 2011

Allow callers to decide whether an ACK is a duplicate ACK. This is a
prerequisite to allowing fastretrans_alert to be called from new
contexts, such as the no_queue and old_ack code paths, from which we
have extra info that tells us whether an ACK is a dupack.
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d2b55f8

21 10月, 2011 3 次提交

tcp: add const qualifiers where possible · cf533ea5

由 Eric Dumazet 提交于 10月 21, 2011

Adding const qualifiers to pointers can ease code review, and spot some
bugs. It might allow compiler to optimize code further.

For example, is it legal to temporary write a null cksum into tcphdr
in tcp_md5_hash_header() ? I am afraid a sniffer could catch the
temporary null value...
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf533ea5

tcp: remove unused tcp_fin() parameters · 20c4cb79

由 Eric Dumazet 提交于 10月 20, 2011

tcp_fin() only needs socket pointer, we can remove skb and th params.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20c4cb79

tcp: use TCP_DEFAULT_INIT_RCVWND in tcp_fixup_rcvbuf() · e9266a02

由 Eric Dumazet 提交于 10月 20, 2011

Since commit 356f0398 (TCP: increase default initial receive
window.), we allow sender to send 10 (TCP_DEFAULT_INIT_RCVWND) segments.

Change tcp_fixup_rcvbuf() to reflect this change, even if no real change
is expected, since sysctl_tcp_rmem[1] = 87380 and this value
is bigger than tcp_fixup_rcvbuf() computed rcvmem (~23720)

Note: Since commit 356f0398 limited default window to maximum of
10*1460 and 2*MSS, we use same heuristic in this patch.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e9266a02

20 10月, 2011 1 次提交

tcp: use TCP_INIT_CWND in tcp_fixup_sndbuf() · 06a59ecb

由 Eric Dumazet 提交于 10月 13, 2011

Initial cwnd being 10 (TCP_INIT_CWND) instead of 3, change
tcp_fixup_sndbuf() to get more than 16384 bytes (sysctl_tcp_wmem[1]) in
initial sk_sndbuf
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06a59ecb

14 10月, 2011 1 次提交

net: more accurate skb truesize · 87fb4b7b

由 Eric Dumazet 提交于 10月 13, 2011

skb truesize currently accounts for sk_buff struct and part of skb head.
kmalloc() roundings are also ignored.

Considering that skb_shared_info is larger than sk_buff, its time to
take it into account for better memory accounting.

This patch introduces SKB_TRUESIZE(X) macro to centralize various
assumptions into a single place.

At skb alloc phase, we put skb_shared_info struct at the exact end of
skb head, to allow a better use of memory (lowering number of
reallocations), since kmalloc() gives us power-of-two memory blocks.

Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
aligned to cache lines, as before.

Note: This patch might trigger performance regressions because of
misconfigured protocol stacks, hitting per socket or global memory
limits that were previously not reached. But its a necessary step for a
more accurate memory accounting.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <ak@linux.intel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87fb4b7b

05 10月, 2011 1 次提交

tcp: properly update lost_cnt_hint during shifting · 1e5289e1

由 Yan, Zheng 提交于 10月 02, 2011

lost_skb_hint is used by tcp_mark_head_lost() to mark the first unhandled skb.
lost_cnt_hint is the number of packets or sacked packets before the lost_skb_hint;
When shifting a skb that is before the lost_skb_hint, if tcp_is_fack() is ture,
the skb has already been counted in the lost_cnt_hint; if tcp_is_fack() is false,
tcp_sacktag_one() will increase the lost_cnt_hint. So tcp_shifted_skb() does not
need to adjust the lost_cnt_hint by itself. When shifting a skb that is equal to
lost_skb_hint, the shifted packets will not be counted by tcp_mark_head_lost().
So tcp_shifted_skb() should adjust the lost_cnt_hint even tcp_is_fack(tp) is true.
Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e5289e1

28 9月, 2011 1 次提交

tcp: rename tcp_skb_cb flags · 4de075e0

由 Eric Dumazet 提交于 9月 27, 2011

Rename struct tcp_skb_cb "flags" to "tcp_flags" to ease code review and
maintenance.

Its content is a combination of FIN/SYN/RST/PSH/ACK/URG/ECE/CWR flags
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4de075e0

27 9月, 2011 2 次提交

tcp: unalias tcp_skb_cb flags and ip_dsfield · b82d1bb4

由 Eric Dumazet 提交于 9月 27, 2011

struct tcp_skb_cb contains a "flags" field containing either tcp flags
or IP dsfield depending on context (input or output path)

Introduce ip_dsfield to make the difference clear and ease maintenance.
If later we want to save space, we can union flags/ip_dsfield
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b82d1bb4

tcp: ECN blackhole should not force quickack mode · 7a269ffa

由 Eric Dumazet 提交于 9月 22, 2011

While playing with a new ADSL box at home, I discovered that ECN
blackhole can trigger suboptimal quickack mode on linux : We send one
ACK for each incoming data frame, without any delay and eventual
piggyback.

This is because TCP_ECN_check_ce() considers that if no ECT is seen on a
segment, this is because this segment was a retransmit.

Refine this heuristic and apply it only if we seen ECT in a previous
segment, to detect ECN blackhole at IP level.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Jerry Chu <hkchu@google.com>
CC: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
CC: Jim Gettys <jg@freedesktop.org>
CC: Dave Taht <dave.taht@gmail.com>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a269ffa

19 9月, 2011 1 次提交

tcp: fix validation of D-SACK · f779b2d6

由 Zheng Yan 提交于 9月 18, 2011

D-SACK is allowed to reside below snd_una. But the corresponding check
in tcp_is_sackblock_valid() is the exact opposite. It looks like a typo.
Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f779b2d6

25 8月, 2011 1 次提交

Proportional Rate Reduction for TCP. · a262f0cd

由 Nandita Dukkipati 提交于 8月 21, 2011

This patch implements Proportional Rate Reduction (PRR) for TCP.
PRR is an algorithm that determines TCP's sending rate in fast
recovery. PRR avoids excessive window reductions and aims for
the actual congestion window size at the end of recovery to be as
close as possible to the window determined by the congestion control
algorithm. PRR also improves accuracy of the amount of data sent
during loss recovery.

The patch implements the recommended flavor of PRR called PRR-SSRB
(Proportional rate reduction with slow start reduction bound) and
replaces the existing rate halving algorithm. PRR improves upon the
existing Linux fast recovery under a number of conditions including:
  1) burst losses where the losses implicitly reduce the amount of
outstanding data (pipe) below the ssthresh value selected by the
congestion control algorithm and,
  2) losses near the end of short flows where application runs out of
data to send.

As an example, with the existing rate halving implementation a single
loss event can cause a connection carrying short Web transactions to
go into the slow start mode after the recovery. This is because during
recovery Linux pulls the congestion window down to packets_in_flight+1
on every ACK. A short Web response often runs out of new data to send
and its pipe reduces to zero by the end of recovery when all its packets
are drained from the network. Subsequent HTTP responses using the same
connection will have to slow start to raise cwnd to ssthresh. PRR on
the other hand aims for the cwnd to be as close as possible to ssthresh
by the end of recovery.

A description of PRR and a discussion of its performance can be found at
the following links:
- IETF Draft:
    http://tools.ietf.org/html/draft-mathis-tcpm-proportional-rate-reduction-01
- IETF Slides:
    http://www.ietf.org/proceedings/80/slides/tcpm-6.pdf
    http://tools.ietf.org/agenda/81/slides/tcpm-2.pdf
- Paper to appear in Internet Measurements Conference (IMC) 2011:
    Improving TCP Loss Recovery
    Nandita Dukkipati, Matt Mathis, Yuchung Cheng
Signed-off-by: NNandita Dukkipati <nanditad@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a262f0cd

09 6月, 2011 1 次提交

tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side · 9ad7c049

由 Jerry Chu 提交于 6月 08, 2011

This patch lowers the default initRTO from 3secs to 1sec per
RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
has been retransmitted, AND the TCP timestamp option is not on.

It also adds support to take RTT sample during 3WHS on the passive
open side, just like its active open counterpart, and uses it, if
valid, to seed the initRTO for the data transmission phase.

The patch also resets ssthresh to its initial default at the
beginning of the data transmission phase, and reduces cwnd to 1 if
there has been MORE THAN ONE retransmission during 3WHS per RFC5681.
Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ad7c049

23 3月, 2011 2 次提交

D
tcp: Make undo_ssthresh arg to tcp_undo_cwr() a bool. · f6152737
由 David S. Miller 提交于 3月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
f6152737

tcp: avoid cwnd moderation in undo · 67d4120a

由 Yuchung Cheng 提交于 3月 14, 2011

In the current undo logic, cwnd is moderated after it was restored
to the value prior entering fast-recovery. It was moderated first
in tcp_try_undo_recovery then again in tcp_complete_cwr.

Since the undo indicates recovery was false, these moderations
are not necessary. If the undo is triggered when most of the
outstanding data have been acknowledged, the (restored) cwnd is
falsely pulled down to a small value.

This patch removes these cwnd moderations if cwnd is undone
  a) during fast-recovery
	b) by receiving DSACKs past fast-recovery
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67d4120a

15 3月, 2011 1 次提交

tcp: fix RTT for quick packets in congestion control · febf0819

由 stephen hemminger 提交于 3月 14, 2011

In the congestion control interface, the callback for each ACK
includes an estimated round trip time in microseconds.
Some algorithms need high resolution (Vegas style) but most only
need jiffie resolution.  If RTT is not accurate (like a retransmission)
-1 is used as a flag value.

When doing coarse resolution if RTT is less than a a jiffie
then 0 should be returned rather than no estimate. Otherwise algorithms
that expect good ack's to trigger slow start (like CUBIC Hystart)
will be confused.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

febf0819

22 2月, 2011 1 次提交

tcp: undo_retrans counter fixes · c24f691b

由 Yuchung Cheng 提交于 2月 07, 2011

Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
not set or undo_retrans is already 0. This happens when sender receives
more DSACK ACKs than packets retransmitted during the current
undo phase. This may also happen when sender receives DSACK after
the undo operation is completed or cancelled.

Fix another bug that undo_retrans is incorrectly incremented when
sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
is rare but not impossible.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c24f691b

03 2月, 2011 1 次提交
- D
  tcp: Increase the initial congestion window to 10. · 442b9635
  由 David S. Miller 提交于 2月 02, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NNandita Dukkipati <nanditad@google.com>
```
  442b9635
26 1月, 2011 1 次提交

TCP: fix a bug that triggers large number of TCP RST by mistake · 44f5324b

由 Jerry Chu 提交于 1月 25, 2011

This patch fixes a bug that causes TCP RST packets to be generated
on otherwise correctly behaved applications, e.g., no unread data
on close,..., etc. To trigger the bug, at least two conditions must
be met:

1. The FIN flag is set on the last data packet, i.e., it's not on a
separate, FIN only packet.
2. The size of the last data chunk on the receive side matches
exactly with the size of buffer posted by the receiver, and the
receiver closes the socket without any further read attempt.

This bug was first noticed on our netperf based testbed for our IW10
proposal to IETF where a large number of RST packets were observed.
netperf's read side code meets the condition 2 above 100%.

Before the fix, tcp_data_queue() will queue the last skb that meets
condition 1 to sk_receive_queue even though it has fully copied out
(skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
tcp_recvmsg() often returns all the copied out data successfully
without actually consuming the skb, due to a check
"if ((chunk = len - tp->ucopy.len) != 0) {"
and
"len -= chunk;"
after tcp_prequeue_process() that causes "len" to become 0 and an
early exit from the big while loop.

I don't see any reason not to free the skb whose data have been fully
consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
get there if MSG_PEEK is on. Am I missing some arcane cases related
to urgent data?
Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44f5324b

24 12月, 2010 1 次提交

tcp: cleanup of cwnd initialization in tcp_init_metrics() · d9f4fbaf

由 Jiri Kosina 提交于 12月 22, 2010

Commit 86bcebaf ("tcp: fix >2 iw selection") fixed a case
when congestion window initialization has been mistakenly omitted
by introducing cwnd label and putting backwards goto from the
end of the function.

This makes the code unnecessarily tricky to read and understand
on a first sight.

Shuffle the code around a little bit to make it more obvious.
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9f4fbaf

10 12月, 2010 1 次提交

net: Abstract away all dst_entry metrics accesses. · defb3519

由 David S. Miller 提交于 12月 08, 2010

Use helper functions to hide all direct accesses, especially writes,
to dst_entry metrics values.

This will allow us to:

1) More easily change how the metrics are stored.

2) Implement COW for metrics.

In particular this will help us put metrics into the inetpeer
cache if that is what we end up doing.  We can make the _metrics
member a pointer instead of an array, initially have it point
at the read-only metrics in the FIB, and then on the first set
grab an inetpeer entry and point the _metrics member there.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>

defb3519

11 11月, 2010 1 次提交

net: avoid limits overflow · 8d987e5c

由 Eric Dumazet 提交于 11月 09, 2010

Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Reported-by: NRobin Holt <holt@sgi.com>
Reviewed-by: NRobin Holt <holt@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d987e5c

18 10月, 2010 2 次提交

Update broken web addresses in the kernel. · 631dd1a8

由 Justin P. Mattock 提交于 10月 18, 2010

The patch below updates broken web addresses in the kernel
Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: NBen Pfaff <blp@cs.stanford.edu>
Acked-by: NHans J. Koch <hjk@linutronix.de>
Reviewed-by: NFinn Thain <fthain@telegraphics.com.au>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

631dd1a8

tcp: sack lost marking fixes · 1fdb9361

由 Ilpo Järvinen 提交于 10月 14, 2010

When only fast rexmit should be done, tcp_mark_head_lost marks
L too far. Also, sacked_upto below 1 is perfectly valid number,
the packets == 0 then needs to be trapped elsewhere.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fdb9361

07 10月, 2010 1 次提交

TCP: Fix setting of snd_ssthresh in tcp_mtu_probe_success · 9c6d5e55

由 John Heffner 提交于 10月 06, 2010

This looks like a simple typo that has gone unnoticed for some time.  The
impact is relatively low but it's clearly wrong.
Signed-off-by: NJohn Heffner <johnwheffner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c6d5e55

30 9月, 2010 1 次提交

tcp: tcp_enter_quickack_mode can be static · 1b9f4092

由 stephen hemminger 提交于 9月 28, 2010

Function only used in tcp_input.c
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b9f4092

28 9月, 2010 1 次提交

tcp: fix TSO FACK loss marking in tcp_mark_head_lost · b3de7559

由 Yuchung Cheng 提交于 9月 24, 2010

When TCP uses FACK algorithm to mark lost packets in
tcp_mark_head_lost(), if the number of packets in the (TSO) skb is
greater than the number of packets that should be marked lost, TCP
incorrectly exits the loop and marks no packets lost in the skb. This
underestimates tp->lost_out and affects the recovery/retransmission.
This patch fargments the skb and marks the correct amount of packets
lost.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3de7559

24 9月, 2010 1 次提交

net: return operator cleanup · a02cec21

由 Eric Dumazet 提交于 9月 22, 2010

Change "return (EXPR);" to "return EXPR;"

return is not a function, parentheses are not required.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a02cec21

21 9月, 2010 1 次提交

tcp: Fix race in tcp_poll · a4d25803

由 Tom Marshall 提交于 9月 20, 2010

If a RST comes in immediately after checking sk->sk_err, tcp_poll will
return POLLIN but not POLLOUT.  Fix this by checking sk->sk_err at the end
of tcp_poll.  Additionally, ensure the correct order of operations on SMP
machines with memory barriers.
Signed-off-by: NTom Marshall <tdm.code@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4d25803

31 8月, 2010 1 次提交

tcp/dccp: Consolidate common code for RFC 3390 conversion · 22b71c8f

由 Gerrit Renker 提交于 8月 29, 2010

This patch consolidates initial-window code common to TCP and CCID-2:
 * TCP uses RFC 3390 in a packet-oriented manner (tcp_input.c) and
 * CCID-2 uses RFC 3390 in packet-oriented manner (RFC 4341).
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22b71c8f

08 8月, 2010 1 次提交

tcp: no md5sig option size check bug · ba78e2dd

由 Dmitry Popov 提交于 8月 07, 2010

tcp_parse_md5sig_option doesn't check md5sig option (TCPOPT_MD5SIG)
length, but tcp_v[46]_inbound_md5_hash assume that it's at least 16
bytes long.
Signed-off-by: NDmitry Popov <dp@highloadlab.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba78e2dd

13 7月, 2010 1 次提交

net/ipv4: EXPORT_SYMBOL cleanups · 4bc2f18b

由 Eric Dumazet 提交于 7月 09, 2010

CodingStyle cleanups

EXPORT_SYMBOL should immediately follow the symbol declaration.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4bc2f18b

16 6月, 2010 1 次提交

tcp: unify tcp flag macros · a3433f35

由 Changli Gao 提交于 6月 12, 2010

unify tcp flag macros: TCPHDR_FIN, TCPHDR_SYN, TCPHDR_RST, TCPHDR_PSH,
TCPHDR_ACK, TCPHDR_URG, TCPHDR_ECE and TCPHDR_CWR. TCBCB_FLAG_* are replaced
with the corresponding TCPHDR_*.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 include/net/tcp.h                      |   24 ++++++-------
 net/ipv4/tcp.c                         |    8 ++--
 net/ipv4/tcp_input.c                   |    2 -
 net/ipv4/tcp_output.c                  |   59 ++++++++++++++++-----------------
 net/netfilter/nf_conntrack_proto_tcp.c |   32 ++++++-----------
 net/netfilter/xt_TCPMSS.c              |    4 --
 6 files changed, 58 insertions(+), 71 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3433f35

01 6月, 2010 1 次提交

net/ipv4/tcp_input.c: fix compilation breakage when FASTRETRANS_DEBUG > 1 · 288fcee8

由 Joe Perches 提交于 5月 31, 2010

Commit: c720c7e8 missed these.
Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

288fcee8

18 5月, 2010 1 次提交

net: Remove unnecessary semicolons after switch statements · ccbd6a5a

由 Joe Perches 提交于 5月 14, 2010

Also added an explicit break; to avoid
a fallthrough in net/ipv4/tcp_input.c
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ccbd6a5a

29 4月, 2010 1 次提交

net: ip_queue_rcv_skb() helper · f84af32c

由 Eric Dumazet 提交于 4月 28, 2010

When queueing a skb to socket, we can immediately release its dst if
target socket do not use IP_CMSG_PKTINFO.

tcp_data_queue() can drop dst too.

This to benefit from a hot cache line and avoid the receiver, possibly
on another cpu, to dirty this cache line himself.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f84af32c

13 4月, 2010 1 次提交

net: sk_dst_cache RCUification · b6c6712a

由 Eric Dumazet 提交于 4月 08, 2010

With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this
work.

sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock)

This rwlock is readlocked for a very small amount of time, and dst
entries are already freed after RCU grace period. This calls for RCU
again :)

This patch converts sk_dst_lock to a spinlock, and use RCU for readers.

__sk_dst_get() is supposed to be called with rcu_read_lock() or if
socket locked by user, so use appropriate rcu_dereference_check()
condition (rcu_read_lock_held() || sock_owned_by_user(sk))

This patch avoids two atomic ops per tx packet on UDP connected sockets,
for example, and permits sk_dst_lock to be much less dirtied.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6c6712a

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年