提交 · 132adf54639cf7dd9315e8df89c2faa59f6e46d9 · openeuler / raspberrypi-kernel

26 4月, 2007 2 次提交

[TCP]: Abstract out all write queue operations. · fe067e8a

由 David S. Miller 提交于 3月 07, 2007

This allows the write queue implementation to be changed,
for example, to one which allows fast interval searching.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe067e8a

[TCP]: Add two new spurious RTO responses to FRTO · 3cfe3baa

由 Ilpo Järvinen 提交于 2月 27, 2007

New sysctl tcp_frto_response is added to select amongst these
responses:
	- Rate halving based; reuses CA_CWR state (default)
	- Very conservative; used to be the only one available (=1)
	- Undo cwr; undoes ssthresh and cwnd reductions (=2)

The response with rate halving requires a new parameter to
tcp_enter_cwr because FRTO has already reduced ssthresh and
doing a second reduction there has to be prevented. In addition,
to keep things nice on 80 cols screen, a local variable was
added.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3cfe3baa

10 4月, 2007 1 次提交

[TCP]: slow_start_after_idle should influence cwnd validation too · 15d33c07

由 David S. Miller 提交于 4月 09, 2007

For the cases that slow_start_after_idle are meant to deal
with, it is almost a certainty that the congestion window
tests will think the connection is application limited and
we'll thus decrease the cwnd there too.  This defeats the
whole point of setting slow_start_after_idle to zero.

So test it there too.

We do not cancel out the entire tcp_cwnd_validate() function
so that if the sysctl is changed we still have the validation
state maintained.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15d33c07

03 4月, 2007 1 次提交
- J
  [TCP]: Do receiver-side SWS avoidance for rcvbuf < MSS. · 84565070
  由 John Heffner 提交于 4月 02, 2007
```
Signed-off-by: NJohn Heffner <jheffner@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  84565070
14 2月, 2007 1 次提交

[TCP]: Prevent pseudo garbage in SYN's advertized window · 600ff0c2

由 Ilpo Järvinen 提交于 2月 13, 2007

TCP may advertize up to 16-bits window in SYN packets (no window
scaling allowed). At the same time, TCP may have rcv_wnd
(32-bits) that does not fit to 16-bits without window scaling
resulting in pseudo garbage into advertized window from the
low-order bits of rcv_wnd. This can happen at least when
mss <= (1<<wscale) (see tcp_select_initial_window). This patch
fixes the handling of SYN advertized windows (compile tested
only).

In worst case (which is unlikely to occur though), the receiver
advertized window could be just couple of bytes. I'm not sure
that such situation would be handled very well at all by the
receiver!? Fortunately, the situation normalizes after the
first non-SYN ACK is received because it has the correct,
scaled window.

Alternatively, tcp_select_initial_window could be changed to
prevent too large rcv_wnd in the first place.

[ tcp_make_synack() has the same bug, and I've added a fix for
  that to this patch -DaveM ]
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

600ff0c2

11 2月, 2007 1 次提交

[NET] IPV4: Fix whitespace errors. · e905a9ed

由 YOSHIFUJI Hideaki 提交于 2月 09, 2007

Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e905a9ed

09 2月, 2007 1 次提交
- J
  [TCP]: Don't apply FIN exception to full TSO segments. · 104439a8
  由 John Heffner 提交于 2月 05, 2007
```
Signed-off-by: NJohn Heffner <jheffner@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  104439a8
26 1月, 2007 1 次提交

[TCP]: Restore SKB socket owner setting in tcp_transmit_skb(). · e89862f4

由 David S. Miller 提交于 1月 26, 2007

Revert 93173112

We can't elide the skb_set_owner_w() here because things like certain
netfilter targets (such as owner MATCH) need a socket to be set on the
SKB for correct operation.

Thanks to Jan Engelhardt and other netfilter list members for
pointing this out.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e89862f4

24 1月, 2007 1 次提交

[TCP]: rare bad TCP checksum with 2.6.19 · 52d570aa

由 Jarek Poplawski 提交于 1月 23, 2007

The patch "Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE"
changed to unconditional copying of ip_summed field from collapsed
skb. This patch reverts this change.

The majority of substantial work including heavy testing
and diagnosing by: Michael Tokarev <mjt@tls.msk.ru>
Possible reasons pointed by: Herbert Xu and Patrick McHardy.
Signed-off-by: NJarek Poplawski <jarkao2@o2.pl>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52d570aa

03 12月, 2006 3 次提交

[TCP]: MD5 Signature Option (RFC2385) support. · cfb6eeb4

由 YOSHIFUJI Hideaki 提交于 11月 14, 2006

Based on implementation by Rick Payne.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfb6eeb4

[TCP/DCCP]: Introduce net_xmit_eval · b9df3cb8

由 Gerrit Renker 提交于 11月 14, 2006

Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
return code of a transmit function needs to be tested against NET_XMIT_CN
which is a value that does not indicate a strict error condition.

This patch uses a macro for these recurring situations which is consistent
with the already existing macro net_xmit_errno, saving on duplicated code.
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

b9df3cb8

[TCP]: Don't set SKB owner in tcp_transmit_skb(). · 93173112

由 David S. Miller 提交于 11月 09, 2006

The data itself is already charged to the SKB, doing
the skb_set_owner_w() just generates a lot of noise and
extra atomics we don't really need.

Lmbench improvements on lat_tcp are minimal:

before:
TCP latency using localhost: 23.2701 microseconds
TCP latency using localhost: 23.1994 microseconds
TCP latency using localhost: 23.2257 microseconds

after:
TCP latency using localhost: 22.8380 microseconds
TCP latency using localhost: 22.9465 microseconds
TCP latency using localhost: 22.8462 microseconds
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93173112

19 10月, 2006 1 次提交

[TCP]: Bound TSO defer time · ae8064ac

由 John Heffner 提交于 10月 18, 2006

This patch limits the amount of time you will defer sending a TSO segment
to less than two clock ticks, or the time between two acks, whichever is
longer.

On slow links, deferring causes significant bursts.  See attached plots,
which show RTT through a 1 Mbps link with a 100 ms RTT and ~100 ms queue
for (a) non-TSO, (b) currnet TSO, and (c) patched TSO.  This burstiness
causes significant jitter, tends to overflow queues early (bad for short
queues), and makes delay-based congestion control more difficult.

Deferring by a couple clock ticks I believe will have a relatively small
impact on performance.
Signed-off-by: NJohn Heffner <jheffner@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae8064ac

12 10月, 2006 1 次提交

[NET]: Use hton{l,s}() for non-initializers. · 496c98df

由 YOSHIFUJI Hideaki 提交于 10月 10, 2006

Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

496c98df

29 9月, 2006 1 次提交

[TCP] net/ipv4/tcp_output.c: trivial annotations · df7a3b07

由 Al Viro 提交于 9月 27, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df7a3b07

23 9月, 2006 2 次提交

[NET/IPV4/IPV6]: Change some sysctl variables to __read_mostly · ab32ea5d

由 Brian Haley 提交于 9月 22, 2006

Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly.

Couldn't actually measure any performance increase while testing (.3%
I consider noise), but seems like the right thing to do.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab32ea5d

[NET]: Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE · 84fa7933

由 Patrick McHardy 提交于 8月 29, 2006

Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose
checksum still needs to be completed) and CHECKSUM_COMPLETE (for
incoming packets, device supplied full checksum).

Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84fa7933

23 8月, 2006 1 次提交

[TCP]: Limit window scaling if window is clamped. · 316c1592

由 Stephen Hemminger 提交于 8月 22, 2006

This small change allows for easy per-route workarounds for broken hosts or
middleboxes that are not compliant with TCP standards for window scaling.
Rather than having to turn off window scaling globally. This patch allows
reducing or disabling window scaling if window clamp is present.

Example: Mark Lord reported a problem with 2.6.17 kernel being unable to
access http://www.everymac.com

# ip route add 216.145.246.23/32 via 10.8.0.1 window 65535
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

316c1592

08 8月, 2006 1 次提交

[TCP]: SNMPv2 tcpOutSegs counter error · bd37a088

由 Wei Yongjun 提交于 8月 07, 2006

Do not count retransmitted segments.
Signed-off-by: NWei Yongjun <yjwei@nanjing-fnst.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd37a088

01 7月, 2006 1 次提交

[NET]: Generalise TSO-specific bits from skb_setup_caps · bcd76111

由 Herbert Xu 提交于 6月 30, 2006

This patch generalises the TSO-specific bits from sk_setup_caps by adding
the sk_gso_type member to struct sock.  This makes sk_setup_caps generic
so that it can be used by TCPv6 or UFO.

The only catch is that whoever uses this must provide a GSO implementation
for their protocol which I think is a fair deal :) For now UFO continues to
live without a GSO implementation which is OK since it doesn't use the sock
caps field at the moment.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bcd76111

30 6月, 2006 1 次提交

[NET]: Add ECN support for TSO · b0da8537

由 Michael Chan 提交于 6月 29, 2006

In the current TSO implementation, NETIF_F_TSO and ECN cannot be
turned on together in a TCP connection.  The problem is that most
hardware that supports TSO does not handle CWR correctly if it is set
in the TSO packet.  Correct handling requires CWR to be set in the
first packet only if it is set in the TSO header.

This patch adds the ability to turn on NETIF_F_TSO and ECN using
GSO if necessary to handle TSO packets with CWR set.  Hardware
that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev->
features flag.

All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set.  If
the output device does not have the NETIF_F_TSO_ECN feature set, GSO
will split the packet up correctly with CWR only set in the first
segment.

With help from Herbert Xu <herbert@gondor.apana.org.au>.

Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock
flag is completely removed.
Signed-off-by: NMichael Chan <mchan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0da8537

23 6月, 2006 1 次提交

[NET]: Merge TSO/UFO fields in sk_buff · 7967168c

由 Herbert Xu 提交于 6月 22, 2006

Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP).  So
let's merge them.

They were used to tell the protocol of a packet.  This function has been
subsumed by the new gso_type field.  This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb.  As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.

I've made gso_type a conjunction.  The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
to be emulated in software.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7967168c

18 6月, 2006 1 次提交

[TCP]: Add tcp_slow_start_after_idle sysctl. · 35089bb2

由 David S. Miller 提交于 6月 13, 2006

A lot of people have asked for a way to disable tcp_cwnd_restart(),
and it seems reasonable to add a sysctl to do that.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35089bb2

06 6月, 2006 1 次提交

[TCP]: Avoid skb_pull if possible when trimming head · f2911969

由 Herbert Xu ~{PmVHI~} 提交于 6月 05, 2006

Trimming the head of an skb by calling skb_pull can cause the packet
to become unaligned if the length pulled is odd.  Since the length is
entirely arbitrary for a FIN packet carrying data, this is actually
quite common.

Unaligned data is not the end of the world, but we should avoid it if
it's easily done.  In this case it is trivial.  Since we're discarding
all of the head data it doesn't matter whether we move skb->data forward
or back.

However, it is still possible to have unaligned skb->data in general.
So network drivers should be prepared to handle it instead of crashing.

This patch also adds an unlikely marking on len < headlen since partial
ACKs on head data are extremely rare in the wild.  As the return value
of __pskb_trim_head is no longer ever NULL that has been removed.
Signed-off-by: NHerbert Xu ~{PmV&gt;HI~} <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2911969

30 4月, 2006 1 次提交

[TCP]: Fix unlikely usage in tcp_transmit_skb() · 83de47cd

由 Hua Zhong 提交于 4月 28, 2006

The following unlikely should be replaced by likely because the
condition happens every time unless there is a hard error to transmit
a packet.
Signed-off-by: NHua Zhong <hzhong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83de47cd

20 4月, 2006 1 次提交

[TCP]: Account skb overhead in tcp_fragment · b60b49ea

由 Herbert Xu 提交于 4月 19, 2006

Make sure that we get the full sizeof(struct sk_buff)
plus the data size accounted for in skb->truesize.

This will create invariants that will allow adding
assertion checks on skb->truesize.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b60b49ea

19 4月, 2006 1 次提交

[TCP]: Fix truesize underflow · ef5cb973

由 Herbert Xu 提交于 4月 18, 2006

There is a problem with the TSO packet trimming code.  The cause of
this lies in the tcp_fragment() function.

When we allocate a fragment for a completely non-linear packet the
truesize is calculated for a payload length of zero.  This means that
truesize could in fact be less than the real payload length.

When that happens the TSO packet trimming can cause truesize to become
negative.  This in turn can cause sk_forward_alloc to be -n * PAGE_SIZE
which would trigger the warning.

I've copied the code DaveM used in tso_fragment which should work here.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef5cb973

15 4月, 2006 1 次提交

[IPV4]: Possible cleanups. · 6c97e72a

由 Adrian Bunk 提交于 4月 12, 2006

This patch contains the following possible cleanups:
- make the following needlessly global function static:
  - arp.c: arp_rcv()
- remove the following unused EXPORT_SYMBOL's:
  - devinet.c: devinet_ioctl
  - fib_frontend.c: ip_rt_ioctl
  - inet_hashtables.c: inet_bind_bucket_create
  - inet_hashtables.c: inet_bind_hash
  - tcp_input.c: sysctl_tcp_abc
  - tcp_ipv4.c: sysctl_tcp_tw_reuse
  - tcp_output.c: sysctl_tcp_mtu_probing
  - tcp_output.c: sysctl_tcp_base_mss
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c97e72a

21 3月, 2006 3 次提交

[TCP]: sysctl to allow TCP window > 32767 sans wscale · 15d99e02

由 Rick Jones 提交于 3月 20, 2006

Back in the dark ages, we had to be conservative and only allow 15-bit
window fields if the window scale option was not negotiated.  Some
ancient stacks used a signed 16-bit quantity for the window field of
the TCP header and would get confused.

Those days are long gone, so we can use the full 16-bits by default
now.

There is a sysctl added so that we can still interact with such old
stacks
Signed-off-by: NRick Jones <rick.jones2@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15d99e02

[TCP] mtu probing: move tcp-specific data out of inet_connection_sock · 0e7b1368

由 John Heffner 提交于 3月 20, 2006

This moves some TCP-specific MTU probing state out of
inet_connection_sock back to tcp_sock.
Signed-off-by: NJohn Heffner <jheffner@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e7b1368

[TCP]: MTU probing · 5d424d5a

由 John Heffner 提交于 3月 20, 2006

Implementation of packetization layer path mtu discovery for TCP, based on
the internet-draft currently found at
<http://www.ietf.org/internet-drafts/draft-ietf-pmtud-method-05.txt>.
Signed-off-by: NJohn Heffner <jheffner@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d424d5a

12 3月, 2006 1 次提交

[TCP]: Fix tcp_tso_should_defer() when limit>=65536 · ba244fe9

由 David S. Miller 提交于 3月 11, 2006

That's >= a full sized TSO frame, so we should always
return 0 in that case.

Based upon a report and initial patch from Lachlan
Andrew, final patch suggested by Herbert Xu.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba244fe9

04 1月, 2006 3 次提交

[TCP]: less inline's · 40efc6fa

由 Stephen Hemminger 提交于 1月 03, 2006

TCP inline usage cleanup:
 * get rid of inline in several places
 * replace __inline__ with inline where possible
 * move functions used in one file out of tcp.h
 * let compiler decide on used once cases

On x86_64: 
   text	   data	    bss	    dec	    hex	filename
3594701	 648348	 567400	4810449	 4966d1	vmlinux.orig
3593133	 648580	 567400	4809113	 496199	vmlinux

On sparc64:
   text	   data	    bss	    dec	    hex	filename
2538278	 406152	 530392	3474822	 350586	vmlinux.ORIG
2536382	 406384	 530392	3473158	 34ff06	vmlinux
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40efc6fa

[IP_SOCKGLUE]: Remove most of the tcp specific calls · d83d8461

由 Arnaldo Carvalho de Melo 提交于 12月 13, 2005

As DCCP needs to be called in the same spots.

Now we have a member in inet_sock (is_icsk), set at sock creation time from
struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and
DCCP) to see if a struct sock instance is a inet_connection_sock for places
like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if
sk_type was SOCK_STREAM, that is insufficient because we now use the same code
for DCCP, that has sk_type SOCK_DCCP.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d83d8461

[ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops · 8292a17a

由 Arnaldo Carvalho de Melo 提交于 12月 13, 2005

And move it to struct inet_connection_sock. DCCP will use it in the
upcoming changesets.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8292a17a

07 12月, 2005 1 次提交

[TCP] Vegas: timestamp before clone · dfb4b9dc

由 David S. Miller 提交于 12月 06, 2005

We have to store the congestion control timestamp on the SKB before we
clone it, not after.  Else we get no timestamping information at all.

tcp_transmit_skb() has been reworked so that we can do the timestamp
still in one spot, instead of at all the call sites.

Problem discovered, and initial fix, from Tom Young
<tyo@ee.unimelb.edu.au>.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfb4b9dc

11 11月, 2005 3 次提交

[TCP]: speed up SACK processing · 6a438bbe

由 Stephen Hemminger 提交于 11月 10, 2005

Use "hints" to speed up the SACK processing. Various forms 
of this have been used by TCP developers (Web100, STCP, BIC)
to avoid the 2x linear search of outstanding segments.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a438bbe

[TCP]: spelling fixes · caa20d9a

由 Stephen Hemminger 提交于 11月 10, 2005

Minor spelling fixes for TCP code.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

caa20d9a

[TCP]: fix congestion window update when using TSO deferal · f4805ede

由 Stephen Hemminger 提交于 11月 10, 2005

TCP peformance with TSO over networks with delay is awful.
On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and
50Mbits/sec without TSO.

The problem is with TSO, we intentionally do not keep the maximum
number of packets in flight to fill the window, we hold out to until 
we can send a MSS chunk. But, we also don't update the congestion window 
unless we have filled, as per RFC2861.

This patch replaces the check for the congestion window being full
with something smarter that accounts for TSO.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4805ede

21 10月, 2005 1 次提交

[TCP] Allow len == skb->len in tcp_fragment · b2cc99f0

由 Herbert Xu 提交于 10月 20, 2005

It is legitimate to call tcp_fragment with len == skb->len since
that is done for FIN packets and the FIN flag counts as one byte.
So we should only check for the len > skb->len case.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

b2cc99f0