提交 · 985990137e81ca9fd6561cd0f7d1a9695ec57d5a · openeuler / raspberrypi-kernel

21 10月, 2005 1 次提交

[TCP] Allow len == skb->len in tcp_fragment · b2cc99f0

由 Herbert Xu 提交于 10月 20, 2005

It is legitimate to call tcp_fragment with len == skb->len since
that is done for FIN packets and the FIN flag counts as one byte.
So we should only check for the len > skb->len case.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

b2cc99f0

14 10月, 2005 1 次提交

[TCP]: Ratelimit debugging warning. · 046d20b7

由 Herbert Xu 提交于 10月 13, 2005

Better safe than sorry.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

046d20b7

13 10月, 2005 1 次提交

[TCP]: Add code to help track down "BUG at net/ipv4/tcp_output.c:438!" · 9ff5c59c

由 Herbert Xu 提交于 10月 12, 2005

This is the second report of this bug.  Unfortunately the first
reporter hasn't been able to reproduce it since to provide more
debugging info.

So let's apply this patch for 2.6.14 to

1) Make this non-fatal.
2) Provide the info we need to track it down.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ff5c59c

09 10月, 2005 1 次提交

[PATCH] gfp flags annotations - part 1 · dd0fc66f

由 Al Viro 提交于 10月 07, 2005

 - added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dd0fc66f

30 9月, 2005 1 次提交

[TCP]: Revert · 01ff367e

由 David S. Miller 提交于 9月 29, 2005

But retain the comment fix.

Alexey Kuznetsov has explained the situation as follows:

--------------------

I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is
not continuous: f.e. for mss=1095 it needs initial window 1095*4, but
for mss=1096 it is 1096*3. We do not know exactly what mss sender used
for calculations. If we advertised 1096 (and calculate initial window
3*1096), the sender could limit it to some value < 1096 and then it
will need window his_mss*4 > 3*1096 to send initial burst.

See?

So, the honest function for inital rcv_wnd derived from
tcp_init_cwnd() is:

	init_rcv_wnd(mss)=
	  min { init_cwnd(mss1)*mss1 for mss1 <= mss }

It is something sort of:

	if (mss < 1096)
		return mss*4;
	if (mss < 1096*2)
		return 1096*4;
	return mss*2;

(I just scrablled a graph of piece of paper, it is difficult to see or
to explain without this)

I selected it differently giving more window than it is strictly
required.  Initial receive window must be large enough to allow sender
following to the rfc (or just setting initial cwnd to 2) to send
initial burst.  But besides that it is arbitrary, so I decided to give
slack space of one segment.

Actually, the logic was:

If mss is low/normal (<=ethernet), set window to receive more than
initial burst allowed by rfc under the worst conditions
i.e. mss*4. This gives slack space of 1 segment for ethernet frames.

For msses slighlty more than ethernet frame, take 3. Try to give slack
space of 1 frame again.

If mss is huge, force 2*mss. No slack space.

Value 1460*3 is really confusing. Minimal one is 1096*2, but besides
that it is an arbitrary value. It was meant to be ~4096. 1460*3 is
just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so
that I guess hands typed this themselves.

--------------------
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01ff367e

29 9月, 2005 1 次提交

[TCP]: Fix init_cwnd calculations in tcp_select_initial_window() · 6b251858

由 David S. Miller 提交于 9月 28, 2005

Match it up to what RFC2414 really specifies.
Noticed by Rick Jones.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b251858

23 9月, 2005 1 次提交

[TCP]: Adjust Reno SACK estimate in tcp_fragment · 83ca28be

由 Herbert Xu 提交于 9月 22, 2005

Since the introduction of TSO pcount a year ago, it has been possible
for tcp_fragment() to cause packets_out to decrease.  Prior to that,
tcp_retrans_try_collapse() was the only way for that to happen on the
retransmission path.

When this happens with Reno, it is possible for sasked_out to become
invalid because it is only an estimate and not tied to any particular
packet on the retransmission queue.

Therefore we need to adjust sacked_out as well as left_out in the Reno
case.  The following patch does exactly that.

This bug is pretty difficult to trigger in practice though since you
need a SACKless peer with a retransmission that occurs just as the
cached MTU value expires.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83ca28be

20 9月, 2005 1 次提交

[TCP]: Handle SACK'd packets properly in tcp_fragment(). · e14c3caf

由 Herbert Xu 提交于 9月 19, 2005

The problem is that we're now calling tcp_fragment() in a context
where the packets might be marked as SACKED_ACKED or SACKED_RETRANS.
This was not possible before as you never retransmitted packets that
are so marked.

Because of this, we need to adjust sacked_out and retrans_out in
tcp_fragment().  This is exactly what the following patch does.

We also need to preserve the SACKED_ACKED/SACKED_RETRANS marking
if they exist.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e14c3caf

15 9月, 2005 1 次提交

[TCP]: Compute in_sacked properly when we split up a TSO frame. · 3c05d92e

由 Herbert Xu 提交于 9月 14, 2005

The problem is that the SACK fragmenting code may incorrectly call
tcp_fragment() with a length larger than the skb->len.  This happens
when the skb on the transmit queue completely falls to the LHS of the
SACK.

And add a BUG() check to tcp_fragment() so we can spot this kind of
error more quickly in the future.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c05d92e

11 9月, 2005 1 次提交

[TCP]: Fix double adjustment of tp->{lost,left}_out in tcp_fragment(). · e130af5d

由 Herbert Xu 提交于 9月 10, 2005

There is an extra left_out/lost_out adjustment in tcp_fragment which
means that the lost_out accounting is always wrong.  This patch removes
that chunk of code.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e130af5d

09 9月, 2005 1 次提交

[TCP]: Fix off by one in tcp_fragment() "already sent" test. · cf0b450c

由 Herbert Xu 提交于 9月 08, 2005

Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf0b450c

02 9月, 2005 1 次提交

[TCP]: Keep TSO enabled even during loss events. · 6475be16

由 David S. Miller 提交于 9月 01, 2005

All we need to do is resegment the queue so that
we record SACK information accurately.  The edges
of the SACK blocks guide our resegmenting decisions.

With help from Herbert Xu.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6475be16

30 8月, 2005 6 次提交

[NET]: Implement SKB fast cloning. · d179cd12

由 David S. Miller 提交于 8月 17, 2005

Protocols that make extensive use of SKB cloning,
for example TCP, eat at least 2 allocations per
packet sent as a result.

To cut the kmalloc() count in half, we implement
a pre-allocation scheme wherein we allocate
2 sk_buff objects in advance, then use a simple
reference count to free up the memory at the
correct time.

Based upon an initial patch by Thomas Graf and
suggestions from Herbert Xu.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d179cd12

[NET]: Store skb->timestamp as offset to a base timestamp · a61bbcf2

由 Patrick McHardy 提交于 8月 14, 2005

Reduces skb size by 8 bytes on 64-bit.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a61bbcf2

[ICSK]: Move TCP congestion avoidance members to icsk · 6687e988

由 Arnaldo Carvalho de Melo 提交于 8月 10, 2005

This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(),
minimal renaming/moving done in this changeset to ease review.

Most of it is just changes of struct tcp_sock * to struct sock * parameters.

With this we move to a state closer to two interesting goals:

1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used
for any INET transport protocol that has struct inet_hashinfo and are
derived from struct inet_connection_sock. Keeps the userspace API, that will
just not display DCCP sockets, while newer versions of tools can support
DCCP.

2. INET generic transport pluggable Congestion Avoidance infrastructure, using
the current TCP CA infrastructure with DCCP.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6687e988

[NET]: Just move the inet_connection_sock function from tcp sources · 3f421baa

由 Arnaldo Carvalho de Melo 提交于 8月 09, 2005

Completing the previous changeset, this also generalises tcp_v4_synq_add,
renaming it to inet_csk_reqsk_queue_hash_add, already geing used in the
DCCP tree, which I plan to merge RSN.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f421baa

[NET]: Introduce inet_connection_sock · 463c84b9

由 Arnaldo Carvalho de Melo 提交于 8月 09, 2005

This creates struct inet_connection_sock, moving members out of struct
tcp_sock that are shareable with other INET connection oriented
protocols, such as DCCP, that in my private tree already uses most of
these members.

The functions that operate on these members were renamed, using a
inet_csk_ prefix while not being moved yet to a new file, so as to
ease the review of these changes.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

463c84b9

[NET]: Kill skb->list · 8728b834

由 David S. Miller 提交于 8月 09, 2005

Remove the "list" member of struct sk_buff, as it is entirely
redundant.  All SKB list removal callers know which list the
SKB is on, so storing this in sk_buff does nothing other than
taking up some space.

Two tricky bits were SCTP, which I took care of, and two ATM
drivers which Francois Romieu <romieu@fr.zoreil.com> fixed
up.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NFrancois Romieu <romieu@fr.zoreil.com>

8728b834

24 8月, 2005 1 次提交

[TCP]: Do TSO deferral even if tail SKB can go out now. · 14869c38

由 Dmitry Yusupov 提交于 8月 23, 2005

If the tail SKB fits into the window, it is still
benefitical to defer until the goal percentage of
the window is available.  This give the application
time to feed more data into the send queue and thus
results in larger TSO frames going out.

Patch from Dmitry Yusupov <dima@neterion.com>.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14869c38

18 8月, 2005 1 次提交

[TCP]: Fix bug #5070: kernel BUG at net/ipv4/tcp_output.c:864 · 35d59efd

由 Herbert Xu 提交于 8月 17, 2005

1) We send out a normal sized packet with TSO on to start off.
2) ICMP is received indicating a smaller MTU.
3) We send the current sk_send_head which needs to be fragmented
since it was created before the ICMP event.  The first fragment
is then sent out.

At this point the remaining fragment is allocated by tcp_fragment.
However, its size is padded to fit the L1 cache-line size therefore
creating tail-room up to 124 bytes long.

This fragment will also be sitting at sk_send_head.

4) tcp_sendmsg is called again and it stores data in the tail-room of
of the fragment.
5) tcp_push_one is called by tcp_sendmsg which then calls tso_fragment
since the packet as a whole exceeds the MTU.

At this point we have a packet that has data in the head area being
fed to tso_fragment which bombs out.

My take on this is that we shouldn't ever call tcp_fragment on a TSO
socket for a packet that is yet to be transmitted since this creates
a packet on sk_send_head that cannot be extended.

So here is a patch to change it so that tso_fragment is always used
in this case.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35d59efd

17 8月, 2005 1 次提交

[TCP]: Fix bug #5070: kernel BUG at net/ipv4/tcp_output.c:864 · c8ac3774

由 Herbert Xu 提交于 8月 16, 2005

1) We send out a normal sized packet with TSO on to start off.
2) ICMP is received indicating a smaller MTU.
3) We send the current sk_send_head which needs to be fragmented
since it was created before the ICMP event.  The first fragment
is then sent out.

At this point the remaining fragment is allocated by tcp_fragment.
However, its size is padded to fit the L1 cache-line size therefore
creating tail-room up to 124 bytes long.

This fragment will also be sitting at sk_send_head.

4) tcp_sendmsg is called again and it stores data in the tail-room of
of the fragment.
5) tcp_push_one is called by tcp_sendmsg which then calls tso_fragment
since the packet as a whole exceeds the MTU.

At this point we have a packet that has data in the head area being
fed to tso_fragment which bombs out.

My take on this is that we shouldn't ever call tcp_fragment on a TSO
socket for a packet that is yet to be transmitted since this creates
a packet on sk_send_head that cannot be extended.

So here is a patch to change it so that tso_fragment is always used
in this case.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8ac3774

11 8月, 2005 1 次提交

[TCP]: Adjust {p,f}ackets_out correctly in tcp_retransmit_skb() · b5da623a

由 Herbert Xu 提交于 8月 10, 2005

Well I've only found one potential cause for the assertion
failure in tcp_mark_head_lost.  First of all, this can only
occur if cnt > 1 since tp->packets_out is never zero here.
If it did hit zero we'd have much bigger problems.

So cnt is equal to fackets_out - reordering.  Normally
fackets_out is less than packets_out.  The only reason
I've found that might cause fackets_out to exceed packets_out
is if tcp_fragment is called from tcp_retransmit_skb with a
TSO skb and the current MSS is greater than the MSS stored
in the TSO skb.  This might occur as the result of an expiring
dst entry.

In that case, packets_out may decrease (line 1380-1381 in
tcp_output.c).  However, fackets_out is unchanged which means
that it may in fact exceed packets_out.

Previously tcp_retrans_try_collapse was the only place where
packets_out can go down and it takes care of this by decrementing
fackets_out.

So we should make sure that fackets_out is reduced by an appropriate
amount here as well.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5da623a

05 8月, 2005 2 次提交

[PATCH] tcp: fix TSO cwnd caching bug · b68e9f85

由 Herbert Xu 提交于 8月 04, 2005

tcp_write_xmit caches the cwnd value indirectly in cwnd_quota.  When
tcp_transmit_skb reduces the cwnd because of tcp_enter_cwr, the cached
value becomes invalid.

This patch ensures that the cwnd value is always reread after each
tcp_transmit_skb call.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b68e9f85

[PATCH] tcp: fix TSO sizing bugs · 846998ae

由 David S. Miller 提交于 8月 04, 2005

MSS changes can be lost since we preemptively initialize the tso_segs count
for an SKB before we %100 commit to sending it out.

So, by the time we send it out, the tso_size information can be stale due
to PMTU events.  This mucks up all of the logic in our send engine, and can
even result in the BUG() triggering in tcp_tso_should_defer().

Another problem we have is that we're storing the tp->mss_cache, not the
SACK block normalized MSS, as the tso_size.  That's wrong too.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

846998ae

09 7月, 2005 1 次提交

[NET]: Fix sparse warnings · 86a76caf

由 Victor Fusco 提交于 7月 08, 2005

From: Victor Fusco <victor@cetuc.puc-rio.br>

Fix the sparse warning "implicit cast to nocast type"
Signed-off-by: NVictor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: NDomen Puncer <domen@coderock.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86a76caf

06 7月, 2005 12 次提交

[TCP]: Never TSO defer under periods of congestion. · 908a75c1

由 David S. Miller 提交于 7月 05, 2005

Congestion window recover after loss depends upon the fact
that if we have a full MSS sized frame at the head of the
send queue, we will send it.  TSO deferral can defeat the
ACK clocking necessary to exit cleanly from recovery.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

908a75c1

[TCP]: Move to new TSO segmenting scheme. · c1b4a7e6

由 David S. Miller 提交于 7月 05, 2005

Make TSO segment transmit size decisions at send time not earlier.

The basic scheme is that we try to build as large a TSO frame as
possible when pulling in the user data, but the size of the TSO frame
output to the card is determined at transmit time.

This is guided by tp->xmit_size_goal. It is always set to a multiple
of MSS and tells sendmsg/sendpage how large an SKB to try and build.

Later, tcp_write_xmit() and tcp_push_one() chop up the packet if
necessary and conditions warrant. These routines can also decide to
"defer" in order to wait for more ACKs to arrive and thus allow larger
TSO frames to be emitted.

A general observation is that TSO elongates the pipe, thus requiring a
larger congestion window and larger buffering especially at the sender
side. Therefore, it is important that applications 1) get a large
enough socket send buffer (this is accomplished by our dynamic send
buffer expansion code) 2) do large enough writes.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1b4a7e6

[TCP]: Eliminate redundant computations in tcp_write_xmit(). · aa93466b

由 David S. Miller 提交于 7月 05, 2005

tcp_snd_test() is run for every packet output by a single
call to tcp_write_xmit(), but this is not necessary.

For one, the congestion window space needs to only be
calculated one time, then used throughout the duration
of the loop.

This cleanup also makes experimenting with different TSO
packetization schemes much easier.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa93466b

[TCP]: Break out tcp_snd_test() into it's constituent parts. · 7f4dd0a9

由 David S. Miller 提交于 7月 05, 2005

tcp_snd_test() does several different things, use inline
functions to express this more clearly.

1) It initializes the TSO count of SKB, if necessary.
2) It performs the Nagle test.
3) It makes sure the congestion window is adhered to.
4) It makes sure SKB fits into the send window.

This cleanup also sets things up so that things like the
available packets in the congestion window does not need
to be calculated multiple times by packet sending loops
such as tcp_write_xmit().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f4dd0a9

[TCP]: Fix __tcp_push_pending_frames() 'nonagle' handling. · 55c97f3e

由 David S. Miller 提交于 7月 05, 2005

'nonagle' should be passed to the tcp_snd_test() function
as 'TCP_NAGLE_PUSH' if we are checking an SKB not at the
tail of the write_queue.  This is because Nagle does not
apply to such frames since we cannot possibly tack more
data onto them.

However, while doing this __tcp_push_pending_frames() makes
all of the packets in the write_queue use this modified
'nonagle' value.

Fix the bug and simplify this function by just calling
tcp_write_xmit() directly if sk_send_head is non-NULL.

As a result, we can now make tcp_data_snd_check() just call
tcp_push_pending_frames() instead of the specialized
__tcp_data_snd_check().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55c97f3e

[TCP]: Fix redundant calculations of tcp_current_mss() · a2e2a59c

由 David S. Miller 提交于 7月 05, 2005

tcp_write_xmit() uses tcp_current_mss(), but some of it's callers,
namely __tcp_push_pending_frames(), already has this value available
already.

While we're here, fix the "cur_mss" argument to be "unsigned int"
instead of plain "unsigned".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2e2a59c

[TCP]: tcp_write_xmit() tabbing cleanup · 92df7b51

由 David S. Miller 提交于 7月 05, 2005

Put the main basic block of work at the top-level of
tabbing, and mark the TCP_CLOSE test with unlikely().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92df7b51

[TCP]: Kill extra cwnd validate in __tcp_push_pending_frames(). · a762a980

由 David S. Miller 提交于 7月 05, 2005

The tcp_cwnd_validate() function should only be invoked
if we actually send some frames, yet __tcp_push_pending_frames()
will always invoke it.  tcp_write_xmit() does the call for us,
so the call here can simply be removed.

Also, tcp_write_xmit() can be marked static.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a762a980

[TCP]: Add missing skb_header_release() call to tcp_fragment(). · f44b5271

由 David S. Miller 提交于 7月 05, 2005

When we add any new packet to the TCP socket write queue,
we must call skb_header_release() on it in order for the
TSO sharing checks in the drivers to work.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f44b5271

[TCP]: Move __tcp_data_snd_check into tcp_output.c · 84d3e7b9

由 David S. Miller 提交于 7月 05, 2005

It reimplements portions of tcp_snd_check(), so it
we move it to tcp_output.c we can consolidate it's
logic much easier in a later change.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84d3e7b9

[TCP]: Move send test logic out of net/tcp.h · f6302d1d

由 David S. Miller 提交于 7月 05, 2005

This just moves the code into tcp_output.c, no code logic changes are
made by this patch.

Using this as a baseline, we can begin to untangle the mess of
comparisons for the Nagle test et al.  We will also be able to reduce
all of the redundant computation that occurs when outputting data
packets.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6302d1d

[TCP]: Fix quick-ack decrementing with TSO. · fc6415bc

由 David S. Miller 提交于 7月 05, 2005

On each packet output, we call tcp_dec_quickack_mode()
if the ACK flag is set.  It drops tp->ack.quick until
it hits zero, at which time we deflate the ATO value.

When doing TSO, we are emitting multiple packets with
ACK set, so we should decrement tp->ack.quick that many
segments.

Note that, unlike this case, tcp_enter_cwr() should not
take the tcp_skb_pcount(skb) into consideration.  That
function, one time, readjusts tp->snd_cwnd and moves
into TCP_CA_CWR state.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc6415bc

24 6月, 2005 1 次提交

[TCP]: Add pluggable congestion control algorithm infrastructure. · 317a76f9

由 Stephen Hemminger 提交于 6月 23, 2005

Allow TCP to have multiple pluggable congestion control algorithms.
Algorithms are defined by a set of operations and can be built in
or modules.  The legacy "new RENO" algorithm is used as a starting
point and fallback.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

317a76f9

19 6月, 2005 2 次提交

[NET] Rename open_request to request_sock · 60236fdd

由 Arnaldo Carvalho de Melo 提交于 6月 18, 2005

Ok, this one just renames some stuff to have a better namespace and to
dissassociate it from TCP:

struct open_request  -> struct request_sock
tcp_openreq_alloc    -> reqsk_alloc
tcp_openreq_free     -> reqsk_free
tcp_openreq_fastfree -> __reqsk_free

With this most of the infrastructure closely resembles a struct
sock methods subset.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60236fdd

[NET] Generalise TCP's struct open_request minisock infrastructure · 2e6599cb

由 Arnaldo Carvalho de Melo 提交于 6月 18, 2005

Kept this first changeset minimal, without changing existing names to
ease peer review.

Basicaly tcp_openreq_alloc now receives the or_calltable, that in turn
has two new members:

->slab, that replaces tcp_openreq_cachep
->obj_size, to inform the size of the openreq descendant for
  a specific protocol

The protocol specific fields in struct open_request were moved to a
class hierarchy, with the things that are common to all connection
oriented PF_INET protocols in struct inet_request_sock, the TCP ones
in tcp_request_sock, that is an inet_request_sock, that is an
open_request.

I.e. this uses the same approach used for the struct sock class
hierarchy, with sk_prot indicating if the protocol wants to use the
open_request infrastructure by filling in sk_prot->rsk_prot with an
or_calltable.

Results? Performance is improved and TCP v4 now uses only 64 bytes per
open request minisock, down from 96 without this patch :-)

Next changeset will rename some of the structs, fields and functions
mentioned above, struct or_calltable is way unclear, better name it
struct request_sock_ops, s/struct open_request/struct request_sock/g,
etc.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e6599cb