提交 · c1b4a7e69576d65efc31a8cea0714173c2841244 · openanolis / cloud-kernel

06 7月, 2005 11 次提交

[TCP]: Move to new TSO segmenting scheme. · c1b4a7e6

由 David S. Miller 提交于 7月 05, 2005

Make TSO segment transmit size decisions at send time not earlier.

The basic scheme is that we try to build as large a TSO frame as
possible when pulling in the user data, but the size of the TSO frame
output to the card is determined at transmit time.

This is guided by tp->xmit_size_goal. It is always set to a multiple
of MSS and tells sendmsg/sendpage how large an SKB to try and build.

Later, tcp_write_xmit() and tcp_push_one() chop up the packet if
necessary and conditions warrant. These routines can also decide to
"defer" in order to wait for more ACKs to arrive and thus allow larger
TSO frames to be emitted.

A general observation is that TSO elongates the pipe, thus requiring a
larger congestion window and larger buffering especially at the sender
side. Therefore, it is important that applications 1) get a large
enough socket send buffer (this is accomplished by our dynamic send
buffer expansion code) 2) do large enough writes.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1b4a7e6

[TCP]: Eliminate redundant computations in tcp_write_xmit(). · aa93466b

由 David S. Miller 提交于 7月 05, 2005

tcp_snd_test() is run for every packet output by a single
call to tcp_write_xmit(), but this is not necessary.

For one, the congestion window space needs to only be
calculated one time, then used throughout the duration
of the loop.

This cleanup also makes experimenting with different TSO
packetization schemes much easier.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa93466b

[TCP]: Break out tcp_snd_test() into it's constituent parts. · 7f4dd0a9

由 David S. Miller 提交于 7月 05, 2005

tcp_snd_test() does several different things, use inline
functions to express this more clearly.

1) It initializes the TSO count of SKB, if necessary.
2) It performs the Nagle test.
3) It makes sure the congestion window is adhered to.
4) It makes sure SKB fits into the send window.

This cleanup also sets things up so that things like the
available packets in the congestion window does not need
to be calculated multiple times by packet sending loops
such as tcp_write_xmit().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f4dd0a9

[TCP]: Fix __tcp_push_pending_frames() 'nonagle' handling. · 55c97f3e

由 David S. Miller 提交于 7月 05, 2005

'nonagle' should be passed to the tcp_snd_test() function
as 'TCP_NAGLE_PUSH' if we are checking an SKB not at the
tail of the write_queue.  This is because Nagle does not
apply to such frames since we cannot possibly tack more
data onto them.

However, while doing this __tcp_push_pending_frames() makes
all of the packets in the write_queue use this modified
'nonagle' value.

Fix the bug and simplify this function by just calling
tcp_write_xmit() directly if sk_send_head is non-NULL.

As a result, we can now make tcp_data_snd_check() just call
tcp_push_pending_frames() instead of the specialized
__tcp_data_snd_check().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55c97f3e

[TCP]: Fix redundant calculations of tcp_current_mss() · a2e2a59c

由 David S. Miller 提交于 7月 05, 2005

tcp_write_xmit() uses tcp_current_mss(), but some of it's callers,
namely __tcp_push_pending_frames(), already has this value available
already.

While we're here, fix the "cur_mss" argument to be "unsigned int"
instead of plain "unsigned".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2e2a59c

[TCP]: tcp_write_xmit() tabbing cleanup · 92df7b51

由 David S. Miller 提交于 7月 05, 2005

Put the main basic block of work at the top-level of
tabbing, and mark the TCP_CLOSE test with unlikely().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92df7b51

[TCP]: Kill extra cwnd validate in __tcp_push_pending_frames(). · a762a980

由 David S. Miller 提交于 7月 05, 2005

The tcp_cwnd_validate() function should only be invoked
if we actually send some frames, yet __tcp_push_pending_frames()
will always invoke it.  tcp_write_xmit() does the call for us,
so the call here can simply be removed.

Also, tcp_write_xmit() can be marked static.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a762a980

[TCP]: Add missing skb_header_release() call to tcp_fragment(). · f44b5271

由 David S. Miller 提交于 7月 05, 2005

When we add any new packet to the TCP socket write queue,
we must call skb_header_release() on it in order for the
TSO sharing checks in the drivers to work.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f44b5271

[TCP]: Move __tcp_data_snd_check into tcp_output.c · 84d3e7b9

由 David S. Miller 提交于 7月 05, 2005

It reimplements portions of tcp_snd_check(), so it
we move it to tcp_output.c we can consolidate it's
logic much easier in a later change.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84d3e7b9

[TCP]: Move send test logic out of net/tcp.h · f6302d1d

由 David S. Miller 提交于 7月 05, 2005

This just moves the code into tcp_output.c, no code logic changes are
made by this patch.

Using this as a baseline, we can begin to untangle the mess of
comparisons for the Nagle test et al.  We will also be able to reduce
all of the redundant computation that occurs when outputting data
packets.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6302d1d

[TCP]: Fix quick-ack decrementing with TSO. · fc6415bc

由 David S. Miller 提交于 7月 05, 2005

On each packet output, we call tcp_dec_quickack_mode()
if the ACK flag is set.  It drops tp->ack.quick until
it hits zero, at which time we deflate the ATO value.

When doing TSO, we are emitting multiple packets with
ACK set, so we should decrement tp->ack.quick that many
segments.

Note that, unlike this case, tcp_enter_cwr() should not
take the tcp_skb_pcount(skb) into consideration.  That
function, one time, readjusts tp->snd_cwnd and moves
into TCP_CA_CWR state.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc6415bc

24 6月, 2005 1 次提交

[TCP]: Add pluggable congestion control algorithm infrastructure. · 317a76f9

由 Stephen Hemminger 提交于 6月 23, 2005

Allow TCP to have multiple pluggable congestion control algorithms.
Algorithms are defined by a set of operations and can be built in
or modules.  The legacy "new RENO" algorithm is used as a starting
point and fallback.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

317a76f9

19 6月, 2005 2 次提交

[NET] Rename open_request to request_sock · 60236fdd

由 Arnaldo Carvalho de Melo 提交于 6月 18, 2005

Ok, this one just renames some stuff to have a better namespace and to
dissassociate it from TCP:

struct open_request  -> struct request_sock
tcp_openreq_alloc    -> reqsk_alloc
tcp_openreq_free     -> reqsk_free
tcp_openreq_fastfree -> __reqsk_free

With this most of the infrastructure closely resembles a struct
sock methods subset.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60236fdd

[NET] Generalise TCP's struct open_request minisock infrastructure · 2e6599cb

由 Arnaldo Carvalho de Melo 提交于 6月 18, 2005

Kept this first changeset minimal, without changing existing names to
ease peer review.

Basicaly tcp_openreq_alloc now receives the or_calltable, that in turn
has two new members:

->slab, that replaces tcp_openreq_cachep
->obj_size, to inform the size of the openreq descendant for
  a specific protocol

The protocol specific fields in struct open_request were moved to a
class hierarchy, with the things that are common to all connection
oriented PF_INET protocols in struct inet_request_sock, the TCP ones
in tcp_request_sock, that is an inet_request_sock, that is an
open_request.

I.e. this uses the same approach used for the struct sock class
hierarchy, with sk_prot indicating if the protocol wants to use the
open_request infrastructure by filling in sk_prot->rsk_prot with an
or_calltable.

Results? Performance is improved and TCP v4 now uses only 64 bytes per
open request minisock, down from 96 without this patch :-)

Next changeset will rename some of the structs, fields and functions
mentioned above, struct or_calltable is way unclear, better name it
struct request_sock_ops, s/struct open_request/struct request_sock/g,
etc.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e6599cb

06 5月, 2005 1 次提交

[PATCH] update Ross Biro bouncing email address · 02c30a84

由 Jesper Juhl 提交于 5月 05, 2005

Ross moved.  Remove the bad email address so people will find the correct
one in ./CREDITS.
Signed-off-by: NJesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

02c30a84

25 4月, 2005 1 次提交

[TCP]: skb pcount with MTU discovery · d5ac99a6

由 David S. Miller 提交于 4月 24, 2005

The problem is that when doing MTU discovery, the too-large segments in
the write queue will be calculated as having a pcount of >1.  When
tcp_write_xmit() is trying to send, tcp_snd_test() fails the cwnd test
when pcount > cwnd.

The segments are eventually transmitted one at a time by keepalive, but
this can take a long time.

This patch checks if TSO is enabled when setting pcount.
Signed-off-by: NJohn Heffner <jheffner@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5ac99a6

17 4月, 2005 1 次提交

Linux-2.6.12-rc2 · 1da177e4

由 Linus Torvalds 提交于 4月 16, 2005

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

1da177e4

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功