提交 · ec0a196626bd12e0ba108d7daa6d95a4fb25c2c5 · openeuler / Kernel

13 6月, 2008 1 次提交

tcp: Revert 'process defer accept as established' changes. · ec0a1966

由 David S. Miller 提交于 6月 12, 2008

This reverts two changesets, ec3c0982
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0a
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems.  The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock.  The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> 		goto death;
> 	}
>
>+	if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+		tcp_send_active_reset(sk, GFP_ATOMIC);
>+		goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec0a1966

22 5月, 2008 1 次提交

tcp: Make prior_ssthresh a u32 · 4b749440

由 Ilpo Järvinen 提交于 5月 21, 2008

If previous window was above representable values of u16,
strange things will happen if undo with the truncated value
is called for. Alternatively, this could be fixed by some
max trickery but that would limit undoing high-speed undos.

Adds 16-bit hole but there isn't anything to fill it with.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b749440

22 3月, 2008 1 次提交

[TCP]: TCP_DEFER_ACCEPT updates - process as established · ec3c0982

由 Patrick McManus 提交于 3月 21, 2008

Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

 - diagnostic state of established matches with the packet traces
   showing completed handshake

 - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.
Signed-off-by: NPatrick McManus <mcmanus@ducksong.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec3c0982

29 1月, 2008 3 次提交

[TCP]: Rewrite SACK block processing & sack_recv_cache use · 68f8353b

由 Ilpo Järvinen 提交于 11月 15, 2007

Key points of this patch are:

  - In case new SACK information is advance only type, no skb
    processing below previously discovered highest point is done
  - Optimize cases below highest point too since there's no need
    to always go up to highest point (which is very likely still
    present in that SACK), this is not entirely true though
    because I'm dropping the fastpath_skb_hint which could
    previously optimize those cases even better. Whether that's
    significant, I'm not too sure.

Currently it will provide skipping by walking. Combined with
RB-tree, all skipping would become fast too regardless of window
size (can be done incrementally later).

Previously a number of cases in TCP SACK processing fails to
take advantage of costly stored information in sack_recv_cache,
most importantly, expected events such as cumulative ACK and new
hole ACKs. Processing on such ACKs result in rather long walks
building up latencies (which easily gets nasty when window is
huge). Those latencies are often completely unnecessary
compared with the amount of _new_ information received, usually
for cumulative ACK there's no new information at all, yet TCP
walks whole queue unnecessary potentially taking a number of
costly cache misses on the way, etc.!

Since the inclusion of highest_sack, there's a lot information
that is very likely redundant (SACK fastpath hint stuff,
fackets_out, highest_sack), though there's no ultimate guarantee
that they'll remain the same whole the time (in all unearthly
scenarios). Take advantage of this knowledge here and drop
fastpath hint and use direct access to highest SACKed skb as
a replacement.

Effectively "special cased" fastpath is dropped. This change
adds some complexity to introduce better coveraged "fastpath",
though the added complexity should make TCP behave more cache
friendly.

The current ACK's SACK blocks are compared against each cached
block individially and only ranges that are new are then scanned
by the high constant walk. For other parts of write queue, even
when in previously known part of the SACK blocks, a faster skip
function is used (if necessary at all). In addition, whenever
possible, TCP fast-forwards to highest_sack skb that was made
available by an earlier patch. In typical case, no other things
but this fast-forward and mandatory markings after that occur
making the access pattern quite similar to the former fastpath
"special case".

DSACKs are special case that must always be walked.

The local to recv_sack_cache copying could be more intelligent
w.r.t DSACKs which are likely to be there only once but that
is left to a separate patch.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68f8353b

I
[TCP]: Earlier SACK block verification & simplify access to them · fd6dad61
由 Ilpo Järvinen 提交于 11月 15, 2007
```
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
fd6dad61

[TCP]: Convert highest_sack to sk_buff to allow direct access · a47e5a98

由 Ilpo Järvinen 提交于 11月 15, 2007

It is going to replace the sack fastpath hint quite soon... :-)
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a47e5a98

16 10月, 2007 1 次提交

[TCP]: Make snd_cwnd_cnt 32-bit · f78a1b38

由 Ilpo Järvinen 提交于 10月 15, 2007

Very little point of having 32-bit snd_cnwd if this is not
32-bit as well, as a number of snd_cwnd incrementation formulas
assume that snd_cwnd_cnt can be at least as large as snd_cwnd.

Whether 32-bit is useful was discussed when e0ef57cc
was made:
  http://marc.info/?l=linux-netdev&m=117218144409825&w=2Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f78a1b38

12 10月, 2007 1 次提交

[TCP]: Limit processing lost_retrans loop to work-to-do cases · b08d6cb2

由 Ilpo Järvinen 提交于 10月 11, 2007

This addition of lost_retrans_low to tcp_sock might be
unnecessary, it's not clear how often lost_retrans worker is
executed when there wasn't work to do.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b08d6cb2

11 10月, 2007 5 次提交

[TCP]: Comment fastpath_cnt_hint off-by-one trap · c79e3357

由 Ilpo Järvinen 提交于 10月 07, 2007

Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c79e3357

[TCP]: Update comment about highest_sack validity · 13dae426

由 Ilpo Järvinen 提交于 8月 10, 2007

This stale info came from the original idea, which proved to be
unnecessarily complex, sacked_out > 0 is easy to do and that when
it's going to be needed anyway (it _can_ be valid also when
sacked_out == 0 but there's not going to be a guarantee about it
for now).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13dae426

[TCP]: Tighten tcp_sock's belt, drop left_out · b5860bba

由 Ilpo Järvinen 提交于 8月 09, 2007

It is easily calculable when needed and user are not that many
after all.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5860bba

[TCP]: Access to highest_sack obsoletes forward_cnt_hint · 539d243f

由 Ilpo Järvinen 提交于 5月 27, 2007

In addition, added a reference about the purpose of the loop.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

539d243f

[TCP]: Add highest_sack seqno, points to globally highest SACK · d738cd8f

由 Ilpo Järvinen 提交于 3月 24, 2007

It is guaranteed to be valid only when !tp->sacked_out. In most
cases this seqno is available in the last ACK but there is no
guarantee for that. The new fast recovery loss marking algorithm
needs this as entry point.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d738cd8f

26 4月, 2007 5 次提交

[SK_BUFF]: Introduce skb_transport_header(skb) · 9c70220b

由 Arnaldo Carvalho de Melo 提交于 4月 25, 2007

For the places where we need a pointer to the transport header, it is
still legal to touch skb->h.raw directly if just adding to,
subtracting from or setting it to another layer header.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c70220b

[SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th · aa8223c7

由 Arnaldo Carvalho de Melo 提交于 4月 10, 2007

Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa8223c7

[TCP]: Introduce tcp_hdrlen() and tcp_optlen() · ab6a5bb6

由 Arnaldo Carvalho de Melo 提交于 3月 18, 2007

The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.

Ditched a no-op in bnx2 in the process.

I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab6a5bb6

D
[TCP]: Make snd_cwnd_clamp a u32. · e0ef57cc
由 David S. Miller 提交于 2月 22, 2007
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
e0ef57cc

[TCP]: Keep copied_seq, rcv_wup and rcv_next together. · 54287cc1

由 Eric Dumazet 提交于 2月 22, 2007

I noticed in oprofile study a cache miss in tcp_rcv_established() to read
copied_seq.

ffffffff80400a80 <tcp_rcv_established>: /* tcp_rcv_established total: 4034293  
2.0400 */

 55493  0.0281 :ffffffff80400bc9:   mov    0x4c8(%r12),%eax copied_seq
543103  0.2746 :ffffffff80400bd1:   cmp    0x3e0(%r12),%eax   rcv_nxt    

if (tp->copied_seq == tp->rcv_nxt &&
        len - tcp_header_len <= tp->ucopy.len) {

In this function, the cache line 0x4c0 -> 0x500 is used only for this
reading 'copied_seq' field.

rcv_wup and copied_seq should be next to rcv_nxt field, to lower number of
active cache lines in hot paths. (tcp_rcv_established(), tcp_poll(), ...)

As you suggested, I changed tcp_create_openreq_child() so that these fields
are changed together, to avoid adding a new store buffer stall.

Patch is 64bit friendly (no new hole because of alignment constraints)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54287cc1

09 2月, 2007 1 次提交

[TCP]: Seperate DSACK from SACK fast path · 6f74651a

由 Baruch Even 提交于 2月 04, 2007

Move DSACK code outside the SACK fast-path checking code. If the DSACK
determined that the information was too old we stayed with a partial cache
copied. Most likely this matters very little since the next packet will not be
DSACK and we will find it in the cache. but it's still not good form and there
is little reason to couple the two checks.

Since the SACK receive cache doesn't need the data to be in host order we also
remove the ntohl in the checking loop.
Signed-off-by: NBaruch Even <baruch@ev-en.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f74651a

03 12月, 2006 4 次提交

A
[TCP]: Renove the __ prefix on the struct tcp_sock members · 3a137d20
由 Arnaldo Carvalho de Melo 提交于 11月 28, 2006
```
As this struct is not userland visible at all.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
```
3a137d20

[TCP]: Change tcp_header_len member in tcp_sock to u16 · 2ff52f28

由 Arnaldo Carvalho de Melo 提交于 11月 28, 2006

With this we eliminate the last hole in struct tcp_sock.

End result:

[acme@newtoy net-2.6.20]$ codiff -sV /tmp/tcp.o.before net/ipv4/tcp.o
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp.c:
  struct tcp_sock |   -4
    tcp_header_len;
     from: int                   /*  1000(0)     4(0) */
     to:   u16                   /*  1000(0)     2(0) */
 1 struct changed
[acme@newtoy net-2.6.20]$

Now sizeof(tcp_sock) is just...

[acme@newtoy net-2.6.20]$ pahole --sizes ../OUTPUT/qemu/net-2.6.20/net/ipv4/tcp.o | grep -w tcp_sock
struct tcp_sock: 1500 0

1500 bytes ;-)
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

2ff52f28

[NET]: Annotate checksums in on-the-wire packets. · 9981a0e3

由 Al Viro 提交于 11月 14, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9981a0e3

[TCP]: MD5 Signature Option (RFC2385) support. · cfb6eeb4

由 YOSHIFUJI Hideaki 提交于 11月 14, 2006

Based on implementation by Rick Payne.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfb6eeb4

19 10月, 2006 1 次提交

[TCP]: Bound TSO defer time · ae8064ac

由 John Heffner 提交于 10月 18, 2006

This patch limits the amount of time you will defer sending a TSO segment
to less than two clock ticks, or the time between two acks, whichever is
longer.

On slow links, deferring causes significant bursts.  See attached plots,
which show RTT through a 1 Mbps link with a 100 ms RTT and ~100 ms queue
for (a) non-TSO, (b) currnet TSO, and (c) patched TSO.  This burstiness
causes significant jitter, tends to overflow queues early (bad for short
queues), and makes delay-based congestion control more difficult.

Deferring by a couple clock ticks I believe will have a relatively small
impact on performance.
Signed-off-by: NJohn Heffner <jheffner@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae8064ac

29 9月, 2006 3 次提交

[TCP]: struct tcp_sock .pred_flags is net-endian · dddc93c0

由 Al Viro 提交于 9月 27, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dddc93c0

[TCP]: struct tcp_sack_block annotations · 269bd27e

由 Al Viro 提交于 9月 27, 2006

Some of the instances of tcp_sack_block are host-endian, some - net-endian.
Define struct tcp_sack_block_wire identical to struct tcp_sack_block
with u32 replaced with __be32; annotate uses of tcp_sack_block replacing
net-endian ones with tcp_sack_block_wire. Change is obviously safe since
for cc(1) __be32 is typedefed to u32.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

269bd27e

[IPV4]: TCP headers annotated · 46a97324

由 Al Viro 提交于 9月 27, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46a97324

23 6月, 2006 1 次提交

[TCP]: Move inclusion of <linux/dmaengine.h> to correct place in <linux/tcp.h> · c8a553ad

由 David Woodhouse 提交于 6月 22, 2006

The new <linux/dmaengine.h> header shouldn't be included from
the !__KERNEL__ portion of tcp.h
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8a553ad

18 6月, 2006 1 次提交

[I/OAT]: Structure changes for TCP recv offload to I/OAT · 97fc2f08

由 Chris Leech 提交于 5月 23, 2006

Adds an async_wait_queue and some additional fields to tcp_sock, and a
dma_cookie_t to sk_buff.
Signed-off-by: NChris Leech <christopher.leech@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97fc2f08

26 4月, 2006 1 次提交
- D
  Don't include linux/config.h from anywhere else in include/ · 62c4f0a2
  由 David Woodhouse 提交于 4月 26, 2006
```
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
```
  62c4f0a2
21 3月, 2006 1 次提交

[TCP] mtu probing: move tcp-specific data out of inet_connection_sock · 0e7b1368

由 John Heffner 提交于 3月 20, 2006

This moves some TCP-specific MTU probing state out of
inet_connection_sock back to tcp_sock.
Signed-off-by: NJohn Heffner <jheffner@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e7b1368

04 1月, 2006 3 次提交

[IP_SOCKGLUE]: Remove most of the tcp specific calls · d83d8461

由 Arnaldo Carvalho de Melo 提交于 12月 13, 2005

As DCCP needs to be called in the same spots.

Now we have a member in inet_sock (is_icsk), set at sock creation time from
struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and
DCCP) to see if a struct sock instance is a inet_connection_sock for places
like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if
sk_type was SOCK_STREAM, that is insufficient because we now use the same code
for DCCP, that has sk_type SOCK_DCCP.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d83d8461

[TCP]: Move the TCPF_ enum to tcp_states.h · 22712813

由 Arnaldo Carvalho de Melo 提交于 12月 13, 2005

Upcoming patches will make, for instance, ip_sockglue.c need just this enum
and not all of tcp.h.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22712813

[ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops · 8292a17a

由 Arnaldo Carvalho de Melo 提交于 12月 13, 2005

And move it to struct inet_connection_sock. DCCP will use it in the
upcoming changesets.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8292a17a

11 11月, 2005 2 次提交

[TCP]: speed up SACK processing · 6a438bbe

由 Stephen Hemminger 提交于 11月 10, 2005

Use "hints" to speed up the SACK processing. Various forms 
of this have been used by TCP developers (Web100, STCP, BIC)
to avoid the 2x linear search of outstanding segments.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a438bbe

[TCP]: Appropriate Byte Count support · 9772efb9

由 Stephen Hemminger 提交于 11月 10, 2005

This is an updated version of the RFC3465 ABC patch originally
for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting
bytes ack'd rather than packets when updating congestion control.

The orignal ABC described in the RFC applied to a Reno style
algorithm. For advanced congestion control there is little
change after leaving slow start.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9772efb9

30 8月, 2005 4 次提交

[ICSK]: Move TCP congestion avoidance members to icsk · 6687e988

由 Arnaldo Carvalho de Melo 提交于 8月 10, 2005

This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(),
minimal renaming/moving done in this changeset to ease review.

Most of it is just changes of struct tcp_sock * to struct sock * parameters.

With this we move to a state closer to two interesting goals:

1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used
for any INET transport protocol that has struct inet_hashinfo and are
derived from struct inet_connection_sock. Keeps the userspace API, that will
just not display DCCP sockets, while newer versions of tools can support
DCCP.

2. INET generic transport pluggable Congestion Avoidance infrastructure, using
the current TCP CA infrastructure with DCCP.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6687e988

[ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timer · 295f7324

由 Arnaldo Carvalho de Melo 提交于 8月 09, 2005

With this we're very close to getting all of the current TCP
refactorings in my dccp-2.6 tree merged, next changeset will export
some functions needed by the current DCCP code and then dccp-2.6.git
will be born!
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

295f7324

[NET]: Introduce inet_connection_sock · 463c84b9

由 Arnaldo Carvalho de Melo 提交于 8月 09, 2005

This creates struct inet_connection_sock, moving members out of struct
tcp_sock that are shareable with other INET connection oriented
protocols, such as DCCP, that in my private tree already uses most of
these members.

The functions that operate on these members were renamed, using a
inet_csk_ prefix while not being moved yet to a new file, so as to
ease the review of these changes.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

463c84b9

[INET]: Generalise tcp_tw_bucket, aka TIME_WAIT sockets · 8feaf0c0

由 Arnaldo Carvalho de Melo 提交于 8月 09, 2005

This paves the way to generalise the rest of the sock ID lookup
routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro
kernels (where IPv6 is always built as a module):

[root@qemu ~]# grep tw_sock /proc/slabinfo
tw_sock_TCPv6  0  0  128  31  1
tw_sock_TCP    0  0   96  41  1
[root@qemu ~]#

Now if a protocol wants to use the TIME_WAIT generic infrastructure it
only has to set the sk_prot->twsk_obj_size field with the size of its
inet_timewait_sock derived sock and proto_register will create
sk_prot->twsk_slab, for now its only for INET sockets, but we can
introduce timewait_sock later if some non INET transport protocolo
wants to use this stuff.

Next changesets will take advantage of this new infrastructure to
generalise even more TCP code.

[acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size
/tmp/before.size: 188646   11764    5068  205478   322a6 net/ipv4/built-in.o
/tmp/after.size:  188144   11764    5068  204976   320b0 net/ipv4/built-in.o
[acme@toy net-2.6.14]$

Tested with both IPv4 & IPv6 (::1 (localhost) & ::ffff:172.20.0.1
(qemu host)).
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8feaf0c0

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功