提交 · 81cc8a75d944fa39fc333c2c329c8e8b3c62cada · openeuler / Kernel

17 7月, 2008 3 次提交

mib: add net to TCP_INC_STATS · 81cc8a75

由 Pavel Emelyanov 提交于 7月 16, 2008

Fortunately (almost) all the TCP code has a sock to get the net from :)
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81cc8a75

tcp: add net to tcp_mib_init · a9c19329

由 Pavel Emelyanov 提交于 7月 16, 2008

This one sets TCP MIBs after zeroing them, and thus requires
the net.

The existing single caller can use init_net (temporarily).
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9c19329

mib: drop unused TCP_XXX_STATS macros · f10f8431

由 Pavel Emelyanov 提交于 7月 16, 2008

TCP_INC_STATS_USER and TCP_ADD_STATS_BH are currently unused.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f10f8431

15 6月, 2008 1 次提交

net: change proto destroy method to return void · 7d06b2e0

由 Brian Haley 提交于 6月 14, 2008

Change struct proto destroy function pointer to return void.  Noticed
by Al Viro.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d06b2e0

13 6月, 2008 1 次提交

tcp: Revert 'process defer accept as established' changes. · ec0a1966

由 David S. Miller 提交于 6月 12, 2008

This reverts two changesets, ec3c0982
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0a
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems.  The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock.  The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> 		goto death;
> 	}
>
>+	if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+		tcp_send_active_reset(sk, GFP_ATOMIC);
>+		goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec0a1966

12 6月, 2008 4 次提交

tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash(). · 9501f972

由 YOSHIFUJI Hideaki 提交于 4月 18, 2008

As we do for other socket/timewait-socket specific parameters,
let the callers pass appropriate arguments to
tcp_v{4,6}_do_calc_md5_hash().
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

9501f972

tcp md5sig: Share most of hash calcucaltion bits between IPv4 and IPv6. · 8d26d76d

由 YOSHIFUJI Hideaki 提交于 4月 17, 2008

We can share most part of the hash calculation code because
the only difference between IPv4 and IPv6 is their pseudo headers.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

8d26d76d

tcp md5sig: Remove redundant protocol argument. · 076fb722

由 YOSHIFUJI Hideaki 提交于 4月 17, 2008

Protocol is always TCP, so remove useless protocol argument.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

076fb722

Y
tcp md5sig: Share MD5 Signature option parser between IPv4 and IPv6. · 7d5d5525
由 YOSHIFUJI Hideaki 提交于 4月 17, 2008
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
```
7d5d5525

11 6月, 2008 1 次提交

ipv4: Remove unused declaration from include/net/tcp.h. · 45d465bc

由 Rami Rosen 提交于 6月 10, 2008

- The tcp_unhash() method in /include/net/tcp.h is no more needed, as the
unhash method in tcp_prot structure is now inet_unhash (instead of
tcp_unhash in the
past); see tcp_prot structure in net/ipv4/tcp_ipv4.c.

- So, this patch removes tcp_unhash() declaration from include/net/tcp.h
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

45d465bc

16 4月, 2008 1 次提交

[TCP]: Increase the max_burst threshold from 3 to tp->reordering. · dd9e0dda

由 John Heffner 提交于 4月 15, 2008

This change is necessary to allow cwnd to grow during persistent
reordering.  Cwnd moderation is applied when in the disorder state
and an ack that fills the hole comes in.  If the hole was greater
than 3 packets, but less than tp->reordering, cwnd will shrink when
it should not have.
Signed-off-by: NJohn Heffner <jheffner@napa.(none)>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd9e0dda

14 4月, 2008 5 次提交

[SKB]: __skb_append = __skb_queue_after · 7de6c033

由 Gerrit Renker 提交于 4月 14, 2008

This expresses __skb_append in terms of __skb_queue_after, exploiting that

  __skb_append(old, new, list) = __skb_queue_after(list, old, new).
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7de6c033

[TCP]: Remove owner from tcp_seq_afinfo. · 5f4472c5

由 Denis V. Lunev 提交于 4月 13, 2008

Move it to tcp_seq_afinfo->seq_fops as should be.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f4472c5

[TCP]: Place file operations directly into tcp_seq_afinfo. · 68fcadd1

由 Denis V. Lunev 提交于 4月 13, 2008

No need to have separate never-used variable.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68fcadd1

[TCP]: Move seq_ops from tcp_iter_state to tcp_seq_afinfo. · 9427c4b3

由 Denis V. Lunev 提交于 4月 13, 2008

No need to create seq_operations for each instance of 'netstat'.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9427c4b3

D
[TCP]: Replace struct net on tcp_iter_state with seq_net_private. · a4146b1b
由 Denis V. Lunev 提交于 4月 13, 2008
```
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
a4146b1b

10 4月, 2008 1 次提交

[Syncookies]: Add support for TCP options via timestamps. · 4dfc2817

由 Florian Westphal 提交于 4月 10, 2008

Allow the use of SACK and window scaling when syncookies are used
and the client supports tcp timestamps. Options are encoded into
the timestamp sent in the syn-ack and restored from the timestamp
echo when the ack is received.

Based on earlier work by Glenn Griffin.
This patch avoids increasing the size of structs by encoding TCP
options into the least significant bits of the timestamp and
by not using any 'timestamp offset'.

The downside is that the timestamp sent in the packet after the synack
will increase by several seconds.

changes since v1:
 don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
 and have ipv6/syncookies.c use it.
 Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()
Reviewed-by: NHagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4dfc2817

08 4月, 2008 1 次提交

[TCP]: tcp_simple_retransmit can cause S+L · 882bebaa

由 Ilpo Järvinen 提交于 4月 07, 2008

This fixes Bugzilla #10384

tcp_simple_retransmit does L increment without any checking
whatsoever for overflowing S+L when Reno is in use.

The simplest scenario I can currently think of is rather
complex in practice (there might be some more straightforward
cases though). Ie., if mss is reduced during mtu probing, it
may end up marking everything lost and if some duplicate ACKs
arrived prior to that sacked_out will be non-zero as well,
leading to S+L > packets_out, tcp_clean_rtx_queue on the next
cumulative ACK or tcp_fastretrans_alert on the next duplicate
ACK will fix the S counter.

More straightforward (but questionable) solution would be to
just call tcp_reset_reno_sack() in tcp_simple_retransmit but
it would negatively impact the probe's retransmission, ie.,
the retransmissions would not occur if some duplicate ACKs
had arrived.

So I had to add reno sacked_out reseting to CA_Loss state
when the first cumulative ACK arrives (this stale sacked_out
might actually be the explanation for the reports of left_out
overflows in kernel prior to 2.6.23 and S+L overflow reports
of 2.6.24). However, this alone won't be enough to fix kernel
before 2.6.24 because it is building on top of the commit
1b6d427b ([TCP]: Reduce sacked_out with reno when purging
write_queue) to keep the sacked_out from overflowing.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: NAlessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

882bebaa

24 3月, 2008 1 次提交

[TCP]: Shrink syncookie_secret by 8 byte. · 2051f11f

由 Florian Westphal 提交于 3月 23, 2008

the first u32 copied from syncookie_secret is overwritten by the
minute-counter four lines below.  After adjusting the destination
address, the size of syncookie_secret can be reduced accordingly.

AFAICS, the only other user of syncookie_secret[] is the ipv6
syncookie support.  Because ipv6 syncookies only grab 44 bytes from
syncookie_secret[], this shouldn't affect them in any way.

With fixes from Glenn Griffin.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NGlenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2051f11f

22 3月, 2008 1 次提交

[TCP]: TCP_DEFER_ACCEPT updates - process as established · ec3c0982

由 Patrick McManus 提交于 3月 21, 2008

Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

 - diagnostic state of established matches with the packet traces
   showing completed handshake

 - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.
Signed-off-by: NPatrick McManus <mcmanus@ducksong.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec3c0982

21 3月, 2008 2 次提交

[NETNS][IPV6] tcp6 - make proc per namespace · 6f8b13bc

由 Daniel Lezcano 提交于 3月 21, 2008

Make the proc for tcp6 to be per namespace.
Signed-off-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f8b13bc

[NETNS][IPV4] tcp - make proc handle the network namespaces · f40c8174

由 Daniel Lezcano 提交于 3月 21, 2008

This patch, like udp proc, makes the proc functions to take care of
which namespace the socket belongs.
Signed-off-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f40c8174

04 3月, 2008 1 次提交

[TCP]: Add IPv6 support to TCP SYN cookies · c6aefafb

由 Glenn Griffin 提交于 2月 07, 2008

Updated to incorporate Eric's suggestion of using a per cpu buffer
rather than allocating on the stack.  Just a two line change, but will
resend in it's entirety.
Signed-off-by: NGlenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

c6aefafb

01 3月, 2008 1 次提交

[INET]: Remove struct net_proto_family* from _init calls. · 9b0f976f

由 Denis V. Lunev 提交于 2月 29, 2008

struct net_proto_family* is not used in icmp[v6]_init, ndisc_init,
igmp_init and tcp_v4_init. Remove it.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Acked-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b0f976f

29 1月, 2008 13 次提交

[TCP]: Uninline tcp_is_cwnd_limited · cea14e0e

由 Ilpo Järvinen 提交于 1月 12, 2008

net/ipv4/tcp_cong.c:
  tcp_reno_cong_avoid |  -65
 1 function changed, 65 bytes removed, diff: -65

net/ipv4/arp.c:
  arp_ignore |   -5
 1 function changed, 5 bytes removed, diff: -5

net/ipv4/tcp_bic.c:
  bictcp_cong_avoid |  -57
 1 function changed, 57 bytes removed, diff: -57

net/ipv4/tcp_cubic.c:
  bictcp_cong_avoid |  -61
 1 function changed, 61 bytes removed, diff: -61

net/ipv4/tcp_highspeed.c:
  hstcp_cong_avoid |  -63
 1 function changed, 63 bytes removed, diff: -63

net/ipv4/tcp_hybla.c:
  hybla_cong_avoid |  -85
 1 function changed, 85 bytes removed, diff: -85

net/ipv4/tcp_htcp.c:
  htcp_cong_avoid |  -57
 1 function changed, 57 bytes removed, diff: -57

net/ipv4/tcp_veno.c:
  tcp_veno_cong_avoid |  -52
 1 function changed, 52 bytes removed, diff: -52

net/ipv4/tcp_scalable.c:
  tcp_scalable_cong_avoid |  -61
 1 function changed, 61 bytes removed, diff: -61

net/ipv4/tcp_yeah.c:
  tcp_yeah_cong_avoid |  -75
 1 function changed, 75 bytes removed, diff: -75

net/ipv4/tcp_illinois.c:
  tcp_illinois_cong_avoid |  -54
 1 function changed, 54 bytes removed, diff: -54

net/dccp/ccids/ccid3.c:
  ccid3_update_send_interval |   -7
  ccid3_hc_tx_packet_recv    |   +7
 2 functions changed, 7 bytes added, 7 bytes removed, diff: +0

net/ipv4/tcp_cong.c:
  tcp_is_cwnd_limited |  +88
 1 function changed, 88 bytes added, diff: +88

built-in.o:
 14 functions changed, 95 bytes added, 642 bytes removed, diff: -547

...Again some gcc artifacts visible as well.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cea14e0e

[TCP]: Uninline tcp_set_state · 490d5046

由 Ilpo Järvinen 提交于 1月 12, 2008

net/ipv4/tcp.c:
  tcp_close_state | -226
  tcp_done        | -145
  tcp_close       | -564
  tcp_disconnect  | -141
 4 functions changed, 1076 bytes removed, diff: -1076

net/ipv4/tcp_input.c:
  tcp_fin               |  -86
  tcp_rcv_state_process | -164
 2 functions changed, 250 bytes removed, diff: -250

net/ipv4/tcp_ipv4.c:
  tcp_v4_connect | -209
 1 function changed, 209 bytes removed, diff: -209

net/ipv4/arp.c:
  arp_ignore |   +5
 1 function changed, 5 bytes added, diff: +5

net/ipv6/tcp_ipv6.c:
  tcp_v6_connect | -158
 1 function changed, 158 bytes removed, diff: -158

net/sunrpc/xprtsock.c:
  xs_sendpages |   -2
 1 function changed, 2 bytes removed, diff: -2

net/dccp/ccids/ccid3.c:
  ccid3_update_send_interval |   +7
 1 function changed, 7 bytes added, diff: +7

net/ipv4/tcp.c:
  tcp_set_state | +238
 1 function changed, 238 bytes added, diff: +238

built-in.o:
 12 functions changed, 250 bytes added, 1695 bytes removed, diff: -1445

I've no explanation why some unrelated changes seem to occur
consistently as well (arp_ignore, ccid3_update_send_interval;
I checked the arp_ignore asm and it seems to be due to some
reordered of operation order causing some extra opcodes to be
generated). Still, the benefits are pretty obvious from the
codiff's results.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

490d5046

[TCP]: Remove TCPCB_URG & TCPCB_AT_TAIL as unnecessary · 4828e7f4

由 Ilpo Järvinen 提交于 12月 31, 2007

The snd_up check should be enough. I suspect this has been
there to provide a minor optimization in clean_rtx_queue which
used to have a small if (!->sacked) block which could skip
snd_up check among the other work.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4828e7f4

[TCP]: Introduce tcp_wnd_end() to reduce line lengths · 90840def

由 Ilpo Järvinen 提交于 12月 31, 2007

Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90840def

[NET] CORE: Introducing new memory accounting interface. · 3ab224be

由 Hideo Aoki 提交于 12月 31, 2007

This patch introduces new memory accounting functions for each network
protocol. Most of them are renamed from memory accounting functions
for stream protocols. At the same time, some stream memory accounting
functions are removed since other functions do same thing.

Renaming:
	sk_stream_free_skb()		->	sk_wmem_free_skb()
	__sk_stream_mem_reclaim()	->	__sk_mem_reclaim()
	sk_stream_mem_reclaim()		->	sk_mem_reclaim()
	sk_stream_mem_schedule 		->    	__sk_mem_schedule()
	sk_stream_pages()      		->	sk_mem_pages()
	sk_stream_rmem_schedule()	->	sk_rmem_schedule()
	sk_stream_wmem_schedule()	->	sk_wmem_schedule()
	sk_charge_skb()			->	sk_mem_charge()

Removeing
	sk_stream_rfree():	consolidates into sock_rfree()
	sk_stream_set_owner_r(): consolidates into skb_set_owner_r()
	sk_stream_mem_schedule()

The following functions are added.
    	sk_has_account(): check if the protocol supports accounting
	sk_mem_uncharge(): do the opposite of sk_mem_charge()

In addition, to achieve consolidation, updating sk_wmem_queued is
removed from sk_mem_charge().

Next, to consolidate memory accounting functions, this patch adds
memory accounting calls to network core functions. Moreover, present
memory accounting call is renamed to new accounting call.

Finally we replace present memory accounting calls with new interface
in TCP and SCTP.
Signed-off-by: NTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: NHideo Aoki <haoki@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ab224be

[TCP]: Convert several length variable to unsigned. · 9cb5734e

由 YOSHIFUJI Hideaki 提交于 1月 12, 2008

Several length variables cannot be negative, so convert int to
unsigned int.  This also allows us to do sane shift operations
on those variables.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9cb5734e

[TCP]: Abstract tp->highest_sack accessing & point to next skb · 6859d494

由 Ilpo Järvinen 提交于 12月 02, 2007

Pointing to the next skb is necessary to avoid referencing
already SACKed skbs which will soon be on a separate list.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6859d494

[TCP]: Add tcp_for_write_queue_from_safe and use it in mtu_probe · 234b6860

由 Ilpo Järvinen 提交于 12月 02, 2007

Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

234b6860

[TCP]: Cong.ctrl modules: remove unused good_ack from cong_avoid · c3a05c60

由 Ilpo Järvinen 提交于 12月 02, 2007

Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3a05c60

[TCP]: Move FRTO checks out from write queue abstraction funcs · 8512430e

由 Ilpo Järvinen 提交于 11月 26, 2007

Better place exists in update_send_head (other non-queue related
adjustments are done there as well) which is the only caller of
tcp_advance_send_head (now that the bogus call from mtu_probe is
gone).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8512430e

[TCP]: Rewrite SACK block processing & sack_recv_cache use · 68f8353b

由 Ilpo Järvinen 提交于 11月 15, 2007

Key points of this patch are:

  - In case new SACK information is advance only type, no skb
    processing below previously discovered highest point is done
  - Optimize cases below highest point too since there's no need
    to always go up to highest point (which is very likely still
    present in that SACK), this is not entirely true though
    because I'm dropping the fastpath_skb_hint which could
    previously optimize those cases even better. Whether that's
    significant, I'm not too sure.

Currently it will provide skipping by walking. Combined with
RB-tree, all skipping would become fast too regardless of window
size (can be done incrementally later).

Previously a number of cases in TCP SACK processing fails to
take advantage of costly stored information in sack_recv_cache,
most importantly, expected events such as cumulative ACK and new
hole ACKs. Processing on such ACKs result in rather long walks
building up latencies (which easily gets nasty when window is
huge). Those latencies are often completely unnecessary
compared with the amount of _new_ information received, usually
for cumulative ACK there's no new information at all, yet TCP
walks whole queue unnecessary potentially taking a number of
costly cache misses on the way, etc.!

Since the inclusion of highest_sack, there's a lot information
that is very likely redundant (SACK fastpath hint stuff,
fackets_out, highest_sack), though there's no ultimate guarantee
that they'll remain the same whole the time (in all unearthly
scenarios). Take advantage of this knowledge here and drop
fastpath hint and use direct access to highest SACKed skb as
a replacement.

Effectively "special cased" fastpath is dropped. This change
adds some complexity to introduce better coveraged "fastpath",
though the added complexity should make TCP behave more cache
friendly.

The current ACK's SACK blocks are compared against each cached
block individially and only ranges that are new are then scanned
by the high constant walk. For other parts of write queue, even
when in previously known part of the SACK blocks, a faster skip
function is used (if necessary at all). In addition, whenever
possible, TCP fast-forwards to highest_sack skb that was made
available by an earlier patch. In typical case, no other things
but this fast-forward and mandatory markings after that occur
making the access pattern quite similar to the former fastpath
"special case".

DSACKs are special case that must always be walked.

The local to recv_sack_cache copying could be more intelligent
w.r.t DSACKs which are likely to be there only once but that
is left to a separate patch.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68f8353b

[TCP]: Convert highest_sack to sk_buff to allow direct access · a47e5a98

由 Ilpo Järvinen 提交于 11月 15, 2007

It is going to replace the sack fastpath hint quite soon... :-)
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a47e5a98

[TCP]: Splice receive support. · 9c55e01c

由 Jens Axboe 提交于 11月 06, 2007

Support for network splice receive.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c55e01c

20 11月, 2007 1 次提交

[TCP] MTUprobe: fix potential sk_send_head corruption · 6e421410

由 Ilpo Jrvinen 提交于 11月 19, 2007

When the abstraction functions got added, conversion here was
made incorrectly. As a result, the skb may end up pointing
to skb which got included to the probe skb and then was freed.
For it to trigger, however, skb_transmit must fail sending as
well.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e421410

24 10月, 2007 1 次提交

[TCP]: Remove unneeded implicit type cast when calling tcp_minshall_update() · c1bd24b7

由 Chuck Lever 提交于 10月 23, 2007

The tcp_minshall_update() function is called in exactly one place, and is
passed an unsigned integer for the mss_len argument.  Make the sign of the
argument match the sign of the passed variable in order to eliminate an
unneeded implicit type cast and a mixed sign comparison in
tcp_minshall_update().
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1bd24b7

11 10月, 2007 1 次提交
- D
  [TCP]: Minor coding style fixup. · 0800f170
  由 David S. Miller 提交于 9月 20, 2007
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  0800f170

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功