提交 · 3516ffb0fef710749daf288c0fe146503e0cf9d4 · openeuler / Kernel

03 8月, 2007 1 次提交

[TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). · 3516ffb0

由 David S. Miller 提交于 8月 02, 2007

As discovered by Evegniy Polyakov, if we try to sendmsg after
a connection reset, we can do incredibly stupid things.

The core issue is that inet_sendmsg() tries to autobind the
socket, but we should never do that for TCP.  Instead we should
just go straight into TCP's sendmsg() code which will do all
of the necessary state and pending socket error checks.

TCP's sendpage already directly vectors to tcp_sendpage(), so this
merely brings sendmsg() in line with that.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3516ffb0

11 7月, 2007 1 次提交

[TCPv4]: Improve BH latency in /proc/net/tcp · a7ab4b50

由 Herbert Xu 提交于 6月 10, 2007

Currently the code for /proc/net/tcp disable BH while iterating
over the entire established hash table.  Even though we call
cond_resched_softirq for each entry, we still won't process
softirq's as regularly as we would otherwise do which results
in poor performance when the system is loaded near capacity.

This anomaly comes from the 2.4 code where this was all in a
single function and the local_bh_disable might have made sense
as a small optimisation.

The cost of each local_bh_disable is so small when compared
against the increased latency in keeping it disabled over a
large but mostly empty TCP established hash table that we
should just move it to the individual read_lock/read_unlock
calls as we do in inet_diag.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7ab4b50

13 6月, 2007 1 次提交
- D
  [TCP]: Disable TSO if MD5SIG is enabled. · 3d7dbeac
  由 David S. Miller 提交于 6月 12, 2007
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  3d7dbeac
08 6月, 2007 1 次提交

[TCP]: Honour sk_bound_dev_if in tcp_v4_send_ack · f0e48dbf

由 Patrick McHardy 提交于 6月 04, 2007

A time_wait socket inherits sk_bound_dev_if from the original socket,
but it is not used when sending ACK packets using ip_send_reply.

Fix by passing the oif to ip_send_reply in struct ip_reply_arg and
use it for output routing.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0e48dbf

04 6月, 2007 1 次提交
- W
  [IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP · 584bdf8c
  由 Wei Dong 提交于 5月 31, 2007
```
Signed-off-by: NWei Dong <weidong@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  584bdf8c
26 4月, 2007 10 次提交

[NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY · 60476372

由 Herbert Xu 提交于 4月 09, 2007

When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
maps to the semantics of CHECKSUM_UNNECESSARY.  Therefore we should
treat it as such in the stack.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60476372

[NET]: Use csum_start offset instead of skb_transport_header · 663ead3b

由 Herbert Xu 提交于 4月 09, 2007

The skb transport pointer is currently used to specify the start
of the checksum region for transmit checksum offload.  Unfortunately,
the same pointer is also used during receive side processing.

This creates a problem when we want to retransmit a received
packet with partial checksums since the skb transport pointer
would be overwritten.

This patch solves this problem by creating a new 16-bit csum_start
offset value to replace the skb transport header for the purpose
of checksums.  This offset is calculated from skb->head so that
it does not have to change when skb->data changes.

No extra space is required since csum_offset itself fits within
a 16-bit word so we can use the other 16 bits for csum_start.

For backwards compatibility, just before we push a packet with
partial checksums off into the device driver, we set the skb
transport header to what it would have been under the old scheme.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

663ead3b

[TCP]: tcp_memory_pressure and tcp_socket are__read_mostly candidates · 4103f8cd

由 Eric Dumazet 提交于 3月 27, 2007

tcp_memory_pressure and tcp_socket currently share a cache line with tcp_memory_allocated, tcp_sockets_allocated.
(Very hot cache line)
It makes sense to declare these variables as __read_mostly, to avoid false sharing on SMP.

ffffffff8081d9c0 B tcp_orphan_count
ffffffff8081d9c4 B tcp_memory_allocated
ffffffff8081d9c8 B tcp_sockets_allocated
ffffffff8081d9cc B tcp_memory_pressure
ffffffff8081d9d0 b tcp_md5sig_users
ffffffff8081d9d8 b tcp_md5sig_pool
ffffffff8081d9e0 b warntime.31570
ffffffff8081d9e8 b tcp_socket
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4103f8cd

[SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th · aa8223c7

由 Arnaldo Carvalho de Melo 提交于 4月 10, 2007

Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa8223c7

[TCP]: Introduce tcp_hdrlen() and tcp_optlen() · ab6a5bb6

由 Arnaldo Carvalho de Melo 提交于 3月 18, 2007

The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.

Ditched a no-op in bnx2 in the process.

I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab6a5bb6

A
[SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmph · 88c7664f
由 Arnaldo Carvalho de Melo 提交于 3月 13, 2007
```
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
88c7664f
A
[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph · eddc9ec5
由 Arnaldo Carvalho de Melo 提交于 4月 20, 2007
```
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
eddc9ec5

[TCP]: Abstract out all write queue operations. · fe067e8a

由 David S. Miller 提交于 3月 07, 2007

This allows the write queue implementation to be changed,
for example, to one which allows fast interval searching.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe067e8a

[NET]: Convert xtime.tv_sec to get_seconds() · 9d729f72

由 James Morris 提交于 3月 04, 2007

Where appropriate, convert references to xtime.tv_sec to the
get_seconds() helper function.
Signed-off-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d729f72

[TCP]: struct *sock argument renamed: sp -> sk · cf4c6bf8

由 Ilpo Järvinen 提交于 2月 22, 2007

In general, TCP code uses "sk" for struct sock pointer.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf4c6bf8

11 2月, 2007 1 次提交

[NET] IPV4: Fix whitespace errors. · e905a9ed

由 YOSHIFUJI Hideaki 提交于 2月 09, 2007

Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e905a9ed

09 2月, 2007 3 次提交

[NET]: change layout of ehash table · dbca9b27

由 Eric Dumazet 提交于 2月 08, 2007

ehash table layout is currently this one :

First half of this table is used by sockets not in TIME_WAIT state
Second half of it is used by sockets in TIME_WAIT state.

This is non optimal because of for a given hash or socket, the two chain heads
are located in separate cache lines.
Moreover the locks of the second half are never used.

If instead of this halving, we use two list heads in inet_ehash_bucket instead
of only one, we probably can avoid one cache miss, and reduce ram usage,
particularly if sizeof(rwlock_t) is big (various CONFIG_DEBUG_SPINLOCK,
CONFIG_DEBUG_LOCK_ALLOC settings). So we still halves the table but we keep
together related chains to speedup lookups and socket state change.

In this patch I did not try to align struct inet_ehash_bucket, but a future
patch could try to make this structure have a convenient size (a power of two
or a multiple of L1_CACHE_SIZE).
I guess rwlock will just vanish as soon as RCU is plugged into ehash :) , so
maybe we dont need to scratch our heads to align the bucket...

Note : In case struct inet_ehash_bucket is not a power of two, we could
probably change alloc_large_system_hash() (in case it use __get_free_pages())
to free the unused space. It currently allocates a big zone, but the last
quarter of it could be freed. Again, this should be a temporary 'problem'.

Patch tested on ipv4 tcp only, but should be OK for IPV6 and DCCP.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbca9b27

[IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts. · 8eb9086f

由 David S. Miller 提交于 2月 08, 2007

Do this even for non-blocking sockets.  This avoids the silly -EAGAIN
that applications can see now, even for non-blocking sockets in some
cases (f.e. connect()).

With help from Venkat Tekkirala.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eb9086f

[TCP]: remove tcp header from tcp_v4_check (take #2) · ba7808ea

由 Frederik Deweerdt 提交于 2月 04, 2007

The tcphdr struct passed to tcp_v4_check is not used, the following
patch removes it from the parameter list.

This adds the netfilter modifications missing in the patch I sent
for rc3-mm1.
Signed-off-by: NFrederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba7808ea

09 1月, 2007 1 次提交

[TCP]: Fix iov_len calculation in tcp_v4_send_ack(). · cb48cfe8

由 Craig Schlenter 提交于 1月 09, 2007

This fixes the ftp stalls present in the current kernels.

All credit goes to Komuro <komurojun-mbn@nifty.com> for tracking
this down. The patch is untested but it looks *cough* obviously
correct.
Signed-off-by: NCraig Schlenter <craig@codefountain.com>
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb48cfe8

18 12月, 2006 2 次提交

[TCP]: Trivial fix to message in tcp_v4_inbound_md5_hash · a9fc00cc

由 Leigh Brown 提交于 12月 17, 2006

The message logged in tcp_v4_inbound_md5_hash when the hash was expected
but not found was reversed.
Signed-off-by: NLeigh Brown <leigh@solinno.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9fc00cc

[TCP]: Fix oops caused by tcp_v4_md5_do_del · 8228a18d

由 Leigh Brown 提交于 12月 17, 2006

md5sig_info.alloced4 must be set to zero when freeing keys4, otherwise
it will not be alloc'd again when another key is added to the same
socket by tcp_v4_md5_do_add.
Signed-off-by: NLeigh Brown <leigh@solinno.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8228a18d

03 12月, 2006 13 次提交

[TCP]: Fix warnings with TCP_MD5SIG disabled. · b6332e6c

由 Andrew Morton 提交于 11月 30, 2006

Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6332e6c

[NET]: Possible cleanups. · f5b99bcd

由 Adrian Bunk 提交于 11月 30, 2006

This patch contains the following possible cleanups:
- make the following needlessly global functions statis:
  - ipv4/tcp.c: __tcp_alloc_md5sig_pool()
  - ipv4/tcp_ipv4.c: tcp_v4_reqsk_md5_lookup()
  - ipv4/udplite.c: udplite_rcv()
  - ipv4/udplite.c: udplite_err()
- make the following needlessly global structs static:
  - ipv4/tcp_ipv4.c: tcp_request_sock_ipv4_ops
  - ipv4/tcp_ipv4.c: tcp_sock_ipv4_specific
  - ipv6/tcp_ipv6.c: tcp_request_sock_ipv6_ops
- net/ipv{4,6}/udplite.c: remove inline's from static functions
                          (gcc should know best when to inline them)
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5b99bcd

[TCP] MD5SIG: Kill CONFIG_TCP_MD5SIG_DEBUG. · 08dd1a50

由 David S. Miller 提交于 11月 30, 2006

It just obfuscates the code and adds limited value.  And as Adrian
Bunk noticed, it lacked Kconfig help text too, so just kill it.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08dd1a50

[NET]: Split skb->csum · ff1dcadb

由 Al Viro 提交于 11月 20, 2006

... into anonymous union of __wsum and __u32 (csum and csum_offset resp.)
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff1dcadb

A
[NET]: Fix assorted misannotations (from md5 and udplite merges). · 8e5200f5
由 Al Viro 提交于 11月 20, 2006
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
8e5200f5

[TCP_IPV4]: Use kmemdup where appropriate · f6685938

由 Arnaldo Carvalho de Melo 提交于 11月 17, 2006

Also use a variable to avoid the longish tp->md5sig_info-> use
in tcp_v4_md5_do_add.

Code diff stats:

[acme@newtoy net-2.6.20]$ codiff /tmp/tcp_ipv4.o.before /tmp/tcp_ipv4.o.after
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp_ipv4.c:
  tcp_v4_md5_do_add     |  -62
  tcp_v4_syn_recv_sock  |  -32
  tcp_v4_parse_md5_keys |  -86
 3 functions changed, 180 bytes removed
[acme@newtoy net-2.6.20]$
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

f6685938

A
[TCP_IPV4]: CodingStyle cleanups, no code change · 7174259e
由 Arnaldo Carvalho de Melo 提交于 11月 17, 2006
```
Mostly related to CONFIG_TCP_MD5SIG recent merge.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
```
7174259e

[NET]: Annotate __skb_checksum_complete() and friends. · b51655b9

由 Al Viro 提交于 11月 14, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b51655b9

[IPV6]: Assorted trivial endianness annotations. · 714e85be

由 Al Viro 提交于 11月 14, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

714e85be

[TCP]: MD5 Signature Option (RFC2385) support. · cfb6eeb4

由 YOSHIFUJI Hideaki 提交于 11月 14, 2006

Based on implementation by Rick Payne.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfb6eeb4

[TCP/DCCP]: Introduce net_xmit_eval · b9df3cb8

由 Gerrit Renker 提交于 11月 14, 2006

Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
return code of a transmit function needs to be tested against NET_XMIT_CN
which is a value that does not indicate a strict error condition.

This patch uses a macro for these recurring situations which is consistent
with the already existing macro net_xmit_errno, saving on duplicated code.
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

b9df3cb8

[TCP]: Remove dead code in init_sequence · a94f723d

由 Gerrit Renker 提交于 11月 10, 2006

This removes two redundancies:

1) The test (skb->protocol == htons(ETH_P_IPV6) in tcp_v6_init_sequence()
   is always true, due to
	* tcp_v6_conn_request() is the only function calling this one
	* tcp_v6_conn_request() redirects all skb's with ETH_P_IP protocol to
	  tcp_v4_conn_request() [ cf. top of tcp_v6_conn_request()]

2) The first argument, `struct sock *sk' of tcp_v{4,6}_init_sequence() is
   never used.
Signed-off-by: NGerrit Renker  <gerrit@erg.abdn.ac.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a94f723d

[NET]: Size listen hash tables using backlog hint · 72a3effa

由 Eric Dumazet 提交于 11月 16, 2006

We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
each LISTEN socket, regardless of various parameters (listen backlog for
example)

On x86_64, this means order-1 allocations (might fail), even for 'small'
sockets, expecting few connections. On the contrary, a huge server wanting a
backlog of 50000 is slowed down a bit because of this fixed limit.

This patch makes the sizing of listen hash table a dynamic parameter,
depending of :
- net.core.somaxconn tunable (default is 128)
- net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
- backlog value given by user application  (2nd parameter of listen())

For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
kmalloc().

We still limit memory allocation with the two existing tunables (somaxconn &
tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
usage.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72a3effa

20 10月, 2006 1 次提交

[TCP]: One NET_INC_STATS() could be NET_INC_STATS_BH in tcp_v4_err() · 06ca719f

由 Eric Dumazet 提交于 10月 20, 2006

I believe this NET_INC_STATS() call can be replaced by
NET_INC_STATS_BH(), a little bit cheaper.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06ca719f

12 10月, 2006 2 次提交
- Y
  [NET]: Use typesafe inet_twsk() inline function instead of cast. · 9469c7b4
  由 YOSHIFUJI Hideaki 提交于 10月 10, 2006
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  9469c7b4
- Y
  [TCP]: Use TCPOLEN_TSTAMP_ALIGNED macro instead of magic number. · 4244f8a9
  由 YOSHIFUJI Hideaki 提交于 10月 10, 2006
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  4244f8a9
29 9月, 2006 2 次提交

[IPV4]: struct inet_timewait_sock annotations · 23f33c2d

由 Al Viro 提交于 9月 27, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23f33c2d

[IPV4]: annotate address in inet_request_sock · adaf345b

由 Al Viro 提交于 9月 27, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

adaf345b

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功