提交 · c940587bf603b4295a7f5e9ff8fed123368a1ef7 · openeuler / raspberrypi-kernel

26 10月, 2007 1 次提交
- D
  [TCP]: Fix scatterlist handling in MD5 signature support. · c7da57a1
  由 David S. Miller 提交于 10月 26, 2007
```
Use sg_init_table() and sg_mark_end() as needed.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  c7da57a1
11 10月, 2007 2 次提交

[INET]: local port range robustness · 227b60f5

由 Stephen Hemminger 提交于 10月 10, 2007

Expansion of original idea from Denis V. Lunev <den@openvz.org>

Add robustness and locking to the local_port_range sysctl.
1. Enforce that low < high when setting.
2. Use seqlock to ensure atomic update.

The locking might seem like overkill, but there are
cases where sysadmin might want to change value in the
middle of a DoS attack.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

227b60f5

[NET]: Make /proc/net per network namespace · 457c4cbc

由 Eric W. Biederman 提交于 9月 12, 2007

This patch makes /proc/net per network namespace. It modifies the global
variables proc_net and proc_net_stat to be per network namespace.
The proc_net file helpers are modified to take a network namespace argument,
and all of their callers are fixed to pass &init_net for that argument.
This ensures that all of the /proc/net files are only visible and
usable in the initial network namespace until the code behind them
has been updated to be handle multiple network namespaces.

Making /proc/net per namespace is necessary as at least some files
in /proc/net depend upon the set of network devices which is per
network namespace, and even more files in /proc/net have contents
that are relevant to a single network namespace.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

457c4cbc

29 9月, 2007 1 次提交

[TCP]: Fix MD5 signature handling on big-endian. · f8ab18d2

由 David S. Miller 提交于 9月 28, 2007

Based upon a report and initial patch by Peter Lieven.

tcp4_md5sig_key and tcp6_md5sig_key need to start with
the exact same members as tcp_md5sig_key.  Because they
are both cast to that type by tcp_v{4,6}_md5_do_lookup().

Unfortunately tcp{4,6}_md5sig_key use a u16 for the key
length instead of a u8, which is what tcp_md5sig_key
uses.  This just so happens to work by accident on
little-endian, but on big-endian it doesn't.

Instead of casting, just place tcp_md5sig_key as the first member of
the address-family specific structures, adjust the access sites, and
kill off the ugly casts.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8ab18d2

03 8月, 2007 1 次提交

[TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). · 3516ffb0

由 David S. Miller 提交于 8月 02, 2007

As discovered by Evegniy Polyakov, if we try to sendmsg after
a connection reset, we can do incredibly stupid things.

The core issue is that inet_sendmsg() tries to autobind the
socket, but we should never do that for TCP.  Instead we should
just go straight into TCP's sendmsg() code which will do all
of the necessary state and pending socket error checks.

TCP's sendpage already directly vectors to tcp_sendpage(), so this
merely brings sendmsg() in line with that.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3516ffb0

11 7月, 2007 1 次提交

[TCPv4]: Improve BH latency in /proc/net/tcp · a7ab4b50

由 Herbert Xu 提交于 6月 10, 2007

Currently the code for /proc/net/tcp disable BH while iterating
over the entire established hash table.  Even though we call
cond_resched_softirq for each entry, we still won't process
softirq's as regularly as we would otherwise do which results
in poor performance when the system is loaded near capacity.

This anomaly comes from the 2.4 code where this was all in a
single function and the local_bh_disable might have made sense
as a small optimisation.

The cost of each local_bh_disable is so small when compared
against the increased latency in keeping it disabled over a
large but mostly empty TCP established hash table that we
should just move it to the individual read_lock/read_unlock
calls as we do in inet_diag.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7ab4b50

13 6月, 2007 1 次提交
- D
  [TCP]: Disable TSO if MD5SIG is enabled. · 3d7dbeac
  由 David S. Miller 提交于 6月 12, 2007
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  3d7dbeac
08 6月, 2007 1 次提交

[TCP]: Honour sk_bound_dev_if in tcp_v4_send_ack · f0e48dbf

由 Patrick McHardy 提交于 6月 04, 2007

A time_wait socket inherits sk_bound_dev_if from the original socket,
but it is not used when sending ACK packets using ip_send_reply.

Fix by passing the oif to ip_send_reply in struct ip_reply_arg and
use it for output routing.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0e48dbf

04 6月, 2007 1 次提交
- W
  [IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP · 584bdf8c
  由 Wei Dong 提交于 5月 31, 2007
```
Signed-off-by: NWei Dong <weidong@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  584bdf8c
26 4月, 2007 10 次提交

[NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY · 60476372

由 Herbert Xu 提交于 4月 09, 2007

When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
maps to the semantics of CHECKSUM_UNNECESSARY.  Therefore we should
treat it as such in the stack.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60476372

[NET]: Use csum_start offset instead of skb_transport_header · 663ead3b

由 Herbert Xu 提交于 4月 09, 2007

The skb transport pointer is currently used to specify the start
of the checksum region for transmit checksum offload.  Unfortunately,
the same pointer is also used during receive side processing.

This creates a problem when we want to retransmit a received
packet with partial checksums since the skb transport pointer
would be overwritten.

This patch solves this problem by creating a new 16-bit csum_start
offset value to replace the skb transport header for the purpose
of checksums.  This offset is calculated from skb->head so that
it does not have to change when skb->data changes.

No extra space is required since csum_offset itself fits within
a 16-bit word so we can use the other 16 bits for csum_start.

For backwards compatibility, just before we push a packet with
partial checksums off into the device driver, we set the skb
transport header to what it would have been under the old scheme.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

663ead3b

[TCP]: tcp_memory_pressure and tcp_socket are__read_mostly candidates · 4103f8cd

由 Eric Dumazet 提交于 3月 27, 2007

tcp_memory_pressure and tcp_socket currently share a cache line with tcp_memory_allocated, tcp_sockets_allocated.
(Very hot cache line)
It makes sense to declare these variables as __read_mostly, to avoid false sharing on SMP.

ffffffff8081d9c0 B tcp_orphan_count
ffffffff8081d9c4 B tcp_memory_allocated
ffffffff8081d9c8 B tcp_sockets_allocated
ffffffff8081d9cc B tcp_memory_pressure
ffffffff8081d9d0 b tcp_md5sig_users
ffffffff8081d9d8 b tcp_md5sig_pool
ffffffff8081d9e0 b warntime.31570
ffffffff8081d9e8 b tcp_socket
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4103f8cd

[SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th · aa8223c7

由 Arnaldo Carvalho de Melo 提交于 4月 10, 2007

Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa8223c7

[TCP]: Introduce tcp_hdrlen() and tcp_optlen() · ab6a5bb6

由 Arnaldo Carvalho de Melo 提交于 3月 18, 2007

The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.

Ditched a no-op in bnx2 in the process.

I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab6a5bb6

A
[SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmph · 88c7664f
由 Arnaldo Carvalho de Melo 提交于 3月 13, 2007
```
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
88c7664f
A
[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph · eddc9ec5
由 Arnaldo Carvalho de Melo 提交于 4月 20, 2007
```
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
eddc9ec5

[TCP]: Abstract out all write queue operations. · fe067e8a

由 David S. Miller 提交于 3月 07, 2007

This allows the write queue implementation to be changed,
for example, to one which allows fast interval searching.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe067e8a

[NET]: Convert xtime.tv_sec to get_seconds() · 9d729f72

由 James Morris 提交于 3月 04, 2007

Where appropriate, convert references to xtime.tv_sec to the
get_seconds() helper function.
Signed-off-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d729f72

[TCP]: struct *sock argument renamed: sp -> sk · cf4c6bf8

由 Ilpo Järvinen 提交于 2月 22, 2007

In general, TCP code uses "sk" for struct sock pointer.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf4c6bf8

11 2月, 2007 1 次提交

[NET] IPV4: Fix whitespace errors. · e905a9ed

由 YOSHIFUJI Hideaki 提交于 2月 09, 2007

Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e905a9ed

09 2月, 2007 3 次提交

[NET]: change layout of ehash table · dbca9b27

由 Eric Dumazet 提交于 2月 08, 2007

ehash table layout is currently this one :

First half of this table is used by sockets not in TIME_WAIT state
Second half of it is used by sockets in TIME_WAIT state.

This is non optimal because of for a given hash or socket, the two chain heads
are located in separate cache lines.
Moreover the locks of the second half are never used.

If instead of this halving, we use two list heads in inet_ehash_bucket instead
of only one, we probably can avoid one cache miss, and reduce ram usage,
particularly if sizeof(rwlock_t) is big (various CONFIG_DEBUG_SPINLOCK,
CONFIG_DEBUG_LOCK_ALLOC settings). So we still halves the table but we keep
together related chains to speedup lookups and socket state change.

In this patch I did not try to align struct inet_ehash_bucket, but a future
patch could try to make this structure have a convenient size (a power of two
or a multiple of L1_CACHE_SIZE).
I guess rwlock will just vanish as soon as RCU is plugged into ehash :) , so
maybe we dont need to scratch our heads to align the bucket...

Note : In case struct inet_ehash_bucket is not a power of two, we could
probably change alloc_large_system_hash() (in case it use __get_free_pages())
to free the unused space. It currently allocates a big zone, but the last
quarter of it could be freed. Again, this should be a temporary 'problem'.

Patch tested on ipv4 tcp only, but should be OK for IPV6 and DCCP.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbca9b27

[IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts. · 8eb9086f

由 David S. Miller 提交于 2月 08, 2007

Do this even for non-blocking sockets.  This avoids the silly -EAGAIN
that applications can see now, even for non-blocking sockets in some
cases (f.e. connect()).

With help from Venkat Tekkirala.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eb9086f

[TCP]: remove tcp header from tcp_v4_check (take #2) · ba7808ea

由 Frederik Deweerdt 提交于 2月 04, 2007

The tcphdr struct passed to tcp_v4_check is not used, the following
patch removes it from the parameter list.

This adds the netfilter modifications missing in the patch I sent
for rc3-mm1.
Signed-off-by: NFrederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba7808ea

09 1月, 2007 1 次提交

[TCP]: Fix iov_len calculation in tcp_v4_send_ack(). · cb48cfe8

由 Craig Schlenter 提交于 1月 09, 2007

This fixes the ftp stalls present in the current kernels.

All credit goes to Komuro <komurojun-mbn@nifty.com> for tracking
this down. The patch is untested but it looks *cough* obviously
correct.
Signed-off-by: NCraig Schlenter <craig@codefountain.com>
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb48cfe8

18 12月, 2006 2 次提交

[TCP]: Trivial fix to message in tcp_v4_inbound_md5_hash · a9fc00cc

由 Leigh Brown 提交于 12月 17, 2006

The message logged in tcp_v4_inbound_md5_hash when the hash was expected
but not found was reversed.
Signed-off-by: NLeigh Brown <leigh@solinno.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9fc00cc

[TCP]: Fix oops caused by tcp_v4_md5_do_del · 8228a18d

由 Leigh Brown 提交于 12月 17, 2006

md5sig_info.alloced4 must be set to zero when freeing keys4, otherwise
it will not be alloc'd again when another key is added to the same
socket by tcp_v4_md5_do_add.
Signed-off-by: NLeigh Brown <leigh@solinno.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8228a18d

03 12月, 2006 13 次提交

[TCP]: Fix warnings with TCP_MD5SIG disabled. · b6332e6c

由 Andrew Morton 提交于 11月 30, 2006

Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6332e6c

[NET]: Possible cleanups. · f5b99bcd

由 Adrian Bunk 提交于 11月 30, 2006

This patch contains the following possible cleanups:
- make the following needlessly global functions statis:
  - ipv4/tcp.c: __tcp_alloc_md5sig_pool()
  - ipv4/tcp_ipv4.c: tcp_v4_reqsk_md5_lookup()
  - ipv4/udplite.c: udplite_rcv()
  - ipv4/udplite.c: udplite_err()
- make the following needlessly global structs static:
  - ipv4/tcp_ipv4.c: tcp_request_sock_ipv4_ops
  - ipv4/tcp_ipv4.c: tcp_sock_ipv4_specific
  - ipv6/tcp_ipv6.c: tcp_request_sock_ipv6_ops
- net/ipv{4,6}/udplite.c: remove inline's from static functions
                          (gcc should know best when to inline them)
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5b99bcd

[TCP] MD5SIG: Kill CONFIG_TCP_MD5SIG_DEBUG. · 08dd1a50

由 David S. Miller 提交于 11月 30, 2006

It just obfuscates the code and adds limited value.  And as Adrian
Bunk noticed, it lacked Kconfig help text too, so just kill it.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08dd1a50

[NET]: Split skb->csum · ff1dcadb

由 Al Viro 提交于 11月 20, 2006

... into anonymous union of __wsum and __u32 (csum and csum_offset resp.)
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff1dcadb

A
[NET]: Fix assorted misannotations (from md5 and udplite merges). · 8e5200f5
由 Al Viro 提交于 11月 20, 2006
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
8e5200f5

[TCP_IPV4]: Use kmemdup where appropriate · f6685938

由 Arnaldo Carvalho de Melo 提交于 11月 17, 2006

Also use a variable to avoid the longish tp->md5sig_info-> use
in tcp_v4_md5_do_add.

Code diff stats:

[acme@newtoy net-2.6.20]$ codiff /tmp/tcp_ipv4.o.before /tmp/tcp_ipv4.o.after
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp_ipv4.c:
  tcp_v4_md5_do_add     |  -62
  tcp_v4_syn_recv_sock  |  -32
  tcp_v4_parse_md5_keys |  -86
 3 functions changed, 180 bytes removed
[acme@newtoy net-2.6.20]$
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

f6685938

A
[TCP_IPV4]: CodingStyle cleanups, no code change · 7174259e
由 Arnaldo Carvalho de Melo 提交于 11月 17, 2006
```
Mostly related to CONFIG_TCP_MD5SIG recent merge.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
```
7174259e

[NET]: Annotate __skb_checksum_complete() and friends. · b51655b9

由 Al Viro 提交于 11月 14, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b51655b9

[IPV6]: Assorted trivial endianness annotations. · 714e85be

由 Al Viro 提交于 11月 14, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

714e85be

[TCP]: MD5 Signature Option (RFC2385) support. · cfb6eeb4

由 YOSHIFUJI Hideaki 提交于 11月 14, 2006

Based on implementation by Rick Payne.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfb6eeb4

[TCP/DCCP]: Introduce net_xmit_eval · b9df3cb8

由 Gerrit Renker 提交于 11月 14, 2006

Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
return code of a transmit function needs to be tested against NET_XMIT_CN
which is a value that does not indicate a strict error condition.

This patch uses a macro for these recurring situations which is consistent
with the already existing macro net_xmit_errno, saving on duplicated code.
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

b9df3cb8

[TCP]: Remove dead code in init_sequence · a94f723d

由 Gerrit Renker 提交于 11月 10, 2006

This removes two redundancies:

1) The test (skb->protocol == htons(ETH_P_IPV6) in tcp_v6_init_sequence()
   is always true, due to
	* tcp_v6_conn_request() is the only function calling this one
	* tcp_v6_conn_request() redirects all skb's with ETH_P_IP protocol to
	  tcp_v4_conn_request() [ cf. top of tcp_v6_conn_request()]

2) The first argument, `struct sock *sk' of tcp_v{4,6}_init_sequence() is
   never used.
Signed-off-by: NGerrit Renker  <gerrit@erg.abdn.ac.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a94f723d

[NET]: Size listen hash tables using backlog hint · 72a3effa

由 Eric Dumazet 提交于 11月 16, 2006

We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
each LISTEN socket, regardless of various parameters (listen backlog for
example)

On x86_64, this means order-1 allocations (might fail), even for 'small'
sockets, expecting few connections. On the contrary, a huge server wanting a
backlog of 50000 is slowed down a bit because of this fixed limit.

This patch makes the sizing of listen hash table a dynamic parameter,
depending of :
- net.core.somaxconn tunable (default is 128)
- net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
- backlog value given by user application  (2nd parameter of listen())

For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
kmalloc().

We still limit memory allocation with the two existing tunables (somaxconn &
tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
usage.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72a3effa

20 10月, 2006 1 次提交

[TCP]: One NET_INC_STATS() could be NET_INC_STATS_BH in tcp_v4_err() · 06ca719f

由 Eric Dumazet 提交于 10月 20, 2006

I believe this NET_INC_STATS() call can be replaced by
NET_INC_STATS_BH(), a little bit cheaper.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06ca719f