提交 · d6701191329b51793bc56724548f0863d2149c29 · openeuler / Kernel

21 11月, 2007 2 次提交
- Y
  [IPV4] TCPMD5: Use memmove() instead of memcpy() because we have overlaps. · 354faf09
  由 YOSHIFUJI Hideaki 提交于 11月 20, 2007
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  354faf09
- Y
  [IPV4] TCPMD5: Omit redundant NULL check for kfree() argument. · a80cc20d
  由 YOSHIFUJI Hideaki 提交于 11月 20, 2007
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  a80cc20d
07 11月, 2007 2 次提交

[INET]: Remove per bucket rwlock in tcp/dccp ehash table. · 230140cf

由 Eric Dumazet 提交于 11月 07, 2007

As done two years ago on IP route cache table (commit
22c047cc) , we can avoid using one
lock per hash bucket for the huge TCP/DCCP hash tables.

On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for
litle performance differences. (we hit a different cache line for the
rwlock, but then the bucket cache line have a better sharing factor
among cpus, since we dirty it less often). For netstat or ss commands
that want a full scan of hash table, we perform fewer memory accesses.

Using a 'small' table of hashed rwlocks should be more than enough to
provide correct SMP concurrency between different buckets, without
using too much memory. Sizing of this table depends on
num_possible_cpus() and various CONFIG settings.

This patch provides some locking abstraction that may ease a future
work using a different model for TCP/DCCP table.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

230140cf

[IPV4]: Use the {DEFINE|REF}_PROTO_INUSE infrastructure · 47a31a6f

由 Eric Dumazet 提交于 11月 05, 2007

Trivial patch to make "tcp,udp,udplite,raw" protocols uses the fast
"inuse sockets" infrastructure

Each protocol use then a static percpu var, instead of a dynamic one.
This saves some ram and some cpu cycles
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47a31a6f

02 11月, 2007 1 次提交

[SG] Get rid of __sg_mark_end() · c46f2334

由 Jens Axboe 提交于 10月 31, 2007

sg_mark_end() overwrites the page_link information, but all users want
__sg_mark_end() behaviour where we just set the end bit. That is the most
natural way to use the sg list, since you'll fill it in and then mark the
end point.

So change sg_mark_end() to only set the termination bit. Add a sg_magic
debug check as well, and clear a chain pointer if it is set.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c46f2334

31 10月, 2007 1 次提交

[NET]: Fix incorrect sg_mark_end() calls. · 51c739d1

由 David S. Miller 提交于 10月 30, 2007

This fixes scatterlist corruptions added by

	commit 68e3f5dd
	[CRYPTO] users: Fix up scatterlist conversion errors

The issue is that the code calls sg_mark_end() which clobbers the
sg_page() pointer of the final scatterlist entry.

The first part fo the fix makes skb_to_sgvec() do __sg_mark_end().

After considering all skb_to_sgvec() call sites the most correct
solution is to call __sg_mark_end() in skb_to_sgvec() since that is
what all of the callers would end up doing anyways.

I suspect this might have fixed some problems in virtio_net which is
the sole non-crypto user of skb_to_sgvec().

Other similar sg_mark_end() cases were converted over to
__sg_mark_end() as well.

Arguably sg_mark_end() is a poorly named function because it doesn't
just "mark", it clears out the page pointer as a side effect, which is
what led to these bugs in the first place.

The one remaining plain sg_mark_end() call is in scsi_alloc_sgtable()
and arguably it could be converted to __sg_mark_end() if only so that
we can delete this confusing interface from linux/scatterlist.h
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51c739d1

30 10月, 2007 1 次提交

[TCP] MD5: Remove some more unnecessary casting. · b0a713e9

由 Matthias M. Dellweg 提交于 10月 29, 2007

while reviewing the tcp_md5-related code further i came across with
another two of these casts which you probably have missed. I don't
actually think that they impose a problem by now, but as you said we
should remove them.
Signed-off-by: NMatthias M. Dellweg <2500@gmx.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0a713e9

26 10月, 2007 1 次提交
- D
  [TCP]: Fix scatterlist handling in MD5 signature support. · c7da57a1
  由 David S. Miller 提交于 10月 26, 2007
```
Use sg_init_table() and sg_mark_end() as needed.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  c7da57a1
11 10月, 2007 2 次提交

[INET]: local port range robustness · 227b60f5

由 Stephen Hemminger 提交于 10月 10, 2007

Expansion of original idea from Denis V. Lunev <den@openvz.org>

Add robustness and locking to the local_port_range sysctl.
1. Enforce that low < high when setting.
2. Use seqlock to ensure atomic update.

The locking might seem like overkill, but there are
cases where sysadmin might want to change value in the
middle of a DoS attack.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

227b60f5

[NET]: Make /proc/net per network namespace · 457c4cbc

由 Eric W. Biederman 提交于 9月 12, 2007

This patch makes /proc/net per network namespace. It modifies the global
variables proc_net and proc_net_stat to be per network namespace.
The proc_net file helpers are modified to take a network namespace argument,
and all of their callers are fixed to pass &init_net for that argument.
This ensures that all of the /proc/net files are only visible and
usable in the initial network namespace until the code behind them
has been updated to be handle multiple network namespaces.

Making /proc/net per namespace is necessary as at least some files
in /proc/net depend upon the set of network devices which is per
network namespace, and even more files in /proc/net have contents
that are relevant to a single network namespace.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

457c4cbc

29 9月, 2007 1 次提交

[TCP]: Fix MD5 signature handling on big-endian. · f8ab18d2

由 David S. Miller 提交于 9月 28, 2007

Based upon a report and initial patch by Peter Lieven.

tcp4_md5sig_key and tcp6_md5sig_key need to start with
the exact same members as tcp_md5sig_key.  Because they
are both cast to that type by tcp_v{4,6}_md5_do_lookup().

Unfortunately tcp{4,6}_md5sig_key use a u16 for the key
length instead of a u8, which is what tcp_md5sig_key
uses.  This just so happens to work by accident on
little-endian, but on big-endian it doesn't.

Instead of casting, just place tcp_md5sig_key as the first member of
the address-family specific structures, adjust the access sites, and
kill off the ugly casts.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8ab18d2

03 8月, 2007 1 次提交

[TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). · 3516ffb0

由 David S. Miller 提交于 8月 02, 2007

As discovered by Evegniy Polyakov, if we try to sendmsg after
a connection reset, we can do incredibly stupid things.

The core issue is that inet_sendmsg() tries to autobind the
socket, but we should never do that for TCP.  Instead we should
just go straight into TCP's sendmsg() code which will do all
of the necessary state and pending socket error checks.

TCP's sendpage already directly vectors to tcp_sendpage(), so this
merely brings sendmsg() in line with that.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3516ffb0

11 7月, 2007 1 次提交

[TCPv4]: Improve BH latency in /proc/net/tcp · a7ab4b50

由 Herbert Xu 提交于 6月 10, 2007

Currently the code for /proc/net/tcp disable BH while iterating
over the entire established hash table.  Even though we call
cond_resched_softirq for each entry, we still won't process
softirq's as regularly as we would otherwise do which results
in poor performance when the system is loaded near capacity.

This anomaly comes from the 2.4 code where this was all in a
single function and the local_bh_disable might have made sense
as a small optimisation.

The cost of each local_bh_disable is so small when compared
against the increased latency in keeping it disabled over a
large but mostly empty TCP established hash table that we
should just move it to the individual read_lock/read_unlock
calls as we do in inet_diag.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7ab4b50

13 6月, 2007 1 次提交
- D
  [TCP]: Disable TSO if MD5SIG is enabled. · 3d7dbeac
  由 David S. Miller 提交于 6月 12, 2007
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  3d7dbeac
08 6月, 2007 1 次提交

[TCP]: Honour sk_bound_dev_if in tcp_v4_send_ack · f0e48dbf

由 Patrick McHardy 提交于 6月 04, 2007

A time_wait socket inherits sk_bound_dev_if from the original socket,
but it is not used when sending ACK packets using ip_send_reply.

Fix by passing the oif to ip_send_reply in struct ip_reply_arg and
use it for output routing.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0e48dbf

04 6月, 2007 1 次提交
- W
  [IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP · 584bdf8c
  由 Wei Dong 提交于 5月 31, 2007
```
Signed-off-by: NWei Dong <weidong@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  584bdf8c
26 4月, 2007 10 次提交

[NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY · 60476372

由 Herbert Xu 提交于 4月 09, 2007

When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
maps to the semantics of CHECKSUM_UNNECESSARY.  Therefore we should
treat it as such in the stack.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60476372

[NET]: Use csum_start offset instead of skb_transport_header · 663ead3b

由 Herbert Xu 提交于 4月 09, 2007

The skb transport pointer is currently used to specify the start
of the checksum region for transmit checksum offload.  Unfortunately,
the same pointer is also used during receive side processing.

This creates a problem when we want to retransmit a received
packet with partial checksums since the skb transport pointer
would be overwritten.

This patch solves this problem by creating a new 16-bit csum_start
offset value to replace the skb transport header for the purpose
of checksums.  This offset is calculated from skb->head so that
it does not have to change when skb->data changes.

No extra space is required since csum_offset itself fits within
a 16-bit word so we can use the other 16 bits for csum_start.

For backwards compatibility, just before we push a packet with
partial checksums off into the device driver, we set the skb
transport header to what it would have been under the old scheme.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

663ead3b

[TCP]: tcp_memory_pressure and tcp_socket are__read_mostly candidates · 4103f8cd

由 Eric Dumazet 提交于 3月 27, 2007

tcp_memory_pressure and tcp_socket currently share a cache line with tcp_memory_allocated, tcp_sockets_allocated.
(Very hot cache line)
It makes sense to declare these variables as __read_mostly, to avoid false sharing on SMP.

ffffffff8081d9c0 B tcp_orphan_count
ffffffff8081d9c4 B tcp_memory_allocated
ffffffff8081d9c8 B tcp_sockets_allocated
ffffffff8081d9cc B tcp_memory_pressure
ffffffff8081d9d0 b tcp_md5sig_users
ffffffff8081d9d8 b tcp_md5sig_pool
ffffffff8081d9e0 b warntime.31570
ffffffff8081d9e8 b tcp_socket
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4103f8cd

[SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th · aa8223c7

由 Arnaldo Carvalho de Melo 提交于 4月 10, 2007

Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa8223c7

[TCP]: Introduce tcp_hdrlen() and tcp_optlen() · ab6a5bb6

由 Arnaldo Carvalho de Melo 提交于 3月 18, 2007

The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.

Ditched a no-op in bnx2 in the process.

I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab6a5bb6

A
[SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmph · 88c7664f
由 Arnaldo Carvalho de Melo 提交于 3月 13, 2007
```
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
88c7664f
A
[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph · eddc9ec5
由 Arnaldo Carvalho de Melo 提交于 4月 20, 2007
```
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
eddc9ec5

[TCP]: Abstract out all write queue operations. · fe067e8a

由 David S. Miller 提交于 3月 07, 2007

This allows the write queue implementation to be changed,
for example, to one which allows fast interval searching.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe067e8a

[NET]: Convert xtime.tv_sec to get_seconds() · 9d729f72

由 James Morris 提交于 3月 04, 2007

Where appropriate, convert references to xtime.tv_sec to the
get_seconds() helper function.
Signed-off-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d729f72

[TCP]: struct *sock argument renamed: sp -> sk · cf4c6bf8

由 Ilpo Järvinen 提交于 2月 22, 2007

In general, TCP code uses "sk" for struct sock pointer.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf4c6bf8

11 2月, 2007 1 次提交

[NET] IPV4: Fix whitespace errors. · e905a9ed

由 YOSHIFUJI Hideaki 提交于 2月 09, 2007

Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e905a9ed

09 2月, 2007 3 次提交

[NET]: change layout of ehash table · dbca9b27

由 Eric Dumazet 提交于 2月 08, 2007

ehash table layout is currently this one :

First half of this table is used by sockets not in TIME_WAIT state
Second half of it is used by sockets in TIME_WAIT state.

This is non optimal because of for a given hash or socket, the two chain heads
are located in separate cache lines.
Moreover the locks of the second half are never used.

If instead of this halving, we use two list heads in inet_ehash_bucket instead
of only one, we probably can avoid one cache miss, and reduce ram usage,
particularly if sizeof(rwlock_t) is big (various CONFIG_DEBUG_SPINLOCK,
CONFIG_DEBUG_LOCK_ALLOC settings). So we still halves the table but we keep
together related chains to speedup lookups and socket state change.

In this patch I did not try to align struct inet_ehash_bucket, but a future
patch could try to make this structure have a convenient size (a power of two
or a multiple of L1_CACHE_SIZE).
I guess rwlock will just vanish as soon as RCU is plugged into ehash :) , so
maybe we dont need to scratch our heads to align the bucket...

Note : In case struct inet_ehash_bucket is not a power of two, we could
probably change alloc_large_system_hash() (in case it use __get_free_pages())
to free the unused space. It currently allocates a big zone, but the last
quarter of it could be freed. Again, this should be a temporary 'problem'.

Patch tested on ipv4 tcp only, but should be OK for IPV6 and DCCP.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbca9b27

[IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts. · 8eb9086f

由 David S. Miller 提交于 2月 08, 2007

Do this even for non-blocking sockets.  This avoids the silly -EAGAIN
that applications can see now, even for non-blocking sockets in some
cases (f.e. connect()).

With help from Venkat Tekkirala.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eb9086f

[TCP]: remove tcp header from tcp_v4_check (take #2) · ba7808ea

由 Frederik Deweerdt 提交于 2月 04, 2007

The tcphdr struct passed to tcp_v4_check is not used, the following
patch removes it from the parameter list.

This adds the netfilter modifications missing in the patch I sent
for rc3-mm1.
Signed-off-by: NFrederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba7808ea

09 1月, 2007 1 次提交

[TCP]: Fix iov_len calculation in tcp_v4_send_ack(). · cb48cfe8

由 Craig Schlenter 提交于 1月 09, 2007

This fixes the ftp stalls present in the current kernels.

All credit goes to Komuro <komurojun-mbn@nifty.com> for tracking
this down. The patch is untested but it looks *cough* obviously
correct.
Signed-off-by: NCraig Schlenter <craig@codefountain.com>
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb48cfe8

18 12月, 2006 2 次提交

[TCP]: Trivial fix to message in tcp_v4_inbound_md5_hash · a9fc00cc

由 Leigh Brown 提交于 12月 17, 2006

The message logged in tcp_v4_inbound_md5_hash when the hash was expected
but not found was reversed.
Signed-off-by: NLeigh Brown <leigh@solinno.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9fc00cc

[TCP]: Fix oops caused by tcp_v4_md5_do_del · 8228a18d

由 Leigh Brown 提交于 12月 17, 2006

md5sig_info.alloced4 must be set to zero when freeing keys4, otherwise
it will not be alloc'd again when another key is added to the same
socket by tcp_v4_md5_do_add.
Signed-off-by: NLeigh Brown <leigh@solinno.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8228a18d

03 12月, 2006 7 次提交

[TCP]: Fix warnings with TCP_MD5SIG disabled. · b6332e6c

由 Andrew Morton 提交于 11月 30, 2006

Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6332e6c

[NET]: Possible cleanups. · f5b99bcd

由 Adrian Bunk 提交于 11月 30, 2006

This patch contains the following possible cleanups:
- make the following needlessly global functions statis:
  - ipv4/tcp.c: __tcp_alloc_md5sig_pool()
  - ipv4/tcp_ipv4.c: tcp_v4_reqsk_md5_lookup()
  - ipv4/udplite.c: udplite_rcv()
  - ipv4/udplite.c: udplite_err()
- make the following needlessly global structs static:
  - ipv4/tcp_ipv4.c: tcp_request_sock_ipv4_ops
  - ipv4/tcp_ipv4.c: tcp_sock_ipv4_specific
  - ipv6/tcp_ipv6.c: tcp_request_sock_ipv6_ops
- net/ipv{4,6}/udplite.c: remove inline's from static functions
                          (gcc should know best when to inline them)
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5b99bcd

[TCP] MD5SIG: Kill CONFIG_TCP_MD5SIG_DEBUG. · 08dd1a50

由 David S. Miller 提交于 11月 30, 2006

It just obfuscates the code and adds limited value.  And as Adrian
Bunk noticed, it lacked Kconfig help text too, so just kill it.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08dd1a50

[NET]: Split skb->csum · ff1dcadb

由 Al Viro 提交于 11月 20, 2006

... into anonymous union of __wsum and __u32 (csum and csum_offset resp.)
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff1dcadb

A
[NET]: Fix assorted misannotations (from md5 and udplite merges). · 8e5200f5
由 Al Viro 提交于 11月 20, 2006
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
8e5200f5

[TCP_IPV4]: Use kmemdup where appropriate · f6685938

由 Arnaldo Carvalho de Melo 提交于 11月 17, 2006

Also use a variable to avoid the longish tp->md5sig_info-> use
in tcp_v4_md5_do_add.

Code diff stats:

[acme@newtoy net-2.6.20]$ codiff /tmp/tcp_ipv4.o.before /tmp/tcp_ipv4.o.after
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp_ipv4.c:
  tcp_v4_md5_do_add     |  -62
  tcp_v4_syn_recv_sock  |  -32
  tcp_v4_parse_md5_keys |  -86
 3 functions changed, 180 bytes removed
[acme@newtoy net-2.6.20]$
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

f6685938

A
[TCP_IPV4]: CodingStyle cleanups, no code change · 7174259e
由 Arnaldo Carvalho de Melo 提交于 11月 17, 2006
```
Mostly related to CONFIG_TCP_MD5SIG recent merge.
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
```
7174259e

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功