提交 · 629ca23c331ec75ac87b016debbb3c4d2fe62650 · OpenHarmony / kernel_linux

06 7月, 2008 10 次提交

MIB: add struct net to UDP_INC_STATS_USER · 629ca23c

由 Pavel Emelyanov 提交于 7月 05, 2008

Nothing special - all the places already have a struct sock
at hands, so use the sock_net() net.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

629ca23c

netns: selective flush of rt_cache · 32cb5b4e

由 Denis V. Lunev 提交于 7月 05, 2008

dst cache is marked as expired on the per/namespace basis by previous
path. Right now we have to implement selective cache shrinking. This
procedure has been ported from older OpenVz codebase.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32cb5b4e

netns: place rt_genid into struct net · e84f84f2

由 Denis V. Lunev 提交于 7月 05, 2008

Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e84f84f2

ipv4: pass current value of rt_genid into rt_hash · b00180de

由 Denis V. Lunev 提交于 7月 05, 2008

Basically, there is no difference to atomic_read internally or pass it as
a parameter as rt_hash is inline.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b00180de

D
netns: add struct net parameter to rt_cache_invalidate · 86c657f6
由 Denis V. Lunev 提交于 7月 05, 2008
```
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
86c657f6
D
netns: make rt_secret_rebuild timer per namespace · 9f5e97e5
由 Denis V. Lunev 提交于 7月 05, 2008
```
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
9f5e97e5
D
netns: register net.ipv4.route.flush in each namespace · 39a23e75
由 Denis V. Lunev 提交于 7月 05, 2008
```
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
39a23e75

ipv4: remove static flush_delay variable · 639e104f

由 Denis V. Lunev 提交于 7月 05, 2008

flush delay is used as an external storage for net.ipv4.route.flush sysctl
entry. It is write-only.

The ctl_table->data for this entry is used once. Fix this case to point
to the stack to remove global variable. Do this to avoid additional
variable on struct net in the next patch.

Possible race (as it was before) accessing this local variable is removed
using flush_mutex.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

639e104f

net: add fib_rules_ops to flush_cache method · ae299fc0

由 Denis V. Lunev 提交于 7月 05, 2008

This is required to pass namespace context into rt_cache_flush called from
->flush_cache.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae299fc0

D
netns: add namespace parameter to rt_cache_flush · 76e6ebfb
由 Denis V. Lunev 提交于 7月 05, 2008
```
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
76e6ebfb

03 7月, 2008 2 次提交

ipv4: Do cleanup for ip_mr_init · 03d2f897

由 Wang Chen 提交于 7月 03, 2008

Same as ip6_mr_init(), make ip_mr_init() return errno if fails.
But do not do error handling in inet_init(), just print a msg.
Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

03d2f897

tcp: de-bloat a bit with factoring NET_INC_STATS_BH out · 40b215e5

由 Pavel Emelyanov 提交于 7月 03, 2008

There are some places in TCP that select one MIB index to
bump snmp statistics like this:

	if (<something>)
		NET_INC_STATS_BH(<some_id>);
	else if (<something_else>)
		NET_INC_STATS_BH(<some_other_id>);
	...
	else
		NET_INC_STATS_BH(<default_id>);

or in a more tricky but still similar way.

On the other hand, this NET_INC_STATS_BH is a camouflaged
increment of percpu variable, which is not that small.

Factoring those cases out de-bloats 235 bytes on non-preemptible
i386 config and drives parts of the code into 80 columns.

add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-235 (-235)
function                                     old     new   delta
tcp_fastretrans_alert                       1437    1424     -13
tcp_dsack_set                                137     124     -13
tcp_xmit_retransmit_queue                    690     676     -14
tcp_try_undo_recovery                        283     265     -18
tcp_sacktag_write_queue                     1550    1515     -35
tcp_update_reordering                        162     106     -56
tcp_retransmit_timer                         990     904     -86
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40b215e5

02 7月, 2008 1 次提交

icmp: fix units for ratelimit · 6dbf4bca

由 Stephen Hemminger 提交于 7月 01, 2008

Convert the sysctl values for icmp ratelimit to use milliseconds instead
of jiffies which is based on kernel configured HZ.
Internal kernel jiffies are not a proper unit for any userspace API.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6dbf4bca

28 6月, 2008 4 次提交

net/inet_lro: remove setting skb->ip_summed when not LRO-able · 251a4b32

由 Eli Cohen 提交于 6月 27, 2008

When an SKB cannot be chained to a session, the current code attempts
to "restore" its ip_summed field from lro_mgr->ip_summed. However,
lro_mgr->ip_summed does not hold the original value; in fact, we'd
better not touch skb->ip_summed since it is not modified by the code
in the path leading to a failure to chain it.  Also use a cleaer
comment to the describe the ip_summed field of struct net_lro_mgr.

Issue raised by Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

251a4b32

inet fragments: fix race between inet_frag_find and inet_frag_secret_rebuild · 9a375803

由 Pavel Emelyanov 提交于 6月 27, 2008

The problem is that while we work w/o the inet_frags.lock even
read-locked the secret rebuild timer may occur (on another CPU, since
BHs are still disabled in the inet_frag_find) and change the rnd seed
for ipv4/6 fragments.

It was caused by my patch fd9e6354
([INET]: Omit double hash calculations in xxx_frag_intern) late 
in the 2.6.24 kernel, so this should probably be queued to -stable.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a375803

tcp: /proc/net/tcp rto,ato values not scaled properly (v2) · 7be87351

由 Stephen Hemminger 提交于 6月 27, 2008

I found another case where we are sending information to userspace
in the wrong HZ scale.  This should have been fixed back in 2.5 :-(

This means an ABI change but as it stands there is no way for an application
like ss to get the right value.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7be87351

tcp: calculate tcp_mem based on low memory instead of all memory · 57413ebc

由 Miquel van Smoorenburg 提交于 6月 27, 2008

The tcp_mem array which contains limits on the total amount of memory
used by TCP sockets is calculated based on nr_all_pages. On a 32 bits
x86 system, we should base this on the number of lowmem pages.
Signed-off-by: NMiquel van Smoorenburg <miquels@cistron.nl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57413ebc

20 6月, 2008 2 次提交

net: Discard and warn about LRO'd skbs received for forwarding · 4497b076

由 Ben Hutchings 提交于 6月 19, 2008

Add skb_warn_if_lro() to test whether an skb was received with LRO and
warn if so.

Change br_forward(), ip_forward() and ip6_forward() to call it) and
discard the skb if it returns true.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4497b076

net: Disable LRO on devices that are forwarding · 0187bdfb

由 Ben Hutchings 提交于 6月 19, 2008

Large Receive Offload (LRO) is only appropriate for packets that are
destined for the host, and should be disabled if received packets may be
forwarded.  It can also confuse the GSO on output.

Add dev_disable_lro() function which uses the appropriate ethtool ops to
disable LRO if enabled.

Add calls to dev_disable_lro() in br_add_if() and functions that enable
IPv4 and IPv6 forwarding.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0187bdfb

18 6月, 2008 3 次提交

udp: sk_drops handling · cb61cb9b

由 Eric Dumazet 提交于 6月 17, 2008

In commits 33c732c3 ([IPV4]: Add raw
drops counter) and a92aa318 ([IPV6]:
Add raw drops counter), Wang Chen added raw drops counter for
/proc/net/raw & /proc/net/raw6

This patch adds this capability to UDP sockets too (/proc/net/udp &
/proc/net/udp6).

This means that 'RcvbufErrors' errors found in /proc/net/snmp can be also
be examined for each udp socket.

# grep Udp: /proc/net/snmp
Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
Udp: 23971006 75 899420 16390693 146348 0

# cat /proc/net/udp
 sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt  ---
uid  timeout inode ref pointer drops
 75: 00000000:02CB 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2358 2 ffff81082a538c80 0
111: 00000000:006F 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2286 2 ffff81042dd35c80 146348

In this example, only port 111 (0x006F) was flooded by messages that
user program could not read fast enough. 146348 messages were lost.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb61cb9b

xfrm: fix fragmentation for ipv4 xfrm tunnel · fe833fca

由 Steffen Klassert 提交于 6月 17, 2008

When generating the ip header for the transformed packet we just copy
the frag_off field of the ip header from the original packet to the ip
header of the new generated packet. If we receive a packet as a chain
of fragments, all but the last of the new generated packets have the
IP_MF flag set. We have to mask the frag_off field to only keep the
IP_DF flag from the original packet. This got lost with git commit
36cf9acf ("[IPSEC]: Separate
inner/outer mode processing on output")
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe833fca

netfilter: nf_nat: fix RCU races · 68b80f11

由 Patrick McHardy 提交于 6月 17, 2008

Fix three ct_extend/NAT extension related races:

- When cleaning up the extension area and removing it from the bysource hash,
  the nat->ct pointer must not be set to NULL since it may still be used in
  a RCU read side

- When replacing a NAT extension area in the bysource hash, the nat->ct
  pointer must be assigned before performing the replacement

- When reallocating extension storage in ct_extend, the old memory must
  not be freed immediately since it may still be used by a RCU read side

Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315
and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68b80f11

17 6月, 2008 9 次提交

inet: add struct net argument to inet_ehashfn · 9f26b3ad

由 Pavel Emelyanov 提交于 6月 16, 2008

Although this hash takes addresses into account, the ehash chains
can also be too long when, for instance, communications via lo occur.
So, prepare the inet_hashfn to take struct net into account.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f26b3ad

inet: add struct net argument to inet_lhashfn · 2086a650

由 Pavel Emelyanov 提交于 6月 16, 2008

Listening-on-one-port sockets in many namespaces produce long 
chains in the listening_hash-es, so prepare the inet_lhashfn to 
take struct net into account.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2086a650

inet: add struct net argument to inet_bhashfn · 7f635ab7

由 Pavel Emelyanov 提交于 6月 16, 2008

Binding to some port in many namespaces may create too long
chains in bhash-es, so prepare the hashfn to take struct net
into account.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f635ab7

udp: add struct net argument to udp_hashfn · 19c7578f

由 Pavel Emelyanov 提交于 6月 16, 2008

Every caller already has this one. The new argument is currently 
unused, but this will be fixed shortly.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19c7578f

udp: provide a struct net pointer for __udp[46]_lib_mcast_deliver · e3163493

由 Pavel Emelyanov 提交于 6月 16, 2008

They both calculate the hash chain, but currently do not have
a struct net pointer, so pass one there via additional argument,
all the more so their callers already have such.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3163493

udp: introduce a udp_hashfn function · d6266281

由 Pavel Emelyanov 提交于 6月 16, 2008

Currently the chain to store a UDP socket is calculated with
simple (x & (UDP_HTABLE_SIZE - 1)). But taking net into account
would make this calculation a bit more complex, so moving it into
a function would help.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6266281

ipv4: Remove unused definitions in net/ipv4/tcp_ipv4.c. · a9d246db

由 Rami Rosen 提交于 6月 16, 2008

1) Remove ICMP_MIN_LENGTH, as it is unused.

2) Remove unneeded tcp_v4_send_check() declaration.
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9d246db

raw: Restore /proc/net/raw correct behavior · 68be802c

由 Eric Dumazet 提交于 6月 16, 2008

I just noticed "cat /proc/net/raw" was buggy, missing '\n' separators.

I believe this was introduced by commit 8cd850ef 
([RAW]: Cleanup IPv4 raw_seq_show.)

This trivial patch restores correct behavior, and applies to current 
Linus tree (should also be applied to stable tree as well.)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68be802c

tcp: Revert reset of deferred accept changes in 2.6.26 · 93653e04

由 David S. Miller 提交于 6月 16, 2008

Ingo's system is still seeing strange behavior, and he
reports that is goes away if the rest of the deferred
accept changes are reverted too.

Therefore this reverts e4c78840
("[TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synack") and
539fae89 ("[TCP]: TCP_DEFER_ACCEPT
updates - defer timeout conflicts with max_thresh").

Just like the other revert, these ideas can be revisited for
2.6.27
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93653e04

15 6月, 2008 1 次提交

net: change proto destroy method to return void · 7d06b2e0

由 Brian Haley 提交于 6月 14, 2008

Change struct proto destroy function pointer to return void.  Noticed
by Al Viro.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d06b2e0

13 6月, 2008 1 次提交

tcp: Revert 'process defer accept as established' changes. · ec0a1966

由 David S. Miller 提交于 6月 12, 2008

This reverts two changesets, ec3c0982
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0a
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems.  The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock.  The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> 		goto death;
> 	}
>
>+	if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+		tcp_send_active_reset(sk, GFP_ATOMIC);
>+		goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec0a1966

12 6月, 2008 5 次提交

net: remove CVS keywords · 0b040829

由 Adrian Bunk 提交于 6月 10, 2008

This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b040829

tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash(). · 9501f972

由 YOSHIFUJI Hideaki 提交于 4月 18, 2008

As we do for other socket/timewait-socket specific parameters,
let the callers pass appropriate arguments to
tcp_v{4,6}_do_calc_md5_hash().
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

9501f972

tcp md5sig: Share most of hash calcucaltion bits between IPv4 and IPv6. · 8d26d76d

由 YOSHIFUJI Hideaki 提交于 4月 17, 2008

We can share most part of the hash calculation code because
the only difference between IPv4 and IPv6 is their pseudo headers.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

8d26d76d

tcp md5sig: Remove redundant protocol argument. · 076fb722

由 YOSHIFUJI Hideaki 提交于 4月 17, 2008

Protocol is always TCP, so remove useless protocol argument.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

076fb722

Y
tcp md5sig: Share MD5 Signature option parser between IPv4 and IPv6. · 7d5d5525
由 YOSHIFUJI Hideaki 提交于 4月 17, 2008
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
```
7d5d5525

11 6月, 2008 2 次提交

net: Fix routing tables with id > 255 for legacy software · 709772e6

由 Krzysztof Piotr Oledzki 提交于 6月 10, 2008

Most legacy software do not like tables > 255 as rtm_table is u8
so tb_id is sent &0xff and it is possible to mismatch for example
table 510 with table 254 (main).

This patch introduces RT_TABLE_COMPAT=252 so the code uses it if
tb_id > 255. It makes such old applications happy, new
ones are still able to use RTA_TABLE to get a proper table id.
Signed-off-by: NKrzysztof Piotr Oledzki <ole@ans.pl>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

709772e6

ipv4 addr: Send netlink notification for address label changes · 573bf470

由 Thomas Graf 提交于 6月 10, 2008

Makes people happy who try to keep a list of addresses up to date by
listening to notifications.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

573bf470

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多