提交 · 72e09ad107e78d69ff4d3b97a69f0aad2b77280f · openeuler / Kernel

05 6月, 2010 1 次提交

ipv6: avoid high order allocations · 72e09ad1

由 Eric Dumazet 提交于 6月 05, 2010

With mtu=9000, mld_newpack() use order-2 GFP_ATOMIC allocations, that
are very unreliable, on machines where PAGE_SIZE=4K

Limit allocated skbs to be at most one page. (order-0 allocations)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72e09ad1

01 6月, 2010 1 次提交

net: sock_queue_err_skb() dont mess with sk_forward_alloc · b1faf566

由 Eric Dumazet 提交于 5月 31, 2010

Correct sk_forward_alloc handling for error_queue would need to use a
backlog of frames that softirq handler could not deliver because socket
is owned by user thread. Or extend backlog processing to be able to
process normal and error packets.

Another possibility is to not use mem charge for error queue, this is
what I implemented in this patch.

Note: this reverts commit 29030374
(net: fix sk_forward_alloc corruptions), since we dont need to lock
socket anymore.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1faf566

31 5月, 2010 1 次提交

netfilter: xtables: stackptr should be percpu · 7489aec8

由 Eric Dumazet 提交于 5月 31, 2010

commit f3c5c1bf (netfilter: xtables: make ip_tables reentrant)
introduced a performance regression, because stackptr array is shared by
all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache
line)

Fix this using alloc_percpu()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-By: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

7489aec8

29 5月, 2010 2 次提交

net: fix sk_forward_alloc corruptions · 29030374

由 Eric Dumazet 提交于 5月 29, 2010

As David found out, sock_queue_err_skb() should be called with socket
lock hold, or we risk sk_forward_alloc corruption, since we use non
atomic operations to update this field.

This patch adds bh_lock_sock()/bh_unlock_sock() pair to three spots.
(BH already disabled)

1) skb_tstamp_tx() 
2) Before calling ip_icmp_error(), in __udp4_lib_err() 
3) Before calling ipv6_icmp_error(), in __udp6_lib_err()
Reported-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29030374

IPv6: fix Mobile IPv6 regression · 6057fd78

由 Brian Haley 提交于 5月 28, 2010

Commit f4f914b5 (net: ipv6 bind to device issue) caused
a regression with Mobile IPv6 when it changed the meaning
of fl->oif to become a strict requirement of the route
lookup.  Instead, only force strict mode when
sk->sk_bound_dev_if is set on the calling socket, getting
the intended behavior and fixing the regression.
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6057fd78

28 5月, 2010 1 次提交

ipv6: Add GSO support on forwarding path · 0aa68271

由 Herbert Xu 提交于 5月 27, 2010

Currently we disallow GSO packets on the IPv6 forward path.
This patch fixes this.

Note that I discovered that our existing GSO MTU checks (e.g.,
IPv4 forwarding) are buggy in that they skip the check altogether,
when they really should be checking gso_size + header instead.

I have also been lazy here in that I haven't bothered to segment
the GSO packet by hand before generating an ICMP message.  Someone
should add that to be 100% correct.
Reported-by: NRalf Baechle <ralf@linux-mips.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0aa68271

27 5月, 2010 1 次提交

net: fix lock_sock_bh/unlock_sock_bh · 8a74ad60

由 Eric Dumazet 提交于 5月 26, 2010

This new sock lock primitive was introduced to speedup some user context
socket manipulation. But it is unsafe to protect two threads, one using
regular lock_sock/release_sock, one using lock_sock_bh/unlock_sock_bh

This patch changes lock_sock_bh to be careful against 'owned' state.
If owned is found to be set, we must take the slow path.
lock_sock_bh() now returns a boolean to say if the slow path was taken,
and this boolean is used at unlock_sock_bh time to call the appropriate
unlock function.

After this change, BH are either disabled or enabled during the
lock_sock_bh/unlock_sock_bh protected section. This might be misleading,
so we rename these functions to lock_sock_fast()/unlock_sock_fast().
Reported-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Tested-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a74ad60

26 5月, 2010 1 次提交

ipmr: off by one in __ipmr_fill_mroute() · ed0f160a

由 Dan Carpenter 提交于 5月 26, 2010

This fixes a smatch warning:
	net/ipv4/ipmr.c +1917 __ipmr_fill_mroute(12) error: buffer overflow
	'(mrt)->vif_table' 32 <= 32

The ipv6 version had the same issue.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed0f160a

19 5月, 2010 4 次提交

ipv6: Never schedule DAD timer on dead address · 622ccdf1

由 Herbert Xu 提交于 5月 18, 2010

This patch ensures that all places that schedule the DAD timer
look at the address state in a safe manner before scheduling the
timer.  This ensures that we don't end up with pending timers
after deleting an address.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

622ccdf1

ipv6: Use POSTDAD state · f2344a13

由 Herbert Xu 提交于 5月 18, 2010

This patch makes use of the new POSTDAD state.  This prevents
a race between DAD completion and failure.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2344a13

ipv6: Use state_lock to protect ifa state · 4c5ff6a6

由 Herbert Xu 提交于 5月 18, 2010

This patch makes use of the new state_lock to synchronise between
updates to the ifa state.  This fixes the issue where a remotely
triggered address deletion (through DAD failure) coincides with a
local administrative address deletion, causing certain actions to
be performed twice incorrectly.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c5ff6a6

ipv6: Replace inet6_ifaddr->dead with state · e9d3e084

由 Herbert Xu 提交于 5月 18, 2010

This patch replaces the boolean dead flag on inet6_ifaddr with
a state enum.  This allows us to roll back changes when deleting
an address according to whether DAD has completed or not.

This patch only adds the state field and does not change the logic.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e9d3e084

18 5月, 2010 4 次提交

net: Remove unnecessary returns from void function()s · 3fa21e07

由 Joe Perches 提交于 5月 17, 2010

This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fa21e07

net: Introduce skb_tunnel_rx() helper · d19d56dd

由 Eric Dumazet 提交于 5月 17, 2010

skb rxhash should be cleared when a skb is handled by a tunnel before
being delivered again, so that correct packet steering can take place.

There are other cleanups and accounting that we can factorize in a new
helper, skb_tunnel_rx()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d19d56dd

ipv6: fix the bug of address check · eedf042a

由 Stephen Hemminger 提交于 5月 17, 2010

The duplicate address check code got broken in the conversion
to hlist (2.6.35).  The earlier patch did not fix the case where
two addresses match same hash value. Use two exit paths,
rather than depending on state of loop variables (from macro).

Based on earlier fix by Shan Wei.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Reviewed-by: NShan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eedf042a

ipv6 addrlabel: permit deletion of labels assigned to removed dev · 0771275b

由 Florian Westphal 提交于 5月 07, 2010

as addrlabels with an interface index are left alone when the
interface gets removed this results in addrlabels that can no
longer be removed.

Restrict validation of index to adding new addrlabels.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0771275b

16 5月, 2010 1 次提交

net: Introduce sk_route_nocaps · a465419b

由 Eric Dumazet 提交于 5月 16, 2010

TCP-MD5 sessions have intermittent failures, when route cache is
invalidated. ip_queue_xmit() has to find a new route, calls
sk_setup_caps(sk, &rt->u.dst), destroying the 

sk->sk_route_caps &= ~NETIF_F_GSO_MASK

that MD5 desperately try to make all over its way (from
tcp_transmit_skb() for example)

So we send few bad packets, and everything is fine when
tcp_transmit_skb() is called again for this socket.

Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a
socket field, sk_route_nocaps, containing bits to mask on sk_route_caps.
Reported-by: NBhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a465419b

13 5月, 2010 3 次提交

netfilter: remove unnecessary returns from void function()s · 736d58e3

由 Joe Perches 提交于 5月 13, 2010

This patch removes from net/ netfilter files
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'
Signed-off-by: NJoe Perches <joe@perches.com>
[Patrick: changed to keep return statements in otherwise empty function bodies]
Signed-off-by: NPatrick McHardy <kaber@trash.net>

736d58e3

netfilter: cleanup printk messages · 654d0fbd

由 Stephen Hemminger 提交于 5月 13, 2010

Make sure all printk messages have a severity level.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

654d0fbd

netfilter: change NF_ASSERT to WARN_ON · af567603

由 Stephen Hemminger 提交于 5月 13, 2010

Change netfilter asserts to standard WARN_ON. This has the
benefit of backtrace info and also causes netfilter errors
to show up on kerneloops.org.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

af567603

12 5月, 2010 5 次提交

netfilter: xtables: combine built-in extension structs · 4538506b

由 Jan Engelhardt 提交于 7月 04, 2009

Prepare the arrays for use with the multiregister function. The
future layer-3 xt matches can then be easily added to it without
needing more (un)register code.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

4538506b

netfilter: xtables: change hotdrop pointer to direct modification · b4ba2611

由 Jan Engelhardt 提交于 7月 07, 2009

Since xt_action_param is writable, let's use it. The pointer to
'bool hotdrop' always worried (8 bytes (64-bit) to write 1 byte!).
Surprisingly results in a reduction in size:

   text    data     bss filename
5457066  692730  357892 vmlinux.o-prev
5456554  692730  357892 vmlinux.o
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

b4ba2611

netfilter: xtables: deconstify struct xt_action_param for matches · 62fc8051

由 Jan Engelhardt 提交于 7月 07, 2009

In future, layer-3 matches will be an xt module of their own, and
need to set the fragoff and thoff fields. Adding more pointers would
needlessy increase memory requirements (esp. so for 64-bit, where
pointers are wider).
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

62fc8051

J
netfilter: xtables: substitute temporary defines by final name · 4b560b44
由 Jan Engelhardt 提交于 7月 05, 2009
```
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
4b560b44

netfilter: xtables: combine struct xt_match_param and xt_target_param · de74c169

由 Jan Engelhardt 提交于 7月 05, 2009

The structures carried - besides match/target - almost the same data.
It is possible to combine them, as extensions are evaluated serially,
and so, the callers end up a little smaller.

  text  data  bss  filename
-15318   740  104  net/ipv4/netfilter/ip_tables.o
+15286   740  104  net/ipv4/netfilter/ip_tables.o
-15333   540  152  net/ipv6/netfilter/ip6_tables.o
+15269   540  152  net/ipv6/netfilter/ip6_tables.o
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

de74c169

11 5月, 2010 6 次提交

ipv6: ip6mr: add support for dumping routing tables over netlink · 5b285cac

由 Patrick McHardy 提交于 5月 11, 2010

The ip6mr /proc interface (ip6_mr_cache) can't be extended to dump routes
from any tables but the main table in a backwards compatible fashion since
the output format ends in a variable amount of output interfaces.

Introduce a new netlink interface to dump multicast routes from all tables,
similar to the netlink interface for regular routes.
Signed-off-by: NPatrick McHardy <kaber@trash.net>

5b285cac

ipv6: ip6mr: support multiple tables · d1db275d

由 Patrick McHardy 提交于 5月 11, 2010

This patch adds support for multiple independant multicast routing instances,
named "tables".

Userspace multicast routing daemons can bind to a specific table instance by
issuing a setsockopt call using a new option MRT6_TABLE. The table number is
stored in the raw socket data and affects all following ip6mr setsockopt(),
getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT)
is created with a default routing rule pointing to it. Newly created pim6reg
devices have the table number appended ("pim6regX"), with the exception of
devices created in the default table, which are named just "pim6reg" for
compatibility reasons.

Packets are directed to a specific table instance using routing rules,
similar to how regular routing rules work. Currently iif, oif and mark
are supported as keys, source and destination addresses could be supported
additionally.

Example usage:

- bind pimd/xorp/... to a specific table:

uint32_t table = 123;
setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table));

- create routing rules directing packets to the new table:

# ip -6 mrule add iif eth0 lookup 123
# ip -6 mrule add oif eth0 lookup 123
Signed-off-by: NPatrick McHardy <kaber@trash.net>

d1db275d

P
ipv6: ip6mr: move mroute data into seperate structure · 6bd52143
由 Patrick McHardy 提交于 5月 11, 2010
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
6bd52143
P
ipv6: ip6mr: convert struct mfc_cache to struct list_head · f30a7784
由 Patrick McHardy 提交于 5月 11, 2010
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
f30a7784

ipv6: ip6mr: remove net pointer from struct mfc6_cache · b5aa30b1

由 Patrick McHardy 提交于 5月 11, 2010

Now that cache entries in unres_queue don't need to be distinguished by their
network namespace pointer anymore, we can remove it from struct mfc6_cache
add pass the namespace as function argument to the functions that need it.
Signed-off-by: NPatrick McHardy <kaber@trash.net>

b5aa30b1

ipv6: ip6mr: move unres_queue and timer to per-namespace data · c476efbc

由 Patrick McHardy 提交于 5月 11, 2010

The unres_queue is currently shared between all namespaces. Following patches
will additionally allow to create multiple multicast routing tables in each
namespace. Having a single shared queue for all these users seems to excessive,
move the queue and the cleanup timer to the per-namespace data to unshare it.

As a side-effect, this fixes a bug in the seq file iteration functions: the
first entry returned is always from the current namespace, entries returned
after that may belong to any namespace.
Signed-off-by: NPatrick McHardy <kaber@trash.net>

c476efbc

07 5月, 2010 1 次提交

ipv6: udp: make short packet logging consistent with ipv4 · d6bc0149

由 Bjørn Mork 提交于 5月 06, 2010

Adding addresses and ports to the short packet log message,
like ipv4/udp.c does it, makes these messages a lot more useful:

[  822.182450] UDPv6: short packet: From [2001:db8:ffb4:3::1]:47839 23715/178 to [2001:db8:ffb4:3:5054:ff:feff:200]:1234

This requires us to drop logging in case pskb_may_pull() fails,
which also is consistent with ipv4/udp.c
Signed-off-by: NBjørn Mork <bjorn@mork.no>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6bc0149

06 5月, 2010 1 次提交

IPv6: fix IPV6_RECVERR handling of locally-generated errors · d40a4de0

由 Brian Haley 提交于 5月 03, 2010

I noticed when I added support for IPV6_DONTFRAG that if you set
IPV6_RECVERR and tried to send a UDP packet larger than 64K to an
IPv6 destination, you'd correctly get an EMSGSIZE, but reading from
MSG_ERRQUEUE returned the incorrect address in the cmsg:

struct msghdr:
	 msg_name         0x7fff8f3c96d0
	 msg_namelen      28
struct sockaddr_in6:
	 sin6_family      10
	 sin6_port        7639
	 sin6_flowinfo    0
	 sin6_addr        ::ffff:38.32.0.0
	 sin6_scope_id    0  ((null))

It should have returned this in my case:

struct msghdr:
	 msg_name         0x7fffd866b510
	 msg_namelen      28
struct sockaddr_in6:
	 sin6_family      10
	 sin6_port        7639
	 sin6_flowinfo    0
	 sin6_addr        2620:0:a09:e000:21f:29ff:fe57:f88b
	 sin6_scope_id    0  ((null))

The problem is that ipv6_recv_error() assumes that if the error
wasn't generated by ICMPv6, it's an IPv4 address sitting there,
and proceeds to create a v4-mapped address from it.

Change ipv6_icmp_error() and ipv6_local_error() to set skb->protocol
to htons(ETH_P_IPV6) so that ipv6_recv_error() knows the address
sitting right after the extended error is IPv6, else it will
incorrectly map the first octet into an IPv4-mapped IPv6 address
in the cmsg structure returned in a recvmsg() call to obtain
the error.
Signed-off-by: NBrian Haley <brian.haley@hp.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.htmlSigned-off-by: NDavid S. Miller <davem@davemloft.net>

d40a4de0

04 5月, 2010 2 次提交

ipv6: Fix default multicast hops setting. · f935aa9e

由 David S. Miller 提交于 5月 03, 2010

As per RFC 3493 the default multicast hops setting
for a socket should be "1" just like ipv4.

Ironically we have a IPV6_DEFAULT_MCASTHOPS macro
it just wasn't being used.
Reported-by: NElliot Hughes <enh@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f935aa9e

net: rcu fixes · 4f70ecca

由 Eric Dumazet 提交于 5月 03, 2010

Add hlist_for_each_entry_rcu_bh() and
hlist_for_each_entry_continue_rcu_bh() macros, and use them in
ipv6_get_ifaddr(), if6_get_first() and if6_get_next() to fix lockdeps
warnings.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f70ecca

02 5月, 2010 1 次提交
- J
  netfilter: xtables: dissolve do_match function · ef53d702
  由 Jan Engelhardt 提交于 7月 09, 2009
```
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
  ef53d702
01 5月, 2010 1 次提交

ipv6: cleanup: remove unneeded null check · 83d7eb29

由 Dan Carpenter 提交于 4月 30, 2010

We dereference "sk" unconditionally elsewhere in the function.  

This was left over from:  b30bd282 "ip6_xmit: remove unnecessary NULL
ptr check".  According to that commit message, "the sk argument to 
ip6_xmit is never NULL nowadays since the skb->priority assigment 
expects a valid socket."
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83d7eb29

29 4月, 2010 3 次提交

net: ip_queue_rcv_skb() helper · f84af32c

由 Eric Dumazet 提交于 4月 28, 2010

When queueing a skb to socket, we can immediately release its dst if
target socket do not use IP_CMSG_PKTINFO.

tcp_data_queue() can drop dst too.

This to benefit from a hot cache line and avoid the receiver, possibly
on another cpu, to dirty this cache line himself.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f84af32c

net: speedup udp receive path · 4b0b72f7

由 Eric Dumazet 提交于 4月 28, 2010

Since commit 95766fff ([UDP]: Add memory accounting.), 
each received packet needs one extra sock_lock()/sock_release() pair.

This added latency because of possible backlog handling. Then later,
ticket spinlocks added yet another latency source in case of DDOS.

This patch introduces lock_sock_bh() and unlock_sock_bh()
synchronization primitives, avoiding one atomic operation and backlog
processing.

skb_free_datagram_locked() uses them instead of full blown
lock_sock()/release_sock(). skb is orphaned inside locked section for
proper socket memory reclaim, and finally freed outside of it.

UDP receive path now take the socket spinlock only once.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b0b72f7

Revert "tcp: bind() fix when many ports are bound" · 8d238b25

由 David S. Miller 提交于 4月 28, 2010

This reverts two commits:

fda48a0d
tcp: bind() fix when many ports are bound

and a follow-on fix for it:

6443bb1f
ipv6: Fix inet6_csk_bind_conflict()

It causes problems with binding listening sockets when time-wait
sockets from a previous instance still are alive.

It's too late to keep fiddling with this so late in the -rc
series, and we'll deal with it in net-next-2.6 instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d238b25

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功