提交 · 05d8402576c9c1b85bfc9e4f9d6a21c27ccbd5b1 · openeuler / raspberrypi-kernel

23 2月, 2011 1 次提交
- D
  xfrm: Mark flowi arg to ->get_tos() const. · 05d84025
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  05d84025
21 2月, 2011 1 次提交

tcp: Remove debug macro of TCP_CHECK_TIMER · 089c3482

由 Shan Wei 提交于 2月 19, 2011

Now, TCP_CHECK_TIMER is not used for debuging, it does nothing.
And, it has been there for several years, maybe 6 years.

Remove it to keep code clearer.
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

089c3482

19 2月, 2011 1 次提交

net: provide default_advmss() methods to blackhole dst_ops · 214f45c9

由 Eric Dumazet 提交于 2月 18, 2011

Commit 0dbaee3b (net: Abstract default ADVMSS behind an
accessor.) introduced a possible crash in tcp_connect_init(), when
dst->default_advmss() is called from dst_metric_advmss()
Reported-by: NGeorge Spelvin <linux@horizon.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

214f45c9

18 2月, 2011 1 次提交

net: Add initial_ref arg to dst_alloc(). · 3c7bd1a1

由 David S. Miller 提交于 2月 16, 2011

This allows avoiding multiple writes to the initial __refcnt.

The most simplest cases of wanting an initial reference of "1"
in ipv4 and ipv6 have been converted, the rest have been left
along and kept at the existing "0".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c7bd1a1

17 2月, 2011 1 次提交

netfilter: ip6t_LOG: fix a flaw in printing the MAC · 0af320fb

由 Joerg Marx 提交于 2月 17, 2011

The flaw was in skipping the second byte in MAC header due to increasing
the pointer AND indexed access starting at '1'.
Signed-off-by: NJoerg Marx <joerg.marx@secunet.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

0af320fb

11 2月, 2011 2 次提交

inet: Create a mechanism for upward inetpeer propagation into routes. · 6431cbc2

由 David S. Miller 提交于 2月 07, 2011

If we didn't have a routing cache, we would not be able to properly
propagate certain kinds of dynamic path attributes, for example
PMTU information and redirects.

The reason is that if we didn't have a routing cache, then there would
be no way to lookup all of the active cached routes hanging off of
sockets, tunnels, IPSEC bundles, etc.

Consider the case where we created a cached route, but no inetpeer
entry existed and also we were not asked to pre-COW the route metrics
and therefore did not force the creation a new inetpeer entry.

If we later get a PMTU message, or a redirect, and store this
information in a new inetpeer entry, there is no way to teach that
cached route about the newly existing inetpeer entry.

The facilities implemented here handle this problem.

First we create a generation ID.  When we create a cached route of any
kind, we remember the generation ID at the time of attachment.  Any
time we force-create an inetpeer entry in response to new path
information, we bump that generation ID.

The dst_ops->check() callback is where the knowledge of this event
is propagated.  If the global generation ID does not equal the one
stored in the cached route, and the cached route has not attached
to an inetpeer yet, we look it up and attach if one is found.  Now
that we've updated the cached route's information, we update the
route's generation ID too.

This clears the way for implementing PMTU and redirects directly in
the inetpeer cache.  There is absolutely no need to consult cached
route information in order to maintain this information.

At this point nothing bumps the inetpeer genids, that comes in the
later changes which handle PMTUs and redirects using inetpeers.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6431cbc2

inetpeer: Abstract address representation further. · 7a71ed89

由 David S. Miller 提交于 2月 09, 2011

Future changes will add caching information, and some of
these new elements will be addresses.

Since the family is implicit via the ->daddr.family member,
replicating the family in ever address we store is entirely
redundant.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a71ed89

09 2月, 2011 1 次提交

net: Kill NETEVENT_PMTU_UPDATE. · 8d13a2a9

由 David S. Miller 提交于 2月 08, 2011

Nobody actually does anything in response to the event,
so just kill it off.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d13a2a9

05 2月, 2011 1 次提交

inetpeer: Move ICMP rate limiting state into inet_peer entries. · 92d86829

由 David S. Miller 提交于 2月 04, 2011

Like metrics, the ICMP rate limiting bits are cached state about
a destination.  So move it into the inet_peer entries.

If an inet_peer cannot be bound (the reason is memory allocation
failure or similar), the policy is to allow.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92d86829

04 2月, 2011 1 次提交
- D
  net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6. · e2d57766
  由 David S. Miller 提交于 2月 03, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  e2d57766
01 2月, 2011 2 次提交

net: Fix ipv6 neighbour unregister_sysctl_table warning · bf36076a

由 Eric W. Biederman 提交于 1月 31, 2011

In my testing of 2.6.37 I was occassionally getting a warning about
sysctl table entries being unregistered in the wrong order.  Digging
in it turns out this dates back to the last great sysctl reorg done
where Al Viro introduced the requirement that sysctl directories
needed to be created before and destroyed after the files in them.

It turns out that in that great reorg /proc/sys/net/ipv6/neigh was
overlooked.  So this patch fixes that oversight and makes an annoying
warning message go away.

>------------[ cut here ]------------
>WARNING: at kernel/sysctl.c:1992 unregister_sysctl_table+0x134/0x164()
>Pid: 23951, comm: kworker/u:3 Not tainted 2.6.37-350888.2010AroraKernelBeta.fc14.x86_64 #1
>Call Trace:
> [<ffffffff8103e034>] warn_slowpath_common+0x80/0x98
> [<ffffffff8103e061>] warn_slowpath_null+0x15/0x17
> [<ffffffff810452f8>] unregister_sysctl_table+0x134/0x164
> [<ffffffff810e7834>] ? kfree+0xc4/0xd1
> [<ffffffff813439b2>] neigh_sysctl_unregister+0x22/0x3a
> [<ffffffffa02cd14e>] addrconf_ifdown+0x33f/0x37b [ipv6]
> [<ffffffff81331ec2>] ? skb_dequeue+0x5f/0x6b
> [<ffffffffa02ce4a5>] addrconf_notify+0x69b/0x75c [ipv6]
> [<ffffffffa02eb953>] ? ip6mr_device_event+0x98/0xa9 [ipv6]
> [<ffffffff813d2413>] notifier_call_chain+0x32/0x5e
> [<ffffffff8105bdea>] raw_notifier_call_chain+0xf/0x11
> [<ffffffff8133cdac>] call_netdevice_notifiers+0x45/0x4a
> [<ffffffff8133d2b0>] rollback_registered_many+0x118/0x201
> [<ffffffff8133d3af>] unregister_netdevice_many+0x16/0x6d
> [<ffffffff8133d571>] default_device_exit_batch+0xa4/0xb8
> [<ffffffff81337c42>] ? cleanup_net+0x0/0x194
> [<ffffffff81337a2a>] ops_exit_list+0x4e/0x56
> [<ffffffff81337d36>] cleanup_net+0xf4/0x194
> [<ffffffff81053318>] process_one_work+0x187/0x280
> [<ffffffff8105441b>] worker_thread+0xff/0x19f
> [<ffffffff8105431c>] ? worker_thread+0x0/0x19f
> [<ffffffff8105776d>] kthread+0x7d/0x85
> [<ffffffff81003824>] kernel_thread_helper+0x4/0x10
> [<ffffffff810576f0>] ? kthread+0x0/0x85
> [<ffffffff81003820>] ? kernel_thread_helper+0x0/0x10
>---[ end trace 8a7e9310b35e9486 ]---
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf36076a

net: Add default_mtu() methods to blackhole dst_ops · ec831ea7

由 Roland Dreier 提交于 1月 31, 2011

When an IPSEC SA is still being set up, __xfrm_lookup() will return
-EREMOTE and so ip_route_output_flow() will return a blackhole route.
This can happen in a sndmsg call, and after d33e4553 ("net: Abstract
default MTU metric calculation behind an accessor.") this leads to a
crash in ip_append_data() because the blackhole dst_ops have no
default_mtu() method and so dst_mtu() calls a NULL pointer.

Fix this by adding default_mtu() methods (that simply return 0, matching
the old behavior) to the blackhole dst_ops.

The IPv4 part of this patch fixes a crash that I saw when using an IPSEC
VPN; the IPv6 part is untested because I don't have an IPv6 VPN, but it
looks to be needed as well.
Signed-off-by: NRoland Dreier <roland@purestorage.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec831ea7

28 1月, 2011 2 次提交

net: Store ipv4/ipv6 COW'd metrics in inetpeer cache. · 06582540

由 David S. Miller 提交于 1月 27, 2011

Please note that the IPSEC dst entry metrics keep using
the generic metrics COW'ing mechanism using kmalloc/kfree.

This gives the IPSEC routes an opportunity to use metrics
which are unique to their encapsulated paths.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06582540

ipv6: Remove route peer binding assertions. · 8f2771f2

由 David S. Miller 提交于 1月 27, 2011

They are bogus.  The basic idea is that I wanted to make sure
that prefixed routes never bind to peers.

The test I used was whether RTF_CACHE was set.

But first of all, the RTF_CACHE flag is set at different spots
depending upon which ip6_rt_copy() caller you're talking about.

I've validated all of the code paths, and even in the future
where we bind peers more aggressively (for route metric COW'ing)
we never bind to prefix'd routes, only fully specified ones.
This even applies when addrconf or icmp6 routes are allocated.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f2771f2

27 1月, 2011 2 次提交

net: Implement read-only protection and COW'ing of metrics. · 62fa8a84

由 David S. Miller 提交于 1月 26, 2011

Routing metrics are now copy-on-write.

Initially a route entry points it's metrics at a read-only location.
If a routing table entry exists, it will point there.  Else it will
point at the all zero metric place-holder called 'dst_default_metrics'.

The writeability state of the metrics is stored in the low bits of the
metrics pointer, we have two bits left to spare if we want to store
more states.

For the initial implementation, COW is implemented simply via kmalloc.
However future enhancements will change this to place the writable
metrics somewhere else, in order to increase sharing.  Very likely
this "somewhere else" will be the inetpeer cache.

Note also that this means that metrics updates may transiently fail
if we cannot COW the metrics successfully.

But even by itself, this patch should decrease memory usage and
increase cache locality especially for routing workloads.  In those
cases the read-only metric copies stay in place and never get written
to.

TCP workloads where metrics get updated, and those rare cases where
PMTU triggers occur, will take a very slight performance hit.  But
that hit will be alleviated when the long-term writable metrics
move to a more sharable location.

Since the metrics storage went from a u32 array of RTAX_MAX entries to
what is essentially a pointer, some retooling of the dst_entry layout
was necessary.

Most importantly, we need to preserve the alignment of the reference
count so that it doesn't share cache lines with the read-mostly state,
as per Eric Dumazet's alignment assertion checks.

The only non-trivial bit here is the move of the 'flags' member into
the writeable cacheline.  This is OK since we are always accessing the
flags around the same moment when we made a modification to the
reference count.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62fa8a84

xfrm6: Don't forget to propagate peer into ipsec route. · 7cc2edb8

由 David S. Miller 提交于 1月 26, 2011

Like ipv4, we have to propagate the ipv6 route peer into
the ipsec top-level route during instantiation.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7cc2edb8

26 1月, 2011 1 次提交

ipv6: Revert 'administrative down' address handling changes. · 73a8bd74

由 David S. Miller 提交于 1月 23, 2011

This reverts the following set of commits:

d1ed113f ("ipv6: remove duplicate neigh_ifdown")
29ba5fed ("ipv6: don't flush routes when setting loopback down")
9d82ca98 ("ipv6: fix missing in6_ifa_put in addrconf")
2de79570 ("ipv6: addrconf: don't remove address state on ifdown if the address is being kept")
8595805a ("IPv6: only notify protocols if address is compeletely gone")
27bdb2ab ("IPv6: keep tentative addresses in hash table")
93fa159a ("IPv6: keep route for tentative address")
8f37ada5 ("IPv6: fix race between cleanup and add/delete address")
84e8b803 ("IPv6: addrconf notify when address is unavailable")
dc2b99f7 ("IPv6: keep permanent addresses on admin down")

because the core semantic change to ipv6 address handling on ifdown
has broken some things, in particular "disable_ipv6" sysctl handling.

Stephen has made several attempts to get things back in working order,
but nothing has restored disable_ipv6 fully yet.
Reported-by: NEric W. Biederman <ebiederm@xmission.com>
Tested-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73a8bd74

25 1月, 2011 2 次提交

ipv6: Always clone offlink routes. · d80bc0fd

由 David S. Miller 提交于 1月 24, 2011

Do not handle PMTU vs. route lookup creation any differently
wrt. offlink routes, always clone them.
Reported-by: NPK <runningdoglackey@yahoo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d80bc0fd

net: change netdev->features to u32 · 04ed3e74

由 Michał Mirosław 提交于 1月 24, 2011

Quoting Ben Hutchings: we presumably won't be defining features that
can only be enabled on 64-bit architectures.

Occurences found by `grep -r` on net/, drivers/net, include/

[ Move features and vlan_features next to each other in
  struct netdev, as per Eric Dumazet's suggestion -DaveM ]
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04ed3e74

21 1月, 2011 3 次提交

ipv6: raw: rcu annotations · f2eda47d

由 Eric Dumazet 提交于 1月 20, 2011

Remove sparse warnings, using a function typedef to be able to use __rcu
annotation on mh_filter pointer.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2eda47d

net: ipv6: sit: fix rcu annotations · 753ea8e9

由 Eric Dumazet 提交于 1月 20, 2011

Fix minor __rcu annotations and remove sparse warnings
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

753ea8e9

netfilter: add a missing include in nf_conntrack_reasm.c · bced94ed

由 Eric Dumazet 提交于 1月 20, 2011

After commit ae90bdea (netfilter: fix compilation when conntrack is
disabled but tproxy is enabled) we have following warnings :

net/ipv6/netfilter/nf_conntrack_reasm.c:520:16: warning: symbol
'nf_ct_frag6_gather' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:591:6: warning: symbol
'nf_ct_frag6_output' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:612:5: warning: symbol
'nf_ct_frag6_init' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:640:6: warning: symbol
'nf_ct_frag6_cleanup' was not declared. Should it be static?

Fix this including net/netfilter/ipv6/nf_defrag_ipv6.h
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

bced94ed

19 1月, 2011 1 次提交

ipv6: Silence privacy extensions initialization · 2fdc1c80

由 Romain Francoise 提交于 1月 17, 2011

When a network namespace is created (via CLONE_NEWNET), the loopback
interface is automatically added to the new namespace, triggering a
printk in ipv6_add_dev() if CONFIG_IPV6_PRIVACY is set.

This is problematic for applications which use CLONE_NEWNET as
part of a sandbox, like Chromium's suid sandbox or recent versions of
vsftpd. On a busy machine, it can lead to thousands of useless
"lo: Disabled Privacy Extensions" messages appearing in dmesg.

It's easy enough to check the status of privacy extensions via the
use_tempaddr sysctl, so just removing the printk seems like the most
sensible solution.
Signed-off-by: NRomain Francoise <romain@orebokech.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2fdc1c80

13 1月, 2011 3 次提交

netfilter: x_table: speedup compat operations · 255d0dc3

由 Eric Dumazet 提交于 12月 18, 2010

One iptables invocation with 135000 rules takes 35 seconds of cpu time
on a recent server, using a 32bit distro and a 64bit kernel.

We eventually trigger NMI/RCU watchdog.

INFO: rcu_sched_state detected stall on CPU 3 (t=6000 jiffies)

COMPAT mode has quadratic behavior and consume 16 bytes of memory per
rule.

Switch the xt_compat algos to use an array instead of list, and use a
binary search to locate an offset in the sorted array.

This halves memory need (8 bytes per rule), and removes quadratic
behavior [ O(N*N) -> O(N*log2(N)) ]

Time of iptables goes from 35 s to 150 ms.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

255d0dc3

inet6: prevent network storms caused by linux IPv6 routers · 72b43d08

由 Alexey Kuznetsov 提交于 1月 12, 2011

Linux IPv6 forwards unicast packets, which are link layer multicasts...
The hole was present since day one. I was 100% this check is there, but it is not.

The problem shows itself, f.e. when Microsoft Network Load Balancer runs on a network.
This software resolves IPv6 unicast addresses to multicast MAC addresses.
Signed-off-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72b43d08

netfilter: fix compilation when conntrack is disabled but tproxy is enabled · 2fc72c7b

由 KOVACS Krisztian 提交于 1月 12, 2011

The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but
failed to update the #ifdef stanzas guarding the defragmentation related
fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c.

This patch adds the required #ifdefs so that IPv6 tproxy can truly be used
without connection tracking.

Original report:
http://marc.info/?l=linux-netdev&m=129010118516341&w=2Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2fc72c7b

12 1月, 2011 2 次提交

ah: reload pointers to skb data after calling skb_cow_data() · 4b0ef1f2

由 Dang Hongwu 提交于 1月 11, 2011

skb_cow_data() may allocate a new data buffer, so pointers on
skb should be set after this function.

Bug was introduced by commit dff3bb06 ("ah4: convert to ahash")
and 8631e9bd ("ah6: convert to ahash").
Signed-off-by: NWang Xuefu <xuefu.wang@6wind.com>
Acked-by: NKrzysztof Witek <krzysztof.witek@6wind.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b0ef1f2

tcp: disallow bind() to reuse addr/port · c191a836

由 Eric Dumazet 提交于 1月 11, 2011

inet_csk_bind_conflict() logic currently disallows a bind() if
it finds a friend socket (a socket bound on same address/port)
satisfying a set of conditions :

1) Current (to be bound) socket doesnt have sk_reuse set
OR
2) other socket doesnt have sk_reuse set
OR
3) other socket is in LISTEN state

We should add the CLOSE state in the 3) condition, in order to avoid two
REUSEADDR sockets in CLOSE state with same local address/port, since
this can deny further operations.

Note : a prior patch tried to address the problem in a different (and
buggy) way. (commit fda48a0d tcp: bind() fix when many ports
are bound).
Reported-by: NGaspar Chilingarov <gasparch@gmail.com>
Reported-by: NDaniel Baluta <daniel.baluta@gmail.com>
Tested-by: NDaniel Baluta <daniel.baluta@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c191a836

11 1月, 2011 1 次提交

netfilter: x_tables: dont block BH while reading counters · 83723d60

由 Eric Dumazet 提交于 1月 10, 2011

Using "iptables -L" with a lot of rules have a too big BH latency.
Jesper mentioned ~6 ms and worried of frame drops.

Switch to a per_cpu seqlock scheme, so that taking a snapshot of
counters doesnt need to block BH (for this cpu, but also other cpus).

This adds two increments on seqlock sequence per ipt_do_table() call,
its a reasonable cost for allowing "iptables -L" not block BH
processing.
Reported-by: NJesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: NJesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

83723d60

20 12月, 2010 1 次提交

ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed. · ad0081e4

由 David Stevens 提交于 12月 17, 2010

This patch modifies IPsec6 to fragment IPv6 packets that are
locally generated as needed.

This version of the patch only fragments in tunnel mode, so that fragment
headers will not be obscured by ESP in transport mode.
Signed-off-by: NDavid L Stevens <dlstevens@us.ibm.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad0081e4

19 12月, 2010 2 次提交

ipv6: remove duplicate neigh_ifdown · d1ed113f

由 stephen hemminger 提交于 12月 16, 2010

When device is being set to down, neigh_ifdown was being called
twice. Once from addrconf notifier and once from ndisc notifier.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1ed113f

ipv6: fib6_ifdown cleanup · bc3ef660

由 stephen hemminger 提交于 12月 16, 2010

Remove (unnecessary) casts to make code cleaner.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc3ef660

17 12月, 2010 3 次提交

ipv6: don't flush routes when setting loopback down · 29ba5fed

由 stephen hemminger 提交于 12月 16, 2010

When loopback device is being brought down, then keep the route table
entries because they are special. The entries in the local table for
linklocal routes and ::1 address should not be purged.

This is a sub optimal solution to the problem and should be replaced
by a better fix in future.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29ba5fed

net: fix nulls list corruptions in sk_prot_alloc · fcbdf09d

由 Octavian Purdila 提交于 12月 16, 2010

Special care is taken inside sk_port_alloc to avoid overwriting
skc_node/skc_nulls_node. We should also avoid overwriting
skc_bind_node/skc_portaddr_node.

The patch fixes the following crash:

 BUG: unable to handle kernel paging request at fffffffffffffff0
 IP: [<ffffffff812ec6dd>] udp4_lib_lookup2+0xad/0x370
 [<ffffffff812ecc22>] __udp4_lib_lookup+0x282/0x360
 [<ffffffff812ed63e>] __udp4_lib_rcv+0x31e/0x700
 [<ffffffff812bba45>] ? ip_local_deliver_finish+0x65/0x190
 [<ffffffff812bbbf8>] ? ip_local_deliver+0x88/0xa0
 [<ffffffff812eda35>] udp_rcv+0x15/0x20
 [<ffffffff812bba45>] ip_local_deliver_finish+0x65/0x190
 [<ffffffff812bbbf8>] ip_local_deliver+0x88/0xa0
 [<ffffffff812bb2cd>] ip_rcv_finish+0x32d/0x6f0
 [<ffffffff8128c14c>] ? netif_receive_skb+0x99c/0x11c0
 [<ffffffff812bb94b>] ip_rcv+0x2bb/0x350
 [<ffffffff8128c14c>] netif_receive_skb+0x99c/0x11c0
Signed-off-by: NLeonard Crestez <lcrestez@ixiacom.com>
Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcbdf09d

ipv6: delete expired route in ip6_pmtu_deliver · d3052b55

由 Andrey Vagin 提交于 12月 11, 2010

The first big packets sent to a "low-MTU" client correctly
triggers the creation of a temporary route containing the reduced MTU.

But after the temporary route has expired, new ICMP6 "packet too big"
will be sent, rt6_pmtu_discovery will find the previous EXPIRED route
check that its mtu isn't bigger then in icmp packet and do nothing
before the temporary route will not deleted by gc.

I make the simple experiment:
while :; do
    time ( dd if=/dev/zero bs=10K count=1 | ssh hostname dd of=/dev/null ) || break;
done

The "time" reports real 0m0.197s if a temporary route isn't expired, but
it reports real 0m52.837s (!!!!) immediately after a temporare route has
expired.
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3052b55

16 12月, 2010 1 次提交

netfilter: fix compilation when conntrack is disabled but tproxy is enabled · ae90bdea

由 KOVACS Krisztian 提交于 12月 15, 2010

The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but
failed to update the #ifdef stanzas guarding the defragmentation related
fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c.

This patch adds the required #ifdefs so that IPv6 tproxy can truly be used
without connection tracking.

Original report:
http://marc.info/?l=linux-netdev&m=129010118516341&w=2Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

ae90bdea

15 12月, 2010 1 次提交

net: Abstract default MTU metric calculation behind an accessor. · d33e4553

由 David S. Miller 提交于 12月 14, 2010

Like RTAX_ADVMSS, make the default calculation go through a dst_ops
method rather than caching the computation in the routing cache
entries.

Now dst metrics are pretty much left as-is when new entries are
created, thus optimizing metric sharing becomes a real possibility.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d33e4553

14 12月, 2010 1 次提交

net: Abstract default ADVMSS behind an accessor. · 0dbaee3b

由 David S. Miller 提交于 12月 13, 2010

Make all RTAX_ADVMSS metric accesses go through a new helper function,
dst_metric_advmss().

Leave the actual default metric as "zero" in the real metric slot,
and compute the actual default value dynamically via a new dst_ops
AF specific callback.

For stacked IPSEC routes, we use the advmss of the path which
preserves existing behavior.

Unlike ipv4/ipv6, DecNET ties the advmss to the mtu and thus updates
advmss on pmtu updates.  This inconsistency in advmss handling
results in more raw metric accesses than I wish we ended up with.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0dbaee3b

13 12月, 2010 2 次提交

ipv6: Demark default hoplimit as zero. · a02e4b7d

由 David S. Miller 提交于 12月 12, 2010

This is for consistency with ipv4.  Using "-1" makes
no sense.

It was made this way a long time ago merely to be consistent
with how the ipv6 socket hoplimit "default" is stored.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a02e4b7d

D
net: Abstract RTAX_HOPLIMIT metric accesses behind helper. · 5170ae82
由 David S. Miller 提交于 12月 12, 2010
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
5170ae82