提交 · d862a6622e9db508d4b28cc7c5bc28bd548cc24e · openanolis / cloud-kernel

14 1月, 2011 1 次提交

netfilter: nf_conntrack: use is_vmalloc_addr() · d862a662

由 Patrick McHardy 提交于 1月 14, 2011

Use is_vmalloc_addr() in nf_ct_free_hashtable() and get rid of
the vmalloc flags to indicate that a hash table has been allocated
using vmalloc().
Signed-off-by: NPatrick McHardy <kaber@trash.net>

d862a662

17 2月, 2010 1 次提交

percpu: add __percpu sparse annotations to net · 7d720c3e

由 Tejun Heo 提交于 2月 16, 2010

Add __percpu sparse annotations to net.

These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors.  This patch doesn't affect normal builds.

The macro and type tricks around snmp stats make things a bit
interesting.  DEFINE/DECLARE_SNMP_STAT() macros mark the target field
as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly.  All
snmp_mib_*() users which used to cast the argument to (void **) are
updated to cast it to (void __percpu **).
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: netdev@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d720c3e

09 2月, 2010 3 次提交

netfilter: nf_conntrack: fix hash resizing with namespaces · d696c7bd

由 Patrick McHardy 提交于 2月 08, 2010

As noticed by Jon Masters <jonathan@jonmasters.org>, the conntrack hash
size is global and not per namespace, but modifiable at runtime through
/sys/module/nf_conntrack/hashsize. Changing the hash size will only
resize the hash in the current namespace however, so other namespaces
will use an invalid hash size. This can cause crashes when enlarging
the hashsize, or false negative lookups when shrinking it.

Move the hash size into the per-namespace data and only use the global
hash size to initialize the per-namespace value when instanciating a
new namespace. Additionally restrict hash resizing to init_net for
now as other namespaces are not handled currently.

Cc: stable@kernel.org
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d696c7bd

netfilter: nf_conntrack: per netns nf_conntrack_cachep · 5b3501fa

由 Eric Dumazet 提交于 2月 08, 2010

nf_conntrack_cachep is currently shared by all netns instances, but
because of SLAB_DESTROY_BY_RCU special semantics, this is wrong.

If we use a shared slab cache, one object can instantly flight between
one hash table (netns ONE) to another one (netns TWO), and concurrent
reader (doing a lookup in netns ONE, 'finding' an object of netns TWO)
can be fooled without notice, because no RCU grace period has to be
observed between object freeing and its reuse.

We dont have this problem with UDP/TCP slab caches because TCP/UDP
hashtables are global to the machine (and each object has a pointer to
its netns).

If we use per netns conntrack hash tables, we also *must* use per netns
conntrack slab caches, to guarantee an object can not escape from one
namespace to another one.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
[Patrick: added unique slab name allocation]
Cc: stable@kernel.org
Signed-off-by: NPatrick McHardy <kaber@trash.net>

5b3501fa

netfilter: nf_conntrack: fix hash resizing with namespaces · 9ab48ddc

由 Patrick McHardy 提交于 2月 08, 2010

As noticed by Jon Masters <jonathan@jonmasters.org>, the conntrack hash
size is global and not per namespace, but modifiable at runtime through
/sys/module/nf_conntrack/hashsize. Changing the hash size will only
resize the hash in the current namespace however, so other namespaces
will use an invalid hash size. This can cause crashes when enlarging
the hashsize, or false negative lookups when shrinking it.

Move the hash size into the per-namespace data and only use the global
hash size to initialize the per-namespace value when instanciating a
new namespace. Additionally restrict hash resizing to init_net for
now as other namespaces are not handled currently.

Cc: stable@kernel.org
Signed-off-by: NPatrick McHardy <kaber@trash.net>

9ab48ddc

04 2月, 2010 1 次提交

netfilter: nf_conntrack: per netns nf_conntrack_cachep · ab59b19b

由 Eric Dumazet 提交于 2月 04, 2010

nf_conntrack_cachep is currently shared by all netns instances, but
because of SLAB_DESTROY_BY_RCU special semantics, this is wrong.

If we use a shared slab cache, one object can instantly flight between
one hash table (netns ONE) to another one (netns TWO), and concurrent
reader (doing a lookup in netns ONE, 'finding' an object of netns TWO)
can be fooled without notice, because no RCU grace period has to be
observed between object freeing and its reuse.

We dont have this problem with UDP/TCP slab caches because TCP/UDP
hashtables are global to the machine (and each object has a pointer to
its netns).

If we use per netns conntrack hash tables, we also *must* use per netns
conntrack slab caches, to guarantee an object can not escape from one
namespace to another one.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
[Patrick: added unique slab name allocation]
Signed-off-by: NPatrick McHardy <kaber@trash.net>

ab59b19b

13 6月, 2009 2 次提交

netfilter: conntrack: optional reliable conntrack event delivery · dd7669a9

由 Pablo Neira Ayuso 提交于 6月 13, 2009

This patch improves ctnetlink event reliability if one broadcast
listener has set the NETLINK_BROADCAST_ERROR socket option.

The logic is the following: if an event delivery fails, we keep
the undelivered events in the missed event cache. Once the next
packet arrives, we add the new events (if any) to the missed
events in the cache and we try a new delivery, and so on. Thus,
if ctnetlink fails to deliver an event, we try to deliver them
once we see a new packet. Therefore, we may lose state
transitions but the userspace process gets in sync at some point.

At worst case, if no events were delivered to userspace, we make
sure that destroy events are successfully delivered. Basically,
if ctnetlink fails to deliver the destroy event, we remove the
conntrack entry from the hashes and we insert them in the dying
list, which contains inactive entries. Then, the conntrack timer
is added with an extra grace timeout of random32() % 15 seconds
to trigger the event again (this grace timeout is tunable via
/proc). The use of a limited random timeout value allows
distributing the "destroy" resends, thus, avoiding accumulating
lots "destroy" events at the same time. Event delivery may
re-order but we can identify them by means of the tuple plus
the conntrack ID.

The maximum number of conntrack entries (active or inactive) is
still handled by nf_conntrack_max. Thus, we may start dropping
packets at some point if we accumulate a lot of inactive conntrack
entries that did not successfully report the destroy event to
userspace.

During my stress tests consisting of setting a very small buffer
of 2048 bytes for conntrackd and the NETLINK_BROADCAST_ERROR socket
flag, and generating lots of very small connections, I noticed
very few destroy entries on the fly waiting to be resend.

A simple way to test this patch consist of creating a lot of
entries, set a very small Netlink buffer in conntrackd (+ a patch
which is not in the git tree to set the BROADCAST_ERROR flag)
and invoke `conntrack -F'.

For expectations, no changes are introduced in this patch.
Currently, event delivery is only done for new expectations (no
events from expectation expiration, removal and confirmation).
In that case, they need a per-expectation event cache to implement
the same idea that is exposed in this patch.

This patch can be useful to provide reliable flow-accouting. We
still have to add a new conntrack extension to store the creation
and destroy time.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

dd7669a9

netfilter: conntrack: move event caching to conntrack extension infrastructure · a0891aa6

由 Pablo Neira Ayuso 提交于 6月 13, 2009

This patch reworks the per-cpu event caching to use the conntrack
extension infrastructure.

The main drawback is that we consume more memory per conntrack
if event delivery is enabled. This patch is required by the
reliable event delivery that follows to this patch.

BTW, this patch allows you to enable/disable event delivery via
/proc/sys/net/netfilter/nf_conntrack_events in runtime, although
you can still disable event caching as compilation option.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

a0891aa6

26 3月, 2009 1 次提交

netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu() · ea781f19

由 Eric Dumazet 提交于 3月 25, 2009

Use "hlist_nulls" infrastructure we added in 2.6.29 for RCUification of UDP & TCP.

This permits an easy conversion from call_rcu() based hash lists to a
SLAB_DESTROY_BY_RCU one.

Avoiding call_rcu() delay at nf_conn freeing time has numerous gains.

First, it doesnt fill RCU queues (up to 10000 elements per cpu).
This reduces OOM possibility, if queued elements are not taken into account
This reduces latency problems when RCU queue size hits hilimit and triggers
emergency mode.

- It allows fast reuse of just freed elements, permitting better use of
CPU cache.

- We delete rcu_head from "struct nf_conn", shrinking size of this structure
by 8 or 16 bytes.

This patch only takes care of "struct nf_conn".
call_rcu() is still used for less critical conntrack parts, that may
be converted later if necessary.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

ea781f19

08 10月, 2008 11 次提交

A
netfilter: netns nf_conntrack: per-netns conntrack accounting · d716a4df
由 Alexey Dobriyan 提交于 10月 08, 2008
```
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
d716a4df
A
netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_log_invalid sysctl · c2a2c7e0
由 Alexey Dobriyan 提交于 10月 08, 2008
```
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
c2a2c7e0
A
netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_checksum sysctl · c04d0552
由 Alexey Dobriyan 提交于 10月 08, 2008
```
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
c04d0552

netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_count sysctl · 80250707

由 Alexey Dobriyan 提交于 10月 08, 2008

Note, sysctl table is always duplicated, this is simpler and less
special-cased.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

80250707

A
netfilter: netns nf_conntrack: per-netns statistics · 0d55af87
由 Alexey Dobriyan 提交于 10月 08, 2008
```
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
0d55af87

netfilter: netns nf_conntrack: per-netns event cache · 6058fa6b

由 Alexey Dobriyan 提交于 10月 08, 2008

Heh, last minute proof-reading of this patch made me think,
that this is actually unneeded, simply because "ct" pointers will be
different for different conntracks in different netns, just like they
are different in one netns.

Not so sure anymore.

[Patrick: pointers will be different, flushing can only be done while
 inactive though and thus it needs to be per netns]
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

6058fa6b

netfilter: netns nf_conntrack: per-netns unconfirmed list · 63c9a262

由 Alexey Dobriyan 提交于 10月 08, 2008

What is confirmed connection in one netns can very well be unconfirmed
in another one.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

63c9a262

netfilter: netns nf_conntrack: per-netns expectations · 9b03f38d

由 Alexey Dobriyan 提交于 10月 08, 2008

Make per-netns a) expectation hash and b) expectations count.

Expectations always belongs to netns to which it's master conntrack belong.
This is natural and doesn't bloat expectation.

Proc files and leaf users are stubbed to init_net, this is temporary.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

9b03f38d

netfilter: netns nf_conntrack: per-netns conntrack hash · 400dad39

由 Alexey Dobriyan 提交于 10月 08, 2008

* make per-netns conntrack hash

  Other solution is to add ->ct_net pointer to tuplehashes and still has one
  hash, I tried that it's ugly and requires more code deep down in protocol
  modules et al.

* propagate netns pointer to where needed, e. g. to conntrack iterators.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

400dad39

netfilter: netns nf_conntrack: per-netns conntrack count · 49ac8713

由 Alexey Dobriyan 提交于 10月 08, 2008

Sysctls and proc files are stubbed to init_net's one. This is temporary.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

49ac8713

netfilter: netns nf_conntrack: add netns boilerplate · dfdb8d79

由 Alexey Dobriyan 提交于 10月 08, 2008

One comment: #ifdefs around #include is necessary to overcome amazing compile
breakages in NOTRACK-in-netns patch (see below).
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

dfdb8d79

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功