提交 · f5c7e1a47aeca2b31106aa94e7f4daa218e6c478 · openanolis / cloud-kernel

02 9月, 2014 2 次提交

xfrm: configure policy hash table thresholds by netlink · 880a6fab

由 Christophe Gouault 提交于 8月 29, 2014

Enable to specify local and remote prefix length thresholds for the
policy hash table via a netlink XFRM_MSG_NEWSPDINFO message.

prefix length thresholds are specified by XFRMA_SPD_IPV4_HTHRESH and
XFRMA_SPD_IPV6_HTHRESH optional attributes (struct xfrmu_spdhthresh).

example:

    struct xfrmu_spdhthresh thresh4 = {
        .lbits = 0;
        .rbits = 24;
    };
    struct xfrmu_spdhthresh thresh6 = {
        .lbits = 0;
        .rbits = 56;
    };
    struct nlmsghdr *hdr;
    struct nl_msg *msg;

    msg = nlmsg_alloc();
    hdr = nlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, XFRMA_SPD_IPV4_HTHRESH, sizeof(__u32), NLM_F_REQUEST);
    nla_put(msg, XFRMA_SPD_IPV4_HTHRESH, sizeof(thresh4), &thresh4);
    nla_put(msg, XFRMA_SPD_IPV6_HTHRESH, sizeof(thresh6), &thresh6);
    nla_send_auto(sk, msg);

The numbers are the policy selector minimum prefix lengths to put a
policy in the hash table.

- lbits is the local threshold (source address for out policies,
  destination address for in and fwd policies).

- rbits is the remote threshold (destination address for out
  policies, source address for in and fwd policies).

The default values are:

XFRMA_SPD_IPV4_HTHRESH: 32 32
XFRMA_SPD_IPV6_HTHRESH: 128 128

Dynamic re-building of the SPD is performed when the thresholds values
are changed.

The current thresholds can be read via a XFRM_MSG_GETSPDINFO request:
the kernel replies to XFRM_MSG_GETSPDINFO requests by an
XFRM_MSG_NEWSPDINFO message, with both attributes
XFRMA_SPD_IPV4_HTHRESH and XFRMA_SPD_IPV6_HTHRESH.
Signed-off-by: NChristophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

880a6fab

xfrm: hash prefixed policies based on preflen thresholds · b58555f1

由 Christophe Gouault 提交于 8月 29, 2014

The idea is an extension of the current policy hashing.

Today only non-prefixed policies are stored in a hash table. This
patch relaxes the constraints, and hashes policies whose prefix
lengths are greater or equal to a configurable threshold.

Each hash table (one per direction) maintains its own set of IPv4 and
IPv6 thresholds (dbits4, sbits4, dbits6, sbits6), by default (32, 32,
128, 128).

Example, if the output hash table is configured with values (16, 24,
56, 64):

ip xfrm policy add dir out src 10.22.0.0/20 dst 10.24.1.0/24 ... => hashed
ip xfrm policy add dir out src 10.22.0.0/16 dst 10.24.1.1/32 ... => hashed
ip xfrm policy add dir out src 10.22.0.0/16 dst 10.24.0.0/16 ... => unhashed

ip xfrm policy add dir out \
    src 3ffe:304:124:2200::/60 dst 3ffe:304:124:2401::/64 ...    => hashed
ip xfrm policy add dir out \
    src 3ffe:304:124:2200::/56 dst 3ffe:304:124:2401::2/128 ...  => hashed
ip xfrm policy add dir out \
    src 3ffe:304:124:2200::/56 dst 3ffe:304:124:2400::/56 ...    => unhashed

The high order bits of the addresses (up to the threshold) are used to
compute the hash key.
Signed-off-by: NChristophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

b58555f1

11 3月, 2014 1 次提交

flowcache: restore a single flow_cache kmem_cache · d32d9bb8

由 Eric Dumazet 提交于 3月 10, 2014

It is not legal to create multiple kmem_cache having the same name.

flowcache can use a single kmem_cache, no need for a per netns
one.

Fixes: ca925cf1 ("flowcache: Make flow cache name space aware")
Reported-by: NJakub Kicinski <moorray3@wp.pl>
Tested-by: NJakub Kicinski <moorray3@wp.pl>
Tested-by: NFan Du <fan.du@windriver.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d32d9bb8

19 2月, 2014 1 次提交

xfrm: Remove caching of xfrm_policy_sk_bundles · 1a1ccc96

由 Steffen Klassert 提交于 2月 19, 2014

We currently cache socket policy bundles at xfrm_policy_sk_bundles.
These cached bundles are never used. Instead we create and cache
a new one whenever xfrm_lookup() is called on a socket policy.

Most protocols cache the used routes to the socket, so let's
remove the unused caching of socket policy bundles in xfrm.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

1a1ccc96

12 2月, 2014 1 次提交

flowcache: Make flow cache name space aware · ca925cf1

由 Fan Du 提交于 1月 18, 2014

Inserting a entry into flowcache, or flushing flowcache should be based
on per net scope. The reason to do so is flushing operation from fat
netns crammed with flow entries will also making the slim netns with only
a few flow cache entries go away in original implementation.

Since flowcache is tightly coupled with IPsec, so it would be easier to
put flow cache global parameters into xfrm namespace part. And one last
thing needs to do is bumping flow cache genid, and flush flow cache should
also be made in per net style.
Signed-off-by: NFan Du <fan.du@windriver.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

ca925cf1

06 12月, 2013 2 次提交

xfrm: Remove ancient sleeping when the SA is in acquire state · 5b8ef341

由 Steffen Klassert 提交于 8月 27, 2013

We now queue packets to the policy if the states are not yet resolved,
this replaces the ancient sleeping code. Also the sleeping can cause
indefinite task hangs if the needed state does not get resolved.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

5b8ef341

xfrm: Namespacify xfrm state/policy locks · 283bc9f3

由 Fan Du 提交于 11月 07, 2013

By semantics, xfrm layer is fully name space aware,
so will the locks, e.g. xfrm_state/pocliy_lock.
Ensure exclusive access into state/policy link list
for different name space with one global lock is not
right in terms of semantics aspect at first place,
as they are indeed mutually independent with each
other, but also more seriously causes scalability
problem.

One practical scenario is on a Open Network Stack,
more than hundreds of lxc tenants acts as routers
within one host, a global xfrm_state/policy_lock
becomes the bottleneck. But onces those locks are
decoupled in a per-namespace fashion, locks contend
is just with in specific name space scope, without
causing additional SPD/SAD access delay for other
name space.

Also this patch improve scalability while as without
changing original xfrm behavior.
Signed-off-by: NFan Du <fan.du@windriver.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

283bc9f3

12 12月, 2011 1 次提交

net: use IS_ENABLED(CONFIG_IPV6) · dfd56b8b

由 Eric Dumazet 提交于 12月 10, 2011

Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfd56b8b

18 10月, 2010 1 次提交

netns: reorder fields in struct net · 8e602ce2

由 Eric Dumazet 提交于 10月 14, 2010

In a network bench, I noticed an unfortunate false sharing between
'loopback_dev' and 'count' fields in "struct net".

'count' is written each time a socket is created or destroyed, while
loopback_dev might be often read in routing code.

Move loopback_dev in a read mostly section of "struct net"

Note: struct netns_xfrm is cache line aligned on SMP.
(It contains a "struct dst_ops")
Move it at the end to avoid holes, and reduce sizeof(struct net) by 128
bytes on ia32.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e602ce2

25 1月, 2010 1 次提交

netns xfrm: deal with dst entries in netns · d7c7544c

由 Alexey Dobriyan 提交于 1月 24, 2010

GC is non-existent in netns, so after you hit GC threshold, no new
dst entries will be created until someone triggers cleanup in init_net.

Make xfrm4_dst_ops and xfrm6_dst_ops per-netns.
This is not done in a generic way, because it woule waste
(AF_MAX - 2) * sizeof(struct dst_ops) bytes per-netns.

Reorder GC threshold initialization so it'd be done before registering
XFRM policies.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7c7544c

04 12月, 2009 1 次提交

net: Allow xfrm_user_net_exit to batch efficiently. · d79d792e

由 Eric W. Biederman 提交于 12月 03, 2009

xfrm.nlsk is provided by the xfrm_user module and is access via rcu from
other parts of the xfrm code.  Add xfrm.nlsk_stash a copy of xfrm.nlsk that
will never be set to NULL.  This allows the synchronize_net and
netlink_kernel_release to be deferred until a whole batch of xfrm.nlsk sockets
have been set to NULL.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d79d792e

26 11月, 2008 20 次提交

netns xfrm: per-netns sysctls · b27aeadb

由 Alexey Dobriyan 提交于 11月 25, 2008

Make
	net.core.xfrm_aevent_etime
	net.core.xfrm_acq_expires
	net.core.xfrm_aevent_rseqth
	net.core.xfrm_larval_drop

sysctls per-netns.

For that make net_core_path[] global, register it to prevent two
/proc/net/core antries and change initcall position -- xfrm_init() is called
from fs_initcall, so this one should be fs_initcall at least.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b27aeadb

netns xfrm: per-netns NETLINK_XFRM socket · a6483b79

由 Alexey Dobriyan 提交于 11月 25, 2008

Stub senders to init_net's one temporarily.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6483b79

netns xfrm: per-netns policy hash resizing work · 66caf628

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66caf628

netns xfrm: per-netns policy counts · dc2caba7

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc2caba7

netns xfrm: per-netns xfrm_policy_bydst hash · a35f6c5d

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a35f6c5d

netns xfrm: per-netns inexact policies · 8b18f8ea

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b18f8ea

netns xfrm: per-netns xfrm_policy_byidx hashmask · 8100bea7

由 Alexey Dobriyan 提交于 11月 25, 2008

Per-netns hashes are independently resizeable.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8100bea7

netns xfrm: per-netns xfrm_policy_byidx hash · 93b851c1

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93b851c1

netns xfrm: per-netns policy list · adfcf0b2

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

adfcf0b2

netns xfrm: per-netns km_waitq · 50a30657

由 Alexey Dobriyan 提交于 11月 25, 2008

Disallow spurious wakeups in __xfrm_lookup().
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50a30657

netns xfrm: per-netns state GC work · c7837144

由 Alexey Dobriyan 提交于 11月 25, 2008

State GC is per-netns, and this is part of it.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7837144

netns xfrm: per-netns state GC list · b8a0ae20

由 Alexey Dobriyan 提交于 11月 25, 2008

km_waitq is going to be made per-netns to disallow spurious wakeups
in __xfrm_lookup().

To not wakeup after every garbage-collected xfrm_state (which potentially
can be from different netns) make state GC list per-netns.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8a0ae20

netns xfrm: per-netns xfrm_hash_work · 63082733

由 Alexey Dobriyan 提交于 11月 25, 2008

All of this is implicit passing which netns's hashes should be resized.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63082733

netns xfrm: per-netns xfrm_state counts · 0bf7c5b0

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0bf7c5b0

netns xfrm: per-netns xfrm_state_hmask · 529983ec

由 Alexey Dobriyan 提交于 11月 25, 2008

Since hashtables are per-netns, they can be independently resized.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

529983ec

netns xfrm: per-netns xfrm_state_byspi hash · b754a4fd

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b754a4fd

netns xfrm: per-netns xfrm_state_bysrc hash · d320bbb3

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d320bbb3

netns xfrm: per-netns xfrm_state_bydst hash · 73d189dc

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73d189dc

netns xfrm: per-netns xfrm_state_all list · 9d4139c7

由 Alexey Dobriyan 提交于 11月 25, 2008

This is done to get
a) simple "something leaked" check
b) cover possible DoSes when other netns puts many, many xfrm_states
   onto a list.
c) not miss "alien xfrm_state" check in some of list iterators in future.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d4139c7

netns xfrm: add netns boilerplate · d62ddc21

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d62ddc21

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功