提交 · a43dce93587bfb5f65fa40647977ef72a7ba6699 · openeuler / Kernel

11 8月, 2017 1 次提交

net: xfrm: support setting an output mark. · 077fbac4

由 Lorenzo Colitti 提交于 8月 11, 2017

On systems that use mark-based routing it may be necessary for
routing lookups to use marks in order for packets to be routed
correctly. An example of such a system is Android, which uses
socket marks to route packets via different networks.

Currently, routing lookups in tunnel mode always use a mark of
zero, making routing incorrect on such systems.

This patch adds a new output_mark element to the xfrm state and
a corresponding XFRMA_OUTPUT_MARK netlink attribute. The output
mark differs from the existing xfrm mark in two ways:

1. The xfrm mark is used to match xfrm policies and states, while
the xfrm output mark is used to set the mark (and influence
the routing) of the packets emitted by those states.
2. The existing mark is constrained to be a subset of the bits of
the originating socket or transformed packet, but the output
mark is arbitrary and depends only on the state.

The use of a separate mark provides additional flexibility. For
example:

- A packet subject to two transforms (e.g., transport mode inside
tunnel mode) can have two different output marks applied to it,
one for the transport mode SA and one for the tunnel mode SA.
- On a system where socket marks determine routing, the packets
emitted by an IPsec tunnel can be routed based on a mark that
is determined by the tunnel, not by the marks of the
unencrypted packets.
- Support for setting the output marks can be introduced without
breaking any existing setups that employ both mark-based
routing and xfrm tunnel mode. Simply changing the code to use
the xfrm mark for routing output packets could xfrm mark could
change behaviour in a way that breaks these setups.

If the output mark is unspecified or set to zero, the mark is not
set or changed.

Tested: make allyesconfig; make -j64
Tested: https://android-review.googlesource.com/452776Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

077fbac4

19 7月, 2017 2 次提交

xfrm: remove flow cache · 09c75704

由 Florian Westphal 提交于 7月 17, 2017

After rcu conversions performance degradation in forward tests isn't that
noticeable anymore.

See next patch for some numbers.

A followup patcg could then also remove genid from the policies
as we do not cache bundles anymore.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

09c75704

net: xfrm: revert to lower xfrm dst gc limit · 3c2a89dd

由 Florian Westphal 提交于 7月 17, 2017

revert c386578f ("xfrm: Let the flowcache handle its size by default.").

Once we remove flow cache, we don't have a flow cache limit anymore.
We must not allow (virtually) unlimited allocations of xfrm dst entries.
Revert back to the old xfrm dst gc limits.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c2a89dd

09 2月, 2017 3 次提交

xfrm: policy: make policy backend const · 37b10383

由 Florian Westphal 提交于 2月 07, 2017

Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

37b10383

xfrm: policy: remove family field · a2817d8b

由 Florian Westphal 提交于 2月 07, 2017

Only needed it to register the policy backend at init time.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

a2817d8b

xfrm: policy: remove garbage_collect callback · 3d7d25a6

由 Florian Westphal 提交于 2月 07, 2017

Just call xfrm_garbage_collect_deferred() directly.
This gets rid of a write to afinfo in register/unregister and allows to
constify afinfo later on.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

3d7d25a6

11 9月, 2016 1 次提交

net: l3mdev: remove redundant calls · e0d56fdd

由 David Ahern 提交于 9月 10, 2016

A previous patch added l3mdev flow update making these hooks
redundant. Remove them.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0d56fdd

22 8月, 2016 1 次提交

xfrm: Only add l3mdev oif to dst lookups · 11d7a0bb

由 David Ahern 提交于 8月 14, 2016

Subash reported that commit 42a7b32b ("xfrm: Add oif to dst lookups")
broke a wifi use case that uses fib rules and xfrms. The intent of
42a7b32b was driven by VRFs with IPsec. As a compromise relax the
use of oif in xfrm lookups to L3 master devices only (ie., oif is either
an L3 master device or is enslaved to a master device).

Fixes: 42a7b32b ("xfrm: Add oif to dst lookups")
Reported-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

11d7a0bb

17 6月, 2016 1 次提交

net: xfrm: fix old-style declaration · 318d3cc0

由 Arnd Bergmann 提交于 6月 16, 2016

Modern C standards expect the '__inline__' keyword to come before the return
type in a declaration, and we get a couple of warnings for this with "make W=1"
in the xfrm{4,6}_policy.c files:

net/ipv6/xfrm6_policy.c:369:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
static int inline xfrm6_net_sysctl_init(struct net *net)
net/ipv6/xfrm6_policy.c:374:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
static void inline xfrm6_net_sysctl_exit(struct net *net)
net/ipv4/xfrm4_policy.c:339:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
static int inline xfrm4_net_sysctl_init(struct net *net)
net/ipv4/xfrm4_policy.c:344:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
static void inline xfrm4_net_sysctl_exit(struct net *net)
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

318d3cc0

03 11月, 2015 1 次提交

xfrm: dst_entries_init() per-net dst_ops · a8a572a6

由 Dan Streetman 提交于 10月 29, 2015

Remove the dst_entries_init/destroy calls for xfrm4 and xfrm6 dst_ops
templates; their dst_entries counters will never be used. Move the
xfrm dst_ops initialization from the common xfrm/xfrm_policy.c to
xfrm4/xfrm4_policy.c and xfrm6/xfrm6_policy.c, and call dst_entries_init
and dst_entries_destroy for each net namespace.

The ipv4 and ipv6 xfrms each create dst_ops template, and perform
dst_entries_init on the templates. The template values are copied to each
net namespace's xfrm.xfrm*_dst_ops. The problem there is the dst_ops
pcpuc_entries field is a percpu counter and cannot be used correctly by
simply copying it to another object.

The result of this is a very subtle bug; changes to the dst entries
counter from one net namespace may sometimes get applied to a different
net namespace dst entries counter. This is because of how the percpu
counter works; it has a main count field as well as a pointer to the
percpu variables. Each net namespace maintains its own main count
variable, but all point to one set of percpu variables. When any net
namespace happens to change one of the percpu variables to outside its
small batch range, its count is moved to the net namespace's main count
variable. So with multiple net namespaces operating concurrently, the
dst_ops entries counter can stray from the actual value that it should
be; if counts are consistently moved from one net namespace to another
(which my testing showed is likely), then one net namespace winds up
with a negative dst_ops count while another winds up with a continually
increasing count, eventually reaching its gc_thresh limit, which causes
all new traffic on the net namespace to fail with -ENOBUFS.
Signed-off-by: NDan Streetman <dan.streetman@canonical.com>
Signed-off-by: NDan Streetman <ddstreet@ieee.org>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

a8a572a6

23 10月, 2015 2 次提交

xfrm4: Reload skb header pointers after calling pskb_may_pull. · ea673a4d

由 Steffen Klassert 提交于 10月 23, 2015

A call to pskb_may_pull may change the pointers into the packet,
so reload the pointers after the call.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

ea673a4d

xfrm4: Fix header checks in _decode_session4. · 1a14f1e5

由 Steffen Klassert 提交于 10月 23, 2015

We skip the header informations if the data pointer points
already behind the header in question for some protocols.
This is because we call pskb_may_pull with a negative value
converted to unsigened int from pskb_may_pull in this case.
Skipping the header informations can lead to incorrect policy
lookups, so fix it by a check of the data pointer position
before we call pskb_may_pull.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

1a14f1e5

08 10月, 2015 2 次提交

ipv4: Merge __ip_local_out and __ip_local_out_sk · b92dacd4

由 Eric W. Biederman 提交于 10月 07, 2015

Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b92dacd4

dst: Pass a sk into .local_out · 4ebdfba7

由 Eric W. Biederman 提交于 10月 07, 2015

For consistency with the other similar methods in the kernel pass a
struct sock into the dst_ops .local_out method.

Simplifying the socket passing case is needed a prequel to passing a
struct net reference into .local_out.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ebdfba7

30 9月, 2015 1 次提交

net: Replace vrf_master_ifindex{, _rcu} with l3mdev equivalents · 385add90

由 David Ahern 提交于 9月 29, 2015

Replace calls to vrf_master_ifindex_rcu and vrf_master_ifindex with either
l3mdev_master_ifindex_rcu or l3mdev_master_ifindex.

The pattern:
    oif = vrf_master_ifindex(dev) ? : dev->ifindex;
is replaced with
    oif = l3mdev_fib_oif(dev);

And remove the now unused vrf macros.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

385add90

29 9月, 2015 1 次提交

xfrm: Let the flowcache handle its size by default. · c386578f

由 Steffen Klassert 提交于 9月 29, 2015

The xfrm flowcache size is limited by the flowcache limit
(4096 * number of online cpus) and the xfrm garbage collector
threshold (2 * 32768), whatever is reached first. This means
that we can hit the garbage collector limit only on systems
with more than 16 cpus. On such systems we simply refuse
new allocations if we reach the limit, so new flows are dropped.
On syslems with 16 or less cpus, we hit the flowcache limit.
In this case, we shrink the flow cache instead of refusing new
flows.

We increase the xfrm garbage collector threshold to INT_MAX
to get the same behaviour, independent of the number of cpus.

The xfrm garbage collector threshold can still be set below
the flowcache limit to reduce the memory usage of the flowcache.
Tested-by: NDan Streetman <dan.streetman@canonical.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

c386578f

18 9月, 2015 1 次提交

net: Fix vti use case with oif in dst lookups · 58189ca7

由 David Ahern 提交于 9月 15, 2015

Steffen reported that the recent change to add oif to dst lookups breaks
the VTI use case. The problem is that with the oif set in the flow struct
the comparison to the nh_oif is triggered. Fix by splitting the
FLOWI_FLAG_VRFSRC into 2 flags -- one that triggers the vrf device cache
bypass (FLOWI_FLAG_VRFSRC) and another telling the lookup to not compare
nh oif (FLOWI_FLAG_SKIP_NH_OIF).

Fixes: 42a7b32b ("xfrm: Add oif to dst lookups")
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58189ca7

16 9月, 2015 1 次提交

net: Add FIB table id to rtable · b7503e0c

由 David Ahern 提交于 9月 02, 2015

Add the FIB table id to rtable to make the information available for
IPv4 as it is for IPv6.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7503e0c

26 8月, 2015 1 次提交

xfrm: Use VRF master index if output device is enslaved · 4ec3b28c

由 David Ahern 提交于 8月 20, 2015

Directs route lookups to VRF table. Compiles out if NET_VRF is not
enabled. With this patch able to successfully bring up ipsec tunnels
in VRFs, even with duplicate network configuration.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ec3b28c

11 8月, 2015 1 次提交

xfrm: Add oif to dst lookups · 42a7b32b

由 David Ahern 提交于 8月 10, 2015

Rules can be installed that direct route lookups to specific tables based
on oif. Plumb the oif through the xfrm lookups so it gets set in the flow
struct and passed to the resolver routines.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

42a7b32b

04 4月, 2015 1 次提交

ipv4: coding style: comparison for equality with NULL · 51456b29

由 Ian Morris 提交于 4月 03, 2015

The ipv4 code uses a mixture of coding styles. In some instances check
for NULL pointer is done as x == NULL and sometimes as !x. !x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.
Signed-off-by: NIan Morris <ipm@chirality.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51456b29

10 3月, 2015 1 次提交

net: Remove protocol from struct dst_ops · ddb3b603

由 Eric W. Biederman 提交于 3月 09, 2015

After my change to neigh_hh_init to obtain the protocol from the
neigh_table there are no more users of protocol in struct dst_ops.
Remove the protocol field from dst_ops and all of it's initializers.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ddb3b603

14 3月, 2014 1 次提交

xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly · 2f32b51b

由 Steffen Klassert 提交于 3月 14, 2014

IPv6 can be build as a module, so we need mechanism to access
the address family dependent callback functions properly.
Therefore we introduce xfrm_input_afinfo, similar to that
what we have for the address family dependent part of
policies and states.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

2f32b51b

01 11月, 2013 1 次提交

xfrm: Fix null pointer dereference when decoding sessions · 84502b5e

由 Steffen Klassert 提交于 10月 30, 2013

On some codepaths the skb does not have a dst entry
when xfrm_decode_session() is called. So check for
a valid skb_dst() before dereferencing the device
interface index. We use 0 as the device index if
there is no valid skb_dst(), or at reverse decoding
we use skb_iif as device interface index.

Bug was introduced with git commit bafd4bd4
("xfrm: Decode sessions with output interface.").
Reported-by: NMeelis Roos <mroos@linux.ee>
Tested-by: NMeelis Roos <mroos@linux.ee>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

84502b5e

28 10月, 2013 1 次提交

xfrm: Increase the garbage collector threshold · eeb1b733

由 Steffen Klassert 提交于 10月 25, 2013

With the removal of the routing cache, we lost the
option to tweak the garbage collector threshold
along with the maximum routing cache size. So git
commit 703fb94e ("xfrm: Fix the gc threshold value
for ipv4") moved back to a static threshold.

It turned out that the current threshold before we
start garbage collecting is much to small for some
workloads, so increase it from 1024 to 32768. This
means that we start the garbage collector if we have
more than 32768 dst entries in the system and refuse
new allocations if we are above 65536.
Reported-by: NWolfgang Walter <linux@stwm.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

eeb1b733

16 9月, 2013 1 次提交

xfrm: Decode sessions with output interface. · bafd4bd4

由 Steffen Klassert 提交于 9月 09, 2013

The output interface matching does not work on forward
policy lookups, the output interface of the flowi is
always 0. Fix this by setting the output interface when
we decode the session.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

bafd4bd4

06 2月, 2013 2 次提交

xfrm: make gc_thresh configurable in all namespaces · 8d068875

由 Michal Kubecek 提交于 2月 06, 2013

The xfrm gc threshold can be configured via xfrm{4,6}_gc_thresh
sysctl but currently only in init_net, other namespaces always
use the default value. This can substantially limit the number
of IPsec tunnels that can be effectively used.
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

8d068875

xfrm: remove unused xfrm4_policy_fini() · 1f53c808

由 Michal Kubecek 提交于 2月 06, 2013

Function xfrm4_policy_fini() is unused since xfrm4_fini() was
removed in 2.6.11.
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

1f53c808

13 11月, 2012 1 次提交

xfrm: Fix the gc threshold value for ipv4 · 703fb94e

由 Steffen Klassert 提交于 11月 13, 2012

The xfrm gc threshold value depends on ip_rt_max_size. This
value was set to INT_MAX with the routing cache removal patch,
so we start doing garbage collecting when we have INT_MAX/2
IPsec routes cached. Fix this by going back to the static
threshold of 1024 routes.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

703fb94e

09 10月, 2012 1 次提交

ipv4: introduce rt_uses_gateway · 155e8336

由 Julian Anastasov 提交于 10月 08, 2012

Add new flag to remember when route is via gateway.
We will use it to allow rt_gateway to contain address of
directly connected host for the cases when DST_NOCACHE is
used or when the NH exception caches per-destination route
without DST_NOCACHE flag, i.e. when routes are not used for
other destinations. By this way we force the neighbour
resolving to work with the routed destination but we
can use different address in the packet, feature needed
for IPVS-DR where original packet for virtual IP is routed
via route to real IP.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

155e8336

01 8月, 2012 1 次提交

ipv4: Properly purge netdev references on uncached routes. · caacf05e

由 David S. Miller 提交于 7月 31, 2012

When a device is unregistered, we have to purge all of the
references to it that may exist in the entire system.

If a route is uncached, we currently have no way of accomplishing
this.

So create a global list that is scanned when a network device goes
down.  This mirrors the logic in net/core/dst.c's dst_ifdown().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

caacf05e

21 7月, 2012 6 次提交

ipv4: Turn rt->rt_route_iif into rt->rt_is_input. · 9917e1e8

由 David S. Miller 提交于 7月 17, 2012

That is this value's only use, as a boolean to indicate whether
a route is an input route or not.

So implement it that way, using a u16 gap present in the struct
already.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9917e1e8

ipv4: Kill rt->rt_oif · 4fd551d7

由 David S. Miller 提交于 7月 17, 2012

Never actually used.

It was being set on output routes to the original OIF specified in the
flow key used for the lookup.

Adjust the only user, ipmr_rt_fib_lookup(), for greater correctness of
the flowi4_oif and flowi4_iif values, thanks to feedback from Julian
Anastasov.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fd551d7

D
ipv4: Remove 'rt_dst' from 'struct rtable' · f1ce3062
由 David S. Miller 提交于 7月 12, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
f1ce3062
D
ipv4: Remove 'rt_mark' from 'struct rtable' · b4869889
由 David Miller 提交于 7月 01, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b4869889
D
ipv4: Kill 'rt_src' from 'struct rtable' · d6c0a4f6
由 David Miller 提交于 7月 01, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
d6c0a4f6

ipv4: Remove rt_key_{src,dst,tos} from struct rtable. · 1a00fee4

由 David Miller 提交于 7月 01, 2012

They are always used in contexts where they can be reconstituted,
or where the finally resolved rt->rt_{src,dst} is semantically
equivalent.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a00fee4

17 7月, 2012 1 次提交

net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() · 6700c270

由 David S. Miller 提交于 7月 17, 2012

This will be used so that we can compose a full flow key.

Even though we have a route in this context, we need more. In the
future the routes will be without destination address, source address,
etc. keying. One ipv4 route will cover entire subnets, etc.

In this environment we have to have a way to possess persistent storage
for redirects and PMTU information. This persistent storage will exist
in the FIB tables, and that's why we'll need to be able to rebuild a
full lookup flow key here. Using that flow key will do a fib_lookup()
and create/update the persistent entry.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6700c270

12 7月, 2012 2 次提交
- D
  net: Remove checks for dst_ops->redirect being NULL. · 1ed5c48f
  由 David S. Miller 提交于 7月 12, 2012
```
No longer necessary.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  1ed5c48f
- D
  ipv4: Add redirect support to all protocol icmp error handlers. · 55be7a9c
  由 David S. Miller 提交于 7月 11, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  55be7a9c

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功