提交 · 8f01cb0827c84bd9c4866b849415b3aa6f0428df · openeuler / Kernel

11 5月, 2011 1 次提交

ipv4: xfrm: Eliminate ->rt_src reference in policy code. · 8f01cb08

由 David S. Miller 提交于 5月 09, 2011

Rearrange xfrm4_dst_lookup() so that it works by calling a helper
function __xfrm_dst_lookup() that takes an explicit flow key storage
area as an argument.

Use this new helper in xfrm4_get_saddr() so we can fetch the selected
source address from the flow instead of from rt->rt_src
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f01cb08

04 5月, 2011 1 次提交

ipv4: Renamt struct rtable's rt_tos to rt_key_tos. · 475949d8

由 David S. Miller 提交于 5月 03, 2011

To more accurately reflect that it is purely a routing
cache lookup key and is used in no other context.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

475949d8

23 4月, 2011 1 次提交

inet: constify ip headers and in6_addr · b71d1d42

由 Eric Dumazet 提交于 4月 22, 2011

Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers
where possible, to make code intention more obvious.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b71d1d42

08 4月, 2011 1 次提交

ipv4: Fix "Set rt->rt_iif more sanely on output routes." · 1b86a58f

由 OGAWA Hirofumi 提交于 4月 07, 2011

Commit 1018b5c0 ("Set rt->rt_iif more
sanely on output routes.")  breaks rt_is_{output,input}_route.

This became the cause to return "IP_PKTINFO's ->ipi_ifindex == 0".

To fix it, this does:

1) Add "int rt_route_iif;" to struct rtable

2) For input routes, always set rt_route_iif to same value as rt_iif

3) For output routes, always set rt_route_iif to zero.  Set rt_iif
   as it is done currently.

4) Change rt_is_{output,input}_route() to test rt_route_iif
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b86a58f

13 3月, 2011 5 次提交

D
net: Put fl4_* macros to struct flowi4 and use them again. · 9cce96df
由 David S. Miller 提交于 3月 12, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
9cce96df
D
net: Use flowi4 and flowi6 in xfrm layer. · 7e1dc7b6
由 David S. Miller 提交于 3月 12, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
7e1dc7b6
D
ipv4: Use flowi4 in public route lookup interfaces. · 9d6ec938
由 David S. Miller 提交于 3月 12, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
9d6ec938

net: Make flowi ports AF dependent. · 6281dcc9

由 David S. Miller 提交于 3月 12, 2011

Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*

This will let us to create AF optimal flow instances.

It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6281dcc9

net: Put flowi_* prefix on AF independent members of struct flowi · 1d28f42c

由 David S. Miller 提交于 3月 12, 2011

I intend to turn struct flowi into a union of AF specific flowi
structs.  There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d28f42c

05 3月, 2011 1 次提交

ipv4: Remove flowi from struct rtable. · 5e2b61f7

由 David S. Miller 提交于 3月 04, 2011

The only necessary parts are the src/dst addresses, the
interface indexes, the TOS, and the mark.

The rest is unnecessary bloat, which amounts to nearly
50 bytes on 64-bit.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e2b61f7

03 3月, 2011 1 次提交
- D
  ipv4: Make output route lookup return rtable directly. · b23dd4fe
  由 David S. Miller 提交于 3月 02, 2011
```
Instead of on the stack.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  b23dd4fe
02 3月, 2011 1 次提交

xfrm: Handle blackhole route creation via afinfo. · 2774c131

由 David S. Miller 提交于 3月 01, 2011

That way we don't have to potentially do this in every xfrm_lookup()
caller.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2774c131

24 2月, 2011 1 次提交
- D
  xfrm: Const'ify address arguments to ->dst_lookup() · 5e6b930f
  由 David S. Miller 提交于 2月 24, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  5e6b930f
23 2月, 2011 2 次提交
- D
  xfrm: Mark flowi arg to ->fill_dst() const. · 0c7b3eef
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  0c7b3eef
- D
  xfrm: Mark flowi arg to ->get_tos() const. · 05d84025
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  05d84025
27 1月, 2011 1 次提交

net: Implement read-only protection and COW'ing of metrics. · 62fa8a84

由 David S. Miller 提交于 1月 26, 2011

Routing metrics are now copy-on-write.

Initially a route entry points it's metrics at a read-only location.
If a routing table entry exists, it will point there.  Else it will
point at the all zero metric place-holder called 'dst_default_metrics'.

The writeability state of the metrics is stored in the low bits of the
metrics pointer, we have two bits left to spare if we want to store
more states.

For the initial implementation, COW is implemented simply via kmalloc.
However future enhancements will change this to place the writable
metrics somewhere else, in order to increase sharing.  Very likely
this "somewhere else" will be the inetpeer cache.

Note also that this means that metrics updates may transiently fail
if we cannot COW the metrics successfully.

But even by itself, this patch should decrease memory usage and
increase cache locality especially for routing workloads.  In those
cases the read-only metric copies stay in place and never get written
to.

TCP workloads where metrics get updated, and those rare cases where
PMTU triggers occur, will take a very slight performance hit.  But
that hit will be alleviated when the long-term writable metrics
move to a more sharable location.

Since the metrics storage went from a u32 array of RTAX_MAX entries to
what is essentially a pointer, some retooling of the dst_entry layout
was necessary.

Most importantly, we need to preserve the alignment of the reference
count so that it doesn't share cache lines with the read-mostly state,
as per Eric Dumazet's alignment assertion checks.

The only non-trivial bit here is the move of the 'flags' member into
the writeable cacheline.  This is OK since we are always accessing the
flags around the same moment when we made a modification to the
reference count.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62fa8a84

18 11月, 2010 1 次提交

net: use the macros defined for the members of flowi · 5811662b

由 Changli Gao 提交于 11月 12, 2010

Use the macros defined for the members of flowi to clean the code up.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5811662b

16 11月, 2010 1 次提交

xfrm: use gre key as flow upper protocol info · cc9ff19d

由 Timo Teräs 提交于 11月 03, 2010

The GRE Key field is intended to be used for identifying an individual
traffic flow within a tunnel. It is useful to be able to have XFRM
policy selector matches to have different policies for different
GRE tunnels.
Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc9ff19d

12 11月, 2010 1 次提交

net: get rid of rtable->idev · 72cdd1d9

由 Eric Dumazet 提交于 11月 11, 2010

It seems idev field in struct rtable has no special purpose, but adding
extra atomic ops.

We hold refcounts on the device itself (using percpu data, so pretty
cheap in current kernel).

infiniband case is solved using dst.dev instead of idev->dev

Removal of this field means routing without route cache is now using
shared data, percpu data, and only potential contention is a pair of
atomic ops on struct neighbour per forwarded packet.

About 5% speedup on routing test.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72cdd1d9

12 10月, 2010 1 次提交

net dst: use a percpu_counter to track entries · fc66f95c

由 Eric Dumazet 提交于 10月 08, 2010

struct dst_ops tracks number of allocated dst in an atomic_t field,
subject to high cache line contention in stress workload.

Switch to a percpu_counter, to reduce number of time we need to dirty a
central location. Place it on a separate cache line to avoid dirtying
read only fields.

Stress test :

(Sending 160.000.000 UDP frames,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_TRIE, SLUB/NUMA)

Before:

real    0m51.179s
user    0m15.329s
sys     10m15.942s

After:

real	0m45.570s
user	0m15.525s
sys	9m56.669s

With a small reordering of struct neighbour fields, subject of a
following patch, (to separate refcnt from other read mostly fields)

real	0m41.841s
user	0m15.261s
sys	8m45.949s
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc66f95c

23 9月, 2010 1 次提交

xfrm4: strip ECN bits from tos field · 94e22389

由 Ulrich Weber 提交于 9月 22, 2010

otherwise ECT(1) bit will get interpreted as RTO_ONLINK
and routing will fail with XfrmOutBundleGenError.
Signed-off-by: NUlrich Weber <uweber@astaro.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94e22389

05 7月, 2010 1 次提交

xfrm: fix xfrm by MARK logic · 44b451f1

由 Peter Kosyh 提交于 7月 02, 2010

While using xfrm by MARK feature in
2.6.34 - 2.6.35 kernels, the mark
is always cleared in flowi structure via memset in
_decode_session4 (net/ipv4/xfrm4_policy.c), so
the policy lookup fails.
IPv6 code is affected by this bug too.
Signed-off-by: NPeter Kosyh <p.kosyh@gmail.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44b451f1

11 6月, 2010 1 次提交

net-next: remove useless union keyword · d8d1f30b

由 Changli Gao 提交于 6月 10, 2010

remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8d1f30b

07 4月, 2010 1 次提交

xfrm: cache bundles instead of policies for outgoing flows · 80c802f3

由 Timo Teräs 提交于 4月 07, 2010

__xfrm_lookup() is called for each packet transmitted out of
system. The xfrm_find_bundle() does a linear search which can
kill system performance depending on how many bundles are
required per policy.

This modifies __xfrm_lookup() to store bundles directly in
the flow cache. If we did not get a hit, we just create a new
bundle instead of doing slow search. This means that we can now
get multiple xfrm_dst's for same flow (on per-cpu basis).
Signed-off-by: NTimo Teras <timo.teras@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80c802f3

03 3月, 2010 1 次提交

ipsec: Fix bogus bundle flowi · 87c1e12b

由 Herbert Xu 提交于 3月 02, 2010

When I merged the bundle creation code, I introduced a bogus
flowi value in the bundle.  Instead of getting from the caller,
it was instead set to the flow in the route object, which is
totally different.

The end result is that the bundles we created never match, and
we instead end up with an ever growing bundle list.

Thanks to Jamal for find this problem.
Reported-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Acked-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87c1e12b

25 1月, 2010 1 次提交

netns xfrm: deal with dst entries in netns · d7c7544c

由 Alexey Dobriyan 提交于 1月 24, 2010

GC is non-existent in netns, so after you hit GC threshold, no new
dst entries will be created until someone triggers cleanup in init_net.

Make xfrm4_dst_ops and xfrm6_dst_ops per-netns.
This is not done in a generic way, because it woule waste
(AF_MAX - 2) * sizeof(struct dst_ops) bytes per-netns.

Reorder GC threshold initialization so it'd be done before registering
XFRM policies.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7c7544c

12 11月, 2009 1 次提交

sysctl net: Remove unused binary sysctl code · f8572d8f

由 Eric W. Biederman 提交于 11月 05, 2009

Now that sys_sysctl is a compatiblity wrapper around /proc/sys
all sysctl strategy routines, and all ctl_name and strategy
entries in the sysctl tables are unused, and can be
revmoed.

In addition neigh_sysctl_register has been modified to no longer
take a strategy argument and it's callers have been modified not
to pass one.

Cc: "David Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

f8572d8f

05 8月, 2009 1 次提交

xfrm4: fix build when SYSCTLs are disabled · f816700a

由 Randy Dunlap 提交于 8月 04, 2009

Fix build errors when SYSCTLs are not enabled:
(.init.text+0x5154): undefined reference to `net_ipv4_ctl_path'
(.init.text+0x5176): undefined reference to `register_net_sysctl_table'
xfrm4_policy.c:(.exit.text+0x573): undefined reference to `unregister_net_sysctl_table
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f816700a

31 7月, 2009 1 次提交

xfrm: select sane defaults for xfrm[4|6] gc_thresh · a33bc5c1

由 Neil Horman 提交于 7月 30, 2009

Choose saner defaults for xfrm[4|6] gc_thresh values on init

Currently, the xfrm[4|6] code has hard-coded initial gc_thresh values
(set to 1024).  Given that the ipv4 and ipv6 routing caches are sized
dynamically at boot time, the static selections can be non-sensical.
This patch dynamically selects an appropriate gc threshold based on
the corresponding main routing table size, using the assumption that
we should in the worst case be able to handle as many connections as
the routing table can.

For ipv4, the maximum route cache size is 16 * the number of hash
buckets in the route cache.  Given that xfrm4 starts garbage
collection at the gc_thresh and prevents new allocations at 2 *
gc_thresh, we set gc_thresh to half the maximum route cache size.

For ipv6, its a bit trickier.  there is no maximum route cache size,
but the ipv6 dst_ops gc_thresh is statically set to 1024.  It seems
sane to select a simmilar gc_thresh for the xfrm6 code that is half
the number of hash buckets in the v6 route cache times 16 (like the v4
code does).
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a33bc5c1

28 7月, 2009 1 次提交

xfrm: export xfrm garbage collector thresholds via sysctl · a44a4a00

由 Neil Horman 提交于 7月 27, 2009

Export garbage collector thresholds for xfrm[4|6]_dst_ops

Had a problem reported to me recently in which a high volume of ipsec
connections on a system began reporting ENOBUFS for new connections
eventually.

It seemed that after about 2000 connections we started being unable to
create more.  A quick look revealed that the xfrm code used a dst_ops
structure that limited the gc_thresh value to 1024, and always
dropped route cache entries after 2x the gc_thresh.

It seems the most direct solution is to export the gc_thresh values in
the xfrm[4|6] dst_ops as sysctls, like the main routing table does, so
that higher volumes of connections can be supported.  This patch has
been tested and allows the reporter to increase their ipsec connection
volume successfully.
Reported-by: NJoe Nall <joe@nall.com>
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>

ipv4/xfrm4_policy.c |   18 ++++++++++++++++++
ipv6/xfrm6_policy.c |   18 ++++++++++++++++++
2 files changed, 36 insertions(+)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a44a4a00

04 7月, 2009 1 次提交

xfrm4: fix the ports decode of sctp protocol · c615c9f3

由 Wei Yongjun 提交于 7月 02, 2009

The SCTP pushed the skb data above the sctp chunk header, so the check
of pskb_may_pull(skb, xprth + 4 - skb->data) in _decode_session4() will
never return 0 because xprth + 4 - skb->data < 0, the ports decode of
sctp will always fail.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c615c9f3

01 2月, 2009 1 次提交

net: replace uses of __constant_{endian} · 09640e63

由 Harvey Harrison 提交于 2月 01, 2009

Base versions handle constant folding now.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

09640e63

26 11月, 2008 3 次提交

netns xfrm: ->get_saddr in netns · fbda33b2

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fbda33b2

netns xfrm: ->dst_lookup in netns · c5b3cf46

由 Alexey Dobriyan 提交于 11月 25, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5b3cf46

netns xfrm: dst garbage-collecting in netns · ddcfd796

由 Alexey Dobriyan 提交于 11月 25, 2008

Pass netns pointer to struct xfrm_policy_afinfo::garbage_collect()

	[This needs more thoughts on what to do with dst_ops]
	[Currently stub to init_net]
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ddcfd796

12 11月, 2008 1 次提交

net: remove struct dst_entry::entry_size · 6bb3ce25

由 Alexey Dobriyan 提交于 11月 11, 2008

Unused after kmem_cache_zalloc() conversion.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6bb3ce25

03 11月, 2008 1 次提交
- J
  net: clean up net/ipv4/ipip.c raw.c tcp.c tcp_minisocks.c tcp_yeah.c xfrm4_policy.c · 5a5f3a8d
  由 Jianjun Kong 提交于 11月 03, 2008
```
Signed-off-by: NJianjun Kong <jianjun@zeuux.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  5a5f3a8d
26 3月, 2008 1 次提交

[NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS. · c346dca1

由 YOSHIFUJI Hideaki 提交于 3月 25, 2008

Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

c346dca1

01 2月, 2008 1 次提交

[NET]: should explicitely initialize atomic_t field in struct dst_ops · e2422970

由 Eric Dumazet 提交于 1月 30, 2008

All but one struct dst_ops static initializations miss explicit
initialization of entries field.

As this field is atomic_t, we should use ATOMIC_INIT(0), and not
rely on atomic_t implementation.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e2422970

29 1月, 2008 1 次提交

[NETNS]: Add namespace parameter to __ip_route_output_key. · 611c183e

由 Denis V. Lunev 提交于 1月 22, 2008

This is only required to propagate it down to the
ip_route_output_slow.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

611c183e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功