提交 · 437de07ced703c2d171b43bd63cf47e0af09a241 · openeuler / Kernel

30 3月, 2014 3 次提交

ipv6: fix checkpatch errors of "foo*" and "foo * bar" · 437de07c

由 Wang Yufen 提交于 3月 28, 2014

ERROR: "(foo*)" should be "(foo *)"
ERROR: "foo * bar" should be "foo *bar"
Suggested-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: NWang Yufen <wangyufen@huawei.com>
Acked-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

437de07c

ipv6: fix checkpatch errors of brace and trailing statements · 49e253e3

由 Wang Yufen 提交于 3月 28, 2014

ERROR: open brace '{' following enum go on the same line
ERROR: open brace '{' following struct go on the same line
ERROR: trailing statements should be on next line
Signed-off-by: NWang Yufen <wangyufen@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49e253e3

ipv6: fix checkpatch errors comments and space · 8db46f1d

由 Wang Yufen 提交于 3月 28, 2014

WARNING: please, no space before tabs
WARNING: please, no spaces at the start of a line
ERROR: spaces required around that ':' (ctx:VxW)
ERROR: spaces required around that '>' (ctx:VxV)
ERROR: spaces required around that '>=' (ctx:VxV)
Signed-off-by: NWang Yufen <wangyufen@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8db46f1d

28 3月, 2014 1 次提交

ipv6: do not overwrite inetpeer metrics prematurely · e5fd387a

由 Michal Kubeček 提交于 3月 27, 2014

If an IPv6 host route with metrics exists, an attempt to add a
new route for the same target with different metrics fails but
rewrites the metrics anyway:

12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000
12sp0:~ # ip -6 route show
fe80::/64 dev eth0  proto kernel  metric 256
fec0::1 dev eth0  metric 1024  rto_min lock 1s
12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1500
RTNETLINK answers: File exists
12sp0:~ # ip -6 route show
fe80::/64 dev eth0  proto kernel  metric 256
fec0::1 dev eth0  metric 1024  rto_min lock 1.5s

This is caused by all IPv6 host routes using the metrics in
their inetpeer (or the shared default). This also holds for the
new route created in ip6_route_add() which shares the metrics
with the already existing route and thus ip6_route_add()
rewrites the metrics even if the new route ends up not being
used at all.

Another problem is that old metrics in inetpeer can reappear
unexpectedly for a new route, e.g.

12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000
12sp0:~ # ip route del fec0::1
12sp0:~ # ip route add fec0::1 dev eth0
12sp0:~ # ip route change fec0::1 dev eth0 hoplimit 10
12sp0:~ # ip -6 route show
fe80::/64 dev eth0  proto kernel  metric 256
fec0::1 dev eth0  metric 1024  hoplimit 10 rto_min lock 1s

Resolve the first problem by moving the setting of metrics down
into fib6_add_rt2node() to the point we are sure we are
inserting the new route into the tree. Second problem is
addressed by introducing new flag DST_METRICS_FORCE_OVERWRITE
which is set for a new host route in ip6_route_add() and makes
ipv6_cow_metrics() always overwrite the metrics in inetpeer
(even if they are not "new"); it is reset after that.

v5: use a flag in _metrics member rather than one in flags

v4: fix a typo making a condition always true (thanks to Hannes
Frederic Sowa)

v3: rewritten based on David Miller's idea to move setting the
metrics (and allocation in non-host case) down to the point we
already know the route is to be inserted. Also rebased to
net-next as it is quite late in the cycle.
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5fd387a

02 1月, 2014 1 次提交

ipv6: remove prune parameter for fib6_clean_all · 0c3584d5

由 Li RongQing 提交于 12月 27, 2013

since the prune parameter for fib6_clean_all always is 0, remove it.
Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c3584d5

28 9月, 2013 2 次提交

ipv6: compare sernum when walking fib for /proc/net/ipv6_route as safety net · 0a67d3ef

由 Hannes Frederic Sowa 提交于 9月 21, 2013

This patch provides an additional safety net against NULL
pointer dereferences while walking the fib trie for the new
/proc/net/ipv6_route walkers. I never needed it myself and am unsure
if it is needed at all, but the same checks where introduced in
2bec5a36 ("ipv6: fib: fix crash when
changing large fib while dumping it") to fix NULL pointer bugs.

This patch is separated from the first patch to make it easier to revert
if we are sure we can drop this logic.

Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a67d3ef

ipv6: avoid high order memory allocations for /proc/net/ipv6_route · 8d2ca1d7

由 Hannes Frederic Sowa 提交于 9月 21, 2013

Dumping routes on a system with lots rt6_infos in the fibs causes up to
11-order allocations in seq_file (which fail). While we could switch
there to vmalloc we could just implement the streaming interface for
/proc/net/ipv6_route. This patch switches /proc/net/ipv6_route from
single_open_net to seq_open_net.

loff_t *pos tracks dst entries.

Also kill never used struct rt6_proc_arg and now unused function
fib6_clean_all_ro.

Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d2ca1d7

12 9月, 2013 1 次提交

net: fib: fib6_add: fix potential NULL pointer dereference · ae7b4e1f

由 Daniel Borkmann 提交于 9月 07, 2013

When the kernel is compiled with CONFIG_IPV6_SUBTREES, and we return
with an error in fn = fib6_add_1(), then error codes are encoded into
the return pointer e.g. ERR_PTR(-ENOENT). In such an error case, we
write the error code into err and jump to out, hence enter the if(err)
condition. Now, if CONFIG_IPV6_SUBTREES is enabled, we check for:

  if (pn != fn && pn->leaf == rt)
    ...
  if (pn != fn && !pn->leaf && !(pn->fn_flags & RTN_RTINFO))
    ...

Since pn is NULL and fn is f.e. ERR_PTR(-ENOENT), then pn != fn
evaluates to true and causes a NULL-pointer dereference on further
checks on pn. Fix it, by setting both NULL in error case, so that
pn != fn already evaluates to false and no further dereference
takes place.

This was first correctly implemented in 4a287eba ("IPv6 routing,
NLM_F_* flag support: REPLACE and EXCL flags support, warn about
missing CREATE flag"), but the bug got later on introduced by
188c517a ("ipv6: return errno pointers consistently for fib6_add_1()").
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Lin Ming <mlin@ss.pku.edu.cn>
Cc: Matti Vaittinen <matti.vaittinen@nsn.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: NMatti Vaittinen <matti.vaittinen@nsn.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae7b4e1f

08 8月, 2013 1 次提交

ipv6: don't stop backtracking in fib6_lookup_1 if subtree does not match · 3e3be275

由 Hannes Frederic Sowa 提交于 8月 07, 2013

In case a subtree did not match we currently stop backtracking and return
NULL (root table from fib_lookup). This could yield in invalid routing
table lookups when using subtrees.

Instead continue to backtrack until a valid subtree or node is found
and return this match.

Also remove unneeded NULL check.
Reported-by: NTeco Boot <teco@inf-net.nl>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: David Lamparter <equinox@diac24.net>
Cc: <boutier@pps.univ-paris-diderot.fr>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e3be275

02 8月, 2013 2 次提交

ipv6: update ip6_rt_last_gc every time GC is run · 49a18d86

由 Michal Kubeček 提交于 8月 01, 2013

As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got there from ip6_dst_gc().
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49a18d86

ipv6: prevent fib6_run_gc() contention · 2ac3ac8f

由 Michal Kubeček 提交于 8月 01, 2013

On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.

This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.

Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ac3ac8f

25 7月, 2013 1 次提交

net: ipv6 eliminate parameter "int addrlen" in function fib6_add_1 · 9225b230

由 fan.du 提交于 7月 22, 2013

The "int addrlen" in fib6_add_1 is rebundant, as we can get it from
parameter "struct in6_addr *addr" once we modified its type.
And also fix some coding style issues in fib6_add_1
Signed-off-by: NFan Du <fan.du@windriver.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9225b230

13 7月, 2013 1 次提交

ipv6: only static routes qualify for equal cost multipathing · 307f2fb9

由 Hannes Frederic Sowa 提交于 7月 12, 2013

Static routes in this case are non-expiring routes which did not get
configured by autoconf or by icmpv6 redirects.

To make sure we actually get an ecmp route while searching for the first
one in this fib6_node's leafs, also make sure it matches the ecmp route
assumptions.

v2:
a) Removed RTF_EXPIRE check in dst.from chain. The check of RTF_ADDRCONF
   already ensures that this route, even if added again without
   RTF_EXPIRES (in case of a RA announcement with infinite timeout),
   does not cause the rt6i_nsiblings logic to go wrong if a later RA
   updates the expiration time later.

v3:
a) Allow RTF_EXPIRES routes to enter the ecmp route set. We have to do so,
   because an pmtu event could update the RTF_EXPIRES flag and we would
   not count this route, if another route joins this set. We now filter
   only for RTF_GATEWAY|RTF_ADDRCONF|RTF_DYNAMIC, which are flags that
   don't get changed after rt6_info construction.

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

307f2fb9

28 2月, 2013 1 次提交

hlist: drop the node parameter from iterators · b67bfe0d

由 Sasha Levin 提交于 2月 27, 2013

I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b67bfe0d

23 10月, 2012 1 次提交

ipv6: add support of equal cost multipath (ECMP) · 51ebd318

由 Nicolas Dichtel 提交于 10月 22, 2012

Each nexthop is added like a single route in the routing table. All routes
that have the same metric/weight and destination but not the same gateway
are considering as ECMP routes. They are linked together, through a list called
rt6i_siblings.

ECMP routes can be added in one shot, with RTA_MULTIPATH attribute or one after
the other (in both case, the flag NLM_F_EXCL should not be set).

The patch is based on a previous work from
Luc Saillard <luc.saillard@6wind.com>.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51ebd318

29 9月, 2012 1 次提交

ipv6: return errno pointers consistently for fib6_add_1() · 188c517a

由 Lin Ming 提交于 9月 25, 2012

fib6_add_1() should consistently return errno pointers,
rather than a mixture of NULL and errno pointers.
Signed-off-by: NLin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

188c517a

22 9月, 2012 1 次提交

ipv6: fix return value check in fib6_add() · f950c0ec

由 Wei Yongjun 提交于 9月 20, 2012

In case of error, the function fib6_add_1() returns ERR_PTR()
or NULL pointer. The ERR_PTR() case check is missing in fib6_add().

dpatch engine is used to generated this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f950c0ec

26 6月, 2012 1 次提交

ipv6: fib: fix fib dump restart · fa809e2f

由 Eric Dumazet 提交于 6月 25, 2012

Commit 2bec5a36 (ipv6: fib: fix crash when changing large fib
while dumping it) introduced ability to restart the dump at tree root,
but failed to skip correctly a count of already dumped entries. Code
didn't match Patrick intent.

We must skip exactly the number of already dumped entries.

Note that like other /proc/net files or netlink producers, we could
still dump some duplicates entries.
Reported-by: NDebabrata Banerjee <dbavatar@gmail.com>
Reported-by: NJosh Hunt <johunt@akamai.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa809e2f

16 6月, 2012 2 次提交

Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" · e8803b6c

由 David S. Miller 提交于 6月 16, 2012

This reverts commit 2a0c451a.

It causes crashes, because now ip6_null_entry is used before
it is initialized.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e8803b6c

ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route · 2a0c451a

由 Thomas Graf 提交于 6月 14, 2012

/proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
handler is installed in ip6_route_net_init() whereas fib_table_hash is
allocated in fib6_net_init() _after_ the proc handler has been installed.

This opens up a short time frame to access fib_table_hash with its pants
down.

fib6_init() as a whole can't be moved to an earlier position as it also
registers the rtnetlink message handlers which should be registered at
the end. Therefore split it into fib6_init() which is run early and
fib6_init_late() to register the rtnetlink message handlers.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a0c451a

11 6月, 2012 1 次提交
- D
  inet: Add inetpeer tree roots to the FIB tables. · 8e773277
  由 David S. Miller 提交于 6月 11, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  8e773277
08 6月, 2012 1 次提交

ipv6: fib: Restore NTF_ROUTER exception in fib6_age() · 8bd74516

由 Thomas Graf 提交于 6月 07, 2012

Commit 5339ab8b (ipv6: fib: Convert fib6_age() to
dst_neigh_lookup().) seems to have mistakenly inverted the
exception for cached NTF_ROUTER routes.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bd74516

16 5月, 2012 2 次提交

net: ipv4 and ipv6: Convert printk(KERN_DEBUG to pr_debug · 91df42be

由 Joe Perches 提交于 5月 15, 2012

Use the current debugging style and enable dynamic_debug.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91df42be

net: ipv6: Standardize prefixes for message logging · f3213831

由 Joe Perches 提交于 5月 15, 2012

Add #define pr_fmt(fmt) as appropriate.

Add "IPv6: " to appropriate files.

Convert printk(KERN_<LEVEL> to pr_<level> (but not KERN_DEBUG).
Standardize on "%s: " not "%s(): " when emitting __func__.
Use "%s: ", __func__ instead of embedding function name.
Coalesce formats, align arguments.

ADDRCONF output is now prefixed with "IPv6: "
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3213831

14 4月, 2012 1 次提交

ipv6: fix problem with expired dst cache · 1716a961

由 Gao feng 提交于 4月 06, 2012

If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet.
this dst cache will not check expire because it has no RTF_EXPIRES flag.
So this dst cache will always be used until the dst gc run.

Change the struct dst_entry,add a union contains new pointer from and expires.
When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use.
we can use this field to point to where the dst cache copy from.
The dst.from is only used in IPV6.

rt6_check_expired check if rt6_info.dst.from is expired.

ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.

ip6_dst_destroy release the ort.

Add some functions to operate the RTF_EXPIRES flag and expires(from) together.
and change the code to use these new adding functions.

Changes from v5:
modify ip6_route_add and ndisc_router_discovery to use new adding functions.

Only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1716a961

28 1月, 2012 1 次提交

ipv6: fib: Convert fib6_age() to dst_neigh_lookup(). · 5339ab8b

由 David S. Miller 提交于 1月 27, 2012

In this specific situation we know we are dealing with a gatewayed route
and therefore rt6i_gateway is not going to be in6addr_any even in future
interpretations.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5339ab8b

31 12月, 2011 1 次提交

IPv6: Avoid taking write lock for /proc/net/ipv6_route · 32b293a5

由 Josh Hunt 提交于 12月 28, 2011

During some debugging I needed to look into how /proc/net/ipv6_route
operated and in my digging I found its calling fib6_clean_all() which uses
"write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
found this on 2.6.32, but reading the code I believe the same basic idea
exists currently. Looking at the rtnetlink code they are only calling
"read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
reading from proc isn't the recommended way of fetching the ipv6 route
table; taking a write lock seems unnecessary and would probably cause
network performance issues.

To verify this I loaded up the ipv6 route table and then ran iperf in 3
cases:
  * doing nothing
  * reading ipv6 route table via proc
    (while :; do cat /proc/net/ipv6_route > /dev/null; done)
  * reading ipv6 route table via rtnetlink
    (while :; do ip -6 route show table all > /dev/null; done)

* Load the ipv6 route table up with:
  * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done

* iperf commands:
  * client: iperf -i 1 -V -c <ipv6 addr>
  * server: iperf -V -s

* iperf results - 3 runs each (in Mbits/sec)
  * nothing: client: 927,927,927 server: 927,927,927
  * proc: client: 179,97,96,113 server: 142,112,133
  * iproute: client: 928,927,928 server: 927,927,927

lock_stat shows taking the write lock is causing the slowdown. Using this
info I decided to write a version of fib6_clean_all() which replaces
write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
this new function I see the same results as with my rtnetlink iperf test.
Signed-off-by: NJosh Hunt <joshhunt00@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32b293a5

29 12月, 2011 1 次提交

ipv6: Kill rt6i_dev and rt6i_expires defines. · d1918542

由 David S. Miller 提交于 12月 28, 2011

It just obscures that the netdevice pointer and the expires value are
implemented in the dst_entry sub-object of the ipv6 route.

And it makes grepping for dst_entry member uses much harder too.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1918542

06 12月, 2011 1 次提交

net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. · 27217455

由 David Miller 提交于 12月 02, 2011

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NRoland Dreier <roland@purestorage.com>

27217455

04 12月, 2011 1 次提交

ipv6: Various cleanups in ip6_route.c · 507c9b1e

由 David S. Miller 提交于 12月 03, 2011

1) x == NULL --> !x
2) x != NULL --> x
3) if() --> if ()
4) while() --> while ()
5) (x & BIT) == 0 --> !(x & BIT)
6) (x&BIT) --> (x & BIT)
7) x=y --> x = y
8) (BIT1|BIT2) --> (BIT1 | BIT2)
9) if ((x & BIT)) --> if (x & BIT)
10) proper argument and struct member alignment
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

507c9b1e

17 11月, 2011 2 次提交

D
ipv6: Use pr_warn() in ip6_fib.c · 8d26784c
由 David S. Miller 提交于 11月 17, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
8d26784c

IPV6 Fix a crash when trying to replace non existing route · 14df015b

由 Matti Vaittinen 提交于 11月 16, 2011

This patch fixes a crash when non existing IPv6 route is tried to be changed.

When new destination node was inserted in middle of FIB6 tree, no relevant
sanity checks were performed. Later route insertion might have been prevented
due to invalid request, causing node with no rt info being left in tree.
When this node was accessed, a crash occurred.

Patch adds missing checks in fib6_add_1()
Signed-off-by: NMatti Vaittinen <Mazziesaccount@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14df015b

16 11月, 2011 1 次提交

IPv6: Removing unnecessary NULL checks. · 229a66e3

由 Matti Vaittinen 提交于 11月 15, 2011

This patch removes unnecessary NULL checks noticed by Dan Carpenter.
Checks were introduced in commit
4a287eba to net-next.
Signed-off-by: NMatti Vaittinen <Mazziesaccount@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

229a66e3

15 11月, 2011 1 次提交

IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag · 4a287eba

由 Matti Vaittinen 提交于 11月 14, 2011

The support for NLM_F_* flags at IPv6 routing requests.

If NLM_F_CREATE flag is not defined for RTM_NEWROUTE request,
warning is printed, but no error is returned. Instead new route is
added. Later NLM_F_CREATE may be required for
new route creation.

Exception is when NLM_F_REPLACE flag is given without NLM_F_CREATE, and
no matching route is found. In this case it should be safe to assume
that the request issuer is familiar with NLM_F_* flags, and does really
not want route to be created.

Specifying NLM_F_REPLACE flag will now make the kernel to search for
matching route, and replace it with new one. If no route is found and
NLM_F_CREATE is specified as well, then new route is created.

Also, specifying NLM_F_EXCL will yield returning of error if matching
route is found.

Patch created against linux-3.2-rc1
Signed-off-by: NMatti Vaittinen <Mazziesaccount@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a287eba

20 10月, 2011 1 次提交

cleanup: remove unnecessary include. · 25c8295b

由 Kevin Wilson 提交于 10月 16, 2011

This cleanup patch removes unnecessary include from net/ipv6/ip6_fib.c.
Signed-off-by: NKevin Wilson <wkevils@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25c8295b

03 8月, 2011 1 次提交

net: fix NULL dereferences in check_peer_redir() · f2c31e32

由 Eric Dumazet 提交于 7月 29, 2011

Gergely Kalman reported crashes in check_peer_redir().

It appears commit f39925db (ipv4: Cache learned redirect
information in inetpeer.) added a race, leading to possible NULL ptr
dereference.

Since we can now change dst neighbour, we should make sure a reader can
safely use a neighbour.

Add RCU protection to dst neighbour, and make sure check_peer_redir()
can be called safely by different cpus in parallel.

As neighbours are already freed after one RCU grace period, this patch
should not add typical RCU penalty (cache cold effects)

Many thanks to Gergely for providing a pretty report pointing to the
bug.
Reported-by: NGergely Kalman <synapse@hippy.csoma.elte.hu>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2c31e32

18 7月, 2011 2 次提交

D
net: Abstract dst->neighbour accesses behind helpers. · 69cce1d1
由 David S. Miller 提交于 7月 17, 2011
```
dst_{get,set}_neighbour()
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
69cce1d1

ipv6: Get rid of rt6i_nexthop macro. · 9cbb7ecb

由 David S. Miller 提交于 7月 17, 2011

It just makes it harder to see 1) what the code is doing
and 2) grep for all users of dst{->,.}neighbour
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9cbb7ecb

10 6月, 2011 1 次提交

rtnetlink: Compute and store minimum ifinfo dump size · c7ac8679

由 Greg Rose 提交于 6月 10, 2011

The message size allocated for rtnl ifinfo dumps was limited to
a single page.  This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.
Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c7ac8679

03 5月, 2011 1 次提交

net: dont hold rtnl mutex during netlink dump callbacks · e67f88dd

由 Eric Dumazet 提交于 4月 27, 2011

Four years ago, Patrick made a change to hold rtnl mutex during netlink
dump callbacks.

I believe it was a wrong move. This slows down concurrent dumps, making
good old /proc/net/ files faster than rtnetlink in some situations.

This occurred to me because one "ip link show dev ..." was _very_ slow
on a workload adding/removing network devices in background.

All dump callbacks are able to use RCU locking now, so this patch does
roughly a revert of commits :

1c2d670f : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
6313c1e0 : [RTNETLINK]: Remove unnecessary locking in dump callbacks

This let writers fight for rtnl mutex and readers going full speed.

It also takes care of phonet : phonet_route_get() is now called from rcu
read section. I renamed it to phonet_route_get_rcu()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e67f88dd

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功