提交 · 64347f786d13349d6a6f812f3a83c269e26c0136 · openeuler / Kernel

29 1月, 2008 40 次提交

[IPV4] fib_trie: dump message multiple part flag · 64347f78

由 Stephen Hemminger 提交于 1月 22, 2008

Match fib_hash, and set NLM_F_MULTI to handle multiple part messages.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64347f78

[IPV4] fib_trie: use hash list · 1328042e

由 Stephen Hemminger 提交于 1月 22, 2008

The code to dump can use the existing hash chain rather than doing
repeated lookup.
Signed-off-by: NStephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1328042e

[IPV4] fib_trie: compute size when needed · 93672292

由 Stephen Hemminger 提交于 1月 22, 2008

Compute the number of prefixes when needed, rather than doing bookeeping.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93672292

[IPV4] fib_trie: style cleanup · a07f5f50

由 Stephen Hemminger 提交于 1月 22, 2008

Style cleanups:
      * make check_leaf return -1 or plen, rather than by reference
      * Get rid of #ifdef that is always set
      * split out embedded function calls in if statements.
      * checkpatch warnings
Signed-off-by: NStephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a07f5f50

[IPV4] fib_trie: put leaf nodes in a slab cache · bc3c8c1e

由 Stephen Hemminger 提交于 1月 22, 2008

This improves locality for operations that touch all the leaves. Save
space since these entries don't need to be hardware cache aligned.
Signed-off-by: NStephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc3c8c1e

[DST]: shrinks sizeof(struct rtable) by 64 bytes on x86_64 · 69a73829

由 Eric Dumazet 提交于 1月 22, 2008

On x86_64, sizeof(struct rtable) is 0x148, which is rounded up to
0x180 bytes by SLAB allocator.

We can reduce this to exactly 0x140 bytes, without alignment overhead,
and store 12 struct rtable per PAGE instead of 10.

rate_tokens is currently defined as an "unsigned long", while its
content should not exceed 6*HZ. It can safely be converted to an
unsigned int.

Moving tclassid right after rate_tokens to fill the 4 bytes hole
permits to save 8 bytes on 'struct dst_entry', which finally permits
to save 8 bytes on 'struct rtable'
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

69a73829

[NETNS][FRAGS]: Make the pernet subsystem for fragments. · 81566e83

由 Pavel Emelyanov 提交于 1月 22, 2008

On namespace start we mainly prepare the ctl variables.

When the namespace is stopped we have to kill all the fragments that
point to this namespace.  The inet_frags_exit_net() handles it.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81566e83

[NETNS][FRAGS]: Make the LRU list per namespace. · 3140c25c

由 Pavel Emelyanov 提交于 1月 22, 2008

The inet_frags.lru_list is used for evicting only, so we have
to make it per-namespace, to evict only those fragments, who's
namespace exceeded its high threshold, but not the whole hash.
Besides, this helps to avoid long loops  in evictor.

The spinlock is not per-namespace because it protects the
hash table as well, which is global.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3140c25c

[NETNS][FRAGS]: Isolate the secret interval from namespaces. · 3b4bc4a2

由 Pavel Emelyanov 提交于 1月 22, 2008

Since we have one hashtable to lookup the fragment, having
different secret_interval-s for hash rebuild doesn't make
sense, so move this one to inet_frags.

The inet_frags_ctl becomes empty after this, so remove it.
The appropriate ctl table is kept read-only in namespaces.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b4bc4a2

[NETNS][FRAGS]: Make thresholds work in namespaces. · e31e0bdc

由 Pavel Emelyanov 提交于 1月 22, 2008

This is the same as with the timeout variable.

Currently, after exceeding the high threshold _all_
the fragments are evicted, but it will be fixed in
later patch.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e31e0bdc

[NETNS][FRAGS]: Make the net.ipv4.ipfrag_timeout work in namespaces. · b2fd5321

由 Pavel Emelyanov 提交于 1月 22, 2008

Move it to the netns_frags, adjust the usage and
make the appropriate ctl table writable.

Now fragment, that live in different namespaces can
live for different times.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2fd5321

[NETNS][FRAGS]: Duplicate sysctl tables for new namespaces. · e4a2d5c2

由 Pavel Emelyanov 提交于 1月 22, 2008

Each namespace has to have own tables to tune their
different parameters, so duplicate the tables and
register them.

All the tables in sub-namespaces are temporarily made
read-only.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4a2d5c2

[NETNS][FRAGS]: Make the mem counter per-namespace. · 6ddc0822

由 Pavel Emelyanov 提交于 1月 22, 2008

This is also simple, but introduces more changes, since
then mem counter is altered in more places.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ddc0822

[NETNS][FRAGS]: Make the nqueues counter per-namespace. · e5a2bb84

由 Pavel Emelyanov 提交于 1月 22, 2008

This is simple - just move the variable from struct inet_frags
to struct netns_frags and adjust the usage appropriately.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5a2bb84

[NETNS][FRAGS]: Make the inet_frag_queue lookup work in namespaces. · ac18e750

由 Pavel Emelyanov 提交于 1月 22, 2008

Since fragment management code is consolidated, we cannot have the
pointer from inet_frag_queue to struct net, since we must know what
king of fragment this is.

So, I introduce the netns_frags structure. This one is currently
empty, but will be eventually filled with per-namespace
attributes. Each inet_frag_queue is tagged with this one.

The conntrack_reasm is not "netns-izated", so it has one static
netns_frags instance to keep working in init namespace.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac18e750

[NETNS][FRAGS]: Move ctl tables around. · 8d8354d2

由 Pavel Emelyanov 提交于 1月 22, 2008

This is a preparation for sysctl netns-ization.
Move the ctl tables to the files, where the tuning
variables reside. Plus make the helpers to register
the tables.

This will simplify the later patches and will keep
similar things closer to each other.

ipv4, ipv6 and conntrack_reasm are patched differently,
but the result is all the tables are in appropriate files.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d8354d2

[IPV4] UDP,UDPLITE: Sparse: {__udp4_lib,udp,udplite}_err() are of void. · fc80be87

由 YOSHIFUJI Hideaki 提交于 1月 22, 2008

Fix following sparse warnings:
| net/ipv4/udp.c:421:2: warning: returning void-valued expression
| net/ipv4/udplite.c:38:2: warning: returning void-valued expression
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

fc80be87

[NETNS]: Pass correct namespace in ip_rt_get_source. · ecfdc8c5

由 Denis V. Lunev 提交于 1月 21, 2008

ip_rt_get_source is the infamous place for which dst_ifdown kludges
have been implemented. This means that rt->u.dst.dev can be safely
dereferrenced obtain nd_net.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ecfdc8c5

[NETNS]: Pass correct namespace in ip_route_input_slow. · 84a885f4

由 Denis V. Lunev 提交于 1月 21, 2008

The packet on the input path always has a referrence to an input
network device it is passed from. Extract network namespace from it.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84a885f4

[NETNS]: Pass correct namespace in context fib_check_nh. · 86167a37

由 Denis V. Lunev 提交于 1月 21, 2008

Correct network namespace is already used in fib_check_nh. Re-work its
usage for better readability and pass into fib_lookup &
inetdev_by_index.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86167a37

[NETNS]: Pass correct namespace in fib_validate_source. · 5b707aaa

由 Denis V. Lunev 提交于 1月 21, 2008

Correct network namespace is available inside fib_validate_source. It
can be obtained from the device passed in. The device is not NULL as
in_device is obtained from it just above.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b707aaa

D
[NETNS]: Add netns parameter to inetdev_by_index. · 7fee0ca2
由 Denis V. Lunev 提交于 1月 21, 2008
```
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
7fee0ca2

[NETNS]: Add netns parameter to fib_lookup. · da0e28cb

由 Denis V. Lunev 提交于 1月 21, 2008

Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da0e28cb

[IPV4]: ipmr sparse warnings · ba93ef74

由 Stephen Hemminger 提交于 1月 21, 2008

Get rid of some of the sparse warnings.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba93ef74

[IPV4]: igmp sparse warnings · dd329bfa

由 Stephen Hemminger 提交于 1月 21, 2008

Partial sparse warning fix.  The other conditional locking
is too much for sparse to handle.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd329bfa

[IPV4]: Enable use of 240/4 address space. · 1e637c74

由 Jan Engelhardt 提交于 1月 21, 2008

This short patch modifies the IPv4 networking to enable use of the
240.0.0.0/4 (aka "class-E") address space as propsed in the internet
draft draft-fuller-240space-00.txt.
Signed-off-by: NJan Engelhardt <jengelh@computergmbh.de>
Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e637c74

[NETNS]: Process FIB rule action in the context of the namespace. · 51314a17

由 Denis V. Lunev 提交于 1月 20, 2008

Save namespace context on the fib rule at the rule creation time and
call routing lookup in the correct namespace.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Acked-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51314a17

[NETNS]: FIB rules API cleanup. · 9e3a5487

由 Denis V. Lunev 提交于 1月 20, 2008

Remove struct net from fib_rules_register(unregister)/notify_change
paths and diet code size a bit.

add/remove: 0/0 grow/shrink: 10/12 up/down: 35/-100 (-65)
function old new delta
notify_rule_change 273 280 +7
trie_show_stats 471 475 +4
fn_trie_delete 473 477 +4
fib_rules_unregister 144 148 +4
fib4_rule_compare 119 123 +4
resize 2842 2845 +3
fn_trie_select_default 515 518 +3
inet_sk_rebuild_header 836 838 +2
fib_trie_seq_show 764 766 +2
__devinet_sysctl_register 276 278 +2
fn_trie_lookup 1124 1123 -1
ip_fib_check_default 133 131 -2
devinet_conf_sysctl 223 221 -2
snmp_fold_field 126 123 -3
fn_trie_insert 2091 2086 -5
inet_create 876 870 -6
fib4_rules_init 197 191 -6
fib_sync_down 452 444 -8
inet_gso_send_check 334 325 -9
fib_create_info 3003 2991 -12
fib_nl_delrule 568 553 -15
fib_nl_newrule 883 852 -31
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Acked-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e3a5487

[FIB]: Add netns to fib_rules_ops. · 03592383

由 Denis V. Lunev 提交于 1月 20, 2008

The backward link from FIB rules operations to the network namespace
will allow to simplify the API a bit.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Acked-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03592383

[NETNS]: Namespace stop vs 'ip r l' race. · 775516bf

由 Denis V. Lunev 提交于 1月 18, 2008

During network namespace stop process kernel side netlink sockets
belonging to a namespace should be closed. They should not prevent
namespace to stop, so they do not increment namespace usage
counter. Though this counter will be put during last sock_put.

The raplacement of the correct netns for init_ns solves the problem
only partial as socket to be stoped until proper stop is a valid
netlink kernel socket and can be looked up by the user processes. This
is not a problem until it resides in initial namespace (no processes
inside this net), but this is not true for init_net.

So, hold the referrence for a socket, remove it from lookup tables and
only after that change namespace and perform a last put.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Tested-by: NAlexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

775516bf

[NETNS]: Consolidate kernel netlink socket destruction. · b7c6ba6e

由 Denis V. Lunev 提交于 1月 28, 2008

Create a specific helper for netlink kernel socket disposal. This just
let the code look better and provides a ground for proper disposal
inside a namespace.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Tested-by: NAlexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7c6ba6e

[NETNS]: Memory leak on network namespace stop. · 4f84d82f

由 Denis V. Lunev 提交于 1月 18, 2008

Network namespace allocates 2 kernel netlink sockets, fibnl &
rtnl. These sockets should be disposed properly, i.e. by
sock_release. Plain sock_put is not enough.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Tested-by: NAlexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f84d82f

[NETNS][DST] dst: pass the dst_ops as parameter to the gc functions · 569d3645

由 Daniel Lezcano 提交于 1月 18, 2008

The garbage collection function receive the dst_ops structure as
parameter. This is useful for the next incoming patchset because it
will need the dst_ops (there will be several instances) and the
network namespace pointer (contained in the dst_ops).

The protocols which do not take care of the namespaces will not be
impacted by this change (expect for the function signature), they do
just ignore the parameter.
Signed-off-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

569d3645

[IPV4] FIB_HASH: Reduce memory needs and speedup lookups · a6501e08

由 Eric Dumazet 提交于 1月 18, 2008

Currently, sizeof(struct fib_alias) is 24 or 48 bytes on 32/64 bits
arches.

Because of SLAB_HWCACHE_ALIGN requirement, these are rounded to 32 and
64 bytes respectively.

This patch moves rcu to the end of fib_alias, and conditionally
defines it only for CONFIG_IP_FIB_TRIE.

We also remove SLAB_HWCACHE_ALIGN requirement for fib_alias and
fib_node objects because it is not necessary.

(BTW SLUB currently denies it for objects smaller than
cache_line_size() / 2, but not SLAB)

Finally, sizeof(fib_alias) go back to 16 and 32 bytes.

Then, we can embed one fib_alias on each fib_node, to favor locality.
Most of the time access to the fib_alias will be free because one
cache line contains both the list head (fn_alias) and (one of) the
list element.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6501e08

[FIB]: Fix rcu_dereference() abuses in fib_trie.c · b59cfbf7

由 Eric Dumazet 提交于 1月 18, 2008

node_parent() and tnode_get_child() currently use rcu_dereference().

These functions are called from both
- readers only paths (where rcu_dereference() is needed), and
- writer path (where rcu_dereference() is not needed)

To make explicit where rcu_dereference() is really needed, I
introduced new node_parent_rcu() and tnode_get_child_rcu() functions
which use rcu_dereference(), while node_parent() and tnode_get_child()
dont use it.

Then I changed calling sites where rcu_dereference() was really needed
to call the _rcu() variants.

This should have no impact but for alpha architecture, and may help
future sparse checks.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b59cfbf7

[NETFILTER]: nf_conntrack: make print_conntrack function optional for l4protos · c71e9167

由 Patrick McHardy 提交于 1月 14, 2008

Allows to remove five empty implementations.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c71e9167

[NETFILTER]: nf_conntrack: remove print_conntrack function from l3protos · c56cc9c0

由 Patrick McHardy 提交于 1月 14, 2008

Its unused and unlikely to ever be used.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c56cc9c0

[NETFILTER]: kill nf_sysctl.c · 4f536522

由 Patrick McHardy 提交于 1月 14, 2008

Since there now is generic support for shared sysctl paths, the only
remains are the net/netfilter and net/ipv4/netfilter paths. Move them
to net/netfilter/core.c and net/ipv4/netfilter.c and kill nf_sysctl.c.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f536522

[NETFILTER]: ipt_REJECT: properly handle IP options · 9ba99b0d

由 Denys Vlasenko 提交于 1月 14, 2008

The current TCP RST construction reuses the old packet and can't
deal with IP options as a consequence of that. Construct the
RST from scratch instead.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ba99b0d

[NETFILTER]: {ip,ip6}_tables: remove some inlines · 022748a9

由 Denys Vlasenko 提交于 1月 14, 2008

This patch removes inlines except those which are used
by packet matching code and thus are performance-critical.

Before:

$ size */*/*/ip*tables*.o
   text    data     bss     dec     hex filename
   6402     500      16    6918    1b06 net/ipv4/netfilter/ip_tables.o
   7130     500      16    7646    1dde net/ipv6/netfilter/ip6_tables.o

After:

$ size */*/*/ip*tables*.o
   text    data     bss     dec     hex filename
   6307     500      16    6823    1aa7 net/ipv4/netfilter/ip_tables.o
   7010     500      16    7526    1d66 net/ipv6/netfilter/ip6_tables.o
Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

022748a9

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功