提交 · 3dc43e3e4d0b52197d3205214fe8f162f9e0c334 · openanolis / cloud-kernel

13 12月, 2011 1 次提交

per-netns ipv4 sysctl_tcp_mem · 3dc43e3e

由 Glauber Costa 提交于 12月 11, 2011

This patch allows each namespace to independently set up
its levels for tcp memory pressure thresholds. This patch
alone does not buy much: we need to make this values
per group of process somehow. This is achieved in the
patches that follows in this patchset.
Signed-off-by: NGlauber Costa <glommer@parallels.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dc43e3e

12 12月, 2011 1 次提交

net: use IS_ENABLED(CONFIG_IPV6) · dfd56b8b

由 Eric Dumazet 提交于 12月 10, 2011

Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfd56b8b

22 11月, 2011 1 次提交

netfilter: nf_conntrack: make event callback registration per-netns · 70e9942f

由 Pablo Neira Ayuso 提交于 11月 22, 2011

This patch fixes an oops that can be triggered following this recipe:

0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded.
1) container is started.
2) connect to it via lxc-console.
3) generate some traffic with the container to create some conntrack
   entries in its table.
4) stop the container: you hit one oops because the conntrack table
   cleanup tries to report the destroy event to user-space but the
   per-netns nfnetlink socket has already gone (as the nfnetlink
   socket is per-netns but event callback registration is global).

To fix this situation, we make the ctnl_notifier per-netns so the
callback is registered/unregistered if the container is
created/destroyed.

Alex Bligh and Alexey Dobriyan originally proposed one small patch to
check if the nfnetlink socket is gone in nfnetlink_has_listeners,
but this is a very visited path for events, thus, it may reduce
performance and it looks a bit hackish to check for the nfnetlink
socket only to workaround this situation. As a result, I decided
to follow the bigger path choice, which seems to look nicer to me.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: NAlex Bligh <alex@alex.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

70e9942f

14 11月, 2011 1 次提交

ipv6: reduce percpu needs for icmpv6msg mibs · 2a24444f

由 Eric Dumazet 提交于 11月 13, 2011

Reading /proc/net/snmp6 on a machine with a lot of cpus is very
expensive (can be ~88000 us).

This is because ICMPV6MSG MIB uses 4096 bytes per cpu, and folding
values for all possible cpus can read 16 Mbytes of memory (32MBytes on
non x86 arches)

ICMP messages are not considered as fast path on a typical server, and
eventually few cpus handle them anyway. We can afford an atomic
operation instead of using percpu data.

This saves 4096 bytes per cpu and per network namespace.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a24444f

10 11月, 2011 1 次提交

ipv4: reduce percpu needs for icmpmsg mibs · acb32ba3

由 Eric Dumazet 提交于 11月 08, 2011

Reading /proc/net/snmp on a machine with a lot of cpus is very expensive
(can be ~88000 us).

This is because ICMPMSG MIB uses 4096 bytes per cpu, and folding values
for all possible cpus can read 16 Mbytes of memory.

ICMP messages are not considered as fast path on a typical server, and
eventually few cpus handle them anyway. We can afford an atomic
operation instead of using percpu data.

This saves 4096 bytes per cpu and per network namespace.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

acb32ba3

27 7月, 2011 1 次提交

atomic: use <linux/atomic.h> · 60063497

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

14 5月, 2011 1 次提交

net: ipv4: add IPPROTO_ICMP socket kind · c319b4d7

由 Vasiliy Kulikov 提交于 5月 13, 2011

This patch adds IPPROTO_ICMP socket kind.  It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges.  In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping.  In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).

Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/

A new ping socket is created with

    socket(PF_INET, SOCK_DGRAM, PROT_ICMP)

Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.

Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.

ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.

ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).

socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range".  It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets.  Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100
4294967295" would enable it for the users, but not daemons.

The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).

Userspace ping util & patch for it:
http://openwall.info/wiki/people/segoon/ping

For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro.  A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/

Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.

All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.

PATCH v3:
    - switched to flowi4.
    - minor changes to be consistent with raw sockets code.

PATCH v2:
    - changed ping_debug() to pr_debug().
    - removed CONFIG_IP_PING.
    - removed ping_seq_fops.owner field (unused for procfs).
    - switched to proc_net_fops_create().
    - switched to %pK in seq_printf().

PATCH v1:
    - fixed checksumming bug.
    - CAP_NET_RAW may not create icmp sockets anymore.

RFC v2:
    - minor cleanups.
    - introduced sysctl'able group range to restrict socket(2).
Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c319b4d7

25 3月, 2011 1 次提交

ipv4: Invalidate nexthop cache nh_saddr more correctly. · 436c3b66

由 David S. Miller 提交于 3月 24, 2011

Any operation that:

1) Brings up an interface
2) Adds an IP address to an interface
3) Deletes an IP address from an interface

can potentially invalidate the nh_saddr value, requiring
it to be recomputed.

Perform the recomputation lazily using a generation ID.
Reported-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

436c3b66

15 3月, 2011 1 次提交

ipvs: move struct netns_ipvs · 2553d064

由 Julian Anastasov 提交于 3月 04, 2011

 	Remove include/net/netns/ip_vs.h because it depends on
structures from include/net/ip_vs.h. As ipvs is pointer in
struct net it is better to move struct netns_ipvs into
include/net/ip_vs.h, so that we can easily use other structures
in struct netns_ipvs.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

2553d064

19 1月, 2011 1 次提交

netfilter: nf_conntrack_tstamp: add flow-based timestamp extension · a992ca2a

由 Pablo Neira Ayuso 提交于 1月 19, 2011

This patch adds flow-based timestamping for conntracks. This
conntrack extension is disabled by default. Basically, we use
two 64-bits variables to store the creation timestamp once the
conntrack has been confirmed and the other to store the deletion
time. This extension is disabled by default, to enable it, you
have to:

echo 1 > /proc/sys/net/netfilter/nf_conntrack_timestamp

This patch allows to save memory for user-space flow-based
loogers such as ulogd2. In short, ulogd2 does not need to
keep a hashtable with the conntrack in user-space to know
when they were created and destroyed, instead we use the
kernel timestamp. If we want to have a sane IPFIX implementation
in user-space, this nanosecs resolution timestamps are also
useful. Other custom user-space applications can benefit from
this via libnetfilter_conntrack.

This patch modifies the /proc output to display the delta time
in seconds since the flow start. You can also obtain the
flow-start date by means of the conntrack-tools.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

a992ca2a

14 1月, 2011 1 次提交

netfilter: nf_conntrack: use is_vmalloc_addr() · d862a662

由 Patrick McHardy 提交于 1月 14, 2011

Use is_vmalloc_addr() in nf_ct_free_hashtable() and get rid of
the vmalloc flags to indicate that a hash table has been allocated
using vmalloc().
Signed-off-by: NPatrick McHardy <kaber@trash.net>

d862a662

13 1月, 2011 17 次提交

IPVS: netns, svc counters moved in ip_vs_ctl,c · 763f8d0e

由 Hans Schillstrom 提交于 1月 03, 2011

Last two global vars to be moved,
ip_vs_ftpsvc_counter and ip_vs_nullsvc_counter.

[horms@verge.net.au: removed whitespace-change-only hunk]
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

763f8d0e

IPVS: netns, trash handling · f2431e6e

由 Hans Schillstrom 提交于 1月 03, 2011

trash list per namspace,
and reordering of some params in dst struct.

[ horms@verge.net.au: Use cancel_delayed_work_sync() instead of
	              cancel_rearming_delayed_work(). Found during
		      merge conflict resoliution ]
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

f2431e6e

IPVS: netns, defense work timer. · f6340ee0

由 Hans Schillstrom 提交于 1月 03, 2011

This patch makes defense work timer per name-space,
A net ptr had to be added to the ipvs struct,
since it's needed by defense_work_handler.

[ horms@verge.net.au: Use cancel_delayed_work_sync() instead of
	              cancel_rearming_delayed_work(). Found during
		      merge conflict resoliution ]
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

f6340ee0

IPVS: netns, ip_vs_ctl local vars moved to ipvs struct. · a0840e2e

由 Hans Schillstrom 提交于 1月 03, 2011

Moving global vars to ipvs struct, except for svc table lock.
Next patch for ctl will be drop-rate handling.

*v3
__ip_vs_mutex remains global
 ip_vs_conntrack_enabled(struct netns_ipvs *ipvs)
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

a0840e2e

IPVS: netns, connection hash got net as param. · 6e67e586

由 Hans Schillstrom 提交于 1月 03, 2011

Connection hash table is now name space aware.
i.e. net ptr >> 8 is xor:ed to the hash,
and this is the first param to be compared.
The net struct is 0xa40 in size ( a little bit smaller for 32 bit arch:s)
and cache-line aligned, so a ptr >> 5 might be a more clever solution ?

All lookups where net is compared uses net_eq() which returns 1 when netns
is disabled, and the compiler seems to do something clever in that case.

ip_vs_conn_fill_param() have *net as first param now.

Three new inlines added to keep conn struct smaller
when names space is disabled.
- ip_vs_conn_net()
- ip_vs_conn_net_set()
- ip_vs_conn_net_eq()

*v3
  moved net compare to the end in "fast path"
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

6e67e586

IPVS: netns, ip_vs_stats and its procfs · b17fc996

由 Hans Schillstrom 提交于 1月 03, 2011

The statistic counter locks for every packet are now removed,
and that statistic is now per CPU, i.e. no locks needed.
However summing is made in ip_vs_est into ip_vs_stats struct
which is moved to ipvs struc.

procfs, ip_vs_stats now have a "per cpu" count and a grand total.
A new function seq_file_single_net() in ip_vs.h created for handling of
single_open_net() since it does not place net ptr in a struct, like others.

/var/lib/lxc # cat /proc/net/ip_vs_stats_percpu
       Total Incoming Outgoing         Incoming         Outgoing
CPU    Conns  Packets  Packets            Bytes            Bytes
  0        0        3        1               9D               34
  1        0        1        2               49               70
  2        0        1        2               34               76
  3        1        2        2               70               74
  ~        1        7        7              18A              18E

     Conns/s   Pkts/s   Pkts/s          Bytes/s          Bytes/s
           0        0        0                0                0

*v3
ip_vs_stats reamains as before, instead ip_vs_stats_percpu is added.
u64 seq lock added

*v4
Bug correction inbytes and outbytes as own vars..
per_cpu counter for all stats now as suggested by Julian.

[horms@verge.net.au: removed whitespace-change-only hunk]
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

b17fc996

IPVS: netns awareness to ip_vs_sync · f131315f

由 Hans Schillstrom 提交于 1月 03, 2011

All global variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)
in sync_buf create  + 4 replaced by sizeof(struct..)
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

f131315f

IPVS: netns awareness to ip_vs_est · 29c2026f

由 Hans Schillstrom 提交于 1月 03, 2011

All variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)

*v3
 timer per ns instead of a common timer in estimator.
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

29c2026f

IPVS: netns awareness to ip_vs_app · ab8a5e84

由 Hans Schillstrom 提交于 1月 03, 2011

All variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)

in ip_vs_protocol param struct net *net added to:
 - register_app()
 - unregister_app()
This affected almost all proto_xxx.c files
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

ab8a5e84

IPVS: netns preparation for proto_sctp · 9d934878

由 Hans Schillstrom 提交于 1月 03, 2011

In this phase (one), all local vars will be moved to ipvs struct.

Remaining work, add param struct net *net to a couple of
functions that is common for all protos and use ip_vs_proto_data

*v3
 Removed unuset function set_state_timeout()
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

9d934878

IPVS: netns preparation for proto_udp · 78b16bde

由 Hans Schillstrom 提交于 1月 03, 2011

In this phase (one), all local vars will be moved to ipvs struct.

Remaining work, add param struct net *net to a couple of
functions that is common for all protos and use ip_vs_proto_data

*v3
Removed unused function set_state_timeout()
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

78b16bde

IPVS: netns preparation for proto_tcp · 4a85b96c

由 Hans Schillstrom 提交于 1月 03, 2011

In this phase (one), all local vars will be moved to ipvs struct.

Remaining work, add param struct net *net to a couple of
functions that is common for all protos and use all
ip_vs_proto_data

*v3
Removed unused function as sugested by Simon
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

4a85b96c

IPVS: netns, prepare protocol · 252c6410

由 Hans Schillstrom 提交于 1月 03, 2011

Add support for protocol data per name-space.
in struct ip_vs_protocol, appcnt will be removed when all protos
are modified for network name-space.

This patch causes warnings of unused functions, they will be used
when next patch will be applied.
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

252c6410

IPVS: netns awarness to lblc sheduler · b6e885dd

由 Hans Schillstrom 提交于 1月 03, 2011

var sysctl_ip_vs_lblc_expiration moved to ipvs struct as
    sysctl_lblc_expiration

procfs updated to handle this.
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

b6e885dd

IPVS: netns awarness to lblcr sheduler · d0a1eef9

由 Hans Schillstrom 提交于 1月 03, 2011

var sysctl_ip_vs_lblcr_expiration moved to ipvs struct as
    sysctl_lblcr_expiration

procfs updated to handle this.
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

d0a1eef9

IPVS: netns to services part 1 · fc723250

由 Hans Schillstrom 提交于 1月 03, 2011

Services hash tables got netns ptr a hash arg,
While Real Servers (rs) has been moved to ipvs struct.
Two new inline functions added to get net ptr from skb.

Since ip_vs is called from different contexts there is two
places to dig for the net ptr skb->dev or skb->sk
this is handled in skb_net() and skb_sknet()

Global functions, ip_vs_service_get() ip_vs_lookup_real_service()
etc have got  struct net *net as first param.
If possible get net ptr skb etc,
 - if not &init_net is used at this early stage of patching.

ip_vs_ctl.c  procfs not ready for netns yet.

*v3
 Comments by Julian
- __ip_vs_service_find and __ip_vs_svc_fwm_find are fast path,
  net_eq(svc->net, net) so the check is at the end now.
- net = skb_net(skb) in ip_vs_out moved after check for skb_dst.
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

fc723250

IPVS: netns, add basic init per netns. · 61b1ab45

由 Hans Schillstrom 提交于 1月 03, 2011

Preparation for network name-space init, in this stage
some empty functions exists.

In most files there is a check if it is root ns i.e. init_net
if (!net_eq(net, &init_net))
        return ...
this will be removed by the last patch, when enabling name-space.

*v3
 ip_vs_conn.c merge error corrected.
 net_ipvs #ifdef removed as sugested by Jan Engelhardt

[ horms@verge.net.au: Removed whitespace-change-only hunks ]
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

61b1ab45

22 11月, 2010 1 次提交

netns: let net_generic take pointer-to-const args · 20a95a21

由 Jan Engelhardt 提交于 11月 20, 2010

This commit is same in nature as v2.6.37-rc1-755-g3654654f; the network
namespace itself is not modified when calling net_generic, so the
parameter can be const.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20a95a21

18 10月, 2010 1 次提交

netns: reorder fields in struct net · 8e602ce2

由 Eric Dumazet 提交于 10月 14, 2010

In a network bench, I noticed an unfortunate false sharing between
'loopback_dev' and 'count' fields in "struct net".

'count' is written each time a socket is created or destroyed, while
loopback_dev might be often read in routing code.

Move loopback_dev in a read mostly section of "struct net"

Note: struct netns_xfrm is cache line aligned on SMP.
(It contains a "struct dst_ops")
Move it at the end to avoid holes, and reduce sizeof(struct net) by 128
bytes on ia32.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e602ce2

11 5月, 2010 4 次提交

ipv6: ip6mr: support multiple tables · d1db275d

由 Patrick McHardy 提交于 5月 11, 2010

This patch adds support for multiple independant multicast routing instances,
named "tables".

Userspace multicast routing daemons can bind to a specific table instance by
issuing a setsockopt call using a new option MRT6_TABLE. The table number is
stored in the raw socket data and affects all following ip6mr setsockopt(),
getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT)
is created with a default routing rule pointing to it. Newly created pim6reg
devices have the table number appended ("pim6regX"), with the exception of
devices created in the default table, which are named just "pim6reg" for
compatibility reasons.

Packets are directed to a specific table instance using routing rules,
similar to how regular routing rules work. Currently iif, oif and mark
are supported as keys, source and destination addresses could be supported
additionally.

Example usage:

- bind pimd/xorp/... to a specific table:

uint32_t table = 123;
setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table));

- create routing rules directing packets to the new table:

# ip -6 mrule add iif eth0 lookup 123
# ip -6 mrule add oif eth0 lookup 123
Signed-off-by: NPatrick McHardy <kaber@trash.net>

d1db275d

P
ipv6: ip6mr: move mroute data into seperate structure · 6bd52143
由 Patrick McHardy 提交于 5月 11, 2010
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
6bd52143
P
ipv6: ip6mr: convert struct mfc_cache to struct list_head · f30a7784
由 Patrick McHardy 提交于 5月 11, 2010
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
f30a7784

ipv6: ip6mr: move unres_queue and timer to per-namespace data · c476efbc

由 Patrick McHardy 提交于 5月 11, 2010

The unres_queue is currently shared between all namespaces. Following patches
will additionally allow to create multiple multicast routing tables in each
namespace. Having a single shared queue for all these users seems to excessive,
move the queue and the cleanup timer to the per-namespace data to unshare it.

As a side-effect, this fixes a bug in the seq file iteration functions: the
first entry returned is always from the current namespace, entries returned
after that may belong to any namespace.
Signed-off-by: NPatrick McHardy <kaber@trash.net>

c476efbc

08 5月, 2010 1 次提交

ipv4: remove ip_rt_secret timer (v4) · 3ee94372

由 Neil Horman 提交于 5月 08, 2010

A while back there was a discussion regarding the rt_secret_interval timer.
Given that we've had the ability to do emergency route cache rebuilds for awhile
now, based on a statistical analysis of the various hash chain lengths in the
cache, the use of the flush timer is somewhat redundant. This patch removes the
rt_secret_interval sysctl, allowing us to rely solely on the statistical
analysis mechanism to determine the need for route cache flushes.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ee94372

28 4月, 2010 1 次提交

net: disallow to use net_assign_generic externally · 05fceb4a

由 Jiri Pirko 提交于 4月 23, 2010

Now there's no need to use this fuction directly because it's handled by
register_pernet_device. So to make this simple and easy to understand,
make this static to do not tempt potentional users.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05fceb4a

14 4月, 2010 4 次提交

ipv4: ipmr: support multiple tables · f0ad0860

由 Patrick McHardy 提交于 4月 13, 2010

This patch adds support for multiple independant multicast routing instances,
named "tables".

Userspace multicast routing daemons can bind to a specific table instance by
issuing a setsockopt call using a new option MRT_TABLE. The table number is
stored in the raw socket data and affects all following ipmr setsockopt(),
getsockopt() and ioctl() calls. By default, a single table (RT_TABLE_DEFAULT)
is created with a default routing rule pointing to it. Newly created pimreg
devices have the table number appended ("pimregX"), with the exception of
devices created in the default table, which are named just "pimreg" for
compatibility reasons.

Packets are directed to a specific table instance using routing rules,
similar to how regular routing rules work. Currently iif, oif and mark
are supported as keys, source and destination addresses could be supported
additionally.

Example usage:

- bind pimd/xorp/... to a specific table:

uint32_t table = 123;
setsockopt(fd, IPPROTO_IP, MRT_TABLE, &table, sizeof(table));

- create routing rules directing packets to the new table:

# ip mrule add iif eth0 lookup 123
# ip mrule add oif eth0 lookup 123
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0ad0860

P
ipv4: ipmr: move mroute data into seperate structure · 0c12295a
由 Patrick McHardy 提交于 4月 13, 2010
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
0c12295a
P
ipv4: ipmr: convert struct mfc_cache to struct list_head · 862465f2
由 Patrick McHardy 提交于 4月 13, 2010
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
862465f2

ipv4: ipmr: move unres_queue and timer to per-namespace data · e258beb2

由 Patrick McHardy 提交于 4月 13, 2010

e258beb2

openanolis / cloud-kernel 12 个月 前同步成功

openanolis / cloud-kernel
12 个月前同步成功