提交 · d956798d82d2d331c031301965d69e17a1a48a2b · openanolis / cloud-kernel

01 2月, 2011 13 次提交

netfilter: xtables: "set" match and "SET" target support · d956798d

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The patch adds the combined module of the "SET" target and "set" match
to netfilter. Both the previous and the current revisions are supported.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

d956798d

netfilter: ipset: list:set set type support · f830837f

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the list:set type support in two flavours:
without and with timeout. The sets has two sides: for the userspace,
they store the names of other (non list:set type of) sets: one can add,
delete and test set names. For the kernel, it forms an ordered union of
the member sets: the members sets are tried in order when elements are
added, deleted and tested and the process stops at the first success.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

f830837f

netfilter: ipset: hash:net,port set type support · 21f45020

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the hash:net,port type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are two dimensional: IPv4/IPv6 network address/prefix and protocol/port
pairs.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

21f45020

netfilter: ipset: hash:net set type support · b3837029

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the hash:net type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are one dimensional: IPv4/IPv6 network address/prefixes.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

b3837029

netfilter: ipset: hash:ip,port,net set type support · 41d22f7b

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the hash:ip,port,net type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are three dimensional: IPv4/IPv6 address, protocol/port and IPv4/IPv6
network address/prefix triples. The different prefixes are searched/matched
from the longest prefix to the shortes one (most specific to least).
In other words the processing time linearly grows with the number of
different prefixes in the set.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

41d22f7b

netfilter: ipset: hash:ip,port,ip set type support · 5663bc30

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the hash:ip,port,ip type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are three dimensional: IPv4/IPv6 address, protocol/port and IPv4/IPv6
address triples.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

5663bc30

netfilter: ipset: hash:ip,port set type support · 07896ed3

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the hash:ip,port type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are two dimensional: IPv4/IPv6 address and protocol/port pairs. The port
is interpeted for TCP, UPD, ICMP and ICMPv6 (at the latters as type/code
of course).
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

07896ed3

netfilter: ipset: hash:ip set type support · 6c027889

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the hash:ip type support in four flavours:
for IPv4 or IPv6, both without and with timeout support.

All the hash types are based on the "array hash" or ahash structure
and functions as a good compromise between minimal memory footprint
and speed. The hashing uses arrays to resolve clashes. The hash table
is resized (doubled) when searching becomes too long. Resizing can be
triggered by userspace add commands only and those are serialized by
the nfnl mutex. During resizing the set is read-locked, so the only
possible concurrent operations are the kernel side readers. Those are
protected by RCU locking.

Because of the four flavours and the other hash types, the functions
are implemented in general forms in the ip_set_ahash.h header file
and the real functions are generated before compiling by macro expansion.
Thus the dereferencing of low-level functions and void pointer arguments
could be avoided: the low-level functions are inlined, the function
arguments are pointers of type-specific structures.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

6c027889

netfilter: ipset; bitmap:port set type support · 54326190

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the bitmap:port type in two flavours, without
and with timeout support to store TCP/UDP ports from a range.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

54326190

netfilter: ipset: bitmap:ip,mac type support · de76021a

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the bitmap:ip,mac set type in two flavours,
without and with timeout support. In this kind of set one can store
IPv4 address and (source) MAC address pairs. The type supports elements
added without the MAC part filled out: when the first matching from kernel
happens, the MAC part is automatically filled out. The timing out of the
elements stars when an element is complete in the IP,MAC pair.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

de76021a

netfilter: ipset: bitmap:ip set type support · 72205fc6

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The module implements the bitmap:ip set type in two flavours, without
and with timeout support. In this kind of set one can store IPv4
addresses (or network addresses) from a given range.

In order not to waste memory, the timeout version does not rely on
the kernel timer for every element to be timed out but on garbage
collection. All set types use this mechanism.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

72205fc6

netfilter: ipset: IP set core support · a7b4f989

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The patch adds the IP set core support to the kernel.

The IP set core implements a netlink (nfnetlink) based protocol by which
one can create, destroy, flush, rename, swap, list, save, restore sets,
and add, delete, test elements from userspace. For simplicity (and backward
compatibilty and for not to force ip(6)tables to be linked with a netlink
library) reasons a small getsockopt-based protocol is also kept in order
to communicate with the ip(6)tables match and target.

The netlink protocol passes all u16, etc values in network order with
NLA_F_NET_BYTEORDER flag. The protocol enforces the proper use of the
NLA_F_NESTED and NLA_F_NET_BYTEORDER flags.

For other kernel subsystems (netfilter match and target) the API contains
the functions to add, delete and test elements in sets and the required calls
to get/put refereces to the sets before those operations can be performed.

The set types (which are implemented in independent modules) are stored
in a simple RCU protected list. A set type may have variants: for example
without timeout or with timeout support, for IPv4 or for IPv6. The sets
(i.e. the pointers to the sets) are stored in an array. The sets are
identified by their index in the array, which makes possible easy and
fast swapping of sets. The array is protected indirectly by the nfnl
mutex from nfnetlink. The content of the sets are protected by the rwlock
of the set.

There are functional differences between the add/del/test functions
for the kernel and userspace:

- kernel add/del/test: works on the current packet (i.e. one element)
- kernel test: may trigger an "add" operation  in order to fill
  out unspecified parts of the element from the packet (like MAC address)
- userspace add/del: works on the netlink message and thus possibly
  on multiple elements from the IPSET_ATTR_ADT container attribute.
- userspace add: may trigger resizing of a set
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

a7b4f989

netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros · f703651e

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The patch adds the NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros to the
vanilla kernel.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

f703651e

29 1月, 2011 1 次提交
- T
  netfilter: xt_iprange: add IPv6 match debug print code · 6a4ddef2
  由 Thomas Jacob 提交于 1月 28, 2011
```
Signed-off-by: NThomas Jacob <jacob@internet24.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
  6a4ddef2
27 1月, 2011 1 次提交
- T
  netfilter: xt_iprange: typo in IPv4 match debug print code · 705ca147
  由 Thomas Jacob 提交于 1月 27, 2011
```
Signed-off-by: NThomas Jacob <jacob@internet24.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
  705ca147
26 1月, 2011 2 次提交

P

Merge branch 'connlimit' of git://dev.medozas.de/linux · 2e0348c4
由 Patrick McHardy 提交于 1月 26, 2011

2e0348c4

netfilter: xt_connlimit: pick right dstaddr in NAT scenario · ad86e1f2

由 Jan Engelhardt 提交于 1月 26, 2011

xt_connlimit normally records the "original" tuples in a hashlist
(such as "1.2.3.4 -> 5.6.7.8"), and looks in this list for iph->daddr
when counting.

When the user however uses DNAT in PREROUTING, looking for
iph->daddr -- which is now 192.168.9.10 -- will not match. Thus in
daddr mode, we need to record the reverse direction tuple
("192.168.9.10 -> 1.2.3.4") instead. In the reverse tuple, the dst
addr is on the src side, which is convenient, as count_them still uses
&conn->tuple.src.u3.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

ad86e1f2

25 1月, 2011 2 次提交

netfilter: ipvs: fix compiler warnings · 9f4e1ccd

由 Changli Gao 提交于 1月 25, 2011

Fix compiler warnings when IP_VS_DBG() isn't defined.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Acked-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>

9f4e1ccd

IPVS netns BUG, register sysctl for root ns · 07924709

由 Hans Schillstrom 提交于 1月 24, 2011

The newly created table was not used when register sysctl for a new namespace.
I.e. sysctl doesn't work for other than root namespace (init_net)
Signed-off-by: NHans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>

07924709

22 1月, 2011 2 次提交

IPVS: Change sock_create_kernel() to __sock_create() · 4b3fd571

由 Simon Horman 提交于 1月 22, 2011

The recent netns changes omitted to change
sock_create_kernel() to __sock_create() in ip_vs_sync.c

The effect of this is that the interface will be selected in the
root-namespace, from my point of view it's a major bug.
Reported-by: NHans Schillstrom <hans@schillstrom.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

4b3fd571

netfilter: ipvs: fix compiler warnings · 091bb34c

由 Changli Gao 提交于 1月 21, 2011

Fix compiler warnings when no transport protocol load balancing support
is configured.

[horms@verge.net.au: removed suprious __ip_vs_cleanup() clean-up hunk]
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>

091bb34c

21 1月, 2011 3 次提交

netfilter: add a missing include in nf_conntrack_reasm.c · bced94ed

由 Eric Dumazet 提交于 1月 20, 2011

After commit ae90bdea (netfilter: fix compilation when conntrack is
disabled but tproxy is enabled) we have following warnings :

net/ipv6/netfilter/nf_conntrack_reasm.c:520:16: warning: symbol
'nf_ct_frag6_gather' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:591:6: warning: symbol
'nf_ct_frag6_output' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:612:5: warning: symbol
'nf_ct_frag6_init' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:640:6: warning: symbol
'nf_ct_frag6_cleanup' was not declared. Should it be static?

Fix this including net/netfilter/ipv6/nf_defrag_ipv6.h
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

bced94ed

netfilter: nf_conntrack: fix linker error with NF_CONNTRACK_TIMESTAMP=n · 2f1e3176

由 Patrick McHardy 提交于 1月 20, 2011

net/built-in.o: In function `nf_conntrack_init_net':
net/netfilter/nf_conntrack_core.c:1521:
	undefined reference to `nf_conntrack_tstamp_init'
net/netfilter/nf_conntrack_core.c:1531:
	undefined reference to `nf_conntrack_tstamp_fini'

Add dummy inline functions for the =n case to fix this.
Reported-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

2f1e3176

netfilter: xtables: add missing header inclusions for headers_check · 06988b06

由 Jan Engelhardt 提交于 1月 20, 2011

Resolve these warnings on `make headers_check`:

usr/include/linux/netfilter/xt_CT.h:7: found __[us]{8,16,32,64} type
without #include <linux/types.h>
...
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

06988b06

20 1月, 2011 16 次提交

netfilter: nf_nat: place conntrack in source hash after SNAT is done · 41a7cab6

由 Changli Gao 提交于 1月 20, 2011

If SNAT isn't done, the wrong info maybe got by the other cts.

As the filter table is after DNAT table, the packets dropped in filter
table also bother bysource hash table.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

41a7cab6

P

Merge branch 'connlimit' of git://dev.medozas.de/linux · 4cda47d2
由 Patrick McHardy 提交于 1月 20, 2011

4cda47d2

netfilter: xtables: remove duplicate member · ba12b130

由 Jan Engelhardt 提交于 1月 20, 2011

Accidentally missed removing the old out-of-union "inverse" member,
which caused the struct size to change which then gives size mismatch
warnings when using an old iptables.

It is interesting to see that gcc did not warn about this before.
(Filed http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47376 )
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

ba12b130

P
Merge branch 'connlimit' of git://dev.medozas.de/linux · 82d800d8
由 Patrick McHardy 提交于 1月 20, 2011
```
Conflicts:
	Documentation/feature-removal-schedule.txt
Signed-off-by: NPatrick McHardy <kaber@trash.net>
```
82d800d8

netfilter: do not omit re-route check on NF_QUEUE verdict · 28a51ba5

由 Florian Westphal 提交于 1月 20, 2011

ret != NF_QUEUE only works in the "--queue-num 0" case; for
queues > 0 the test should be '(ret & NF_VERDICT_MASK) != NF_QUEUE'.

However, NF_QUEUE no longer DROPs the skb unconditionally if queueing
fails (due to NF_VERDICT_FLAG_QUEUE_BYPASS verdict flag), so the
re-route test should also be performed if this flag is set in the
verdict.

The full test would then look something like

&& ((ret & NF_VERDICT_MASK) == NF_QUEUE && (ret & NF_VERDICT_FLAG_QUEUE_BYPASS))

This is rather ugly, so just remove the NF_QUEUE test altogether.

The only effect is that we might perform an unnecessary route lookup
in the NF_QUEUE case.

ip6table_mangle did not have such a check.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

28a51ba5

D

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 · a07aa004
由 David S. Miller 提交于 1月 20, 2011

a07aa004

netfilter: xtables: remove extraneous header that slipped in · 5d844928

由 Jan Engelhardt 提交于 1月 20, 2011

Commit 0b8ad876 (netfilter: xtables: add missing header files to export
list) erroneously added this.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

5d844928

net_sched: cleanups · cc7ec456

由 Eric Dumazet 提交于 1月 19, 2011

Cleanup net/sched code to current CodingStyle and practices.

Reduce inline abuse
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc7ec456

af_unix: coding style: remove one level of indentation in unix_shutdown() · 7180a031

由 Alban Crequy 提交于 1月 19, 2011

Signed-off-by: NAlban Crequy <alban.crequy@collabora.co.uk>
Reviewed-by: NIan Molton <ian.molton@collabora.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7180a031

net_sched: implement a root container qdisc sch_mqprio · b8970f0b

由 John Fastabend 提交于 1月 17, 2011

This implements a mqprio queueing discipline that by default creates
a pfifo_fast qdisc per tx queue and provides the needed configuration
interface.

Using the mqprio qdisc the number of tcs currently in use along
with the range of queues alloted to each class can be configured. By
default skbs are mapped to traffic classes using the skb priority.
This mapping is configurable.

Configurable parameters,

struct tc_mqprio_qopt {
	__u8    num_tc;
	__u8    prio_tc_map[TC_BITMASK + 1];
	__u8    hw;
	__u16   count[TC_MAX_QUEUE];
	__u16   offset[TC_MAX_QUEUE];
};

Here the count/offset pairing give the queue alignment and the
prio_tc_map gives the mapping from skb->priority to tc.

The hw bit determines if the hardware should configure the count
and offset values. If the hardware bit is set then the operation
will fail if the hardware does not implement the ndo_setup_tc
operation. This is to avoid undetermined states where the hardware
may or may not control the queue mapping. Also minimal bounds
checking is done on the count/offset to verify a queue does not
exceed num_tx_queues and that queue ranges do not overlap. Otherwise
it is left to user policy or hardware configuration to create
useful mappings.

It is expected that hardware QOS schemes can be implemented by
creating appropriate mappings of queues in ndo_tc_setup().

One expected use case is drivers will use the ndo_setup_tc to map
queue ranges onto 802.1Q traffic classes. This provides a generic
mechanism to map network traffic onto these traffic classes and
removes the need for lower layer drivers to know specifics about
traffic types.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8970f0b

net: implement mechanism for HW based QOS · 4f57c087

由 John Fastabend 提交于 1月 17, 2011

This patch provides a mechanism for lower layer devices to
steer traffic using skb->priority to tx queues. This allows
for hardware based QOS schemes to use the default qdisc without
incurring the penalties related to global state and the qdisc
lock. While reliably receiving skbs on the correct tx ring
to avoid head of line blocking resulting from shuffling in
the LLD. Finally, all the goodness from txq caching and xps/rps
can still be leveraged.

Many drivers and hardware exist with the ability to implement
QOS schemes in the hardware but currently these drivers tend
to rely on firmware to reroute specific traffic, a driver
specific select_queue or the queue_mapping action in the
qdisc.

By using select_queue for this drivers need to be updated for
each and every traffic type and we lose the goodness of much
of the upstream work. Firmware solutions are inherently
inflexible. And finally if admins are expected to build a
qdisc and filter rules to steer traffic this requires knowledge
of how the hardware is currently configured. The number of tx
queues and the queue offsets may change depending on resources.
Also this approach incurs all the overhead of a qdisc with filters.

With the mechanism in this patch users can set skb priority using
expected methods ie setsockopt() or the stack can set the priority
directly. Then the skb will be steered to the correct tx queues
aligned with hardware QOS traffic classes. In the normal case with
single traffic class and all queues in this class everything
works as is until the LLD enables multiple tcs.

To steer the skb we mask out the lower 4 bits of the priority
and allow the hardware to configure upto 15 distinct classes
of traffic. This is expected to be sufficient for most applications
at any rate it is more then the 8021Q spec designates and is
equal to the number of prio bands currently implemented in
the default qdisc.

This in conjunction with a userspace application such as
lldpad can be used to implement 8021Q transmission selection
algorithms one of these algorithms being the extended transmission
selection algorithm currently being used for DCB.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f57c087

netlink: support setting devgroup parameters · e7ed828f

由 Vlad Dogaru 提交于 1月 13, 2011

If a rtnetlink request specifies a negative or zero ifindex and has no
interface name attribute, but has a group attribute, then the chenges
are made to all the interfaces belonging to the specified group.
Signed-off-by: NVlad Dogaru <ddvlad@rosedu.org>
Acked-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7ed828f

net_device: add support for network device groups · cbda10fa

由 Vlad Dogaru 提交于 1月 13, 2011

Net devices can now be grouped, enabling simpler manipulation from
userspace. This patch adds a group field to the net_device structure, as
well as rtnetlink support to query and modify it.
Signed-off-by: NVlad Dogaru <ddvlad@rosedu.org>
Acked-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cbda10fa

net: cleanup unused macros in net directory · 441c793a

由 Shan Wei 提交于 1月 13, 2011

Clean up some unused macros in net/*.
1. be left for code change. e.g. PGV_FROM_VMALLOC, PGV_FROM_VMALLOC, KMEM_SAFETYZONE.
2. never be used since introduced to kernel.
   e.g. P9_RDMA_MAX_SGE, UTIL_CTRL_PKT_SIZE.
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Acked-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

441c793a

vxge: update driver version · 6997e618

由 Jon Mason 提交于 1月 18, 2011

Update vxge driver version to 2.5.2
Signed-off-by: NJon Mason <jon.mason@exar.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6997e618

vxge: MSIX one shot mode · 16fded7d

由 Jon Mason 提交于 1月 18, 2011

To reduce the possibility of losing an interrupt in the handler due to a
race between an interrupt processing and disable/enable of interrupts,
enable MSIX one shot.

Also, add support for adaptive interrupt coalesing
Signed-off-by: NJon Mason <jon.mason@exar.com>
Signed-off-by: NMasroor Vettuparambil <masroor.vettuparambil@exar.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16fded7d

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功