提交 · 9dff2c966a0a79a4222553a851f17e679fc28a43 · Linux-御风守护者 / linux

18 9月, 2015 1 次提交

netfilter: Use nf_hook_state.net · 9dff2c96

由 Eric W. Biederman 提交于 9月 15, 2015

Instead of saying "net = dev_net(state->in?state->in:state->out)"
just say "state->net".  As that information is now availabe,
much less confusing and much less error prone.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9dff2c96

03 9月, 2015 1 次提交

netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled · a82b0e63

由 Daniel Borkmann 提交于 9月 02, 2015

While testing various Kconfig options on another issue, I found that
the following one triggers as well on allmodconfig and nf_conntrack
disabled:

  net/ipv4/netfilter/nf_dup_ipv4.c: In function ‘nf_dup_ipv4’:
  net/ipv4/netfilter/nf_dup_ipv4.c:72:20: error: ‘nf_skb_duplicated’ undeclared (first use in this function)
    if (this_cpu_read(nf_skb_duplicated))
  [...]
  net/ipv6/netfilter/nf_dup_ipv6.c: In function ‘nf_dup_ipv6’:
  net/ipv6/netfilter/nf_dup_ipv6.c:66:20: error: ‘nf_skb_duplicated’ undeclared (first use in this function)
    if (this_cpu_read(nf_skb_duplicated))

Fix it by including directly the header where it is defined.

Fixes: bbde9fc1 ("netfilter: factor out packet duplication for IPv4/IPv6")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a82b0e63

29 8月, 2015 1 次提交

Revert "netfilter: xtables: compute exact size needed for jumpstack" · 98dbbfc3

由 Florian Westphal 提交于 8月 26, 2015

This reverts commit 98d1bd80.

mark_source_chains will not re-visit chains, so

*filter
:INPUT ACCEPT [365:25776]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [217:45832]
:t1 - [0:0]
:t2 - [0:0]
:t3 - [0:0]
:t4 - [0:0]
-A t1 -i lo -j t2
-A t2 -i lo -j t3
-A t3 -i lo -j t4
# -A INPUT -j t4
# -A INPUT -j t3
# -A INPUT -j t2
-A INPUT -j t1
COMMIT

Will compute a chain depth of 2 if the comments are removed.
Revert back to counting the number of chains for the time being.
Reported-by: NCong Wang <cwang@twopensource.com>
Reported-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

98dbbfc3

22 8月, 2015 1 次提交

netfilter: nf_dup: fix sparse warnings · 59e26423

由 Pablo Neira Ayuso 提交于 8月 21, 2015

>> net/ipv4/netfilter/nft_dup_ipv4.c:29:37: sparse: incorrect type in initializer (different base types)
   net/ipv4/netfilter/nft_dup_ipv4.c:29:37:    expected restricted __be32 [user type] s_addr
   net/ipv4/netfilter/nft_dup_ipv4.c:29:37:    got unsigned int [unsigned] <noident>

>> net/ipv6/netfilter/nf_dup_ipv6.c:48:23: sparse: incorrect type in assignment (different base types)
   net/ipv6/netfilter/nf_dup_ipv6.c:48:23:    expected restricted __be32 [addressable] [assigned] [usertype] flowlabel
   net/ipv6/netfilter/nf_dup_ipv6.c:48:23:    got int
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

59e26423

18 8月, 2015 3 次提交

net: Change pseudohdr argument of inet_proto_csum_replace* to be a bool · 4b048d6d

由 Tom Herbert 提交于 8月 17, 2015

inet_proto_csum_replace4,2,16 take a pseudohdr argument which indicates
the checksum field carries a pseudo header. This argument should be a
boolean instead of an int.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b048d6d

netfilter: nf_conntrack: add efficient mark to zone mapping · 5e8018fc

由 Daniel Borkmann 提交于 8月 14, 2015

This work adds the possibility of deriving the zone id from the skb->mark
field in a scalable manner. This allows for having only a single template
serving hundreds/thousands of different zones, for example, instead of the
need to have one match for each zone as an extra CT jump target.

Note that we'd need to have this information attached to the template as at
the time when we're trying to lookup a possible ct object, we already need
to know zone information for a possible match when going into
__nf_conntrack_find_get(). This work provides a minimal implementation for
a possible mapping.

In order to not add/expose an extra ct->status bit, the zone structure has
been extended to carry a flag for deriving the mark.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

5e8018fc

netfilter: nf_conntrack: add direction support for zones · deedb590

由 Daniel Borkmann 提交于 8月 14, 2015

This work adds a direction parameter to netfilter zones, so identity
separation can be performed only in original/reply or both directions
(default). This basically opens up the possibility of doing NAT with
conflicting IP address/port tuples from multiple, isolated tenants
on a host (e.g. from a netns) without requiring each tenant to NAT
twice resp. to use its own dedicated IP address to SNAT to, meaning
overlapping tuples can be made unique with the zone identifier in
original direction, where the NAT engine will then allocate a unique
tuple in the commonly shared default zone for the reply direction.
In some restricted, local DNAT cases, also port redirection could be
used for making the reply traffic unique w/o requiring SNAT.

The consensus we've reached and discussed at NFWS and since the initial
implementation [1] was to directly integrate the direction meta data
into the existing zones infrastructure, as opposed to the ct->mark
approach we proposed initially.

As we pass the nf_conntrack_zone object directly around, we don't have
to touch all call-sites, but only those, that contain equality checks
of zones. Thus, based on the current direction (original or reply),
we either return the actual id, or the default NF_CT_DEFAULT_ZONE_ID.
CT expectations are direction-agnostic entities when expectations are
being compared among themselves, so we can only use the identifier
in this case.

Note that zone identifiers can not be included into the hash mix
anymore as they don't contain a "stable" value that would be equal
for both directions at all times, f.e. if only zone->id would
unconditionally be xor'ed into the table slot hash, then replies won't
find the corresponding conntracking entry anymore.

If no particular direction is specified when configuring zones, the
behaviour is exactly as we expect currently (both directions).

Support has been added for the CT netlink interface as well as the
x_tables raw CT target, which both already offer existing interfaces
to user space for the configuration of zones.

Below a minimal, simplified collision example (script in [2]) with
netperf sessions:

  +--- tenant-1 ---+   mark := 1
  |    netperf     |--+
  +----------------+  |                CT zone := mark [ORIGINAL]
   [ip,sport] := X   +--------------+  +--- gateway ---+
                     | mark routing |--|     SNAT      |-- ... +
                     +--------------+  +---------------+       |
  +--- tenant-2 ---+  |                                     ~~~|~~~
  |    netperf     |--+                +-----------+           |
  +----------------+   mark := 2       | netserver |------ ... +
   [ip,sport] := X                     +-----------+
                                        [ip,port] := Y
On the gateway netns, example:

  iptables -t raw -A PREROUTING -j CT --zone mark --zone-dir ORIGINAL
  iptables -t nat -A POSTROUTING -o <dev> -j SNAT --to-source <ip> --random-fully

  iptables -t mangle -A PREROUTING -m conntrack --ctdir ORIGINAL -j CONNMARK --save-mark
  iptables -t mangle -A POSTROUTING -m conntrack --ctdir REPLY -j CONNMARK --restore-mark

conntrack dump from gateway netns:

  netperf -H 10.1.1.2 -t TCP_STREAM -l60 -p12865,5555 from each tenant netns

  tcp 6 431995 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=1
                           src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=1024
               [ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1

  tcp 6 431994 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=2
                           src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=5555
               [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=1

  tcp 6 299 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=39438 dport=33768 zone-orig=1
                        src=10.1.1.2 dst=10.1.1.1 sport=33768 dport=39438
               [ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1

  tcp 6 300 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=32889 dport=40206 zone-orig=2
                        src=10.1.1.2 dst=10.1.1.1 sport=40206 dport=32889
               [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=2

Taking this further, test script in [2] creates 200 tenants and runs
original-tuple colliding netperf sessions each. A conntrack -L dump in
the gateway netns also confirms 200 overlapping entries, all in ESTABLISHED
state as expected.

I also did run various other tests with some permutations of the script,
to mention some: SNAT in random/random-fully/persistent mode, no zones (no
overlaps), static zones (original, reply, both directions), etc.

  [1] http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/57412/
  [2] https://paste.fedoraproject.org/242835/65657871/Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

deedb590

11 8月, 2015 1 次提交

netfilter: nf_conntrack: push zone object into functions · 308ac914

由 Daniel Borkmann 提交于 8月 08, 2015

This patch replaces the zone id which is pushed down into functions
with the actual zone object. It's a bigger one-time change, but
needed for later on extending zones with a direction parameter, and
thus decoupling this additional information from all call-sites.

No functional changes in this patch.

The default zone becomes a global const object, namely nf_ct_zone_dflt
and will be returned directly in various cases, one being, when there's
f.e. no zoning support.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

308ac914

10 8月, 2015 1 次提交

netfilter: SYNPROXY: fix sending window update to client · 3c16241c

由 Phil Sutter 提交于 7月 28, 2015

Upon receipt of SYNACK from the server, ipt_SYNPROXY first sends back an ACK to
finish the server handshake, then calls nf_ct_seqadj_init() to initiate
sequence number adjustment of forwarded packets to the client and finally sends
a window update to the client to unblock it's TX queue.

Since synproxy_send_client_ack() does not set synproxy_send_tcp()'s nfct
parameter, no sequence number adjustment happens and the client receives the
window update with incorrect sequence number. Depending on client TCP
implementation, this leads to a significant delay (until a window probe is
being sent).
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3c16241c

07 8月, 2015 2 次提交

netfilter: nf_tables: add nft_dup expression · d877f071

由 Pablo Neira Ayuso 提交于 5月 31, 2015

This new expression uses the nf_dup engine to clone packets to a given gateway.
Unlike xt_TEE, we use an index to indicate output interface which should be
fine at this stage.

Moreover, change to the preemtion-safe this_cpu_read(nf_skb_duplicated) from
nf_dup_ipv{4,6} to silence a lockdep splat.

Based on the original tee expression from Arturo Borrero Gonzalez, although
this patch has diverted quite a bit from this initial effort due to the
change to support maps.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

d877f071

netfilter: factor out packet duplication for IPv4/IPv6 · bbde9fc1

由 Pablo Neira Ayuso 提交于 5月 31, 2015

Extracted from the xtables TEE target. This creates two new modules for IPv4
and IPv6 that are shared between the TEE target and the new nf_tables dup
expressions.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

bbde9fc1

30 7月, 2015 1 次提交

netfilter: bridge: reduce nf_bridge_info to 32 bytes again · 72b1e5e4

由 Florian Westphal 提交于 7月 23, 2015

We can use union for most of the temporary cruft (original ipv4/ipv6
address, source mac, physoutdev) since they're used during different
stages of br netfilter traversal.

Also get rid of the last two ->mask users.

Shrinks struct from 48 to 32 on 64bit arch.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

72b1e5e4

16 7月, 2015 4 次提交

netfilter: xtables: remove __pure annotation · 6c7941de

由 Florian Westphal 提交于 7月 14, 2015

sparse complains:
ip_tables.c:361:27: warning: incorrect type in assignment (different modifiers)
ip_tables.c:361:27:    expected struct ipt_entry *[assigned] e
ip_tables.c:361:27:    got struct ipt_entry [pure] *

doesn't change generated code.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

6c7941de

netfilter: add and use jump label for xt_tee · dcebd315

由 Florian Westphal 提交于 7月 14, 2015

Don't bother testing if we need to switch to alternate stack
unless TEE target is used.
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

dcebd315

netfilter: xtables: don't save/restore jumpstack offset · 7814b6ec

由 Florian Westphal 提交于 7月 14, 2015

In most cases there is no reentrancy into ip/ip6tables.

For skbs sent by REJECT or SYNPROXY targets, there is one level
of reentrancy, but its not relevant as those targets issue an absolute
verdict, i.e. the jumpstack can be clobbered since its not used
after the target issues absolute verdict (ACCEPT, DROP, STOLEN, etc).

So the only special case where it is relevant is the TEE target, which
returns XT_CONTINUE.

This patch changes ip(6)_do_table to always use the jump stack starting
from 0.

When we detect we're operating on an skb sent via TEE (percpu
nf_skb_duplicated is 1) we switch to an alternate stack to leave
the original one alone.

Since there is no TEE support for arptables, it doesn't need to
test if tee is active.

The jump stack overflow tests are no longer needed as well --
since ->stacksize is the largest call depth we cannot exceed it.

A much better alternative to the external jumpstack would be to just
declare a jumps[32] stack on the local stack frame, but that would mean
we'd have to reject iptables rulesets that used to work before.

Another alternative would be to start rejecting rulesets with a larger
call depth, e.g. 1000 -- in this case it would be feasible to allocate the
entire stack in the percpu area which would avoid one dereference.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7814b6ec

netfilter: xtables: compute exact size needed for jumpstack · 98d1bd80

由 Florian Westphal 提交于 7月 14, 2015

The {arp,ip,ip6tables} jump stack is currently sized based
on the number of user chains.

However, its rather unlikely that every user defined chain jumps to the
next, so lets use the existing loop detection logic to also track the
chain depths.

The stacksize is then set to the largest chain depth seen.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

98d1bd80

02 7月, 2015 1 次提交

netfilter: arptables: use percpu jumpstack · 3bd22997

由 Florian Westphal 提交于 6月 30, 2015

commit 482cfc31 ("netfilter: xtables: avoid percpu ruleset duplication")

Unlike ip and ip6tables, arp tables were never converted to use the percpu
jump stack.

It still uses the rule blob to store return address, which isn't safe
anymore since we now share this blob among all processors.

Because there is no TEE support for arptables, we don't need to cope
with reentrancy, so we can use loocal variable to hold stack offset.

Fixes: 482cfc31 ("netfilter: xtables: avoid percpu ruleset duplication")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3bd22997

24 6月, 2015 1 次提交

net: ipv4 sysctl option to ignore routes when nexthop link is down · 0eeb075f

由 Andy Gospodarek 提交于 6月 23, 2015

This feature is only enabled with the new per-interface or ipv4 global
sysctls called 'ignore_routes_with_linkdown'.

net.ipv4.conf.all.ignore_routes_with_linkdown = 0
net.ipv4.conf.default.ignore_routes_with_linkdown = 0
net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
...

When the above sysctls are set, will report to userspace that a route is
dead and will no longer resolve to this nexthop when performing a fib
lookup.  This will signal to userspace that the route will not be
selected.  The signalling of a RTNH_F_DEAD is only passed to userspace
if the sysctl is enabled and link is down.  This was done as without it
the netlink listeners would have no idea whether or not a nexthop would
be selected.   The kernel only sets RTNH_F_DEAD internally if the
interface has IFF_UP cleared.

With the new sysctl set, the following behavior can be observed
(interface p8p1 is link-down):

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 dead linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 dead linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
90.0.0.1 via 70.0.0.2 dev p7p1  src 70.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache <local>
80.0.0.2 via 10.0.5.2 dev p9p1  src 10.0.5.15
    cache

While the route does remain in the table (so it can be modified if
needed rather than being wiped away as it would be if IFF_UP was
cleared), the proper next-hop is chosen automatically when the link is
down.  Now interface p8p1 is linked-up:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
192.168.56.0/24 dev p2p1  proto kernel  scope link  src 192.168.56.2
90.0.0.1 via 80.0.0.2 dev p8p1  src 80.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache <local>
80.0.0.2 dev p8p1  src 80.0.0.1
    cache

and the output changes to what one would expect.

If the sysctl is not set, the following output would be expected when
p8p1 is down:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2

Since the dead flag does not appear, there should be no expectation that
the kernel would skip using this route due to link being down.

v2: Split kernel changes into 2 patches, this actually makes a
behavioral change if the sysctl is set.  Also took suggestion from Alex
to simplify code by only checking sysctl during fib lookup and
suggestion from Scott to add a per-interface sysctl.

v3: Code clean-ups to make it more readable and efficient as well as a
reverse path check fix.

v4: Drop binary sysctl

v5: Whitespace fixups from Dave

v6: Style changes from Dave and checkpatch suggestions

v7: One more checkpatch fixup
Signed-off-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NDinesh Dutt <ddutt@cumulusnetworks.com>
Acked-by: NScott Feldman <sfeldma@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0eeb075f

16 6月, 2015 1 次提交

netfilter: x_tables: remove XT_TABLE_INFO_SZ and a dereference. · 711bdde6

由 Eric Dumazet 提交于 6月 15, 2015

After Florian patches, there is no need for XT_TABLE_INFO_SZ anymore :
Only one copy of table is kept, instead of one copy per cpu.

We also can avoid a dereference if we put table data right after
xt_table_info. It reduces register pressure and helps compiler.

Then, we attempt a kmalloc() if total size is under order-3 allocation,
to reduce TLB pressure, as in many cases, rules fit in 32 KB.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

711bdde6

15 6月, 2015 1 次提交

netfilter: Kconfig: get rid of parens around depends on · f09becc7

由 Pablo Neira Ayuso 提交于 6月 12, 2015

According to the reporter, they are not needed.
Reported-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

f09becc7

12 6月, 2015 2 次提交

netfilter: xtables: avoid percpu ruleset duplication · 482cfc31

由 Florian Westphal 提交于 6月 11, 2015

We store the rule blob per (possible) cpu.  Unfortunately this means we can
waste lot of memory on big smp machines. ipt_entry structure ('rule head')
is 112 byte, so e.g. with maxcpu=64 one single rule eats
close to 8k RAM.

Since previous patch made counters percpu it appears there is nothing
left in the rule blob that needs to be percpu.

On my test system (144 possible cpus, 400k dummy rules) this
change saves close to 9 Gigabyte of RAM.
Reported-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

482cfc31

netfilter: xtables: use percpu rule counters · 71ae0dff

由 Florian Westphal 提交于 6月 11, 2015

The binary arp/ip/ip6tables ruleset is stored per cpu.

The only reason left as to why we need percpu duplication are the rule
counters embedded into ipt_entry et al -- since each cpu has its own copy
of the rules, all counters can be lockless.

The downside is that the more cpus are supported, the more memory is
required.  Rules are not just duplicated per online cpu but for each
possible cpu, i.e. if maxcpu is 144, then rule is duplicated 144 times,
not for the e.g. 64 cores present.

To save some memory and also improve utilization of shared caches it
would be preferable to only store the rule blob once.

So we first need to separate counters and the rule blob.

Instead of using entry->counters, allocate this percpu and store the
percpu address in entry->counters.pcnt on CONFIG_SMP.

This change makes no sense as-is; it is merely an intermediate step to
remove the percpu duplication of the rule set in a followup patch.
Suggested-by: NEric Dumazet <edumazet@google.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Reported-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

71ae0dff

27 5月, 2015 1 次提交
- F
  netfilter: remove unused comefrom hookmask argument · 2f06550b
  由 Florian Westphal 提交于 5月 24, 2015
```
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
```
  2f06550b
20 5月, 2015 1 次提交

netfilter: ensure number of counters is >0 in do_replace() · 1086bbe9

由 Dave Jones 提交于 5月 19, 2015

After improving setsockopt() coverage in trinity, I started triggering
vmalloc failures pretty reliably from this code path:

warn_alloc_failed+0xe9/0x140
__vmalloc_node_range+0x1be/0x270
vzalloc+0x4b/0x50
__do_replace+0x52/0x260 [ip_tables]
do_ipt_set_ctl+0x15d/0x1d0 [ip_tables]
nf_setsockopt+0x65/0x90
ip_setsockopt+0x61/0xa0
raw_setsockopt+0x16/0x60
sock_common_setsockopt+0x14/0x20
SyS_setsockopt+0x71/0xd0

It turns out we don't validate that the num_counters field in the
struct we pass in from userspace is initialized.

The same problem also exists in ebtables, arptables, ipv6, and the
compat variants.
Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

1086bbe9

18 5月, 2015 1 次提交

netfilter: synproxy: fix sparse errors · ba6d0564

由 Eric Dumazet 提交于 5月 15, 2015

Fix verbose sparse errors :

make C=2 CF=-D__CHECK_ENDIAN__ net/ipv4/netfilter/ipt_SYNPROXY.o
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba6d0564

16 5月, 2015 1 次提交

netfilter: x_tables: add context to know if extension runs from nft_compat · 55917a21

由 Pablo Neira Ayuso 提交于 5月 14, 2015

Currently, we have four xtables extensions that cannot be used from the
xt over nft compat layer. The problem is that they need real access to
the full blown xt_entry to validate that the rule comes with the right
dependencies. This check was introduced to overcome the lack of
sufficient userspace dependency validation in iptables.

To resolve this problem, this patch introduces a new field to the
xt_tgchk_param structure that tell us if the extension is run from
nft_compat context.

The three affected extensions are:

1) CLUSTERIP, this target has been superseded by xt_cluster. So just
   bail out by returning -EINVAL.

2) TCPMSS. Relax the checking when used from nft_compat. If used with
   the wrong configuration, it will corrupt !syn packets by adding TCP
   MSS option.

3) ebt_stp. Relax the check to make sure it uses the reserved
   destination MAC address for STP.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Tested-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

55917a21

13 4月, 2015 2 次提交

netfilter: nf_tables: switch registers to 32 bit addressing · 49499c3e

由 Patrick McHardy 提交于 4月 11, 2015

Switch the nf_tables registers from 128 bit addressing to 32 bit
addressing to support so called concatenations, where multiple values
can be concatenated over multiple registers for O(1) exact matches of
multiple dimensions using sets.

The old register values are mapped to areas of 128 bits for compatibility.
When dumping register numbers, values are expressed using the old values
if they refer to the beginning of a 128 bit area for compatibility.

To support concatenations, register loads of less than a full 32 bit
value need to be padded. This mainly affects the payload and exthdr
expressions, which both unconditionally zero the last word before
copying the data.

Userspace fully passes the testsuite using both old and new register
addressing.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

49499c3e

netfilter: nf_tables: get rid of NFT_REG_VERDICT usage · a55e22e9

由 Patrick McHardy 提交于 4月 11, 2015

Replace the array of registers passed to expressions by a struct nft_regs,
containing the verdict as a seperate member, which aliases to the
NFT_REG_VERDICT register.

This is needed to seperate the verdict from the data registers completely,
so their size can be changed.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a55e22e9

09 4月, 2015 1 次提交

netfilter: Fix switch statement warnings with recent gcc. · c1f86676

由 David Miller 提交于 4月 07, 2015

More recent GCC warns about two kinds of switch statement uses:

1) Switching on an enumeration, but not having an explicit case
   statement for all members of the enumeration.  To show the
   compiler this is intentional, we simply add a default case
   with nothing more than a break statement.

2) Switching on a boolean value.  I think this warning is dumb
   but nevertheless you get it wholesale with -Wswitch.

This patch cures all such warnings in netfilter.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NPablo Neira Ayuso <pablo@netfilter.org>

c1f86676

08 4月, 2015 1 次提交

netfilter: bridge: add helpers for fetching physin/outdev · c737b7c4

由 Florian Westphal 提交于 4月 02, 2015

right now we store this in the nf_bridge_info struct, accessible
via skb->nf_bridge.  This patch prepares removal of this pointer from skb:

Instead of using skb->nf_bridge->x, we use helpers to obtain the in/out
device (or ifindexes).

Followup patches to netfilter will then allow nf_bridge_info to be
obtained by a call into the br_netfilter core, rather than keeping a
pointer to it in sk_buff.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c737b7c4

05 4月, 2015 5 次提交
- D
  netfilter: Pass nf_hook_state through arpt_do_table(). · b85c3dc9
  由 David S. Miller 提交于 4月 03, 2015
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  b85c3dc9
- D
  netfilter: Pass nf_hook_state through nft_set_pktinfo*(). · 073bfd56
  由 David S. Miller 提交于 4月 03, 2015
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  073bfd56
- D
  netfilter: Pass nf_hook_state through ipt_do_table(). · 1c491ba2
  由 David S. Miller 提交于 4月 03, 2015
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  1c491ba2
- D
  netfilter: Pass nf_hook_state through nf_nat_ipv4_{in,out,fn,local_fn}(). · d7cf4081
  由 David S. Miller 提交于 4月 03, 2015
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  d7cf4081
- D
  netfilter: Make nf_hookfn use nf_hook_state. · 238e54c9
  由 David S. Miller 提交于 4月 03, 2015
```
Pass the nf_hook_state all the way down into the hook
functions themselves.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  238e54c9
01 4月, 2015 2 次提交

netlink: implement nla_get_in_addr and nla_get_in6_addr · 67b61f6c

由 Jiri Benc 提交于 3月 29, 2015

Those are counterparts to nla_put_in_addr and nla_put_in6_addr.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67b61f6c

netlink: implement nla_put_in_addr and nla_put_in6_addr · 930345ea

由 Jiri Benc 提交于 3月 29, 2015

IP addresses are often stored in netlink attributes. Add generic functions
to do that.

For nla_put_in_addr, it would be nicer to pass struct in_addr but this is
not used universally throughout the kernel, in way too many places __be32 is
used to store IPv4 address.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

930345ea

25 3月, 2015 1 次提交

netfilter: Use LOGLEVEL_<FOO> defines · a81b2ce8

由 Joe Perches 提交于 3月 23, 2015

Use the #defines where appropriate.

Miscellanea:

Add explicit #include <linux/kernel.h> where it was not
previously used so that these #defines are a bit more
explicitly defined instead of indirectly included via:
	module.h->moduleparam.h->kernel.h
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a81b2ce8

19 3月, 2015 1 次提交

netfilter: restore rule tracing via nfnetlink_log · 4017a7ee

由 Pablo Neira Ayuso 提交于 3月 02, 2015

Since fab4085f ("netfilter: log: nf_log_packet() as real unified
interface"), the loginfo structure that is passed to nf_log_packet() is
used to explicitly indicate the logger type you want to use.

This is a problem for people tracing rules through nfnetlink_log since
packets are always routed to the NF_LOG_TYPE logger after the
aforementioned patch.

We can fix this by removing the trace loginfo structures, but that still
changes the log level from 4 to 5 for tracing messages and there may be
someone relying on this outthere. So let's just introduce a new
nf_log_trace() function that restores the former behaviour.
Reported-by: NMarkus Kötter <koetter@rrzn.uni-hannover.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4017a7ee

18 3月, 2015 1 次提交

netfilter: Remove uses of seq_<foo> return values · 1ca9e417

由 Joe Perches 提交于 3月 16, 2015

The seq_printf/seq_puts/seq_putc return values, because they
are frequently misused, will eventually be converted to void.

See: commit 1f33c41c ("seq_file: Rename seq_overflow() to
     seq_has_overflowed() and make public")

Miscellanea:

o realign arguments
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

1ca9e417

Linux-御风守护者 / linux 与 Fork 源项目一致

Linux-御风守护者 / linux
与 Fork 源项目一致