提交 · f4c756b4ea7d2921391febcaed4ce2511872a0e1 · openeuler / raspberrypi-kernel

29 12月, 2015 3 次提交

netfilter: nf_tables: remove check against removal of inactive objects · f4c756b4

由 Pablo Neira Ayuso 提交于 12月 15, 2015

The following sequence inside a batch, although not very useful, is
valid:

 add table foo
 ...
 delete table foo

This may be generated by some robot while applying some incremental
upgrade, so remove the defensive checks against this.

This patch keeps the check on the get/dump path by now, we have to
replace the inactive flag by introducing object generations.
Reported-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

f4c756b4

netfilter: nf_tables: destroy basechain and rules on netdevice removal · 5ebe0b0e

由 Pablo Neira Ayuso 提交于 12月 15, 2015

If the netdevice is destroyed, the resources that are attached should
be released too as they belong to the device that is now gone.
Suggested-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

5ebe0b0e

netfilter: nf_tables: release objects on netns destruction · df05ef87

由 Pablo Neira Ayuso 提交于 12月 15, 2015

We have to release the existing objects on netns removal otherwise we
leak them. Chains are unregistered in first place to make sure no
packets are walking on our rules and sets anymore.

The object release happens by when we unregister the family via
nft_release_afinfo() which is called from nft_unregister_afinfo() from
the corresponding __net_exit path in every family.
Reported-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

df05ef87

18 12月, 2015 1 次提交

netfilter: meta: add support for setting skb->pkttype · b4aae759

由 Florian Westphal 提交于 12月 10, 2015

This allows to redirect bridged packets to local machine:

ether type ip ether daddr set aa:53:08:12:34:56 meta pkttype set unicast
Without 'set unicast', ip stack discards PACKET_OTHERHOST skbs.

It is also useful to add support for a '-m cluster like' nft rule
(where switch floods packets to several nodes, and each cluster node
 node processes a subset of packets for load distribution).

Mangling is restricted to HOST/OTHER/BROAD/MULTICAST, i.e. you cannot set
skb->pkt_type to PACKET_KERNEL or change PACKET_LOOPBACK to PACKET_HOST.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

b4aae759

16 12月, 2015 1 次提交

sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC · 53692b1d

由 Tom Herbert 提交于 12月 14, 2015

The SCTP checksum is really a CRC and is very different from the
standards 1's complement checksum that serves as the checksum
for IP protocols. This offload interface is also very different.
Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
differences. The term CSUM should be reserved in the stack to refer
to the standard 1's complement IP checksum.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53692b1d

15 12月, 2015 3 次提交

nfnetlink: add nfnl_dereference_protected helper · 9c55d3b5

由 Florian Westphal 提交于 12月 03, 2015

to avoid overly long line in followup patch.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

9c55d3b5

netfilter: implement xt_cgroup cgroup2 path match · c38c4597

由 Tejun Heo 提交于 12月 07, 2015

This patch implements xt_cgroup path match which matches cgroup2
membership of the associated socket.  The match is recursive and
invertible.

For rationales on introducing another cgroup based match, please refer
to a preceding commit "sock, cgroup: add sock->sk_cgroup".

v3: Folded into xt_cgroup as a new revision interface as suggested by
    Pablo.

v2: Included linux/limits.h from xt_cgroup2.h for PATH_MAX.  Added
    explicit alignment to the priv field.  Both suggested by Jan.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
CC: Neil Horman <nhorman@tuxdriver.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c38c4597

netfilter: prepare xt_cgroup for multi revisions · 4ec8ff0e

由 Tejun Heo 提交于 12月 07, 2015

xt_cgroup will grow cgroup2 path based match.  Postfix existing
symbols with _v0 and prepare for multi revision registration.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
CC: Neil Horman <nhorman@tuxdriver.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4ec8ff0e

14 12月, 2015 2 次提交

netfilter: cttimeout: add netns support · 19576c94

由 Pablo Neira 提交于 12月 09, 2015

Add a per-netns list of timeout objects and adjust code to use it.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

19576c94

netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort · a907e36d

由 Xin Long 提交于 12月 07, 2015

When we use 'nft -f' to submit rules, it will build multiple rules into
one netlink skb to send to kernel, kernel will process them one by one.
meanwhile, it add the trans into commit_list to record every commit.
if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
will be marked. after all the process is done. it will roll back all the
commits.

now kernel use list_add_tail to add trans to commit, and use
list_for_each_entry_safe to roll back. which means the order of adding
and rollback is the same. that will cause some cases cannot work well,
even trigger call trace, like:

1. add a set into table foo  [return -EAGAIN]:
   commit_list = 'add set trans'
2. del foo:
   commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
then nf_tables_abort will be called to roll back:
firstly process 'add set trans':
                   case NFT_MSG_NEWSET:
                        trans->ctx.table->use--;
                        list_del_rcu(&nft_trans_set(trans)->list);

  it will del the set from the table foo, but it has removed when del
  table foo [step 2], then the kernel will panic.

the right order of rollback should be:
  'del tab trans' -> 'del set trans' -> 'add set trans'.
which is opposite with commit_list order.

so fix it by rolling back commits with reverse order in nf_tables_abort.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a907e36d

11 12月, 2015 1 次提交

netfilter: nfnetlink: fix splat due to incorrect socket memory accounting in skbuff clones · bd678e09

由 Pablo Neira Ayuso 提交于 12月 09, 2015

If we attach the sk to the skb from nfnetlink_rcv_batch(), then
netlink_skb_destructor() will underflow the socket receive memory
counter and we get warning splat when releasing the socket.

$ cat /proc/net/netlink
sk       Eth Pid    Groups   Rmem     Wmem     Dump     Locks     Drops     Inode
ffff8800ca903000 12  0      00000000 -54144   0        0 2        0        17942
                                     ^^^^^^

Rmem above shows an underflow.

And here below the warning splat:

[ 1363.815976] WARNING: CPU: 2 PID: 1356 at net/netlink/af_netlink.c:958 netlink_sock_destruct+0x80/0xb9()
[...]
[ 1363.816152] CPU: 2 PID: 1356 Comm: kworker/u16:1 Tainted: G        W       4.4.0-rc1+ #153
[ 1363.816155] Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
[ 1363.816160] Workqueue: netns cleanup_net
[ 1363.816163]  0000000000000000 ffff880119203dd0 ffffffff81240204 0000000000000000
[ 1363.816169]  ffff880119203e08 ffffffff8104db4b ffffffff813d49a1 ffff8800ca771000
[ 1363.816174]  ffffffff81a42b00 0000000000000000 ffff8800c0afe1e0 ffff880119203e18
[ 1363.816179] Call Trace:
[ 1363.816181]  <IRQ>  [<ffffffff81240204>] dump_stack+0x4e/0x79
[ 1363.816193]  [<ffffffff8104db4b>] warn_slowpath_common+0x9a/0xb3
[ 1363.816197]  [<ffffffff813d49a1>] ? netlink_sock_destruct+0x80/0xb9

skb->sk was only needed to lookup for the netns, however we don't need
this anymore since 633c9a84 ("netfilter: nfnetlink: avoid recurrent
netns lookups in call_batch") so this patch removes this manual socket
assignment to resolve this problem.
Reported-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Reported-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Tested-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

bd678e09

10 12月, 2015 1 次提交

netfilter: nfnetlink: avoid recurrent netns lookups in call_batch · 633c9a84

由 Pablo Neira Ayuso 提交于 12月 09, 2015

Pass the net pointer to the call_batch callback functions so we can skip
recurrent lookups.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Tested-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>

633c9a84

09 12月, 2015 6 次提交

netfilter: nf_tables: fix nf_log_trace based tracing · 9fb0b519

由 Florian Westphal 提交于 12月 09, 2015

nf_log_trace() outputs bogus 'TRACE:' strings because I forgot to update
the comments array.

Fixes: 33d5a7b1 ("netfilter: nf_tables: extend tracing infrastructure")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

9fb0b519

netfilter: nfnetlink_log: Change setter functions to be void · 23509fcd

由 Rosen, Rami 提交于 12月 08, 2015

Change return type of nfulnl_set_timeout() and nfulnl_set_qthresh() to
be void.

This patch changes the return type of the static methods
nfulnl_set_timeout() and nfulnl_set_qthresh() to be void, as there is no
justification and no need for these methods to return int.
Signed-off-by: NRami Rosen <rami.rosen@intel.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

23509fcd

netfilter: nfnetlink_queue: Unregister pernet subsys in case of init failure · 639e077b

由 Nikolay Borisov 提交于 12月 07, 2015

Commit 3bfe0498 ("netfilter: nfnetlink_{log,queue}:
Register pernet in first place") reorganised the initialisation
order of the pernet_subsys to avoid "use-before-initialised"
condition. However, in doing so the cleanup logic in nfnetlink_queue
got botched in that the pernet_subsys wasn't cleaned in case
nfnetlink_subsys_register failed. This patch adds the necessary
cleanup routine call.

Fixes: 3bfe0498 ("netfilter: nfnetlink_{log,queue}: Register pernet in first place")
Signed-off-by: NNikolay Borisov <kernel@kyup.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

639e077b

netfilter: nf_tables: wrap tracing with a static key · e639f7ab

由 Florian Westphal 提交于 11月 28, 2015

Only needed when meta nftrace rule(s) were added.
The assumption is that no such rules are active, so the call to
nft_trace_init is "never" needed.

When nftrace rules are active, we always call the nft_trace_* functions,
but will only send netlink messages when all of the following are true:

 - traceinfo structure was initialised
 - skb->nf_trace == 1
 - at least one subscriber to trace group.

Adding an extra conditional
(static_branch ... && skb->nf_trace)
	nft_trace_init( ..)

Is possible but results in a larger nft_do_chain footprint.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

e639f7ab

netfilter: nf_tables: extend tracing infrastructure · 33d5a7b1

由 Florian Westphal 提交于 11月 28, 2015

nft monitor mode can then decode and display this trace data.

Parts of LL/Network/Transport headers are provided as separate
attributes.

Otherwise, printing IP address data becomes virtually impossible
for userspace since in the case of the netdev family we really don't
want userspace to have to know all the possible link layer types
and/or sizes just to display/print an ip address.

We also don't want userspace to have to follow ipv6 header chains
to get the s/dport info, the kernel already did this work for us.

To avoid bloating nft_do_chain all data required for tracing is
encapsulated in nft_traceinfo.

The structure is initialized unconditionally(!) for each nft_do_chain
invocation.

This unconditionall call will be moved under a static key in a
followup patch.

With lots of help from Patrick McHardy and Pablo Neira.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

33d5a7b1

net: wrap sock->sk_cgrp_prioidx and ->sk_classid inside a struct · 2a56a1fe

由 Tejun Heo 提交于 12月 07, 2015

Introduce sock->sk_cgrp_data which is a struct sock_cgroup_data.
->sk_cgroup_prioidx and ->sk_classid are moved into it.  The struct
and its accessors are defined in cgroup-defs.h.  This is to prepare
for overloading the fields with a cgroup pointer.

This patch mostly performs equivalent conversions but the followings
are noteworthy.

* Equality test before updating classid is removed from
  sock_update_classid().  This shouldn't make any noticeable
  difference and a similar test will be implemented on the helper side
  later.

* sock_update_netprioidx() now takes struct sock_cgroup_data and can
  be moved to netprio_cgroup.h without causing include dependency
  loop.  Moved.

* The dummy version of sock_update_netprioidx() converted to a static
  inline function while at it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a56a1fe

25 11月, 2015 2 次提交

netfilter: nft_payload: add packet mangling support · 7ec3f7b4

由 Patrick McHardy 提交于 11月 24, 2015

Add support for mangling packet payload. Checksum for the specified base
header is updated automatically if requested, however no updates for any
kind of pseudo headers are supported, meaning no stateless NAT is supported.

For checksum updates different checksumming methods can be specified. The
currently supported methods are NONE for no checksum updates, and INET for
internet type checksums.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7ec3f7b4

netfilter: Set /proc/net entries owner to root in namespace · f13f2aee

由 Philip Whineray 提交于 11月 22, 2015

Various files are owned by root with 0440 permission. Reading them is
impossible in an unprivileged user namespace, interfering with firewall
tools. For instance, iptables-save relies on /proc/net/ip_tables_names
contents to dump only loaded tables.

This patch assigned ownership of the following files to root in the
current namespace:

- /proc/net/*_tables_names
- /proc/net/*_tables_matches
- /proc/net/*_tables_targets
- /proc/net/nf_conntrack
- /proc/net/nf_conntrack_expect
- /proc/net/netfilter/nfnetlink_log

A mapping for root must be available, so this order should be followed:

unshare(CLONE_NEWUSER);
/* Setup the mapping */
unshare(CLONE_NEWNET);
Signed-off-by: NPhilip Whineray <phil@firehol.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

f13f2aee

23 11月, 2015 1 次提交

netfilter: nfnetlink_queue: avoid harmless unnitialized variable warnings · 8e662164

由 Arnd Bergmann 提交于 11月 19, 2015

Several ARM default configurations give us warnings on recent
compilers about potentially uninitialized variables in the
nfnetlink code in two functions:

net/netfilter/nfnetlink_queue.c: In function 'nfqnl_build_packet_message':
net/netfilter/nfnetlink_queue.c:519:19: warning: 'nfnl_ct' may be used uninitialized in this function [-Wmaybe-uninitialized]
if (ct && nfnl_ct->build(skb, ct, ctinfo, NFQA_CT, NFQA_CT_INFO) < 0)

Moving the rcu_dereference(nfnl_ct_hook) call outside of the
conditional code avoids the warning without forcing us to
preinitialize the variable.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: a4b4766c ("netfilter: nfnetlink_queue: rename related to nfqueue attaching conntrack info")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

8e662164

16 11月, 2015 1 次提交

ipvs: use skb_to_full_sk() helper · 340c78e5

由 Eric Dumazet 提交于 11月 12, 2015

SYNACK packets might be attached to request sockets.

Use skb_to_full_sk() helper to avoid illegal accesses to
inet_sk(skb->sk)

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Acked-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

340c78e5

11 11月, 2015 3 次提交

netfilter: nf_tables: add clone interface to expression operations · 086f3321

由 Pablo Neira Ayuso 提交于 11月 10, 2015

With the conversion of the counter expressions to make it percpu, we
need to clone the percpu memory area, otherwise we crash when using
counters from flow tables.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

086f3321

netfilter: fix xt_TEE and xt_TPROXY dependencies · 74ec4d55

由 Arnd Bergmann 提交于 11月 10, 2015

Kconfig is too smart for its own good: a Kconfig line that states

	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES

means that if IP6_NF_IPTABLES is set to 'm', then NF_DEFRAG_IPV6 will
also be set to 'm', regardless of the state of the symbol from which
it is selected. When the xt_TEE driver is built-in and nothing else
forces NF_DEFRAG_IPV6 to be built-in, this causes a link-time error:

net/built-in.o: In function `tee_tg6':
net/netfilter/xt_TEE.c:46: undefined reference to `nf_dup_ipv6'

This works around that behavior by changing the dependency to
'if IP6_NF_IPTABLES != n', which is interpreted as boolean expression
rather than a tristate and causes the NF_DEFRAG_IPV6 symbol to
be built-in as well.

The bug only occurs once in thousands of 'randconfig' builds and
does not really impact real users. From inspecting the other
surrounding Kconfig symbols, I am guessing that NETFILTER_XT_TARGET_TPROXY
and NETFILTER_XT_MATCH_SOCKET have the same issue. If not, this
change should still be harmless.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

74ec4d55

netfilter: nfnetlink_log: work around uninitialized variable warning · c872a2d9

由 Arnd Bergmann 提交于 11月 10, 2015

After a recent (correct) change, gcc started warning about the use
of the 'flags' variable in nfulnl_recv_config()

net/netfilter/nfnetlink_log.c: In function 'nfulnl_recv_config':
net/netfilter/nfnetlink_log.c:320:14: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/netfilter/nfnetlink_log.c:828:6: note: 'flags' was declared here

The warning first shows up in ARM s3c2410_defconfig with gcc-4.3 or
higher (including 5.2.1, which is the latest version I checked) I
tried working around it by rearranging the code but had no success
with that.

As a last resort, this initializes the variable to zero, which shuts
up the warning, but means that we don't get a warning if the code
is ever changed in a way that actually causes the variable to be
used without first being written.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 8cbc8708 ("netfilter: nfnetlink_log: validate dependencies to avoid breaking atomicity")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c872a2d9

09 11月, 2015 2 次提交

netfilter: nft_meta: use skb_to_full_sk() helper · 3aed8225

由 Eric Dumazet 提交于 11月 08, 2015

SYNACK packets might be attached to request sockets.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3aed8225

netfilter: xt_owner: use skb_to_full_sk() helper · fdd723e2

由 Eric Dumazet 提交于 11月 08, 2015

SYNACK packets might be attached to a request socket,
xt_owner wants to gte the listener in this case.

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fdd723e2

07 11月, 2015 3 次提交

J
netfilter: ipset: Fix hash type expire: release empty hash bucket block · 0aae24eb
由 Jozsef Kadlecsik 提交于 11月 07, 2015
```
When all entries are expired/all slots are empty, release the bucket.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
```
0aae24eb

netfilter: ipset: Fix hash:* type expiration · e9dfdc05

由 Jozsef Kadlecsik 提交于 11月 07, 2015

Incorrect index was used when the data blob was shrinked at expiration,
which could lead to falsely expired entries and memory leak when
the comment extension was used too.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

e9dfdc05

netfilter: ipset: Fix extension alignment · 95ad1f4a

由 Jozsef Kadlecsik 提交于 11月 07, 2015

The data extensions in ipset lacked the proper memory alignment and
thus could lead to kernel crash on several architectures. Therefore
the structures have been reorganized and alignment attributes added
where needed. The patch was tested on armv7h by Gerhard Wiesinger and
on x86_64, sparc64 by Jozsef Kadlecsik.
Reported-by: NGerhard Wiesinger <lists@wiesinger.com>
Tested-by: NGerhard Wiesinger <lists@wiesinger.com>
Tested-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

95ad1f4a

28 10月, 2015 1 次提交

netfilter: nfnetlink: don't probe module if it exists · dbc3617f

由 Florian Westphal 提交于 10月 27, 2015

nfnetlink_bind request_module()s all the time as nfnetlink_get_subsys()
shifts the argument by 8 to obtain the subsys id.

So using type instead of type << 8 always returns NULL.

Fixes: 03292745 ("netlink: add nlk->netlink_bind hook for module auto-loading")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

dbc3617f

27 10月, 2015 1 次提交

netfilter: nf_nat_redirect: add missing NULL pointer check · 94f9cd81

由 Munehisa Kamata 提交于 10月 26, 2015

Commit 8b13eddf ("netfilter: refactor NAT
redirect IPv4 to use it from nf_tables") has introduced a trivial logic
change which can result in the following crash.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: [<ffffffffa033002d>] nf_nat_redirect_ipv4+0x2d/0xa0 [nf_nat_redirect]
PGD 3ba662067 PUD 3ba661067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: ipv6(E) xt_REDIRECT(E) nf_nat_redirect(E) xt_tcpudp(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) ip_tables(E) x_tables(E) binfmt_misc(E) xfs(E) libcrc32c(E) evbug(E) evdev(E) psmouse(E) i2c_piix4(E) i2c_core(E) acpi_cpufreq(E) button(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
CPU: 0 PID: 2536 Comm: ip Tainted: G            E   4.1.7-15.23.amzn1.x86_64 #1
Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
task: ffff8800eb438000 ti: ffff8803ba664000 task.ti: ffff8803ba664000
[...]
Call Trace:
 <IRQ>
 [<ffffffffa0334065>] redirect_tg4+0x15/0x20 [xt_REDIRECT]
 [<ffffffffa02e2e99>] ipt_do_table+0x2b9/0x5e1 [ip_tables]
 [<ffffffffa0328045>] iptable_nat_do_chain+0x25/0x30 [iptable_nat]
 [<ffffffffa031777d>] nf_nat_ipv4_fn+0x13d/0x1f0 [nf_nat_ipv4]
 [<ffffffffa0328020>] ? iptable_nat_ipv4_fn+0x20/0x20 [iptable_nat]
 [<ffffffffa031785e>] nf_nat_ipv4_in+0x2e/0x90 [nf_nat_ipv4]
 [<ffffffffa03280a5>] iptable_nat_ipv4_in+0x15/0x20 [iptable_nat]
 [<ffffffff81449137>] nf_iterate+0x57/0x80
 [<ffffffff814491f7>] nf_hook_slow+0x97/0x100
 [<ffffffff814504d4>] ip_rcv+0x314/0x400

unsigned int
nf_nat_redirect_ipv4(struct sk_buff *skb,
...
{
...
		rcu_read_lock();
		indev = __in_dev_get_rcu(skb->dev);
		if (indev != NULL) {
			ifa = indev->ifa_list;
			newdst = ifa->ifa_local; <---
		}
		rcu_read_unlock();
...
}

Before the commit, 'ifa' had been always checked before access. After the
commit, however, it could be accessed even if it's NULL. Interestingly,
this was once fixed in 2003.

http://marc.info/?l=netfilter-devel&m=106668497403047&w=2

In addition to the original one, we have seen the crash when packets that
need to be redirected somehow arrive on an interface which hasn't been
yet fully configured.

This change just reverts the logic to the old behavior to avoid the crash.

Fixes: 8b13eddf ("netfilter: refactor NAT redirect IPv4 to use it from nf_tables")
Signed-off-by: NMunehisa Kamata <kamatam@amazon.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

94f9cd81

22 10月, 2015 1 次提交

netfilter: xt_TEE: fix NULL dereference · 45efccdb

由 Eric Dumazet 提交于 10月 19, 2015

iptables -I INPUT ... -j TEE --gateway 10.1.2.3

<crash> because --oif was not specified

tee_tg_check() sets ->priv pointer to NULL in this case.

Fixes: bbde9fc1 ("netfilter: factor out packet duplication for IPv4/IPv6")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

45efccdb

17 10月, 2015 4 次提交

netfilter: ipset: Fix sleeping memory allocation in atomic context · 00db674b

由 Nikolay Borisov 提交于 10月 16, 2015

Commit 00590fdd introduced RCU locking in list type and in
doing so introduced a memory allocation in list_set_add, which
is done in an atomic context, due to the fact that ipset rcu
list modifications are serialised with a spin lock. The reason
why we can't use a mutex is that in addition to modifying the
list with ipset commands, it's also being modified when a
particular ipset rule timeout expires aka garbage collection.
This gc is triggered from set_cleanup_entries, which in turn
is invoked from a timer thus requiring the lock to be bh-safe.

Concretely the following call chain can lead to "sleeping function
called in atomic context" splat:
call_ad -> list_set_uadt -> list_set_uadd -> kzalloc(, GFP_KERNEL).
And since GFP_KERNEL allows initiating direct reclaim thus
potentially sleeping in the allocation path.

To fix the issue change the allocation type to GFP_ATOMIC, to
correctly reflect that it is occuring in an atomic context.

Fixes: 00590fdd ("netfilter: ipset: Introduce RCU locking in list type")
Signed-off-by: NNikolay Borisov <kernel@kyup.com>
Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

00db674b

netfilter: nf_queue: remove rcu_read_lock calls · 81b4325e

由 Florian Westphal 提交于 10月 13, 2015

All verdict handlers make use of the nfnetlink .call_rcu callback
so rcu readlock is already held.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

81b4325e

netfilter: make nf_queue_entry_get_refs return void · ed78d09d

由 Florian Westphal 提交于 10月 13, 2015

We don't care if module is being unloaded anymore since hook unregister
handling will destroy queue entries using that hook.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

ed78d09d

netfilter: remove hook owner refcounting · 2ffbceb2

由 Florian Westphal 提交于 10月 13, 2015

since commit 8405a8ff ("netfilter: nf_qeueue: Drop queue entries on
nf_unregister_hook") all pending queued entries are discarded.

So we can simply remove all of the owner handling -- when module is
removed it also needs to unregister all its hooks.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2ffbceb2

15 10月, 2015 2 次提交

netfilter: nfnetlink_log: validate dependencies to avoid breaking atomicity · 8cbc8708

由 Pablo Neira 提交于 10月 13, 2015

Check that dependencies are fulfilled before updating the logger
instance, otherwise we can leave things in intermediate state on errors
in nfulnl_recv_config().

[ Ken-ichirou reports that this is also fixing missing instance refcnt drop
  on error introduced in his patch 914eebf2 ("netfilter: nfnetlink_log:
  autoload nf_conntrack_netlink module NFQA_CFG_F_CONNTRACK config flag"). ]
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Tested-by: NKen-ichirou MATSUZAWA <chamaken@gmail.com>

8cbc8708

netfilter: nfnetlink_log: consolidate check for instance in nfulnl_recv_config() · 336a3b3e

由 Pablo Neira Ayuso 提交于 10月 13, 2015

This patch consolidates the check for valid logger instance once we have
passed the command handling:

The config message that we receive may contain the following info:

1) Command only: We always get a valid instance pointer if we just
   created it. In case that the instance is being destroyed or the
   command is unknown, we jump to exit path of nfulnl_recv_config().
   This patch doesn't modify this handling.

2) Config only: In this case, the instance must always exist since the
   user is asking for configuration updates. If the instance doesn't exist
   this returns -ENODEV.

3) No command and no configs are specified: This case is rare. The
   user is sending us a config message with neither commands nor
   config options. In this case, we have to check if the instance exists
   and bail out otherwise. Before this patch, it was possible to send a
   config message with no command and no config updates for an
   unexisting instance without triggering an error. So this is the only
   case that changes.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Tested-by: NKen-ichirou MATSUZAWA <chamaken@gmail.com>

336a3b3e

13 10月, 2015 1 次提交

netfilter: sync with packet rx also after removing queue entries · 514ed62e

由 Florian Westphal 提交于 10月 08, 2015

We need to sync packet rx again after flushing the queue entries.
Otherwise, the following race could happen:

cpu1: nf_unregister_hook(H) called, H unliked from lists, calls
synchronize_net() to wait for packet rx completion.

Problem is that while no new nf_queue_entry structs that use H can be
allocated, another CPU might receive a verdict from userspace just before
cpu1 calls nf_queue_nf_hook_drop to remove this entry:

cpu2: receive verdict from userspace, lock queue
cpu2: unlink nf_queue_entry struct E, which references H, from queue list
cpu1: calls nf_queue_nf_hook_drop, blocks on queue spinlock
cpu2: unlock queue
cpu1: nf_queue_nf_hook_drop drops affected queue entries
cpu2: call nf_reinject for E
cpu1: kfree(H)
cpu2: potential use-after-free for H

Cc: Eric W. Biederman <ebiederm@xmission.com>
Fixes: 085db2c0 ("netfilter: Per network namespace netfilter hooks.")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

514ed62e