1. 29 12月, 2015 3 次提交
  2. 18 12月, 2015 1 次提交
    • F
      netfilter: meta: add support for setting skb->pkttype · b4aae759
      Florian Westphal 提交于
      This allows to redirect bridged packets to local machine:
      
      ether type ip ether daddr set aa:53:08:12:34:56 meta pkttype set unicast
      Without 'set unicast', ip stack discards PACKET_OTHERHOST skbs.
      
      It is also useful to add support for a '-m cluster like' nft rule
      (where switch floods packets to several nodes, and each cluster node
       node processes a subset of packets for load distribution).
      
      Mangling is restricted to HOST/OTHER/BROAD/MULTICAST, i.e. you cannot set
      skb->pkt_type to PACKET_KERNEL or change PACKET_LOOPBACK to PACKET_HOST.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b4aae759
  3. 16 12月, 2015 1 次提交
  4. 15 12月, 2015 3 次提交
  5. 14 12月, 2015 2 次提交
    • P
      netfilter: cttimeout: add netns support · 19576c94
      Pablo Neira 提交于
      Add a per-netns list of timeout objects and adjust code to use it.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      19576c94
    • X
      netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort · a907e36d
      Xin Long 提交于
      When we use 'nft -f' to submit rules, it will build multiple rules into
      one netlink skb to send to kernel, kernel will process them one by one.
      meanwhile, it add the trans into commit_list to record every commit.
      if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
      will be marked. after all the process is done. it will roll back all the
      commits.
      
      now kernel use list_add_tail to add trans to commit, and use
      list_for_each_entry_safe to roll back. which means the order of adding
      and rollback is the same. that will cause some cases cannot work well,
      even trigger call trace, like:
      
      1. add a set into table foo  [return -EAGAIN]:
         commit_list = 'add set trans'
      2. del foo:
         commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
      then nf_tables_abort will be called to roll back:
      firstly process 'add set trans':
                         case NFT_MSG_NEWSET:
                              trans->ctx.table->use--;
                              list_del_rcu(&nft_trans_set(trans)->list);
      
        it will del the set from the table foo, but it has removed when del
        table foo [step 2], then the kernel will panic.
      
      the right order of rollback should be:
        'del tab trans' -> 'del set trans' -> 'add set trans'.
      which is opposite with commit_list order.
      
      so fix it by rolling back commits with reverse order in nf_tables_abort.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a907e36d
  6. 11 12月, 2015 1 次提交
    • P
      netfilter: nfnetlink: fix splat due to incorrect socket memory accounting in skbuff clones · bd678e09
      Pablo Neira Ayuso 提交于
      If we attach the sk to the skb from nfnetlink_rcv_batch(), then
      netlink_skb_destructor() will underflow the socket receive memory
      counter and we get warning splat when releasing the socket.
      
      $ cat /proc/net/netlink
      sk       Eth Pid    Groups   Rmem     Wmem     Dump     Locks     Drops     Inode
      ffff8800ca903000 12  0      00000000 -54144   0        0 2        0        17942
                                           ^^^^^^
      
      Rmem above shows an underflow.
      
      And here below the warning splat:
      
      [ 1363.815976] WARNING: CPU: 2 PID: 1356 at net/netlink/af_netlink.c:958 netlink_sock_destruct+0x80/0xb9()
      [...]
      [ 1363.816152] CPU: 2 PID: 1356 Comm: kworker/u16:1 Tainted: G        W       4.4.0-rc1+ #153
      [ 1363.816155] Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
      [ 1363.816160] Workqueue: netns cleanup_net
      [ 1363.816163]  0000000000000000 ffff880119203dd0 ffffffff81240204 0000000000000000
      [ 1363.816169]  ffff880119203e08 ffffffff8104db4b ffffffff813d49a1 ffff8800ca771000
      [ 1363.816174]  ffffffff81a42b00 0000000000000000 ffff8800c0afe1e0 ffff880119203e18
      [ 1363.816179] Call Trace:
      [ 1363.816181]  <IRQ>  [<ffffffff81240204>] dump_stack+0x4e/0x79
      [ 1363.816193]  [<ffffffff8104db4b>] warn_slowpath_common+0x9a/0xb3
      [ 1363.816197]  [<ffffffff813d49a1>] ? netlink_sock_destruct+0x80/0xb9
      
      skb->sk was only needed to lookup for the netns, however we don't need
      this anymore since 633c9a84 ("netfilter: nfnetlink: avoid recurrent
      netns lookups in call_batch") so this patch removes this manual socket
      assignment to resolve this problem.
      Reported-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
      Reported-by: NBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Tested-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
      bd678e09
  7. 10 12月, 2015 1 次提交
  8. 09 12月, 2015 6 次提交
  9. 25 11月, 2015 2 次提交
    • P
      netfilter: nft_payload: add packet mangling support · 7ec3f7b4
      Patrick McHardy 提交于
      Add support for mangling packet payload. Checksum for the specified base
      header is updated automatically if requested, however no updates for any
      kind of pseudo headers are supported, meaning no stateless NAT is supported.
      
      For checksum updates different checksumming methods can be specified. The
      currently supported methods are NONE for no checksum updates, and INET for
      internet type checksums.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7ec3f7b4
    • P
      netfilter: Set /proc/net entries owner to root in namespace · f13f2aee
      Philip Whineray 提交于
      Various files are owned by root with 0440 permission. Reading them is
      impossible in an unprivileged user namespace, interfering with firewall
      tools. For instance, iptables-save relies on /proc/net/ip_tables_names
      contents to dump only loaded tables.
      
      This patch assigned ownership of the following files to root in the
      current namespace:
      
      - /proc/net/*_tables_names
      - /proc/net/*_tables_matches
      - /proc/net/*_tables_targets
      - /proc/net/nf_conntrack
      - /proc/net/nf_conntrack_expect
      - /proc/net/netfilter/nfnetlink_log
      
      A mapping for root must be available, so this order should be followed:
      
      unshare(CLONE_NEWUSER);
      /* Setup the mapping */
      unshare(CLONE_NEWNET);
      Signed-off-by: NPhilip Whineray <phil@firehol.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f13f2aee
  10. 23 11月, 2015 1 次提交
    • A
      netfilter: nfnetlink_queue: avoid harmless unnitialized variable warnings · 8e662164
      Arnd Bergmann 提交于
      Several ARM default configurations give us warnings on recent
      compilers about potentially uninitialized variables in the
      nfnetlink code in two functions:
      
      net/netfilter/nfnetlink_queue.c: In function 'nfqnl_build_packet_message':
      net/netfilter/nfnetlink_queue.c:519:19: warning: 'nfnl_ct' may be used uninitialized in this function [-Wmaybe-uninitialized]
        if (ct && nfnl_ct->build(skb, ct, ctinfo, NFQA_CT, NFQA_CT_INFO) < 0)
      
      Moving the rcu_dereference(nfnl_ct_hook) call outside of the
      conditional code avoids the warning without forcing us to
      preinitialize the variable.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: a4b4766c ("netfilter: nfnetlink_queue: rename related to nfqueue attaching conntrack info")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8e662164
  11. 16 11月, 2015 1 次提交
  12. 11 11月, 2015 3 次提交
    • P
      netfilter: nf_tables: add clone interface to expression operations · 086f3321
      Pablo Neira Ayuso 提交于
      With the conversion of the counter expressions to make it percpu, we
      need to clone the percpu memory area, otherwise we crash when using
      counters from flow tables.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      086f3321
    • A
      netfilter: fix xt_TEE and xt_TPROXY dependencies · 74ec4d55
      Arnd Bergmann 提交于
      Kconfig is too smart for its own good: a Kconfig line that states
      
      	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES
      
      means that if IP6_NF_IPTABLES is set to 'm', then NF_DEFRAG_IPV6 will
      also be set to 'm', regardless of the state of the symbol from which
      it is selected. When the xt_TEE driver is built-in and nothing else
      forces NF_DEFRAG_IPV6 to be built-in, this causes a link-time error:
      
      net/built-in.o: In function `tee_tg6':
      net/netfilter/xt_TEE.c:46: undefined reference to `nf_dup_ipv6'
      
      This works around that behavior by changing the dependency to
      'if IP6_NF_IPTABLES != n', which is interpreted as boolean expression
      rather than a tristate and causes the NF_DEFRAG_IPV6 symbol to
      be built-in as well.
      
      The bug only occurs once in thousands of 'randconfig' builds and
      does not really impact real users. From inspecting the other
      surrounding Kconfig symbols, I am guessing that NETFILTER_XT_TARGET_TPROXY
      and NETFILTER_XT_MATCH_SOCKET have the same issue. If not, this
      change should still be harmless.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      74ec4d55
    • A
      netfilter: nfnetlink_log: work around uninitialized variable warning · c872a2d9
      Arnd Bergmann 提交于
      After a recent (correct) change, gcc started warning about the use
      of the 'flags' variable in nfulnl_recv_config()
      
      net/netfilter/nfnetlink_log.c: In function 'nfulnl_recv_config':
      net/netfilter/nfnetlink_log.c:320:14: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/nfnetlink_log.c:828:6: note: 'flags' was declared here
      
      The warning first shows up in ARM s3c2410_defconfig with gcc-4.3 or
      higher (including 5.2.1, which is the latest version I checked) I
      tried working around it by rearranging the code but had no success
      with that.
      
      As a last resort, this initializes the variable to zero, which shuts
      up the warning, but means that we don't get a warning if the code
      is ever changed in a way that actually causes the variable to be
      used without first being written.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 8cbc8708 ("netfilter: nfnetlink_log: validate dependencies to avoid breaking atomicity")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c872a2d9
  13. 09 11月, 2015 2 次提交
  14. 07 11月, 2015 3 次提交
  15. 28 10月, 2015 1 次提交
  16. 27 10月, 2015 1 次提交
    • M
      netfilter: nf_nat_redirect: add missing NULL pointer check · 94f9cd81
      Munehisa Kamata 提交于
      Commit 8b13eddf ("netfilter: refactor NAT
      redirect IPv4 to use it from nf_tables") has introduced a trivial logic
      change which can result in the following crash.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
      IP: [<ffffffffa033002d>] nf_nat_redirect_ipv4+0x2d/0xa0 [nf_nat_redirect]
      PGD 3ba662067 PUD 3ba661067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: ipv6(E) xt_REDIRECT(E) nf_nat_redirect(E) xt_tcpudp(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) ip_tables(E) x_tables(E) binfmt_misc(E) xfs(E) libcrc32c(E) evbug(E) evdev(E) psmouse(E) i2c_piix4(E) i2c_core(E) acpi_cpufreq(E) button(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
      CPU: 0 PID: 2536 Comm: ip Tainted: G            E   4.1.7-15.23.amzn1.x86_64 #1
      Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
      task: ffff8800eb438000 ti: ffff8803ba664000 task.ti: ffff8803ba664000
      [...]
      Call Trace:
       <IRQ>
       [<ffffffffa0334065>] redirect_tg4+0x15/0x20 [xt_REDIRECT]
       [<ffffffffa02e2e99>] ipt_do_table+0x2b9/0x5e1 [ip_tables]
       [<ffffffffa0328045>] iptable_nat_do_chain+0x25/0x30 [iptable_nat]
       [<ffffffffa031777d>] nf_nat_ipv4_fn+0x13d/0x1f0 [nf_nat_ipv4]
       [<ffffffffa0328020>] ? iptable_nat_ipv4_fn+0x20/0x20 [iptable_nat]
       [<ffffffffa031785e>] nf_nat_ipv4_in+0x2e/0x90 [nf_nat_ipv4]
       [<ffffffffa03280a5>] iptable_nat_ipv4_in+0x15/0x20 [iptable_nat]
       [<ffffffff81449137>] nf_iterate+0x57/0x80
       [<ffffffff814491f7>] nf_hook_slow+0x97/0x100
       [<ffffffff814504d4>] ip_rcv+0x314/0x400
      
      unsigned int
      nf_nat_redirect_ipv4(struct sk_buff *skb,
      ...
      {
      ...
      		rcu_read_lock();
      		indev = __in_dev_get_rcu(skb->dev);
      		if (indev != NULL) {
      			ifa = indev->ifa_list;
      			newdst = ifa->ifa_local; <---
      		}
      		rcu_read_unlock();
      ...
      }
      
      Before the commit, 'ifa' had been always checked before access. After the
      commit, however, it could be accessed even if it's NULL. Interestingly,
      this was once fixed in 2003.
      
      http://marc.info/?l=netfilter-devel&m=106668497403047&w=2
      
      In addition to the original one, we have seen the crash when packets that
      need to be redirected somehow arrive on an interface which hasn't been
      yet fully configured.
      
      This change just reverts the logic to the old behavior to avoid the crash.
      
      Fixes: 8b13eddf ("netfilter: refactor NAT redirect IPv4 to use it from nf_tables")
      Signed-off-by: NMunehisa Kamata <kamatam@amazon.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      94f9cd81
  17. 22 10月, 2015 1 次提交
  18. 17 10月, 2015 4 次提交
  19. 15 10月, 2015 2 次提交
    • P
      netfilter: nfnetlink_log: validate dependencies to avoid breaking atomicity · 8cbc8708
      Pablo Neira 提交于
      Check that dependencies are fulfilled before updating the logger
      instance, otherwise we can leave things in intermediate state on errors
      in nfulnl_recv_config().
      
      [ Ken-ichirou reports that this is also fixing missing instance refcnt drop
        on error introduced in his patch 914eebf2 ("netfilter: nfnetlink_log:
        autoload nf_conntrack_netlink module NFQA_CFG_F_CONNTRACK config flag"). ]
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Tested-by: NKen-ichirou MATSUZAWA <chamaken@gmail.com>
      8cbc8708
    • P
      netfilter: nfnetlink_log: consolidate check for instance in nfulnl_recv_config() · 336a3b3e
      Pablo Neira Ayuso 提交于
      This patch consolidates the check for valid logger instance once we have
      passed the command handling:
      
      The config message that we receive may contain the following info:
      
      1) Command only: We always get a valid instance pointer if we just
         created it. In case that the instance is being destroyed or the
         command is unknown, we jump to exit path of nfulnl_recv_config().
         This patch doesn't modify this handling.
      
      2) Config only: In this case, the instance must always exist since the
         user is asking for configuration updates. If the instance doesn't exist
         this returns -ENODEV.
      
      3) No command and no configs are specified: This case is rare. The
         user is sending us a config message with neither commands nor
         config options. In this case, we have to check if the instance exists
         and bail out otherwise. Before this patch, it was possible to send a
         config message with no command and no config updates for an
         unexisting instance without triggering an error. So this is the only
         case that changes.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Tested-by: NKen-ichirou MATSUZAWA <chamaken@gmail.com>
      336a3b3e
  20. 13 10月, 2015 1 次提交
    • F
      netfilter: sync with packet rx also after removing queue entries · 514ed62e
      Florian Westphal 提交于
      We need to sync packet rx again after flushing the queue entries.
      Otherwise, the following race could happen:
      
      cpu1: nf_unregister_hook(H) called, H unliked from lists, calls
      synchronize_net() to wait for packet rx completion.
      
      Problem is that while no new nf_queue_entry structs that use H can be
      allocated, another CPU might receive a verdict from userspace just before
      cpu1 calls nf_queue_nf_hook_drop to remove this entry:
      
      cpu2: receive verdict from userspace, lock queue
      cpu2: unlink nf_queue_entry struct E, which references H, from queue list
      cpu1: calls nf_queue_nf_hook_drop, blocks on queue spinlock
      cpu2: unlock queue
      cpu1: nf_queue_nf_hook_drop drops affected queue entries
      cpu2: call nf_reinject for E
      cpu1: kfree(H)
      cpu2: potential use-after-free for H
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Fixes: 085db2c0 ("netfilter: Per network namespace netfilter hooks.")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      514ed62e