1. 20 5月, 2015 1 次提交
    • D
      netfilter: ensure number of counters is >0 in do_replace() · 1086bbe9
      Dave Jones 提交于
      After improving setsockopt() coverage in trinity, I started triggering
      vmalloc failures pretty reliably from this code path:
      
      warn_alloc_failed+0xe9/0x140
      __vmalloc_node_range+0x1be/0x270
      vzalloc+0x4b/0x50
      __do_replace+0x52/0x260 [ip_tables]
      do_ipt_set_ctl+0x15d/0x1d0 [ip_tables]
      nf_setsockopt+0x65/0x90
      ip_setsockopt+0x61/0xa0
      raw_setsockopt+0x16/0x60
      sock_common_setsockopt+0x14/0x20
      SyS_setsockopt+0x71/0xd0
      
      It turns out we don't validate that the num_counters field in the
      struct we pass in from userspace is initialized.
      
      The same problem also exists in ebtables, arptables, ipv6, and the
      compat variants.
      Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1086bbe9
  2. 05 4月, 2015 1 次提交
  3. 19 3月, 2015 1 次提交
    • P
      netfilter: restore rule tracing via nfnetlink_log · 4017a7ee
      Pablo Neira Ayuso 提交于
      Since fab4085f ("netfilter: log: nf_log_packet() as real unified
      interface"), the loginfo structure that is passed to nf_log_packet() is
      used to explicitly indicate the logger type you want to use.
      
      This is a problem for people tracing rules through nfnetlink_log since
      packets are always routed to the NF_LOG_TYPE logger after the
      aforementioned patch.
      
      We can fix this by removing the trace loginfo structures, but that still
      changes the log level from 4 to 5 for tracing messages and there may be
      someone relying on this outthere. So let's just introduce a new
      nf_log_trace() function that restores the former behaviour.
      Reported-by: NMarkus Kötter <koetter@rrzn.uni-hannover.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4017a7ee
  4. 05 4月, 2014 1 次提交
    • T
      netfilter: Can't fail and free after table replacement · c58dd2dd
      Thomas Graf 提交于
      All xtables variants suffer from the defect that the copy_to_user()
      to copy the counters to user memory may fail after the table has
      already been exchanged and thus exposed. Return an error at this
      point will result in freeing the already exposed table. Any
      subsequent packet processing will result in a kernel panic.
      
      We can't copy the counters before exposing the new tables as we
      want provide the counter state after the old table has been
      unhooked. Therefore convert this into a silent error.
      
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c58dd2dd
  5. 22 10月, 2013 1 次提交
    • W
      netfilter: x_tables: fix ordering of jumpstack allocation and table update · b416c144
      Will Deacon 提交于
      During kernel stability testing on an SMP ARMv7 system, Yalin Wang
      reported the following panic from the netfilter code:
      
        1fe0: 0000001c 5e2d3b10 4007e779 4009e110 60000010 00000032 ff565656 ff545454
        [<c06c48dc>] (ipt_do_table+0x448/0x584) from [<c0655ef0>] (nf_iterate+0x48/0x7c)
        [<c0655ef0>] (nf_iterate+0x48/0x7c) from [<c0655f7c>] (nf_hook_slow+0x58/0x104)
        [<c0655f7c>] (nf_hook_slow+0x58/0x104) from [<c0683bbc>] (ip_local_deliver+0x88/0xa8)
        [<c0683bbc>] (ip_local_deliver+0x88/0xa8) from [<c0683718>] (ip_rcv_finish+0x418/0x43c)
        [<c0683718>] (ip_rcv_finish+0x418/0x43c) from [<c062b1c4>] (__netif_receive_skb+0x4cc/0x598)
        [<c062b1c4>] (__netif_receive_skb+0x4cc/0x598) from [<c062b314>] (process_backlog+0x84/0x158)
        [<c062b314>] (process_backlog+0x84/0x158) from [<c062de84>] (net_rx_action+0x70/0x1dc)
        [<c062de84>] (net_rx_action+0x70/0x1dc) from [<c0088230>] (__do_softirq+0x11c/0x27c)
        [<c0088230>] (__do_softirq+0x11c/0x27c) from [<c008857c>] (do_softirq+0x44/0x50)
        [<c008857c>] (do_softirq+0x44/0x50) from [<c0088614>] (local_bh_enable_ip+0x8c/0xd0)
        [<c0088614>] (local_bh_enable_ip+0x8c/0xd0) from [<c06b0330>] (inet_stream_connect+0x164/0x298)
        [<c06b0330>] (inet_stream_connect+0x164/0x298) from [<c061d68c>] (sys_connect+0x88/0xc8)
        [<c061d68c>] (sys_connect+0x88/0xc8) from [<c000e340>] (ret_fast_syscall+0x0/0x30)
        Code: 2a000021 e59d2028 e59de01c e59f011c (e7824103)
        ---[ end trace da227214a82491bd ]---
        Kernel panic - not syncing: Fatal exception in interrupt
      
      This comes about because CPU1 is executing xt_replace_table in response
      to a setsockopt syscall, resulting in:
      
      	ret = xt_jumpstack_alloc(newinfo);
      		--> newinfo->jumpstack = kzalloc(size, GFP_KERNEL);
      
      	[...]
      
      	table->private = newinfo;
      	newinfo->initial_entries = private->initial_entries;
      
      Meanwhile, CPU0 is handling the network receive path and ends up in
      ipt_do_table, resulting in:
      
      	private = table->private;
      
      	[...]
      
      	jumpstack  = (struct ipt_entry **)private->jumpstack[cpu];
      
      On weakly ordered memory architectures, the writes to table->private
      and newinfo->jumpstack from CPU1 can be observed out of order by CPU0.
      Furthermore, on architectures which don't respect ordering of address
      dependencies (i.e. Alpha), the reads from CPU0 can also be re-ordered.
      
      This patch adds an smp_wmb() before the assignment to table->private
      (which is essentially publishing newinfo) to ensure that all writes to
      newinfo will be observed before plugging it into the table structure.
      A dependent-read barrier is also added on the consumer sides, to ensure
      the same ordering requirements are also respected there.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reported-by: NWang, Yalin <Yalin.Wang@sonymobile.com>
      Tested-by: NWang, Yalin <Yalin.Wang@sonymobile.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b416c144
  6. 19 4月, 2013 1 次提交
    • P
      netfilter: add my copyright statements · f229f6ce
      Patrick McHardy 提交于
      Add copyright statements to all netfilter files which have had significant
      changes done by myself in the past.
      
      Some notes:
      
      - nf_conntrack_ecache.c was incorrectly attributed to Rusty and Netfilter
        Core Team when it got split out of nf_conntrack_core.c. The copyrights
        even state a date which lies six years before it was written. It was
        written in 2005 by Harald and myself.
      
      - net/ipv{4,6}/netfilter.c, net/netfitler/nf_queue.c were missing copyright
        statements. I've added the copyright statement from net/netfilter/core.c,
        where this code originated
      
      - for nf_conntrack_proto_tcp.c I've also added Jozsef, since I didn't want
        it to give the wrong impression
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f229f6ce
  7. 06 4月, 2013 1 次提交
    • G
      netfilter: nf_log: prepare net namespace support for loggers · 30e0c6a6
      Gao feng 提交于
      This patch adds netns support to nf_log and it prepares netns
      support for existing loggers. It is composed of four major
      changes.
      
      1) nf_log_register has been split to two functions: nf_log_register
         and nf_log_set. The new nf_log_register is used to globally
         register the nf_logger and nf_log_set is used for enabling
         pernet support from nf_loggers.
      
         Per netns is not yet complete after this patch, it comes in
         separate follow up patches.
      
      2) Add net as a parameter of nf_log_bind_pf. Per netns is not
         yet complete after this patch, it only allows to bind the
         nf_logger to the protocol family from init_net and it skips
         other cases.
      
      3) Adapt all nf_log_packet callers to pass netns as parameter.
         After this patch, this function only works for init_net.
      
      4) Make the sysctl net/netfilter/nf_log pernet.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      30e0c6a6
  8. 02 4月, 2013 1 次提交
  9. 23 1月, 2013 1 次提交
  10. 19 11月, 2012 1 次提交
    • E
      net: Allow userns root to control ipv4 · 52e804c6
      Eric W. Biederman 提交于
      Allow an unpriviled user who has created a user namespace, and then
      created a network namespace to effectively use the new network
      namespace, by reducing capable(CAP_NET_ADMIN) and
      capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
      CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
      
      Settings that merely control a single network device are allowed.
      Either the network device is a logical network device where
      restrictions make no difference or the network device is hardware NIC
      that has been explicity moved from the initial network namespace.
      
      In general policy and network stack state changes are allowed
      while resource control is left unchanged.
      
      Allow creating raw sockets.
      Allow the SIOCSARP ioctl to control the arp cache.
      Allow the SIOCSIFFLAG ioctl to allow setting network device flags.
      Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address.
      Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address.
      Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address.
      Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask.
      Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
      adding, changing and deleting gre tunnels.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
      adding, changing and deleting ipip tunnels.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
      adding, changing and deleting ipsec virtual tunnel interfaces.
      
      Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC,
      MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing
      sockets.
      
      Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and
      arbitrary ip options.
      
      Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option.
      Allow setting the IP_TRANSPARENT ipv4 socket option.
      Allow setting the TCP_REPAIR socket option.
      Allow setting the TCP_CONGESTION socket option.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52e804c6
  11. 16 5月, 2012 1 次提交
  12. 16 4月, 2012 1 次提交
  13. 16 6月, 2011 1 次提交
  14. 04 4月, 2011 1 次提交
    • E
      netfilter: get rid of atomic ops in fast path · 7f5c6d4f
      Eric Dumazet 提交于
      We currently use a percpu spinlock to 'protect' rule bytes/packets
      counters, after various attempts to use RCU instead.
      
      Lately we added a seqlock so that get_counters() can run without
      blocking BH or 'writers'. But we really only need the seqcount in it.
      
      Spinlock itself is only locked by the current/owner cpu, so we can
      remove it completely.
      
      This cleanups api, using correct 'writer' vs 'reader' semantic.
      
      At replace time, the get_counters() call makes sure all cpus are done
      using the old table.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Jan Engelhardt <jengelh@medozas.de>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      7f5c6d4f
  15. 31 3月, 2011 1 次提交
  16. 20 3月, 2011 1 次提交
    • E
      netfilter: xtables: fix reentrancy · db856674
      Eric Dumazet 提交于
      commit f3c5c1bf (make ip_tables reentrant) introduced a race in
      handling the stackptr restore, at the end of ipt_do_table()
      
      We should do it before the call to xt_info_rdunlock_bh(), or we allow
      cpu preemption and another cpu overwrites stackptr of original one.
      
      A second fix is to change the underflow test to check the origptr value
      instead of 0 to detect underflow, or else we allow a jump from different
      hooks.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Jan Engelhardt <jengelh@medozas.de>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      db856674
  17. 15 3月, 2011 1 次提交
    • V
      netfilter: ip_tables: fix infoleak to userspace · 78b79876
      Vasiliy Kulikov 提交于
      Structures ipt_replace, compat_ipt_replace, and xt_get_revision are
      copied from userspace.  Fields of these structs that are
      zero-terminated strings are not checked.  When they are used as argument
      to a format string containing "%s" in request_module(), some sensitive
      information is leaked to userspace via argument of spawned modprobe
      process.
      
      The first and the third bugs were introduced before the git epoch; the
      second was introduced in 2722971c (v2.6.17-rc1).  To trigger the bug
      one should have CAP_NET_ADMIN.
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      78b79876
  18. 13 1月, 2011 1 次提交
    • E
      netfilter: x_table: speedup compat operations · 255d0dc3
      Eric Dumazet 提交于
      One iptables invocation with 135000 rules takes 35 seconds of cpu time
      on a recent server, using a 32bit distro and a 64bit kernel.
      
      We eventually trigger NMI/RCU watchdog.
      
      INFO: rcu_sched_state detected stall on CPU 3 (t=6000 jiffies)
      
      COMPAT mode has quadratic behavior and consume 16 bytes of memory per
      rule.
      
      Switch the xt_compat algos to use an array instead of list, and use a
      binary search to locate an offset in the sorted array.
      
      This halves memory need (8 bytes per rule), and removes quadratic
      behavior [ O(N*N) -> O(N*log2(N)) ]
      
      Time of iptables goes from 35 s to 150 ms.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      255d0dc3
  19. 11 1月, 2011 1 次提交
  20. 03 11月, 2010 1 次提交
  21. 14 10月, 2010 3 次提交
  22. 24 8月, 2010 1 次提交
  23. 18 8月, 2010 1 次提交
  24. 02 8月, 2010 1 次提交
  25. 23 7月, 2010 1 次提交
  26. 04 6月, 2010 1 次提交
  27. 31 5月, 2010 1 次提交
  28. 13 5月, 2010 2 次提交
  29. 12 5月, 2010 5 次提交
  30. 02 5月, 2010 2 次提交
  31. 22 4月, 2010 1 次提交
  32. 19 4月, 2010 1 次提交