1. 11 10月, 2016 1 次提交
  2. 01 10月, 2016 2 次提交
  3. 25 9月, 2016 3 次提交
  4. 19 9月, 2016 1 次提交
  5. 17 10月, 2015 1 次提交
  6. 13 10月, 2015 1 次提交
    • F
      netfilter: sync with packet rx also after removing queue entries · 514ed62e
      Florian Westphal 提交于
      We need to sync packet rx again after flushing the queue entries.
      Otherwise, the following race could happen:
      
      cpu1: nf_unregister_hook(H) called, H unliked from lists, calls
      synchronize_net() to wait for packet rx completion.
      
      Problem is that while no new nf_queue_entry structs that use H can be
      allocated, another CPU might receive a verdict from userspace just before
      cpu1 calls nf_queue_nf_hook_drop to remove this entry:
      
      cpu2: receive verdict from userspace, lock queue
      cpu2: unlink nf_queue_entry struct E, which references H, from queue list
      cpu1: calls nf_queue_nf_hook_drop, blocks on queue spinlock
      cpu2: unlock queue
      cpu1: nf_queue_nf_hook_drop drops affected queue entries
      cpu2: call nf_reinject for E
      cpu1: kfree(H)
      cpu2: potential use-after-free for H
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Fixes: 085db2c0 ("netfilter: Per network namespace netfilter hooks.")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      514ed62e
  7. 05 10月, 2015 2 次提交
  8. 19 9月, 2015 1 次提交
  9. 03 9月, 2015 1 次提交
    • D
      netfilter: nf_conntrack: make nf_ct_zone_dflt built-in · 62da9865
      Daniel Borkmann 提交于
      Fengguang reported, that some randconfig generated the following linker
      issue with nf_ct_zone_dflt object involved:
      
        [...]
        CC      init/version.o
        LD      init/built-in.o
        net/built-in.o: In function `ipv4_conntrack_defrag':
        nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
        net/built-in.o: In function `ipv6_defrag':
        nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
        make: *** [vmlinux] Error 1
      
      Given that configurations exist where we have a built-in part, which is
      accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
      and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
      module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
      area when netfilter is configured in general.
      
      Therefore, split the more generic parts into a common header under
      include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
      section that already holds parts related to CONFIG_NF_CONNTRACK in the
      netfilter core. This fixes the issue on my side.
      
      Fixes: 308ac914 ("netfilter: nf_conntrack: push zone object into functions")
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62da9865
  10. 29 8月, 2015 1 次提交
    • F
      netfilter: reduce sparse warnings · 851345c5
      Florian Westphal 提交于
      bridge/netfilter/ebtables.c:290:26: warning: incorrect type in assignment (different modifiers)
      -> remove __pure annotation.
      
      ipv6/netfilter/ip6t_SYNPROXY.c:240:27: warning: cast from restricted __be16
      -> switch ntohs to htons and vice versa.
      
      netfilter/core.c:391:30: warning: symbol 'nfq_ct_nat_hook' was not declared. Should it be static?
      -> delete it, got removed
      
      net/netfilter/nf_synproxy_core.c:221:48: warning: cast to restricted __be32
      -> Use __be32 instead of u32.
      
      Tested with objdiff that these changes do not affect generated code.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      851345c5
  11. 23 7月, 2015 3 次提交
    • P
      netfilter: rename local nf_hook_list to hook_list · 3bbd14e0
      Pablo Neira Ayuso 提交于
      085db2c0 ("netfilter: Per network namespace netfilter hooks.") introduced a
      new nf_hook_list that is global, so let's avoid this overlap.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      3bbd14e0
    • P
      netfilter: fix possible removal of wrong hook · 7181ebaf
      Pablo Neira Ayuso 提交于
      nf_unregister_net_hook() uses the nf_hook_ops fields as tuple to look up for
      the corresponding hook in the list. However, we may have two hooks with exactly
      the same configuration.
      
      This shouldn't be a problem for nftables since every new chain has an unique
      priv field set, but this may still cause us problems in the future, so better
      address this problem now by keeping a reference to the original nf_hook_ops
      structure to make sure we delete the right hook from nf_unregister_net_hook().
      
      Fixes: 085db2c0 ("netfilter: Per network namespace netfilter hooks.")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7181ebaf
    • P
      netfilter: nf_queue: fix nf_queue_nf_hook_drop() · 2385eb0c
      Pablo Neira Ayuso 提交于
      This function reacquires the rtnl_lock() which is already held by
      nf_unregister_hook().
      
      This can be triggered via: modprobe nf_conntrack_ipv4 && rmmod nf_conntrack_ipv4
      
      [  720.628746] INFO: task rmmod:3578 blocked for more than 120 seconds.
      [  720.628749]       Not tainted 4.2.0-rc2+ #113
      [  720.628752] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  720.628754] rmmod           D ffff8800ca46fd58     0  3578   3571 0x00000080
      [...]
      [  720.628783] Call Trace:
      [  720.628790]  [<ffffffff8152ea0b>] schedule+0x6b/0x90
      [  720.628795]  [<ffffffff8152ecb3>] schedule_preempt_disabled+0x13/0x20
      [  720.628799]  [<ffffffff8152ff55>] mutex_lock_nested+0x1f5/0x380
      [  720.628803]  [<ffffffff81462622>] ? rtnl_lock+0x12/0x20
      [  720.628807]  [<ffffffff81462622>] ? rtnl_lock+0x12/0x20
      [  720.628812]  [<ffffffff81462622>] rtnl_lock+0x12/0x20
      [  720.628817]  [<ffffffff8148ab25>] nf_queue_nf_hook_drop+0x15/0x160
      [  720.628825]  [<ffffffff81488d48>] nf_unregister_net_hook+0x168/0x190
      [  720.628831]  [<ffffffff81488e24>] nf_unregister_hook+0x64/0x80
      [  720.628837]  [<ffffffff81488e60>] nf_unregister_hooks+0x20/0x30
      [...]
      
      Moreover, nf_unregister_net_hook() should only destroy the queue for this
      netns, not for every netns.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Fixes: 085db2c0 ("netfilter: Per network namespace netfilter hooks.")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      2385eb0c
  12. 20 7月, 2015 1 次提交
  13. 16 7月, 2015 2 次提交
    • F
      netfilter: move tee_active to core · e7c8899f
      Florian Westphal 提交于
      This prepares for a TEE like expression in nftables.
      We want to ensure only one duplicate is sent, so both will
      use the same percpu variable to detect duplication.
      
      The other use case is detection of recursive call to xtables, but since
      we don't want dependency from nft to xtables core its put into core.c
      instead of the x_tables core.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      e7c8899f
    • E
      netfilter: Per network namespace netfilter hooks. · 085db2c0
      Eric W. Biederman 提交于
      - Add a new set of functions for registering and unregistering per
        network namespace hooks.
      
      - Modify the old global namespace hook functions to use the per
        network namespace hooks in their implementation, so their remains a
        single list that needs to be walked for any hook (this is important
        for keeping the hook priority working and for keeping the code
        walking the hooks simple).
      
      - Only allow registering the per netdevice hooks in the network
        namespace where the network device lives.
      
      - Dynamically allocate the structures in the per network namespace
        hook list in nf_register_net_hook, and unregister them in
        nf_unregister_net_hook.
      
        Dynamic allocate is required somewhere as the number of network
        namespaces are not fixed so we might as well allocate them in the
        registration function.
      
        The chain of registered hooks on any list is expected to be small so
        the cost of walking that list to find the entry we are unregistering
        should also be small.
      
        Performing the management of the dynamically allocated list entries
        in the registration and unregistration functions keeps the complexity
        from spreading.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      085db2c0
  14. 15 7月, 2015 2 次提交
  15. 23 6月, 2015 1 次提交
    • E
      netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook · 8405a8ff
      Eric W. Biederman 提交于
      Add code to nf_unregister_hook to flush the nf_queue when a hook is
      unregistered.  This guarantees that the pointer that the nf_queue code
      retains into the nf_hook list will remain valid while a packet is
      queued.
      
      I tested what would happen if we do not flush queued packets and was
      trivially able to obtain the oops below.  All that was required was
      to stop the nf_queue listening process, to delete all of the nf_tables,
      and to awaken the nf_queue listening process.
      
      > BUG: unable to handle kernel paging request at 0000000100000001
      > IP: [<0000000100000001>] 0x100000001
      > PGD b9c35067 PUD 0
      > Oops: 0010 [#1] SMP
      > Modules linked in:
      > CPU: 0 PID: 519 Comm: lt-nfqnl_test Not tainted
      > task: ffff8800b9c8c050 ti: ffff8800ba9d8000 task.ti: ffff8800ba9d8000
      > RIP: 0010:[<0000000100000001>]  [<0000000100000001>] 0x100000001
      > RSP: 0018:ffff8800ba9dba40  EFLAGS: 00010a16
      > RAX: ffff8800bab48a00 RBX: ffff8800ba9dba90 RCX: ffff8800ba9dba90
      > RDX: ffff8800b9c10128 RSI: ffff8800ba940900 RDI: ffff8800bab48a00
      > RBP: ffff8800b9c10128 R08: ffffffff82976660 R09: ffff8800ba9dbb28
      > R10: dead000000100100 R11: dead000000200200 R12: ffff8800ba940900
      > R13: ffffffff8313fd50 R14: ffff8800b9c95200 R15: 0000000000000000
      > FS:  00007fb91fc34700(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
      > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      > CR2: 0000000100000001 CR3: 00000000babfb000 CR4: 00000000000007f0
      > Stack:
      >  ffffffff8206ab0f ffffffff82982240 ffff8800bab48a00 ffff8800b9c100a8
      >  ffff8800b9c10100 0000000000000001 ffff8800ba940900 ffff8800b9c10128
      >  ffffffff8206bd65 ffff8800bfb0d5e0 ffff8800bab48a00 0000000000014dc0
      > Call Trace:
      >  [<ffffffff8206ab0f>] ? nf_iterate+0x4f/0xa0
      >  [<ffffffff8206bd65>] ? nf_reinject+0x125/0x190
      >  [<ffffffff8206dee5>] ? nfqnl_recv_verdict+0x255/0x360
      >  [<ffffffff81386290>] ? nla_parse+0x80/0xf0
      >  [<ffffffff8206c42c>] ? nfnetlink_rcv_msg+0x13c/0x240
      >  [<ffffffff811b2fec>] ? __memcg_kmem_get_cache+0x4c/0x150
      >  [<ffffffff8206c2f0>] ? nfnl_lock+0x20/0x20
      >  [<ffffffff82068159>] ? netlink_rcv_skb+0xa9/0xc0
      >  [<ffffffff820677bf>] ? netlink_unicast+0x12f/0x1c0
      >  [<ffffffff82067ade>] ? netlink_sendmsg+0x28e/0x650
      >  [<ffffffff81fdd814>] ? sock_sendmsg+0x44/0x50
      >  [<ffffffff81fde07b>] ? ___sys_sendmsg+0x2ab/0x2c0
      >  [<ffffffff810e8f73>] ? __wake_up+0x43/0x70
      >  [<ffffffff8141a134>] ? tty_write+0x1c4/0x2a0
      >  [<ffffffff81fde9f4>] ? __sys_sendmsg+0x44/0x80
      >  [<ffffffff823ff8d7>] ? system_call_fastpath+0x12/0x6a
      > Code:  Bad RIP value.
      > RIP  [<0000000100000001>] 0x100000001
      >  RSP <ffff8800ba9dba40>
      > CR2: 0000000100000001
      > ---[ end trace 08eb65d42362793f ]---
      
      Cc: stable@vger.kernel.org
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8405a8ff
  16. 14 5月, 2015 2 次提交
    • P
      netfilter: add netfilter ingress hook after handle_ing() under unique static key · e687ad60
      Pablo Neira 提交于
      This patch adds the Netfilter ingress hook just after the existing tc ingress
      hook, that seems to be the consensus solution for this.
      
      Note that the Netfilter hook resides under the global static key that enables
      ingress filtering. Nonetheless, Netfilter still also has its own static key for
      minimal impact on the existing handle_ing().
      
      * Without this patch:
      
      Result: OK: 6216490(c6216338+d152) usec, 100000000 (60byte,0frags)
        16086246pps 7721Mb/sec (7721398080bps) errors: 100000000
      
          42.46%  kpktgend_0   [kernel.kallsyms]   [k] __netif_receive_skb_core
          25.92%  kpktgend_0   [kernel.kallsyms]   [k] kfree_skb
           7.81%  kpktgend_0   [pktgen]            [k] pktgen_thread_worker
           5.62%  kpktgend_0   [kernel.kallsyms]   [k] ip_rcv
           2.70%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_internal
           2.34%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_sk
           1.44%  kpktgend_0   [kernel.kallsyms]   [k] __build_skb
      
      * With this patch:
      
      Result: OK: 6214833(c6214731+d101) usec, 100000000 (60byte,0frags)
        16090536pps 7723Mb/sec (7723457280bps) errors: 100000000
      
          41.23%  kpktgend_0      [kernel.kallsyms]  [k] __netif_receive_skb_core
          26.57%  kpktgend_0      [kernel.kallsyms]  [k] kfree_skb
           7.72%  kpktgend_0      [pktgen]           [k] pktgen_thread_worker
           5.55%  kpktgend_0      [kernel.kallsyms]  [k] ip_rcv
           2.78%  kpktgend_0      [kernel.kallsyms]  [k] netif_receive_skb_internal
           2.06%  kpktgend_0      [kernel.kallsyms]  [k] netif_receive_skb_sk
           1.43%  kpktgend_0      [kernel.kallsyms]  [k] __build_skb
      
      * Without this patch + tc ingress:
      
              tc filter add dev eth4 parent ffff: protocol ip prio 1 \
                      u32 match ip dst 4.3.2.1/32
      
      Result: OK: 9269001(c9268821+d179) usec, 100000000 (60byte,0frags)
        10788648pps 5178Mb/sec (5178551040bps) errors: 100000000
      
          40.99%  kpktgend_0   [kernel.kallsyms]  [k] __netif_receive_skb_core
          17.50%  kpktgend_0   [kernel.kallsyms]  [k] kfree_skb
          11.77%  kpktgend_0   [cls_u32]          [k] u32_classify
           5.62%  kpktgend_0   [kernel.kallsyms]  [k] tc_classify_compat
           5.18%  kpktgend_0   [pktgen]           [k] pktgen_thread_worker
           3.23%  kpktgend_0   [kernel.kallsyms]  [k] tc_classify
           2.97%  kpktgend_0   [kernel.kallsyms]  [k] ip_rcv
           1.83%  kpktgend_0   [kernel.kallsyms]  [k] netif_receive_skb_internal
           1.50%  kpktgend_0   [kernel.kallsyms]  [k] netif_receive_skb_sk
           0.99%  kpktgend_0   [kernel.kallsyms]  [k] __build_skb
      
      * With this patch + tc ingress:
      
              tc filter add dev eth4 parent ffff: protocol ip prio 1 \
                      u32 match ip dst 4.3.2.1/32
      
      Result: OK: 9308218(c9308091+d126) usec, 100000000 (60byte,0frags)
        10743194pps 5156Mb/sec (5156733120bps) errors: 100000000
      
          42.01%  kpktgend_0   [kernel.kallsyms]   [k] __netif_receive_skb_core
          17.78%  kpktgend_0   [kernel.kallsyms]   [k] kfree_skb
          11.70%  kpktgend_0   [cls_u32]           [k] u32_classify
           5.46%  kpktgend_0   [kernel.kallsyms]   [k] tc_classify_compat
           5.16%  kpktgend_0   [pktgen]            [k] pktgen_thread_worker
           2.98%  kpktgend_0   [kernel.kallsyms]   [k] ip_rcv
           2.84%  kpktgend_0   [kernel.kallsyms]   [k] tc_classify
           1.96%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_internal
           1.57%  kpktgend_0   [kernel.kallsyms]   [k] netif_receive_skb_sk
      
      Note that the results are very similar before and after.
      
      I can see gcc gets the code under the ingress static key out of the hot path.
      Then, on that cold branch, it generates the code to accomodate the netfilter
      ingress static key. My explanation for this is that this reduces the pressure
      on the instruction cache for non-users as the new code is out of the hot path,
      and it comes with minimal impact for tc ingress users.
      
      Using gcc version 4.8.4 on:
      
      Architecture:          x86_64
      CPU op-mode(s):        32-bit, 64-bit
      Byte Order:            Little Endian
      CPU(s):                8
      [...]
      L1d cache:             16K
      L1i cache:             64K
      L2 cache:              2048K
      L3 cache:              8192K
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e687ad60
    • P
      f7191483
  17. 05 4月, 2015 2 次提交
  18. 13 11月, 2014 1 次提交
    • F
      netfilter: fix various sparse warnings · 56768644
      Florian Westphal 提交于
      net/bridge/br_netfilter.c:870:6: symbol 'br_netfilter_enable' was not declared. Should it be static?
        no; add include
      net/ipv4/netfilter/nft_reject_ipv4.c:22:6: symbol 'nft_reject_ipv4_eval' was not declared. Should it be static?
        yes
      net/ipv6/netfilter/nf_reject_ipv6.c:16:6: symbol 'nf_send_reset6' was not declared. Should it be static?
        no; add include
      net/ipv6/netfilter/nft_reject_ipv6.c:22:6: symbol 'nft_reject_ipv6_eval' was not declared. Should it be static?
        yes
      net/netfilter/core.c:33:32: symbol 'nf_ipv6_ops' was not declared. Should it be static?
        no; add include
      net/netfilter/xt_DSCP.c:40:57: cast truncates bits from constant value (ffffff03 becomes 3)
      net/netfilter/xt_DSCP.c:57:59: cast truncates bits from constant value (ffffff03 becomes 3)
        add __force, 3 is what we want.
      net/ipv4/netfilter/nf_log_arp.c:77:6: symbol 'nf_log_arp_packet' was not declared. Should it be static?
        yes
      net/ipv4/netfilter/nf_reject_ipv4.c:17:6: symbol 'nf_send_reset' was not declared. Should it be static?
        no; add include
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      56768644
  19. 25 8月, 2014 1 次提交
  20. 08 8月, 2014 1 次提交
  21. 14 10月, 2013 1 次提交
  22. 31 7月, 2013 1 次提交
  23. 23 5月, 2013 2 次提交
    • P
      netfilter: don't panic on error while walking through the init path · 6d11cfdb
      Pablo Neira Ayuso 提交于
      Don't panic if we hit an error while adding the nf_log or pernet
      netfilter support, just bail out.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NGao feng <gaofeng@cn.fujitsu.com>
      6d11cfdb
    • F
      netfilter: add nf_ipv6_ops hook to fix xt_addrtype with IPv6 · 2a7851bf
      Florian Westphal 提交于
      Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812:
      
      [ ip6tables -m addrtype ]
      When I tried to use in the nat/PREROUTING it messes up the
      routing cache even if the rule didn't matched at all.
      [..]
      If I remove the --limit-iface-in from the non-working scenario, so just
      use the -m addrtype --dst-type LOCAL it works!
      
      This happens when LOCAL type matching is requested with --limit-iface-in,
      and the default ipv6 route is via the interface the packet we test
      arrived on.
      
      Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation
      creates an unwanted cached entry, and the packet won't make it to the
      real/expected destination.
      
      Silently ignoring --limit-iface-in makes the routing work but it breaks
      rule matching (--dst-type LOCAL with limit-iface-in is supposed to only
      match if the dst address is configured on the incoming interface;
      without --limit-iface-in it will match if the address is reachable
      via lo).
      
      The test should call ipv6_chk_addr() instead.  However, this would add
      a link-time dependency on ipv6.
      
      There are two possible solutions:
      
      1) Revert the commit that moved ipt_addrtype to xt_addrtype,
         and put ipv6 specific code into ip6t_addrtype.
      2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions.
      
      While the former might seem preferable, Pablo pointed out that there
      are more xt modules with link-time dependeny issues regarding ipv6,
      so lets go for 2).
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2a7851bf
  24. 19 4月, 2013 1 次提交
    • P
      netfilter: add my copyright statements · f229f6ce
      Patrick McHardy 提交于
      Add copyright statements to all netfilter files which have had significant
      changes done by myself in the past.
      
      Some notes:
      
      - nf_conntrack_ecache.c was incorrectly attributed to Rusty and Netfilter
        Core Team when it got split out of nf_conntrack_core.c. The copyrights
        even state a date which lies six years before it was written. It was
        written in 2005 by Harald and myself.
      
      - net/ipv{4,6}/netfilter.c, net/netfitler/nf_queue.c were missing copyright
        statements. I've added the copyright statement from net/netfilter/core.c,
        where this code originated
      
      - for nf_conntrack_proto_tcp.c I've also added Jozsef, since I didn't want
        it to give the wrong impression
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f229f6ce
  25. 06 4月, 2013 2 次提交
  26. 03 12月, 2012 1 次提交
  27. 03 9月, 2012 2 次提交