1. 23 4月, 2014 1 次提交
  2. 14 4月, 2014 2 次提交
    • P
      netfilter: nf_tables: fix nft_cmp_fast failure on big endian for size < 4 · b855d416
      Patrick McHardy 提交于
      nft_cmp_fast is used for equality comparisions of size <= 4. For
      comparisions of size < 4 byte a mask is calculated that is applied to
      both the data from userspace (during initialization) and the register
      value (during runtime). Both values are stored using (in effect) memcpy
      to a memory area that is then interpreted as u32 by nft_cmp_fast.
      
      This works fine on little endian since smaller types have the same base
      address, however on big endian this is not true and the smaller types
      are interpreted as a big number with trailing zero bytes.
      
      The mask therefore must not include the lower bytes, but the higher bytes
      on big endian. Add a helper function that does a cpu_to_le32 to switch
      the bytes on big endian. Since we're dealing with a mask of just consequitive
      bits, this works out fine.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b855d416
    • A
      netfilter: nf_conntrack: initialize net.ct.generation · ee214d54
      Andrey Vagin 提交于
      [  251.920788] INFO: trying to register non-static key.
      [  251.921386] the code is fine but needs lockdep annotation.
      [  251.921386] turning off the locking correctness validator.
      [  251.921386] CPU: 2 PID: 15715 Comm: socket_listen Not tainted 3.14.0+ #294
      [  251.921386] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  251.921386]  0000000000000000 000000009d18c210 ffff880075f039b8 ffffffff816b7ecd
      [  251.921386]  ffffffff822c3b10 ffff880075f039c8 ffffffff816b36f4 ffff880075f03aa0
      [  251.921386]  ffffffff810c65ff ffffffff810c4a85 00000000fffffe01 ffffffffa0075172
      [  251.921386] Call Trace:
      [  251.921386]  [<ffffffff816b7ecd>] dump_stack+0x45/0x56
      [  251.921386]  [<ffffffff816b36f4>] register_lock_class.part.24+0x38/0x3c
      [  251.921386]  [<ffffffff810c65ff>] __lock_acquire+0x168f/0x1b40
      [  251.921386]  [<ffffffff810c4a85>] ? trace_hardirqs_on_caller+0x105/0x1d0
      [  251.921386]  [<ffffffffa0075172>] ? nf_nat_setup_info+0x252/0x3a0 [nf_nat]
      [  251.921386]  [<ffffffff816c1215>] ? _raw_spin_unlock_bh+0x35/0x40
      [  251.921386]  [<ffffffffa0075172>] ? nf_nat_setup_info+0x252/0x3a0 [nf_nat]
      [  251.921386]  [<ffffffff810c7272>] lock_acquire+0xa2/0x120
      [  251.921386]  [<ffffffffa008ab90>] ? ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
      [  251.921386]  [<ffffffffa0055989>] __nf_conntrack_confirm+0x129/0x410 [nf_conntrack]
      [  251.921386]  [<ffffffffa008ab90>] ? ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
      [  251.921386]  [<ffffffffa008ab90>] ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
      [  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
      [  251.921386]  [<ffffffff815d8c5a>] nf_iterate+0xaa/0xc0
      [  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
      [  251.921386]  [<ffffffff815d8d14>] nf_hook_slow+0xa4/0x190
      [  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
      [  251.921386]  [<ffffffff815e98f2>] ip_output+0x92/0x100
      [  251.921386]  [<ffffffff815e8df9>] ip_local_out+0x29/0x90
      [  251.921386]  [<ffffffff815e9240>] ip_queue_xmit+0x170/0x4c0
      [  251.921386]  [<ffffffff815e90d5>] ? ip_queue_xmit+0x5/0x4c0
      [  251.921386]  [<ffffffff81601208>] tcp_transmit_skb+0x498/0x960
      [  251.921386]  [<ffffffff81602d82>] tcp_connect+0x812/0x960
      [  251.921386]  [<ffffffff810e3dc5>] ? ktime_get_real+0x25/0x70
      [  251.921386]  [<ffffffff8159ea2a>] ? secure_tcp_sequence_number+0x6a/0xc0
      [  251.921386]  [<ffffffff81606f57>] tcp_v4_connect+0x317/0x470
      [  251.921386]  [<ffffffff8161f645>] __inet_stream_connect+0xb5/0x330
      [  251.921386]  [<ffffffff8158dfc3>] ? lock_sock_nested+0x33/0xa0
      [  251.921386]  [<ffffffff810c4b5d>] ? trace_hardirqs_on+0xd/0x10
      [  251.921386]  [<ffffffff81078885>] ? __local_bh_enable_ip+0x75/0xe0
      [  251.921386]  [<ffffffff8161f8f8>] inet_stream_connect+0x38/0x50
      [  251.921386]  [<ffffffff8158b157>] SYSC_connect+0xe7/0x120
      [  251.921386]  [<ffffffff810e3789>] ? current_kernel_time+0x69/0xd0
      [  251.921386]  [<ffffffff810c4a85>] ? trace_hardirqs_on_caller+0x105/0x1d0
      [  251.921386]  [<ffffffff810c4b5d>] ? trace_hardirqs_on+0xd/0x10
      [  251.921386]  [<ffffffff8158c36e>] SyS_connect+0xe/0x10
      [  251.921386]  [<ffffffff816caf69>] system_call_fastpath+0x16/0x1b
      [  312.014104] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=60003 jiffies, g=42359, c=42358, q=333)
      [  312.015097] INFO: Stall ended before state dump start
      
      Fixes: 93bb0ceb ("netfilter: conntrack: remove central spinlock nf_conntrack_lock")
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ee214d54
  3. 08 4月, 2014 1 次提交
    • A
      netfilter: nf_conntrack: flush net_gre->keymap_list only from gre helper · 8142b227
      Andrey Vagin 提交于
      nf_ct_gre_keymap_flush() removes a nf_ct_gre_keymap object from
      net_gre->keymap_list and frees the object. But it doesn't clean
      a reference on this object from ct_pptp_info->keymap[dir].
      Then nf_ct_gre_keymap_destroy() may release the same object again.
      
      So nf_ct_gre_keymap_flush() can be called only when we are sure that
      when nf_ct_gre_keymap_destroy will not be called.
      
      nf_ct_gre_keymap is created by nf_ct_gre_keymap_add() and the right way
      to destroy it is to call nf_ct_gre_keymap_destroy().
      
      This patch marks nf_ct_gre_keymap_flush() as static, so this patch can
      break compilation of third party modules, which use
      nf_ct_gre_keymap_flush. I'm not sure this is the right way to deprecate
      this function.
      
      [  226.540793] general protection fault: 0000 [#1] SMP
      [  226.541750] Modules linked in: nf_nat_pptp nf_nat_proto_gre
      nf_conntrack_pptp nf_conntrack_proto_gre ip_gre ip_tunnel gre
      ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc xt_nat
      iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
      nf_conntrack veth tun bridge stp llc ppdev microcode joydev pcspkr
      serio_raw virtio_console virtio_balloon floppy parport_pc parport
      pvpanic i2c_piix4 virtio_net drm_kms_helper ttm ata_generic virtio_pci
      virtio_ring virtio drm i2c_core pata_acpi [last unloaded: ip_tunnel]
      [  226.541776] CPU: 0 PID: 49 Comm: kworker/u4:2 Not tainted 3.14.0-rc8+ #101
      [  226.541776] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  226.541776] Workqueue: netns cleanup_net
      [  226.541776] task: ffff8800371e0000 ti: ffff88003730c000 task.ti: ffff88003730c000
      [  226.541776] RIP: 0010:[<ffffffff81389ba9>]  [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0
      [  226.541776] RSP: 0018:ffff88003730dbd0  EFLAGS: 00010a83
      [  226.541776] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800374e6c40 RCX: dead000000200200
      [  226.541776] RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800371e07d0 RDI: ffff8800374e6c40
      [  226.541776] RBP: ffff88003730dbd0 R08: 0000000000000000 R09: 0000000000000000
      [  226.541776] R10: 0000000000000001 R11: ffff88003730d92e R12: 0000000000000002
      [  226.541776] R13: ffff88007a4c42d0 R14: ffff88007aef0000 R15: ffff880036cf0018
      [  226.541776] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
      [  226.541776] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  226.541776] CR2: 00007f07f643f7d0 CR3: 0000000036fd2000 CR4: 00000000000006f0
      [  226.541776] Stack:
      [  226.541776]  ffff88003730dbe8 ffffffff81389c5d ffff8800374ffbe4 ffff88003730dc28
      [  226.541776]  ffffffffa0162a43 ffffffffa01627c5 ffff88007a4c42d0 ffff88007aef0000
      [  226.541776]  ffffffffa01651c0 ffff88007a4c45e0 ffff88007aef0000 ffff88003730dc40
      [  226.541776] Call Trace:
      [  226.541776]  [<ffffffff81389c5d>] list_del+0xd/0x30
      [  226.541776]  [<ffffffffa0162a43>] nf_ct_gre_keymap_destroy+0x283/0x2d0 [nf_conntrack_proto_gre]
      [  226.541776]  [<ffffffffa01627c5>] ? nf_ct_gre_keymap_destroy+0x5/0x2d0 [nf_conntrack_proto_gre]
      [  226.541776]  [<ffffffffa0162ab7>] gre_destroy+0x27/0x70 [nf_conntrack_proto_gre]
      [  226.541776]  [<ffffffffa0117de3>] destroy_conntrack+0x83/0x200 [nf_conntrack]
      [  226.541776]  [<ffffffffa0117d87>] ? destroy_conntrack+0x27/0x200 [nf_conntrack]
      [  226.541776]  [<ffffffffa0117d60>] ? nf_conntrack_hash_check_insert+0x2e0/0x2e0 [nf_conntrack]
      [  226.541776]  [<ffffffff81630142>] nf_conntrack_destroy+0x72/0x180
      [  226.541776]  [<ffffffff816300d5>] ? nf_conntrack_destroy+0x5/0x180
      [  226.541776]  [<ffffffffa011ef80>] ? kill_l3proto+0x20/0x20 [nf_conntrack]
      [  226.541776]  [<ffffffffa011847e>] nf_ct_iterate_cleanup+0x14e/0x170 [nf_conntrack]
      [  226.541776]  [<ffffffffa011f74b>] nf_ct_l4proto_pernet_unregister+0x5b/0x90 [nf_conntrack]
      [  226.541776]  [<ffffffffa0162409>] proto_gre_net_exit+0x19/0x30 [nf_conntrack_proto_gre]
      [  226.541776]  [<ffffffff815edf89>] ops_exit_list.isra.1+0x39/0x60
      [  226.541776]  [<ffffffff815eecc0>] cleanup_net+0x100/0x1d0
      [  226.541776]  [<ffffffff810a608a>] process_one_work+0x1ea/0x4f0
      [  226.541776]  [<ffffffff810a6028>] ? process_one_work+0x188/0x4f0
      [  226.541776]  [<ffffffff810a64ab>] worker_thread+0x11b/0x3a0
      [  226.541776]  [<ffffffff810a6390>] ? process_one_work+0x4f0/0x4f0
      [  226.541776]  [<ffffffff810af42d>] kthread+0xed/0x110
      [  226.541776]  [<ffffffff8173d4dc>] ? _raw_spin_unlock_irq+0x2c/0x40
      [  226.541776]  [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200
      [  226.541776]  [<ffffffff8174747c>] ret_from_fork+0x7c/0xb0
      [  226.541776]  [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200
      [  226.541776] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de
      48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48
      39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89
      42 08
      [  226.541776] RIP  [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0
      [  226.541776]  RSP <ffff88003730dbd0>
      [  226.612193] ---[ end trace 985ae23ddfcc357c ]---
      
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8142b227
  4. 04 4月, 2014 6 次提交
  5. 28 3月, 2014 1 次提交
  6. 19 3月, 2014 1 次提交
  7. 18 3月, 2014 1 次提交
  8. 17 3月, 2014 3 次提交
    • F
      netfilter: connlimit: use rbtree for per-host conntrack obj storage · 7d084877
      Florian Westphal 提交于
      With current match design every invocation of the connlimit_match
      function means we have to perform (number_of_conntracks % 256) lookups
      in the conntrack table [ to perform GC/delete stale entries ].
      This is also the reason why ____nf_conntrack_find() in perf top has
      > 20% cpu time per core.
      
      This patch changes the storage to rbtree which cuts down the number of
      ct objects that need testing.
      
      When looking up a new tuple, we only test the connections of the host
      objects we visit while searching for the wanted host/network (or
      the leaf we need to insert at).
      
      The slot count is reduced to 32.  Increasing slot count doesn't
      speed up things much because of rbtree nature.
      
      before patch (50kpps rx, 10kpps tx):
      +  20.95%  ksoftirqd/0  [nf_conntrack] [k] ____nf_conntrack_find
      +  20.50%  ksoftirqd/1  [nf_conntrack] [k] ____nf_conntrack_find
      +  20.27%  ksoftirqd/2  [nf_conntrack] [k] ____nf_conntrack_find
      +   5.76%  ksoftirqd/1  [nf_conntrack] [k] hash_conntrack_raw
      +   5.39%  ksoftirqd/2  [nf_conntrack] [k] hash_conntrack_raw
      +   5.35%  ksoftirqd/0  [nf_conntrack] [k] hash_conntrack_raw
      
      after (90kpps, 51kpps tx):
      +  17.24%       swapper  [nf_conntrack]    [k] ____nf_conntrack_find
      +   6.60%   ksoftirqd/2  [nf_conntrack]    [k] ____nf_conntrack_find
      +   2.73%       swapper  [nf_conntrack]    [k] hash_conntrack_raw
      +   2.36%       swapper  [xt_connlimit]    [k] count_tree
      
      Obvious disadvantages to previous version are the increase in code
      complexity and the increased memory cost.
      
      Partially based on Eric Dumazets fq scheduler.
      Reviewed-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7d084877
    • F
      netfilter: connlimit: make same_source_net signed · 50e0e9b1
      Florian Westphal 提交于
      currently returns 1 if they're the same.  Make it work like mem/strcmp
      so it can be used as rbtree search function.
      Reviewed-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      50e0e9b1
    • F
      netfilter: connlimit: use keyed locks · 1442e750
      Florian Westphal 提交于
      connlimit currently suffers from spinlock contention, example for
      4-core system with rps enabled:
      
      +  20.84%   ksoftirqd/2  [kernel.kallsyms] [k] _raw_spin_lock_bh
      +  20.76%   ksoftirqd/1  [kernel.kallsyms] [k] _raw_spin_lock_bh
      +  20.42%   ksoftirqd/0  [kernel.kallsyms] [k] _raw_spin_lock_bh
      +   6.07%   ksoftirqd/2  [nf_conntrack]    [k] ____nf_conntrack_find
      +   6.07%   ksoftirqd/1  [nf_conntrack]    [k] ____nf_conntrack_find
      +   5.97%   ksoftirqd/0  [nf_conntrack]    [k] ____nf_conntrack_find
      +   2.47%   ksoftirqd/2  [nf_conntrack]    [k] hash_conntrack_raw
      +   2.45%   ksoftirqd/0  [nf_conntrack]    [k] hash_conntrack_raw
      +   2.44%   ksoftirqd/1  [nf_conntrack]    [k] hash_conntrack_raw
      
      May allow parallel lookup/insert/delete if the entry is hashed to
      another slot.  With patch:
      
      +  20.95%  ksoftirqd/0  [nf_conntrack] [k] ____nf_conntrack_find
      +  20.50%  ksoftirqd/1  [nf_conntrack] [k] ____nf_conntrack_find
      +  20.27%  ksoftirqd/2  [nf_conntrack] [k] ____nf_conntrack_find
      +   5.76%  ksoftirqd/1  [nf_conntrack] [k] hash_conntrack_raw
      +   5.39%  ksoftirqd/2  [nf_conntrack] [k] hash_conntrack_raw
      +   5.35%  ksoftirqd/0  [nf_conntrack] [k] hash_conntrack_raw
      +   2.00%  ksoftirqd/1  [kernel.kallsyms] [k] __rcu_read_unlock
      
      Improved rx processing rate from ~35kpps to ~50 kpps.
      Reviewed-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1442e750
  9. 15 3月, 2014 1 次提交
  10. 13 3月, 2014 1 次提交
  11. 12 3月, 2014 4 次提交
  12. 08 3月, 2014 5 次提交
  13. 07 3月, 2014 8 次提交
  14. 06 3月, 2014 5 次提交