1. 16 6月, 2015 30 次提交
  2. 15 6月, 2015 2 次提交
  3. 14 6月, 2015 8 次提交
    • J
    • J
      netfilter: ipset: Introduce RCU locking in list type · 00590fdd
      Jozsef Kadlecsik 提交于
      Standard rculist is used.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      00590fdd
    • J
      netfilter: ipset: Introduce RCU locking in hash:* types · 18f84d41
      Jozsef Kadlecsik 提交于
      Three types of data need to be protected in the case of the hash types:
      
      a. The hash buckets: standard rcu pointer operations are used.
      b. The element blobs in the hash buckets are stored in an array and
         a bitmap is used for book-keeping to tell which elements in the array
         are used or free.
      c. Networks per cidr values and the cidr values themselves are stored
         in fix sized arrays and need no protection. The values are modified
         in such an order that in the worst case an element testing is repeated
         once with the same cidr value.
      
      The ipset hash approach uses arrays instead of lists and therefore is
      incompatible with rhashtable.
      
      Performance is tested by Jesper Dangaard Brouer:
      
      Simple drop in FORWARD
      ~~~~~~~~~~~~~~~~~~~~~~
      
      Dropping via simple iptables net-mask match::
      
       iptables -t raw -N simple || iptables -t raw -F simple
       iptables -t raw -I simple  -s 198.18.0.0/15 -j DROP
       iptables -t raw -D PREROUTING -j simple
       iptables -t raw -I PREROUTING -j simple
      
      Drop performance in "raw": 11.3Mpps
      
      Generator: sending 12.2Mpps (tx:12264083 pps)
      
      Drop via original ipset in RAW table
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Create a set with lots of elements::
      
       sudo ./ipset destroy test
       echo "create test hash:ip hashsize 65536" > test.set
       for x in `seq 0 255`; do
          for y in `seq 0 255`; do
              echo "add test 198.18.$x.$y" >> test.set
          done
       done
       sudo ./ipset restore < test.set
      
      Dropping via ipset::
      
       iptables -t raw -F
       iptables -t raw -N net198 || iptables -t raw -F net198
       iptables -t raw -I net198 -m set --match-set test src -j DROP
       iptables -t raw -I PREROUTING -j net198
      
      Drop performance in "raw" with ipset: 8Mpps
      
      Perf report numbers ipset drop in "raw"::
      
       +   24.65%  ksoftirqd/1  [ip_set]           [k] ip_set_test
       -   21.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_lock_bh
          - _raw_read_lock_bh
             + 99.88% ip_set_test
       -   19.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_unlock_bh
          - _raw_read_unlock_bh
             + 99.72% ip_set_test
       +    4.31%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_kadt
       +    2.27%  ksoftirqd/1  [ixgbe]            [k] ixgbe_fetch_rx_buffer
       +    2.18%  ksoftirqd/1  [ip_tables]        [k] ipt_do_table
       +    1.81%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_test
       +    1.61%  ksoftirqd/1  [kernel.kallsyms]  [k] __netif_receive_skb_core
       +    1.44%  ksoftirqd/1  [kernel.kallsyms]  [k] build_skb
       +    1.42%  ksoftirqd/1  [kernel.kallsyms]  [k] ip_rcv
       +    1.36%  ksoftirqd/1  [kernel.kallsyms]  [k] __local_bh_enable_ip
       +    1.16%  ksoftirqd/1  [kernel.kallsyms]  [k] dev_gro_receive
       +    1.09%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_unlock
       +    0.96%  ksoftirqd/1  [ixgbe]            [k] ixgbe_clean_rx_irq
       +    0.95%  ksoftirqd/1  [kernel.kallsyms]  [k] __netdev_alloc_frag
       +    0.88%  ksoftirqd/1  [kernel.kallsyms]  [k] kmem_cache_alloc
       +    0.87%  ksoftirqd/1  [xt_set]           [k] set_match_v3
       +    0.85%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_gro_receive
       +    0.83%  ksoftirqd/1  [kernel.kallsyms]  [k] nf_iterate
       +    0.76%  ksoftirqd/1  [kernel.kallsyms]  [k] put_compound_page
       +    0.75%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_lock
      
      Drop via ipset in RAW table with RCU-locking
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      With RCU locking, the RW-lock is gone.
      
      Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps
      Performance-tested-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      18f84d41
    • J
      netfilter: ipset: Introduce RCU locking in bitmap:* types · 96f51428
      Jozsef Kadlecsik 提交于
      There's nothing much required because the bitmap types use atomic
      bit operations. However the logic of adding elements slightly changed:
      first the MAC address updated (which is not atomic), then the element
      activated (added). The extensions may call kfree_rcu() therefore we
      call rcu_barrier() at module removal.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      96f51428
    • J
      netfilter: ipset: Prepare the ipset core to use RCU at set level · b57b2d1f
      Jozsef Kadlecsik 提交于
      Replace rwlock_t with spinlock_t in "struct ip_set" and change the locking
      accordingly. Convert the comment extension into an rcu-avare object. Also,
      simplify the timeout routines.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      b57b2d1f
    • J
      netfilter:ipset Remove rbtree from hash:net,iface · bd55389c
      Jozsef Kadlecsik 提交于
      Remove rbtree in order to introduce RCU instead of rwlock in ipset
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      bd55389c
    • J
      netfilter: ipset: Make sure listing doesn't grab a set which is just being destroyed. · 9c1ba5c8
      Jozsef Kadlecsik 提交于
      There was a small window when all sets are destroyed and a concurrent
      listing of all sets could grab a set which is just being destroyed.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      9c1ba5c8
    • J
      netfilter: ipset: Fix parallel resizing and listing of the same set · c4c99783
      Jozsef Kadlecsik 提交于
      When elements added to a hash:* type of set and resizing triggered,
      parallel listing could start to list the original set (before resizing)
      and "continue" with listing the new set. Fix it by references and
      using the original hash table for listing. Therefore the destroying of
      the original hash table may happen from the resizing or listing functions.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      c4c99783